Hacker Newsnew | past | comments | ask | show | jobs | submit | liampulles's commentslogin

Frankly, calling software development engineering is quite debatable. We should be calling less things engineering that aren't actually engineering qualifications.

Being a branch of engineering implies a certain level of professionalism and accountability that the software development community actively resists.

Engineering like the guy in the booth at a show is a sound engineer. Talented: check; challenging work: check; valuable: check; creative: check. "Engineering" like designing a building, bridge, or power line? Nope.

It's not a protected term in the US so it's jarring to those of us living where it is.


I work with someone who does great QA work. They know how to rip something apart, they understand the user's non-technical perspective and approach, and they understand what edge cases to look out and they have the actual equipment to test on different physical devices (and so on).

Most importantly, they have the diligence and patience to methodically test subtly different cases, which I frankly don't have.

On the question of whether QA slows things down, I have to ask: slows down what? Slows down releasing something broken? Why is that something to optimize for? We should always be asking how long it takes to release the right thing (indeed I'm most productive when I can close a ticket after concluding nothing is needed).


If all/most QA people were like this then no one would be complaining.

Sure, but this issue is not specific to QA. Any roles which you depend on with incompetent people occupying them will lead to issues and frustration.

Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)

I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.


Well before LLMs, I already ditched ORMs. What sometimes holds back SQL is not having a convenient way to call it. Statically-typed languages require you to manually set result types unless you use a compile-time query builder, but that's a whole can of worms. Besides that, many client libs aren't so convenient out of the box, so you still need a few of your own helpers.

Also, before jsonb existed, you'd often run into big blobs of properties you don't care to split up into tables. Now it takes some discipline to avoid shoving things into jsonb that shouldn't be.


Let me introduce you to LINQ…

https://learn.microsoft.com/en-us/dotnet/csharp/linq/

It solves all of your issues with “ORMs” (it’s really more than just an ORM)


That's part of the query builder "can of worms" I mentioned. Might be one of the better examples. But idk what limitations it has vs regular SQL, like it mentioned not having native support for count/max.

Author here. DB and external service calls are often the biggest wins, thanks for calling that out.

In my demo app, the CPU hotspots were entirely in application code, not I/O wait. And across a fleet, even "smaller" gains in CPU and heap compound into real cost and throughput differences. They're different problems, but your point is valid. Goal here is to get more folks thinking about other aspects of performance especially when the software is running at scale.


My experience profiling is that I/O wait is never the problem. However, the app may actually be spending most of it's CPU time interacting with database. In general, networks have gotten so fast relative to CPU that the CPU cost of marshalling or serializing data across a protocol ends up being the limiting factor. I got a major speedup once just by updating the JSON serialization library an app used.

> I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.

Maybe we can ditch active models like those we see in sqlalchemy, but the typed query builders that come with ORMs are going to become more important, not less. Leveraging the compiler to catch bad queries is a huge win.


I use Ecto with Elixir in my day job, and it has a pretty good query building type solution. BUT: I still regularly come into issues where I have to use a fragment in order to do the specific SQL operation that I want, or I start my app and it turns out it has not caught the issue with my query (relating to my specific MySQL version or whatever). Which unfortunately defeats the purpose.

My experience with something like the latest Claude Code models these days has been that they are pretty good at SQL. I think some combination of LLM review of SQL code with smoke tests would do the trick here.


> a lot of real world performance comes down to how efficiently you are calling external services (including the database)

Apart from that my experience over the last 20 years was that a lot of performance is lost because of memory allocation (in GCed languages like Java or JavaScript). Removing allocation in hot loops really goes a long way and leads to 10 or 100 fold runtime improvements.


This has been the key for me as well, memory allocation in hot paths is usually the first optimization that I look for. It's quite surprising to see how far very inefficient algorithms (time complexity wise) can go as long as no allocations are made.

This applies to non-GC languages as well. Memory management is slow. Even with manual memory management I have been able to dramatically speed up code simply by modifying how memory is allocated.

Parts of the GC language crowd in particular have come to hold some false optimistic beliefs about how well a GC can handle allocations. Also, Java and C# can sneak in silly heap allocations in the wrong places (e.g. autoboxing). So there is a tendency for programs to overload the GC with avoidable work.


> Parts of the GC language crowd in particular have come to hold some false optimistic beliefs about how well a GC can handle allocations.

Yep, the idea is "we've made allocations fast, so allocate away!". But that's a trap — every allocation puts pressure on the GC, no matter how fast you've made the very act of allocating. It's a terrible mindset to encourage the users of your language to have.

Then there's the more insidious problem — to make allocations fast you must have traded something off, like GC throughput. So now your GC is slower and encourages programmers to allocate, which makes it even slower.


Autoboxing is more a Java problem mainly because of type erasure with generics. C# has "proper" generics and no hidden boxing is occuring there.

> ditch ORMs ... make good use of SQL

I think Java (or other JVM languages) are then best positioned, because of jooq. Still the best SQL generation library I've used.


Jooq with Kotlin for a back-end has been the best of both worlds for me.

Much cleaner, shorter code and type safety with Postgres (my schema tends to be highly normalized too). And these days I’ve got it well integrated with Zod for type safe JS/TS front-ends as well.


Anytime I use a language other than Java it's always jooq that I miss. It's that good.

I'm rather partial to MyBatis (and Liquibase) but I might have to give Jooq a try.

thumbs up for jooq

> Just converting a loop of queries into bulk ones can help loads

This is usually the first thing I look for when someone is complaining about speed. Developers often miss it because they are developing against a database on their local machine which removes any of the network latency that exists in deployed environments.


Easy to get wrong as well.

There's a balance with a DB. Doing 1 or 2 row queries 1000 times is obviously inefficient, but making a 1M row query can have it's own set of problems all the same (even if you need that 1M).

It'll depend on the hardware, but you really want to make sure that anything you do with a DB allows for other instances of your application a chance to also interact with the DB. Nothing worse than finding out the 2 row insert is being blocked by a million row read for 20 seconds.

There's also a question of when you should and shouldn't join data. It's not always a black and white "just let the DB handle it". Sometimes the better route to go down is to make 2 queries rather than joining, particularly if it's something where the main table pulls in 1000 rows with only 10 unique rows pulled from the subtable. Of course, this all depends on how wide these things are as well.

But 100% agree, ORMs are the worst way to handle all these things. They very rarely do the right thing out of the box and to make them fast you ultimately end up needing to comprehend the SQL they are emitting in the first place and potentially you end up writing custom SQL anyways.


ORMs are a caching layer for dev time.

They store up conserved programming time and then spend it all at once when you hit the edge case.

If you never hit the case, it's great. As soon as you do, it's all returned with interest :)


The question is why we don’t have database management systems that integrate tightly with the progmming language. Instead we have to communicate between two different paradigms using a textual language, which is itself inefficient.

We tried that in 90’s RAD environments like Foxpro and others. If it fits the problem, they were great! If not, it’s even worse than with an ORM. They rarely fit today since they were all (or mostly) local-first or even local-only. Scaling was either not possible or pretty difficult.

https://permazen.io/ exists and is a simpler yet still very powerful way to think about databases (for java but the concepts are general).

But it's only really efficient if it can run code right next to the data via fast access - ideally the same machine. The moment you have a DB running on separate hardware or far away from the client, it's going to be slower.

SQL is a very compact way to communicate what you want from a complex database in a way that can be statically analyzed and dynamically optimized. It's also sandboxable. Not so easily to replace.


Because every single database vendor will try to lock down their users to their DBMS.

Oracle is a prime example of this. Stored procedures are the place to put all business logic according to Oracle documentation.

This caused backslash from escaping developers who then declared business logic should never be inside the database. To avoid vendor lock-in.

There's no ideal solution, just tradeoffs.


> Because every single database vendor will try to lock down their users to their DBMS.

I mean, that already happens. It's quite rare to see someone migrate from one database to another. Even if they stuck to pure SQL for everything, it's still a pretty daunting process as Postgres SQL and MSSQL won't be the same thing.


> It's quite rare to see someone migrate from one database to another.

I'm not discounting the level of effort involved, but I think the reason you don't see this often is because it is rare that simply changing DBMS systems is beneficial in and of itself.

And even if it was frictionless (ie: if we had discovered ORM Samarkanda), the real choices are so limited that even if you did it regularly, you would soon run out of DBMSs to try.


It would obviously be beneficial to go from super expensive to free (Postgres). But no one does - why? Because sql is just a veneer over otherwise two completely different things.

I migrated a database with some stored procedures from MSSQL Server to Oracle. Then lots of logic was added as stored procedures to the database. Then I migrated the same system to MySQL. Including the SP. Doesn't happen often, but it does happen.

The answer is simple: model optimized for storage and model designed for processing are two different things. The languages used to describe and query them have to be different.

> The languages used to describe and query them have to be different.

Absolutely not.

That which is asserted without evidence can be dismissed without evidence.


> Absolutely not.

Can also be dismissed without evidence


Sounds like you're too thick to understand Hitchens quote.

Isn’t that what Convex is doing?

I agree with you fully yes. One has to watch out for overwhelmingly large or locking queries.

I’d say both local data structures and algorithms atop them, and external services like DBs, etc., are both just “resources” in a more abstract sense. Optimizing performance is a matter of using the right resources for the right things. Algorithms help a lot when you’re building FE components (even if the server is rendering them, or “rendering” responses for the FE).

I’d also argue “micro-ORMs” like Diesel (which isn’t really much like ActiveRecord, Hibernate, etc., but more a very thin DSL/interface that maps SQL types to Rust types), combined with LLMs, are the ideal solution (assuming we still want humans to be able to easily understand and trust the code generated). And there’s a big argument to be made for schema migration management being done at the app level (with plain SQL for migrations).

All that said, at work, we use Rails. And ActiveRecord’s “includes/preload/eager_load” methods are fantastic solutions to 99% of cases of querying for things efficiently, and are far more clear than all the SQL you’d have to write to replicate them.


I've been using sqlx + postgres very successfully with claude in the last couple of months. However, we've been raw dogging MySQL and node.js at work for over a year, and I also used raw SQLite from C++ before that (I am still traumatized by all the pointers, never again), so...

I've always found ORMs to be performance killers. It always worked out better to write the SQL directly. The idea that you should have a one-to-one correspondence between your data objects and your database objects is disastrous unless your data storage is trivial.

I worked with ORM (EclipseLink) and used SQL just fine.

When using JDBC I found myself quickly in implementing a poor mans ORM.


The issue of creating a DB wrapper doesn't go away by using an ORM. One of the complaints about ORMs I have, in practice, is people often create another wrapper around it.

I saw that a lot, too. I remember one project using Hibernate where the people involved decided to keep the Hibernate objects "pure" and then had them all wrapped in another object they used to keep information that didn't go into the database.

The whole project must have had 3x the number of classes that the actual complexity required, and keeping it all straight was something of a headache. As was onboarding new people, who always struggled with Hibernate.


EclipseLink never received enough love.

> external services (including the database)

Or even the local filesystem :)

CPU calls are cheap, memory is pretty cheap, disk is bad, spinning disk is very bad, network is 'good luck'.

You can O(pretty bad) most of the time as long as you stay within the right category of those.


> Understanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.

I recently fixed a treesitter perf issue (for myself) in neovim by just dfsing down the parse tree instead of what most textobject plugins do, which is:

-> walk the entire tree for all subtrees that match this metadata

-> now you have a list of matching subtrees, iterate through said subtree nodes, and see which ones are "close" to your cursor.

But in neovim, when I type "daf", I usually just want to delete the function right under my cursor. So you can just implement the same algorithm by just... dfsing down the parse tree (which has line numbers embedded per nodes) and detecting the matches yourself.

In school, when I did competitive programming and TCS, these gains often came from super clever invariants that you would just sit there for hours, days, weeks, just mulling it over. Then suddenly realize how to do it more cleverly and the entire problem falls away (and a bunch of smart people praise you for being smart :D). This was not one of them - it was just, "go bypass the API and do it faster, but possibly less maintainably".

In industry, it's often trying to manage the tradeoff between readability, maintainability, etc. I'm very much happy to just use some dumb n^2 pattern for n <= 10 in some loop that I don't really care much about, rather than start pulling out some clever state manipulation that could lead to pretty "menial" issues such as:

- accidental mutable variables and duplicating / reusing them later in the code

- when I look back in a week, "What the hell am I doing here?"

- or just tricky logic in general

I only noticed the treesitter textobject issue because I genuinely started working with 1MB autogen C files at work. So... yeah...

I could go and bug the maintainers to expose a "query over text range* API (they only have query, and node text range separately, I believe. At least of the minimal research I have done; I haven't kept up to date with it). But now that ties into considerations far beyond myself - does this expose state in a way that isn't intuitive? Are we adding composable primitives or just ad hoc adding features into the library to make it faster because of the tighter coupling? etc. etc.

I used to think of all of that as just kind of "bs accidentals" and "why shouldn't we just be able to write the best algorithms possible". As a maintainer of some systems now... nah, the architectural design is sometimes more fun!

I may not have these super clever flashes of insight anymore but I feel like my horizons have broadened (though part of it is because GPT Pro started 1 shotting my favorite competitive programming problems circa late 2025 D: )


You are not wrong. There are of course tradeoffs here. There are various things that can improve web service performance, but if we are talking about the performance of a web service in comparison to other more general concerns, like maintainability, then I agree trying to make small performance wins falls pretty low on the list.

After all, even if one has some slow and beastly, unoptimized Spring Boot container that chews through RAM, its not that expenseive (in the grand scheme of things) to just replicate more instances of it.


Please stop using the term prompt engineering, context engineering, etc. to define formatting the text that we send an LLM.

Its already quite debatable whether software developers should be called software engineers, but this is just ridiculous.


Why?


I think I want to know exactly what SQL ends up hitting the DB, and I want to fine tune it precisely.

This is the same issue I've had with ORMs - I get that they make it easier to generate functionality at speed, but ultimately I want control over the biggest performance lever I have available to me.


The economic divide is very apparent here in South Africa, and it leads to perverse effects on our infrastructure.

As the country's ability to provide basic utilities falters, sufficiently wealthy households go partially or completely off grid, depriving revenue for the utilities and further exacerbating the problem.


yeah given that HN audience might be mostly from 1st world countries

in the 3rd world - north of Limpopo - and everywhere else - you discover soon that where you born is like being 60 points down with a 2 min warning and you're in your own end zone.

where you are born matters more than your own talent etc by a factor of 10


It seems like a lot is being left out. What is considered sufficiently wealthy and why are they going off grid?


> What is considered sufficiently wealthy

What you might consider "middle class", suburban households, and up. These "upgrades" are something that you can roll into your mortgage.

> Why are they going off grid?

Electricity and water availability issues.

For example, I live in an area where water is unavailable between days and weeks, every few weeks. The systemic issue in my municipality is that they have not kept up with the maintenance of water pipes, which means a significant portion of water pumped from the national provider is lost via leaks. Thus when some pumps and reservoirs go offline due to power disruptions, the loss of flow can lead to no water in higher lying areas that takes time to restore.

There are private solutions to this - the most affordable option is to install a big water tank on your property that in affect acts as a water battery. A more expensive solution is to install a borehole on your property, and to draw water from the water table (this might make sense for a complex, or a rich household).


https://en.wikipedia.org/wiki/South_African_energy_crisis

Not OP but it's because there are rolling blackouts.

> What is considered sufficiently wealthy

Ability to afford your own solar. A lot of South Africans can't, and with prices increasing 100x over the past 20 years, it makes it difficult to get out of that poverty.


Likewise for school districts in wealthier areas of the United States.


My evaluation on this is always to ask "are these two things intrinsically the same or coincidentally the same?"

Answering that question requires domain interrogation. If after that I'm still unsure, then I err on the side of having two copies with maybe a comment on both pointing to the other.


I actually like the explicit error and context value stuff in Go, though I recognise I'm in the minority.

The main reason is more to do with maintaining Go code than writing it: I find it very helpful when reading Go code and debugging it, to see actual containers of values get passed around.

Also, whenever I write a bit of boilerplate to return an error up, that is a reminder to consider the failure paths of that call.

Finally, I like the style of having a very clear control flow. I prefer to see the state getting passed in and returned back, rather than "hidden away".

I know that there are other approaches to having clear error values, like an encapsulated return value, and I like that approach as well - but there is also virtue in having simple values. And yes there are definitely footguns due to historical design choices, but the Go language server is pretty good at flagging those, and it is the stubborn commitment to maintaining the major API V1 that makes the language server actually reliable to use (my experience working with Elixir's language server has been quite different, for example).


It's well worth subscribing to the tz updates mailing list, not just to be cognisant of timezone changes, but to add a bit of bemusement to your day.


I see a future where I program at work less, which is sad but c'est la vie. I think the challenge of the job will be heralding and managing my own context for larger codebases managed by smaller teams, and finding ways to allow for more experimental/less verified code in prod. And plenty of consulting work for companies which have vibe coded their business and who are left with a totally fucked data model (if not codebase).

A Private (system) Investigator. :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: