I was initially surprised by the pushback this article is getting.
Then I remembered that this is data-oriented design advice, and I imagine most people on this forum (myself included most of the time) are writing line-of-business web apps where this advice seems like nonsense. I had already internalised the context, and wasn't planning to go apply this to my Laravel backend code.
A heuristic: if in your usual daily work you don't need to think about the instruction cache, then you should probably ignore this advice.
If you haven't yet and want to get a taste of when this advice matters, go find Mike Acton's "Typical C++ Bullshit" and decipher the cryptic notes. This article is like an understandable distillation of that.
Despite what Casey Muratori is trying to argue (and I'm largely sympathetic to his efforts) most line-of-business software needs to optimise for changeability and correctness ("programming over time") not performance.
I think this is still very much applicable in OOP.
Developers tend to break complex business logic within classes down into smaller private methods to keep the code DRY. The “push ifs up” mantra is really useful here to ensure the branching doesn’t end up distributed amongst those methods.
The “push fors down” is also very relevant when most call threads end up at an expensive database query. It’s common to see upstream for loops that end up making many DB calls downstream somewhere when the looping could have been replaced by a “where” clause or “join” in the SQL.
In fact, the “push fors down” mantra is useful even at the architecture level, as it’s usually better to push looping logic for doing aggregations or filtering into a DAO where it can be optimized close to the DB, rather than pulling a bunch of objects out of the database and looping through them.
I love simple and clear design principles like this!
Though, as with all design principles, one needs to consider it deliberately vs applying it as dogma!
I've been experimenting with a similar idea to "pushing ifs up" in OOP, too: Question ifs in the code of a class and just use more classes.
For example - python terminology, as it came from there - don't have a class "BackupStatus", and have the BackupStatus use 2-3 Booleans to figure out how to compute a duration, a status message and such. Instead, have a protocol "BackupStatus" to declare what a BackupStatus has to do, and figure these out for a FailedBackup, a PartialBackup, a SuccessfulBackup and so on.
It's different from what I did for a long time, and some people are weirded out by it at first. But at the same time, it allows me to create a bunch of small, complete and "obviously correct" classes. Or at the least, very concrete cases to reason about with somewhat technical stake holders. And once your backup status classes / enum members / whatever are just correct, then you can go and discuss when to create and use which of states.
It's not for everything naturally, but it quickly gives a solid foundation to build more complex things upon.
It does enable proxy shenanigans though, which is something to be mindful of:
I can be reasonably certain of what a reasonably complicated piece of logic is doing/capable of doing.
Comprehending a function using an object implementing a protocol, on the other hand, will end up requiring me to cross reference logic across 4 or more files before I can feel confident knowing what to expect: the protocol definition and/or the basic implementation, the calling context, the factory context that was responsible for creating the object, and the actual definition of the current object. All of which may be trivial, but you need to actually look at them all.
It's the difference between a breaker box and a house of light switches: the switches are all trivial (you hope), but you still hit the breaker when you need to be sure that you are going to be surprised.
Yeah. I'm kinda thinking about writing a blog post about that project, because I kinda like how it turns out.
Imo, the method I described to me is mostly reasonable to implement closed and fairly complete domain objects. In this case, understanding the protocol is more akin to understanding the concept from the problem domain, and the implementing classes are different cases or states the thing can be in. In your image, I'd compare the main thing I use this for as fuses. Simple, encapsulated and fulfilling a purpose - just implemented in different ways depending on the context.
On the other hand, if you're dealing with implementing a flow chart, I've grown to very much like a decently structured imperative function - supported by these smaller objects.
Yeah, data-oriented design (DOD) always seems to get people riled up. I think it largely gets this type of reaction b/c DOD implies that many aspects of the dominant object-oriented approach are wrong.
> most line-of-business software needs to optimise for changeability and correctness, not performance.
It's a shame that so many see changeability and performance in opposition with each other. I've yet to find compelling evidence that such is the case.
> It's a shame that so many see changeability and performance in opposition with each other. I've yet to find compelling evidence that such is the case.
I think agree with some of the sibling comments that you can write in a default style that is both maintainable and performant. Regularly though, when profiling and optimising, the improvement can reduce maintainability.
Two examples from the last 6 months:
A key routine is now implemented in assembly for our embedded target because we couldn't reliably make the compiler generate the right code. We now have to maintain that in parallel with the C code implementation we use on every other target.
We had a clean layering between two components. We have had to "dirty" that layering a bit to give the lower layer the information needed to run fast in all cases.
With thought you can often minimise or eliminate the impact, but sometimes you take the trade-off.
If you had good tooling, you could annotate the assembly routine as being equivalent to the C implementation; then maintenance would be a lot easier, since the compiler would yell at you if they ever diverged.
Depending on how you've dirtied your abstractions, language tooling might help with that (e.g. if you could decouple the abstraction from the implementation, and map multiple abstractions to the same data), but I don't know whether that could work, or if it'd even be an improvement if it did.
> If you had good tooling, you could annotate the assembly routine as being equivalent to the C implementation; then maintenance would be a lot easier, since the compiler would yell at you if they ever diverged.
Is that something actual compilers can do right now? I don't think I have heard of something like though, though I don't work that close with compilers much. Furthermore, if the compiler can recognise that the assembly routine is equivalent to the C implementation, wouldn't it also be able to generate the same routine?
>Is that something actual compilers can do right now?
it is not (outside of research)
>if the compiler can recognise that the assembly routine is equivalent to the C implementation, wouldn't it also be able to generate the same routine?
If you're willing to provide the equivalence proof yourself, it is much easier to verify such a proof than to produce one. The more work you're willing to put in when writing it, the simpler the proof verifier becomes. You could probably write a basic school arithmetic proof verifier within an hour or so (but it would not be pleasant to use).
I've thought about this idea a lot myself and I think it should be feasible, (maybe not with assembly right away). You could write an easily readable function in C for code review/verification purposes and then specify a list of transformations. Each transformation must preserve the semantics of the original function. For example "unroll this loop by a factor of 4", "Fuse these 4 additions in a single vector instruction". It would be a pain to write such a transformation list (AI assistance maybe?) but once written you can rely on the compiler to make sure there are no mistakes.
I guess you could run roughly the same set of unit tests on both implementations even if you can't formally prove equivalence. The unit tests would likely need to be adapted a little for each implementation. Of course, you then need to prove that the two sets of unit tests are equivalent :)
Well it's hard to argue about that tradeoff in general, but I think the existence of languages like Python, Ruby and PHP is compelling. Though I'd accept the argument that they help optimise for neither performance nor changeability!
My perspective is necessarily limited, but I often see optimisation as a case of "vertical integration" and changeability as about "horizonal integration".
To make something fast, you can dig all the way down through all the layers and do the exact piece of work that is required with the minimum of useless faffing about for the CPU[0]. But to make something robust, you might want to e.g. validate all your inputs at each layer since you don't know who's going to call your method or service.
Regarding the DOD/OOP wars, I really love this article, which argues that OOP doesn't have to be bad[1]. I also think that when performance is a requirement, you just have to get more particular about your use of OOP. For example, the difference between Mesh and InstancedMesh[2] in THREE.js. Both are OOP, but have very different performance implications.
[0] Casey Muratori's "simple code, high performance" video is an epic example of this. When the work he needed to do was "this many specific floating point operations", it was so cool to see him strip away all the useless layers and do almost exactly and only those ops.
> Well it's hard to argue about that tradeoff in general, but I think the existence of languages like Python, Ruby and PHP is compelling. Though I'd accept the argument that they help optimise for neither performance nor changeability!
I see the point you're making, and I don't disagree with it. But I should be a bit more clear in what I really meant in my parent comment:
Given whatever language a developer is working in, whether it is fast like C++ or a slow language like Python (or perhaps even Minecraft Redstone), I think the programmer that takes a data oriented approach (meaning they write their code thinking about the kinds of data the program can receive and the kind it will likely receive, along with what operations will be the most expensive) will have better code than a programmer that makes a nice object model following all the SOLID principles. The majority of the OOP code I've worked with spends too much time caring about abstractions and encapsulation that performance is lost and the code is no better to work with.
> Regarding the DOD/OOP wars, I really love this article, which argues that OOP doesn't have to be bad[1]. I also think that when performance is a requirement, you just have to get more particular about your use of OOP. For example, the difference between Mesh and InstancedMesh[2] in THREE.js. Both are OOP, but have very different performance implications.
Absolutely agree here. Classic gamedev.net article. ECS != DOD and I think the next parts of the article illustrate how DOD isn't necessarily in opposition with programming paradigms like OOP and FP.
With that said, I think it can be argued that common patterns within both OOP and FP circles are a hurdle at times to utilizing hardware to its fullest. Here's Casey Muratori's argument against SOLID principles[0] for instance.
---------------------
I think the point still stands: performance isn't in opposition to making a maintainable/changeable program.
A missing element to the conversation is another interpretation of DOD, that is, domain-oriented design. My favorite writing on the matter is "Programming as if the Domain (and Performance) Mattered" (https://drive.google.com/file/d/0B59Tysg-nEQZSXRqVjJmQjZyVXc...)
When OOP has bad performance, or is otherwise another instance of ball of mud architecture, it often stems from not modeling the domain correctly. Only using or being forced to use an inferior conception of OOP (i.e. OOP as merely fancy structs + function pointers) doesn't help either, and ends up encouraging the sorts of patterns found in GOF even preemptively before they can maybe earn their keep. And what's especially insidious about Kingdom of Nouns style OOP is that just because you have named something, or created a type hierarchy, does not actually make it a particularly good model of anything. If you interview enough people, you'll find some thinking it's entirely reasonable to do the equivalent of making a Car a subclass of a Garage just because garages contain cars. When bad modeling infects production code, it's difficult to remove, especially when it's so overgrown that a bunch of other code is created not to fix the modeling but to try and wrangle a bit of sanity into localized corners (often at the further expense of performance -- frequently from a lot of data copying and revalidating).
On the other hand, modeling things too close to the hardware, or (as the linked paper goes through) too close to one ideal of functional programming purity (where you want to just express everything with map and reduce on sequences of numbers if you can, in theory giving high performance too), can severely get in the way of changing things later, because again you're not actually modeling the domain very correctly, just implementing one clever and specific mapping to raw numbers. When the domain changes, if your mapping was too coupled to the hardware, or too coupled to a nice map/reduce scheme, the change throws a big wrench in things. Sometimes, of course, such direct hardware and code exploitation is necessary, but we live in a fat world of excess and most business software isn't so constrained. An interesting example from the past could be Wolfenstein 3D, where its doors and especially its secret push walls are handled as special cases by the raycaster. Carmack initially refused to add push wall functionality to his engine, it seemed like it would be too big of a hack and possibly hinder performance too much. Big special cases as ifs down in the fors usually suck. After much needling and thinking though, and realizing the game really did need such a feature, he eventually got it done, surprising everyone when at a final attempt at convincing him to add them, he revealed he already did it.
> But to make something robust, you might want to e.g. validate all your inputs at each layer since you don't know who's going to call your method or service.
If you actually have layers, then repeated guarding is not necessary. The guards are necessary when the same function appears at multiple levels of abstraction - when calls reach across multiple layers of even back up the hierarchy.
In Angular you had services. Everything was meant to talk to those to get to the data, and I think a lot of people misunderstood the value this provides. If you want user data to look a certain way to the front end, you can do cleanup at insertion time (which can be spread out over years when you didn’t know it was a problem) or at read time. Ideally you want this in the backend, but it can be a challenge because you don’t know what you need until you need it, and expecting the backend people to drop everything to tackle your latest epiphany RFN is unreasonable.
But you can make the Service work the way you would like the backend to work, and make a pitch for that logic to move into the backend. It’s generally self explanatory code, without a lot of tendrils elsewhere. Airlifting it across process boundaries is far easier than surgically removing a set of assumptions and decisions smeared across the code.
“Beyond this point, all foo have bar, and you don’t need to worry about it.” This takes architectural discipline but the dividends are great.
There is a point where performance optimizations get in the way of clarity, but that's after a long plateau where simple software == fast software. And simple software is the most amenable to changeability. It might not be the fastest way to initially write the software though, as leveraging existing large complex frameworks can give an head start to someone familiar with it.
Somewhere, somewhen, we, as software developers, started thinking that other programmers would rather extend code rather than modify it. This has led us to write code that tries to predict the use cases of future programmers and to pre-emptively include mechanisms for them to use or extend our code. And because it has seeped so deeply into our culture, if we don't do this -- engage in this theater -- we get called out for not being good software engineers.
Of course, the extra hooks we put in to allow re-use and extensibility usually results in code that is slower and more complex than the simple thing. Worse, very often, when a customer needs a new feature, the current extension hooks did not predict this use case and are useless, and so the code has to be modified anyway, but now it's made 10x more difficult because of the extra complexity and because we feel that we have to respect the original design and not rip out all the complexity.
I like Knuth's quote [1] on this subject:
> I also must confess to a strong bias against the fashion for reusable code. To me, “re-editable code” is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.
Writing generally "reusable code", aka a library, warrants a different approach to software development than application code in many areas.
1. Application code = Fast-changing, poorly specified code. You need to have a rapid development cycle that supports "discovering" what the customer wants along the way. Your #1 job is pleasing the customer, as quickly, and as reliably, as possible.
2. Library code = Slow-changing, highly specified code. You have a long, conservative development cycle. Your #1 job is supporting application programmers (the customers of your library).
> Somewhere, somewhen, we, as software developers, started thinking that other programmers would rather extend code rather than modify it.
That was when stuff like "proper testing" was deemed to be too expensive. It's unlikely to break existing workflows with extending something, but very easy to do so during a modification.
Companies used to have hordes of manual testers/QA staff, that all got replaced by automated tools of questionable utility and capability.
The tools are very useful, and they have well-known capability. That capability is strictly less than the capability of most manual testers / QA staff, but it's a lot faster at it, and gets much closer to being exhaustive.
Automation should mean you can do a better job, more efficiently, more easily. Unfortunately, ever since the Industrial Revolution, it seems to mean you can do a quicker job with less money spent on labour costs.
> That capability is strictly less than the capability of most manual testers / QA staff, but it's a lot faster at it, and gets much closer to being exhaustive.
That's if you put the effort in to write good tests. When I look at the state of gaming in general, it's ... pretty obvious that this hasn't worked out. Or the GTA Online JSON debacle - I'm dead sure that this was known internally for a long time, but no one dared to modify it.
And even then: an automated test can't spot other issues unrelated to the test that a human would spot immediately. Say, a CSS bug causes the logo to be displayed in grayscale. The developer who has accidentally placed the filter on all img elements writes a testcase that checks if an img element in content is rendered in greyscale, the tests pass, the branch gets merged without further human review... and boom.
Simplest is definitely not more amenable to changes.
We've just implemented a large feature where some devs tried to "hardcode" all the logic to apply a kind of rules-engine. I was horrified because the whole thing was coupled to the rules we currently needed, but we all know this is just the beginning and we plan to add more rules , and even allow custom rules to be defined by our customers. So, even though what they were trying to do is often lauded on HN and other forums because it's applying KISS and YAGNI, in this case adding a new rule would mean, basically, changing everything - the engine, the data that the engine persists, potentially the end result... everything! Now, perhaps this was indeed simpler though. However, it's the opposite of modifiable (and by the way, implementing it with abstract rules which store their own data, which the engine needs not know about, is actually much cleaner and the de-coupling just comes for free, almost).
This doesn't sound like a simple solution from your fellow devs. It appears to have been an easy solution. If the distinction isn't familiar to you, there is a great talk of Rich Hickey [0], that explains that distinction and some more. The talk is definitely not only applicable to Clojure.
YAGNI is a great slogan, but must not be a dogma. If you already know, that you are going to need custom rules. Then prepare for it. But, if --- for example --- the current rules are almost trivial, and you don't know yet, which rule engine will be the right fit later on. Then it might be a good idea to postpone the decision about the rule engine, until you know more. In the meantime a hard-coded solution could be enough.
I know that talk very well. And I still disagree with you that hardcoding the rules is not simple, just easy. It's always going to be simpler to implement code that's more specific (i.e. has less room for the unknown) than less. Or do you think there is anything Rich said that shows that not to be always true?
Rich didn't say it, if I remember correctly. But there are problems where a more general algorithm is simpler than a more specific straight-forward algorithm. Usually because you change the way you model the problem.
Otherwise, I have to take your word for it, because I cannot see your specific example.
YAGNI definitely doesn't apply in cases where you do actually know that you are gonna need it.
If the task is "build a rules engine with X, Y, and Z rules, that can add A, B, and C rules next year" then delivering hardcoded XYZ rules engine is an absolute failure to engineer and a very braindead use of "YAGNI"
> It's a shame that so many see changeability and performance in opposition with each other. I've yet to find compelling evidence that such is the case.
In the article, when I saw this
> For f, it’s much easier to notice a dead branch than for a combination of g and h!
My first thought was "yes, but now if anyone _else_ calls h or g, the checks never happen (because they live in f). I'd much rather have h and g check what _they_ need in order to run correctly. That way, if another call to one of them is added, we no longer need to rely on _that_ call correctly checking the conditions. Plus it avoids duplication.
But... and this goes back to the original point from your post... this is a matter of code being correct over time; changeability. If you're worried about performance, then having the same check in 2 different places is a problem. If you're not (less) worried, then having the code less likely to break later as changes are made is helpful.
They don't need to be in opposition: it's enough that the changeability, correctness, and performance of solutions is uncorrelated for you to frequently need to tradeoff between them, especially given the inevitable tradeoff of all of these against time to write.
Yeah. Is there an article showing how a clever one liner that is hard to read and an "inefficient looking" but easy to understand multiline explanation style code with proper variable names etc will result in the same compiled code?
I would assume compilers would be sufficiently advanced nowadays...
> most line-of-business software needs to optimise for changeability and correctness ("programming over time") not performance
These are not mutually exclusive, in fact, more often than not, they are correlated.
Maybe the most important aspect of performance is to make things small. Small code, small data structures, small number of executed instructions. Writing small code is what "thinking about the instruction cache" essentially is, btw.
And as it turns out, the smaller the code, the less room there is for bugs, the more you can understand at once, and the easier it is to get good coverage, good for correctness. As for changeability, the smaller the code, the smaller the changes. The same applies to data.
Now, some optimization techniques can make the code more complicated, for example parallelization, caching, some low level optimization, etc... but these only represent a fraction of what optimizing for performance is. And no serious performance conscious programmer will do that without proper profiling/analysis.
Then there are things that make the code faster with limited impact (positive and negative), and this is what the article is about. Functionally, if/for is not really different from for/if, but one is faster than the other, so why pick the slow one? And even if the compiler optimizes that for you, why rely on the compiler if you can do it properly at no cost. Just like looping over 2D arrays, it is good to know that there are two ways of doing it, and while they look equivalent, one is fast and one is slow, so that you don't pick the slow one by accident.
Is for/if faster since loops get started right away where ifs need to check conditionals constantly on top of whatever action they are supposed to execute?
Loops are fastest when they fit in the processor's instruction cache (and preferably, only touch data that fits in the data cache). Similarly, code is fastest when it has to execute the least amount of instructions. In the first example, the walrus(Option) function is designed to be executed unconditionally, only to return early when there is no walrus. That's an unnecessary function call that can be removed by changing the method signature (in Rust, because it has non-nullable types. In other languages you would need to do the non-null check anyway for safety reasons).
> Despite what Casey Muratori is trying to argue (and I'm largely sympathetic to his efforts) most line-of-business software needs to optimise for changeability and correctness ("programming over time") not performance.
Then consider listening to John Ousterhout instead: turns out that in practice, changeability and correctness are not nearly at odds with performance than we might initially think. The reason being, simpler programs also tend to run faster on less memory. Because in practice, simpler programs have shorter call stacks and avoid convoluted (and often costly) abstractions. Sure, top notch performance will complicate your program. But true simplicity will deliver reasonable performance most of the time.
By the way, while pushing fors down is mostly a data oriented advice, pushing ifs up is more about making your program simpler. Or more precisely, increasing its source code locality: https://loup-vaillant.fr/articles/source-of-readability that’s what concentrating all the branching logic in one place is all about.
Honestly, both of these I think are pretty applicable to line-of-business apps as well.
The "push loops down" advice most especially: for any CRUD app, handling creation and updates in bulk when possible will typically save huge amounts of time, much more than in CPU bound use cases. The difference between doing `items.map(insertToDb/postToServer) ` vs doing `insertToDb/postToServer(items)` is going to be orders of magnitude in almost all cases.
I have seen optimizations of this kind take operations from seconds or minutes down to milliseconds. And often the APIs end up cleaner and logs are are much easier to read.
Great point. The n+1 sql queries antipattern comes to mind as a very common "push loops down" application in line of business stuff. Let the db do the work/join.
maybe - although most of the time, especially last few years, I write web apps and I push ifs up all the time because it allows for early return like the article says.
Determining that you need to do nothing should be done as soon as possible, especially in any system where performance is essential and web apps make more money the better their performance as a general rule.
Not even strictly DOD, this trivial principle produces better semantics and drives better code separation down the line.
Several years ago I was exposed to DOD (and then this principle) when working on complex JS/TS-based for long-running systems. It results in better code navigability, accurate semantic synthesis, and easier subsequent refactors.
The side effect: some people remarked that the code looked like C
I almost agree except that I may have in mind a different kind of engineer than you do: people who don't think about the instruction cache even though they are writing performance-critical code. This post won't change that, but I hope that people writing about performance know that audience exists, and I dearly hope that people in this situation recognize it and strive to learn more.
At a large enough scale, even line of business code can become a bottleneck for real-world business activity. And unsurprisingly, engineers who don't think about performance have a way of making small scales feel like large ones because they use architectures, algorithms, and data structures that don't scale.
This happens even in the FAANG companies famed for their interviewing and engineering. I've seen outages last for hours longer than they should have because some critical algorithm took hours to run after a change in inputs, all the while costing millions because one or more user-facing revenue-critical services couldn't run until the algorithm finishes (global control, security, quota, etc. systems can all work like this by design and if you're lucky the tenth postmortem will acknowledge this isn't ideal).
I've had to inherit and rework enough of these projects that I can definitively say the original developers weren't thinking about performance even though they knew their scale. And when I did inherit them and have to understand them well enough to make them as fast as they needed to be, some were at least written clearly enough that it was a joy, and others were a tangled mess (ironically, in some cases, "for performance" but ineffectively).
See also: the evergreen myth that "you don't need to learn algorithms and data structures, just learn to find them online" resulting in bullshit like a correct but exponential algorithm being put into production when a simple linear one would have worked if the engineer knew more about algorithms.
There's so much wrong with how many people do performance engineering that I don't think pushing fors down is even in the top 10 tips I would give, I just think that folks posting and commenting in this space recognize how large and impactful this section of the audience is.
I have practiced this advice in line-of-business software to great effect. Popular ORM libraries operate on individual records, but that is quite slow when software needs to import a large number of objects. Restructuring the code to operate on batches—push fors down—results in significantly fewer database queries and network round-trips. The batched routines can resolve lookups of foreign keys in batches, for example, or insert multiple records at once.
Granted, that is more difficult to maintain. I only batched the import routines. The rest still uses the more maintainable ORM.
Despite what Casey Muratori is trying to argue (and I'm largely sympathetic to his efforts) most line-of-business software needs to optimise for changeability and correctness ("programming over time") not performance.
In investment banking (markets / trading), much of the most economically valuable software could easily run on a Raspberry Pi. I always call it "10,000 if-thens" (all the business rules that build up over 10+ years). Excel with VBA can still do a lot. A very tiny fraction of markets / trading software needs to be performant, yet, those tiny parts dominate most of the conversations (here and at work). It is tiring. Keep writing simple if-thens with a few for loops. Keep making money...
Pushing ifs up is one of my most common code review feedback on frontend prs. I think it's very applicable to standard web apps, especially on large teams where people are reticent to refactor.
Then I remembered that this is data-oriented design advice, and I imagine most people on this forum (myself included most of the time) are writing line-of-business web apps where this advice seems like nonsense. I had already internalised the context, and wasn't planning to go apply this to my Laravel backend code.
A heuristic: if in your usual daily work you don't need to think about the instruction cache, then you should probably ignore this advice.
If you haven't yet and want to get a taste of when this advice matters, go find Mike Acton's "Typical C++ Bullshit" and decipher the cryptic notes. This article is like an understandable distillation of that.
Despite what Casey Muratori is trying to argue (and I'm largely sympathetic to his efforts) most line-of-business software needs to optimise for changeability and correctness ("programming over time") not performance.