More

marginalia_nu · 2026-03-17T09:45:57 1773740757

This is very funny.

Middle verse of Gangsta's Paradise:

Reflecting on the current market landscape and the unique challenges of my professional journey.

Coming from a non-traditional background, I’ve had to pivot and align with high-performing teams to navigate complex environments. It’s easy to get distracted by the "noise," but I remain laser-focused on strategic growth and ROI.

I’m a lifelong learner with a growth mindset, keeping my eye on the prize while maintaining a competitive edge. I’m fully committed to my organization, and we prioritize high-stakes execution—so let’s keep the professional synergy positive.

In this fast-paced industry, agility is everything. I’m operating in a "do or die" climate where meeting KPIs is the only option. Looking at the current burn rate and market volatility, long-term forecasting is a challenge, but I’m staying resilient.

#GrowthMindset #Resilience #StrategicLeadership #MarketTrends #ProfessionalJourney

Freak_NL · 2026-03-17T10:06:49 1773742009

Good idea for a unit test for this: if you put the lyrics to "Weird Al" Yankovic's Mission Statement in, it should return the exact same text as output.

https://www.youtube.com/watch?v=GyV_UG60dD4

ashton314 · 2026-03-17T10:18:48 1773742728

That song is gold.

kileel · 2026-03-17T12:18:52 1773749932

marginalia_nu · 2026-03-16T19:34:23 1773689663

Big thing that made encryption required is arguably that ISPs started injecting crap into webpages.

Governments can still track you with little issue since SNI is unencrypted. It's also very likely that Cloudflare and the like are sharing what they see as they MITM 80% of your connections.

jopsen · 2026-03-16T21:03:32 1773695012

> It's also very likely that Cloudflare and the like are sharing what they see as they MITM 80% of your connections.

Maybe, I suspect not, but even so if we reduce the number of men in the middle that's pretty nice.

marginalia_nu · 2026-03-16T21:13:26 1773695606

Between what Snowden told us, and the CLOUD Act, it seems quite likely.

marginalia_nu · 2026-03-16T19:18:09 1773688689

Google and most search engines optimize for what is most likely to be clicked on. This works poorly and creates a huge popularity bias at scale because it starts feeding on its own tail: What major search engines show you is after all a large contributor to what's most likely to be clicked on.

The reason Marginalia (for some queries) feels like it shows such refreshing results is that it simply does not take popularity into account.

marginalia_nu · 2026-03-16T18:57:01 1773687421

Well to be fair, Marginalia is also developed by 1 guy (me), and Google has like 10K people and infinite compute they can throw at the problem. There has been definite improvements, and will be more improvements still, but Google's still got hands.

janalsncm · 2026-03-16T19:55:37 1773690937

Hey Marginalia, cheers. Imo fewer hands can also be an advantage.

There are no PMs breathing down your neck to inject more ads in the search results, you don’t depend on any broken internal bespoke tools that you can’t fix yourself, and you don’t need anybody’s permission to deploy a new ranking strategy if you want to.

marginalia_nu · 2026-03-16T18:47:25 1773686845

Thanks for shilling.

Regarding the financials, even though the second nlnet grant runs out in a few weeks, I've got enough of a war chest to work full time probably a good bit into 2029 (modulo additional inflation shocks). The operational bit is self-funding now, and it's relatively low maintenance, so if worse comes to worst I'll have to get a job (if jobs still exist in 2029, otherwise I guess I'll live in the shameful cardboard box of those who were NGMI ;-).

marginalia_nu · 2026-03-15T15:13:26 1773587606

The more I evaluate Claude Code, the more it feels like the world's most inconsistent golfer. It can get within a few paces of the hole in often a single strike, and then it'll spend hours, days, weeks trying to nail the putt.

There's some 80-20:ness to all programming, but with current state of the art coding models, the distribution is the most extreme it's ever been.

marginalia_nu · 2026-03-14T12:45:46 1773492346

My experience is that it gets you 80-90% of the way at 20x the speed, but coaxing it into fixing the remaining 10-20% happens at a staggeringly slow speed.

All programming is like this to some extent, but Claude's 80/20 behavior is so much more extreme. It can almost build anything in 15-30 minutes, but after those 15-30 minutes are up, it's only "almost built". Then you need to spend hours, days, maybe even weeks getting past the "almost".

Big part of why everyone seems to be vibe coding apps, but almost nobody seems to be shipping anything.

marginalia_nu · 2026-03-10T17:49:09 1773164949

Yeah it's a rewarding project. Getting a language that kinda works is surprisingly accessible. Though we must be mindful that this is still the "draw some circles" pane. Producing the rest of the rest of the famous owl is, as always, the hard bit.

marginalia_nu · 2026-03-10T16:04:05 1773158645

Expert reviews are just about the only thing that makes AI generated code viable, though doing them after the fact is a bit sketchy, to be efficient you kinda need to keep an eye on what the model is doing as its working.

Unchecked, AI models output code that is as buggy as it is inefficient. In smaller green field contexts, it's not so bad, but in a large code base, it's performs much worse as it will not have access to the bigger picture.

In my experience, you should be spending something like 5-15X the time the model takes to implement a feature on reviewing and making it fix its errors and inefficiencies. If you do that (with an expert's eye), the changes will usually have a high quality and will be correct and good.

If you do not do that due dilligence, the model will produce a staggering amount of low quality code, at a rate that is probably something like 100x what a human could output in a similar timespan. Unchecked, it's like having a small army of the most eager junior devs you can find going completely fucking ape in the codebase.

locusofself · 2026-03-10T16:22:15 1773159735

If you spend 5-15x the time reviewing what the LLM is doing, are you saving any time by using it?

happytoexplain · 2026-03-10T16:31:32 1773160292

No, but that's the crux of the AI problem in software. Time to write code was never the bottleneck. AI is most useful for learning, either via conversation or by seeing examples. It makes writing code faster too, but only a little after you take into account review. The cases where it shines are high-profile and exciting to managers, but not common enough to make a big difference in practice. E.g AI can one-shot a script to get logs from a paginated API, convert it to ndjson, and save to files grouped by week, with minimal code review, but only if I'm already experienced enough to describe those requirements, and, most importantly, that's not what I'm doing every day anyway.

brandensilva · 2026-03-10T18:25:33 1773167133

I'm finding it in some cases I'm dealing with even more code given how much code AI outputs. So yeah, for some tasks I find myself extremely fast but for others I find myself spending ungodly amounts of time reviewing the code I never wrote to make sure it doesn't destroy the project from unforseen convincing slop.

ritlo · 2026-03-10T16:36:18 1773160578

A related Dirty Secret that's going to become clear from all this is that a very large proportion of code in the wild (yes, even in 2026—maybe not in FAANG and friends, IDK, but across all code that is written for pay in the entire economy) has limited or no automated test coverage, and is often being written with only a limited recorded spec that's usually fleshed out only to the degree needed (very partial) as a given feature is being worked on.

What do the relatively hands-off "it can do whole features at a time" coding systems need to function without taking up a shitload of time in reviews? Great automated test coverage, and extensive specs.

I think we're going to find there's very little time-savings to be had for most real-world software projects from heavy application of LLMs, because the time will just go into tests that wouldn't otherwise have been written, and much more detailed specs that otherwise never would have been generated. I guess the bright-side take of this is that we may end up with better-tested and better-specified software? Though so very much of the industry is used to skipping those parts, and especially the less-capable (so far as software goes) orgs that really need the help and the relative amateurs and non-software-professionals that some hope will be able to become extremely productive with these tools, that I'm not sure we'll manage to drag processes & practices to where they need to be to get the most out of LLM coding tools anyway. Especially if the benefit to companies is "you will have better tests for... about the same amount of software as you'd have written without LLMs".

We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.

On the plus side for "AI" companies, low-code solutions are still big business even though they usually fail to deliver the benefits the buyer hopes for, so there's likely a good deal of money to be made selling companies LLM solutions that end up not really being all that great.

slopinthebag · 2026-03-10T17:50:56 1773165056

Re. productivity, if LLM's are a genuine boost with 1/3 of the work, neutral 1/3 of the time, and actually worse 1/3 of the time, it's likely we aren't really seeing performance improvements as 1) people are using them for everything and b) we're still learning how to best use them.

So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.

ansibsha · 2026-03-10T18:33:37 1773167617

> better-specified software

Code is the most precise specification we have for interfacing with computers.

xp84 · 2026-03-11T05:55:57 1773208557

Sure, but if you define the code as the only spec, then it is usually a terrible spec, since the code itself specifies bugs too. And one of the benefits of having a spec (or tests) is that you have something against which to evaluate the program in order to decide if its behavior is correct or not.

Incidentally, I think in many scenarios, LLMs are pretty great at converting code to a spec and indeed spec to code (of equal quality to that of the input spec).

tmaly · 2026-03-10T20:05:20 1773173120

There are some cases where AI is generating binary machine code, albeit small amounts. What do we have when we don't have the code?

marginalia_nu · 2026-03-10T20:44:54 1773175494

Machine code is still code, even if the representation is a bit less legible than the punch cards we used to use.

interestpiqued · 2026-03-10T22:51:21 1773183081

You’re missing the point of a spec

unselect5917 · 2026-03-11T05:03:50 1773205430

The spec is as much for humans as it is the machine, yes?

interestpiqued · 2026-03-11T05:36:17 1773207377

Spec should be made before hand and agreed on by stakeholders. It says what it should do. So it’s for whoever is implementing, modifying, and/or testing the code. And unfortunately devs have a tendency of poor documentation

mrguyorama · 2026-03-11T17:43:09 1773250989

Software development is only 70ish years old and somehow we have already forgotten the very very first thing we learned.

"Just get bulletproof specs that everyone agrees on" is why waterfall style software development doesn't work.

Now suddenly that LLMs are doing the coding, everyone believes that changes?

interestpiqued · 2026-03-11T23:08:19 1773270499

I’m confused, are you saying that making a design plan and high level spec before hand doesn’t work?

unselect5917 · 2026-03-13T04:55:39 1773377739

I've seen it happen. Things that seem reasonable on a spec paper, then you go to implement and you realize it's contradictory or utter nonsense.

dboreham · 2026-03-10T20:55:18 1773176118

Bingo. Hopefully there are some business opportunities for us in that truth.

_wire_ · 2026-03-10T17:00:10 1773162010

> because the time will just go into tests that wouldn't otherwise have been written

Writing tests to ensure a program is correct is the same problem as writing a correct program.

Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.

Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.

You can push uncertainty around and but you can't eliminate it.

This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.

As Douglas Adams noted: ultimately you've got to know where your towel is.

layer8 · 2026-03-10T22:04:15 1773180255

A competent programmer proves the program he writes correct in his head. He can certainly make mistakes in that, but it’s very different from writing tests, because proofs abstract (or quantify) over all states and inputs, which tests cannot do.

shimman · 2026-03-10T16:39:07 1773160747

These companies don't care about saving time or lowering operating costs, they have massive monopolies to subsidize their extremely poor engineering practices with. If the mandate is to force LLM usage or lose your job, you don't care about saving time; you care about saving your job.

One thing I hope we'll all collectively learn from this is how grossly incompetent the elite managerial class has become. They're destroying society because they don't know what to do outside of copying each other.

It has to end.

SchemaLoad · 2026-03-10T22:46:55 1773182815

The submitter with their name on the Jira ticket saves time, the reviewer who has to actually verify the work loses a lot of time and likely just lets issues slip through.

marginalia_nu · 2026-03-10T16:38:22 1773160702

To be honest, some times it's still beneficial.

For fairly straightforward changes it's probably a wash, but ironically enough it's often the trickier jobs where they can be beneficial as it will provide an ansatz that can be refined. It's also very good at tedious chores.

misnome · 2026-03-10T18:23:24 1773167004

And spotting stuff in review! Sometimes it’s false positives but on several occasions I’ve spent ~15-30 minutes teaching-reviewing a PR in person, checked afterwards and it matched every one of the points.

bluGill · 2026-03-10T16:30:15 1773160215

Some, but not very much. Writing code is hard. Ai will do a lot of tedious code that you procrastinate writing.

hard24 · 2026-03-10T16:34:53 1773160493

Also when you are writing code yourself you are implicitly checking it whilst at the back of your mind retaining some form of the entire system as a whole.

People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.

bonesss · 2026-03-10T17:27:51 1773163671

That’s the reverse-centaur issue I see: humans are not great at repetitive nuanced similar seeming tasks, putting the onus on humans to retroactively approve high volumes of critical code has them managing a critical failure mode at their weakest and worst. Automated reviews should be enhancing known good-faith code, manual reviews of high volume superficially sound but subversive code is begging for issues over time.

Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.

An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.

demosito666 · 2026-03-10T21:26:43 1773178003

> Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place

The math depends on importance of the software. A mistake in a typical CRUD enterprise app with 100 users has zero impact on anything. You will fix it when you have time, the important thing is that the app was delivered in a week a year ago and was solving some problem ever since. It has already made enormous profit if you compare it with today’s (yesterday’s ?) manual development that would take half a year and cost millions.

A mistake in a nuclear reactor control code would be a total different thing. Whatever time savings you made on coding are irrelevant if it allowed for a critical bug to slip through.

Between the two extremes you thus have a whole spectrum of tasks that either benefit or lose from applying coding with LLMs. And there are also more axes than this low to high failure cost, which also affect the math. For example, even non-important but large app will likely soon degrade into unmanageable state if developed with too little human intervention and you will be forced to start from scratch loosing a lot of time.

bluGill · 2026-03-10T21:46:48 1773179208

I have found ai extreemly good at finding all those really hard bugs though. Ai is a greater force multiplier when there is a complex bug than in gneen field code.

bluGill · 2026-03-10T17:21:30 1773163290

Sortof. I work on a system too large for anyone to know the whole thing. Often people who don't know each other do something that will break the other. (Often because of the number of different people - most individuals go years between this)

raw_anon_1111 · 2026-03-10T18:21:47 1773166907

No I’m keeping up with the system as a whole because I’m always working at a system level when I’m using AI instead of worrying about the “how”

ansibsha · 2026-03-10T18:37:20 1773167840

No you’re not. The “how” is your job to understand, and if you don’t you’ll end up like the devs in the article.

We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs give you the illusion of this.

raw_anon_1111 · 2026-03-10T19:14:38 1773170078

No in my case the “how” is

1. I spoke to sales to find out about the customer

2. I read every line of the contract (SOW)

3. I did the initial requirements gathering over a couple of days with the client - or maybe up to 3 weeks

3. I designed every single bit of AWS architecture and code

4. I did the design review with the client

5. I led the customer acceptance testing

> We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs

I assure you the mid level developers or god forbid foreign contractors were not “experts” with 30 years of coding experience and at the time 8 years of pre LLM AWS experience. It’s been well over a decade - ironically before LLMs - that my responsibility was only for code I wrote with my own two hands

ansibsha · 2026-03-10T23:00:38 1773183638

Yes, and trusting an LLM here is not a good idea. You know it will make important mistakes.

I’m not saying trusting cheap devs is a good idea either. I do think cheap devs are actually at risk here.

raw_anon_1111 · 2026-03-10T23:30:15 1773185415

I am not “trusting” either - I’m validating that they meet the functional and non functional requirements just like with an LLM. I have never blindly trusted any developer when my neck was the one on the line in front of my CTO/director or customer.

I didn’t blindly trust the Salesforce consultants either. I also didn’t verify every line of oSql (not a typo) they wrote.

icedchai · 2026-03-11T00:42:55 1773189775

Actually, it's SOQL. I did Salesforce crap for many years.

rectang · 2026-03-10T16:53:57 1773161637

> Expert reviews are just about the only thing that makes AI generated code viable

I disagree, in the sense that an engineer who knows how to work with LLMs can produce code which only needs light review.

* Work in small increments

* Explicitly instruct the LLM to make minimal changes

* Think through possible failure modes

* Build in error-checking and validation for those failure modes

* Write tests which exercise all paths

This is a means to produce "viable" code using an LLM without close review. However, to your point, engineers able to execute this plan are likely to be pretty experienced, so it may not be economically viable.

marginalia_nu · 2026-03-10T16:55:42 1773161742

By the time you're working in increments small enough that it doesn't introduce significant issues, you really might as well write the code yourself.

rectang · 2026-03-10T16:59:45 1773161985

That's not my experience — I'm significantly faster while guiding an LLM using this methodology.

The gains are especially notable when working in unfamiliar domains. I can glance over code and know "if this compiles and the tests succeed, it will work", even if I didn't have the knowledge to write it myself.

johnnyanmac · 2026-03-10T18:48:31 1773168511

> I'm significantly faster while guiding an LLM using this methodology.

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

>When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts. This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.

If we're being honest with ourselves, it's not making devs work faster. It at best frees their time up so they feel more productive.

intelkishan · 2026-03-11T06:19:00 1773209940

There's a new report from the same group which shows that the degree of slowdown has reduced. Link: https://metr.org/blog/2026-02-24-uplift-update/

rectang · 2026-03-10T19:27:47 1773170867

Fair point. I have definitely caught myself taking longer to revise a prompt repeatedly after the AI gets things wrong several times than it would have taken to write the code myself.

I'd like to think that I have this under control because the methodology of working in small increments helps me to recognize when I've gotten stuck in an eddy, but I'll have to watch out for it.

I still maintain that the LLM is saving me time overall. Besides helping in unfamiliar domains, it's also faster than me at leaf-node tasks like writing unit tests.

tmaly · 2026-03-10T20:02:44 1773172964

How long will that 19% hold as models grow in capability?

johnnyanmac · 2026-03-10T20:08:29 1773173309

I'm a bit tired of waiting for "tomorrow", so I'll just live in today's world. We'll burn that bridge when we get to it.

rafaelmn · 2026-03-11T07:12:46 1773213166

The study you quoted is sonnet 3.5/3.7 era. You could see the promise with those models but the agentic/task performance of Opus 4.5/4.6 makes a huge difference - the models are pretty amazing at building context from a mid size codebase at this point.

otabdeveloper4 · 2026-03-11T09:36:27 1773221787

Google's latest research shows AI coding increases speed by 3% while also increasing bugs by 9%. (I.e., a net negative.)

AI doesn't make you code faster, it just makes the boring stretches somewhat more exciting.

marginalia_nu · 2026-03-10T17:02:20 1773162140

That's where the Gell-Mann amnesia will get you though. As much it trips up on the domains you're familiar with, it also trips up in unfamiliar domains. You just don't see it.

rectang · 2026-03-10T17:08:22 1773162502

You're not telling me anything I don't know already. Only a person who accepts that they're fallible can execute this methodology anyway, because that's the kind of mentality that it takes to think through potential failure modes.

Yes, code produced this way will have bugs, especially of the "unknown unknown" variety — but so would the code that I would have written by hand.

I think a bigger factor contributing to unforeseen bugs is whether the LLM's code is statistically likely to be correct:

* Is this a domain that the LLM has trained on a lot? (i.e. lots of React code out there, not much in your home-grown DSL)

* Is the codebase itself easy to understand, written with best practices, and adhering to popular conventions? Code which is hard for humans to understand is also hard for an LLM to understand.

marginalia_nu · 2026-03-10T17:16:41 1773163001

Right, I think the latter part is my concern with AI generated code. Often it isn't easy to read (or as easy to read as it could be), and the harder it is to navigate, the more code problems the AI model introduces.

It introduces unnecessary indirection, additional abstractions, fails to re-use code. Humans do this too, but AI models can introduce this type of architectural rot much faster (because it's so fast), and humans usually notice when things start to go off the rails, whereas an AI model will just keep piling on bad code.

rectang · 2026-03-10T17:27:23 1773163643

I agree that under default settings, LLMs introduce way too many changes and are way too willing to refactor everything. I was only able to get the situation under control by adding this standing instruction:

    ---
    applyTo: '**'
    ---
    By default:
    Make the smallest possible change.
    Do not refactor existing code unless I explicitly ask.

Under this, Claude Opus at least produces pretty reliable code with my methodology even under surprisingly challenging circumstances, and recent ChatGPTs weren't bad either (though I'm no longer using them). Less powerful LLMs struggle, though.

raw_anon_1111 · 2026-03-10T18:13:18 1773166398

Besides building web apps for internal use, I’m never going to let AI architect something I’m not familiar with. I could care less whether it uses “clean code” or what design pattern it uses. Meaning I will go from an empty AWS account to fully fledged app + architecture because I’ve been coding for 30 years and dealing with every book and cranny of AWS for a decade.

But I would never do the same for Azure.

rsynnott · 2026-03-10T22:03:35 1773180215

> I can glance over code and know "if this compiles and the tests succeed, it will work", even if I didn't have the knowledge to write it myself.

... Errr... Yeah, that's not a great approach, unless you are defining 'work' extremely vaguely.

rectang · 2026-03-10T23:37:38 1773185858

Haha I have usually found myself on the conservative side of any engineering team I’ve been on, and it’s refreshing to catch some flak for perceived carelessness.

I still make an effort to understand the generated code. If there’s a section I don’t get, I ask the LLM to explain it.

Most of the time it’s just API conventions and idioms I’m not yet familiar with. I have strong enough fundamentals that I generally know what I’m trying to accomplish and how it’s supposed to work and how to achieve it securely.

For example, I was writing some backend code that I knew needed a nonce check but I didn’t know what the conventions were for the framework. So I asked the LLM to add a nonce check, then scanned the docs for the code it generated.

UncleMeat · 2026-03-10T20:30:23 1773174623

Sadly, the way people become expert in a codebase is through coding. The process of coding is the process of learning. If we offload the coding to AI tools we will never be as expert in the codebase, its complexity, its sharp corners, or its unusual requirements. While you can apply general best practices for a code review you can never do as much as if you really got your hands dirty first.

"Seniors will do expert review" will slowly collapse.

jonnycoder · 2026-03-10T17:37:57 1773164277

I tend to agree. I spent a lot of time revising skills for my brownfield repo, writing better prompts to create a plan with clear requirements, writing a skill/command to decompose a plan, having a clear testing skill to write tests and validate, and finally having a code reviewer step using a different model (in my case it's codex since claude did the development). My last PR was as close to perfect as I have got so far.

Skidaddle · 2026-03-10T16:38:38 1773160718

Just lead with “You are an expert software engineer…”, easy!

raw_anon_1111 · 2026-03-10T16:49:42 1773161382

In my experience, inefficient code is rarely the issue outside of data engineering type ETL jobs. It’s mostly architectural. Inefficient code isn’t the reason your login is taking 30 seconds. Yes I know at Amazon/AWS scale (former employee) every efficiency matters. But even at Salesforce scale, ringing out every bit of efficiency doesn’t matter.

No one cares about handcrafted artisanal code as long as it meets both functional and non functional requirements. The minute geeks get over themselves thinking they are some type of artists, the happier they will be.

I’ve had a job that requires coding for 30 years and before ther I was hobbyist and I’ve worked for from everything from 60 person startups to BigTech.

For my last two projects (consulting) and my current project, while I led the project, got the requirements, designed the architecture from an empty AWS account (yes using IAC) and delivered it. I didn’t look at a line of code. I verified the functional and non functional requirements, wrote the hand off documentation etc.

The customer is happy, my company is happy, and I bet you not a single person will ever look at a line of code I wrote. If they do get a developer to take it over, the developer will be grateful for my detailed AGENTS.md file.

sarchertech · 2026-03-10T18:19:08 1773166748

It’s not about hand crafted code or even code performance.

We know from experimentation that agents will change anything that isn’t nailed down. No natural language spec or test suite has ever come close to fully describing all observable behaviors of a non-trivial system.

This means that if no one is reviewing the code, agents adding features will change observable behaviors.

This gets exposed to users as churn, jank, and broken work flows.

raw_anon_1111 · 2026-03-10T18:26:34 1773167194

Thats easy enough to prevent with modular code that’s what “plan mode” is for. But you probably never worked with a bunch of C# developers using R#

sarchertech · 2026-03-10T19:01:47 1773169307

1. Preventing agents from crossing boundaries, creating implicit and explicit dependencies, and building false layers requires much more human control over every PR and involvement with the code than you seem to espouse.

2. Assuming that techniques that work with human developers that have severely impaired judgement but are massively faster at producing code is a bad idea.

3. There’s no way you have enough experience with maintaining code written in this way to confidently hand wave away concerns.

rixed · 2026-03-11T06:24:07 1773210247

I solve this issue (agent looking at too much and changing too much) with the best abstraction ever invented : files and permissions.

One task is usually composed of 2 input files, a specification and a header file, and the task is to output the implementation and nothing more. Agent user has no other permissions in the file system, has no tools, just output the code that's directed into a file. I run ´make' whenever I update a specification. Token count is minimal.

Do I save time? Not much, but having to specify and argue about everything is interesting, and I trust myself that I'm not loosing any knowledge this way; be it the why or the how.

raw_anon_1111 · 2026-03-10T19:18:09 1773170289

Absolutely no one in the value chain cares about “how many layers of abstractions your code has - not your management or your customers. They care about functional and none functional requirements

sarchertech · 2026-03-10T21:51:43 1773179503

Of course they don’t. Please reread what I said, give it the slightest bit of thought, and re-respond if you want a response from me.

raw_anon_1111 · 2026-03-10T22:12:43 1773180763

By definition, coding agents are right now the worse they will ever be and the industry as a whole by definition is the least experienced it will ever be at using then.

So many people on HN are so insulted that the people who put money in our bank accounts and in some cases stock in our brokerage accounts ever cared about their bespoke clean code, GOF patterns and they never did. LLM just made it more apparent.

It’s always been dumb for PR to be focused on for loops vs while loops instead of focusing on whether functional and non functional requirements are met

sarchertech · 2026-03-11T00:45:56 1773189956

Wow you have completely lost the plot. It’s like you’re a bot that’s mixing up who he’s replying to.

raw_anon_1111 · 2026-03-11T00:54:09 1773190449

Just maybe you aren’t making the strong argument you think you are making

sarchertech · 2026-03-11T11:44:38 1773229478

I wouldn’t know sir, because you didn’t address a single lick of it. You went off and argued with some other argument you constructed in your head. I believe we have a name for that. It is an awfully good way to win arguments in your own mind and retain unshakable confidence in your position if that’s your goal though.

raw_anon_1111 · 2026-03-11T15:02:09 1773241329

Again exactly what argument are you making then? Maybe you should spend more time being clear about your argument? An LLM might help…

sarchertech · 2026-03-11T18:09:08 1773252548

Well there’s a numbered list if you’ll read back a few messages. That should be easy for even the most LLM addled brain to digest.

But instead you went off and had your own party arguing with someone (it certainly wasn’t me) about number of layers, GoF patterns, and “clean” code.

hard24 · 2026-03-10T17:10:54 1773162654

"No one cares about handcrafted artisanal code as long as it meets both functional and non functional requirements"

Speak for yourself. I don't hire people like you.

raw_anon_1111 · 2026-03-10T17:12:01 1773162721

And guess what? You probably don’t pay as much as I make now either…

Even in late 2023 with the shit show of the current market, I had no issues having multiple offers within three weeks just by reaching out to my network and companies looking for people with my set of skills.

hard24 · 2026-03-10T17:14:58 1773162898

I field a small team of experts who are paid upwards of a million GBP in cold-hard cash in London. Not stock. Cash.

You sound like a bozo, I can sniff it through my screen.

zepolen · 2026-03-10T22:16:05 1773180965

This sounds like a place I want to work at.

YCpedohaven · 2026-03-10T16:52:52 1773161572

[flagged]

raw_anon_1111 · 2026-03-10T17:07:19 1773162439

Yes because I didn’t check to see if Claude code used a for loop instead of a while loop? Or that it didn’t use my preferred GOF pattern and didn’t use what I read in “Clean Code”?

Guess what? I also stopped caring how registers are used and counting clock cycles in my assembly language code like it’s the 80s and I’m still programming on a 1Mhz 65C02

icedchai · 2026-03-10T19:06:09 1773169569

I can see the argument both ways. Some code is just not worth looking at...

But do you look at any of the AI output? Or is it just "it works, ship it"?

raw_anon_1111 · 2026-03-10T20:25:35 1773174335

My last project was basically an ETL implementation on AWS starting with an empty AWS account and a internal web admin site that had 10 pages. I am yada yada yadaing over a little bit.

What I checked.

1. The bash shell scripts I had it write as my integration test suite

2. To make sure it wasn’t loading the files into Postgres the naive way -loading the file from S3 and doing bulk inserts instead of using the AWS extension that lets it load directly from S3. It’s the differ xe between taking 20 minutes and 20 seconds.

3. I had strict concurrency and failure recovery requirements. I made sure it was done the right way.

4. Various security, logging, log retention requirements

What I didn’t look at - a line of the code for the web admin site. I used AWS Cognito for authentication and checked to make sure that unauthorized users couldn’t use the website. Even that didn’t require looking at the code - I had automated tests that tested all of the endpoints.

icedchai · 2026-03-10T22:25:45 1773181545

This all makes sense.

I've witnessed human developers produce incredibly convoluted, slow "ETL pipelines" that took 10+ minutes to load single digit megabytes of data. It could've been reduced to a shell script that called psql \copy.

getly_store · 2026-03-11T10:04:30 1773223470

Yep. Heavy ETL often adds latency; a staging table plus COPY into Postgres, then idempotent upserts, is usually enough. Keep it incremental and observable: checksums, counts, and replayable loads. For bigger scales, add CDC (logical decoding like Debezium) and parallelize ingestion across partitions; minimize in-Python transforms and push work into SQL.

icedchai · 2026-03-11T15:00:42 1773241242

Yep, that's the pattern I follow.

Unfortunately, the "ETL pipeline" I mentioned didn't even use transactions and was opening a new connection for every insert. No wonder it was slow.

marginalia_nu · 2026-03-10T15:08:45 1773155325

(Flashbacks from the horrors in the Byte Order Mark wars)

xnorswap · 2026-03-10T15:25:51 1773156351

That still trips us up regularly to this day.

dionian · 2026-03-10T15:15:56 1773155756

Oh god