Hacker Newsnew | past | comments | ask | show | jobs | submit | awakeasleep's commentslogin

I’m surprised no one has mentioned that there was no safe course of action for the journalist because there was money on both sides of this outcome.

No matter what he reported, he would have the other side threatening him.


Defer to an actual authority. Where is the official report on whether or not it was an interception? Even with a large explosion the fact that it landed in a wooded area implies it was intercepted. Those are targeted missiles, an acceptable result of an interception is to bring it down in an undeveloped area.

So I would think there should be some sort of authority with official capacity to state what happened, not just a random journalist that doesn't give concrete sources.


> No matter what he reported, he would have the other side threatening him.

That is why modern reportings best practice is to always support both sides. /s


Perhaps because he is a journalist whose job is to report reality, not avoid threats?

Until they start trying the carrot instead of the stick. Then it becomes a bidding war to determine the "reality"

They can also just do that bidding war in the resolution contract

Always is. 'Reality' is a subjective accounting.


The reason no one responds to this list is because it's just one big gish gallop

It's enough to see that you brought a link to the Israeli Military Censor to hint at a conspiracy, to understand who you're dealing with.

But even if you go into the list, you'll see at the top that those who were shot were in the middle of the battle, where Israeli forces were surprised, to the point of massacre alongside civilians. And there, it turns out, they didn't shoot to save themselves by the skin of their teeth, but simply wanted to kill journalists.

Also, a quick search shows that "Mohammad Jarghoun" ("מוחמד ג'רגון") was not a journalist at all, but a media worker, that according to the CPJ [1], during wartime he receives journalistic status. (Also not mentioned in AJ [2]. what a surprise...)

Another comment to the pantheon of "the most logical failures, in the fewest words". And then no one understands why the ICC will never consider such reports..

[1]: https://www.the7eye.org.il/501320

[2]: https://www.aljazeera.com/news/2023/10/10/at-least-six-pales...


How does one say that a media worker is fair game, but a journalist is not? Both are classified as civilians under international law [1]. There are several airstrikes as the method of execution of these journalists as well. Good job cherry-picking the "favourable" examples. Also, I don't know what you're on about, but ICC is clearyly investigating Israeli war crimes of targeted journalist executions [3]. [1]: https://cpj.org/2023/10/journalist-casualties-in-the-israel-... [2]: https://cpj.org/data-methodology/ [3]: https://ifex.org/iccs-israel-palestine-investigation-will-in...

> Both are classified as civilians

Then say civilians. Don't claim what you can't provide. And DO provide context (like, was it still while the massacre was ongoing [1]). But all I can do is to suggest.

> Good job cherry-picking

Me cherry-picking: Taking the literally first entries, Array[0] and Array[1].

Also, you can't claim cherry-picking as invalid against gish-gallop. Since you can't enjoy the size argument only to retract items on the list after the smallest pushback.

Otherwise I can prove God. How? Every sentence in the Bible... Oh, you found some that are wrong? "Good job cherry-picking"!

And not to mention dozens of more problems with the list (no mentioning any IDF comments, no source for titles, etc.). This is just a bad list. Simple as.

> ICC is clearly investigating

Investigating != Judgment. But good, send them more. But please send them a list starting with items that might hold the smallest of scrutinies. And don't prove it by hinting at conspiracies just because Israel has a security censor. But all I can do is to suggest.

[1]: https://13tv.co.il/item/news/abroad/dynw9-903794689/


[flagged]


Another "banger" comment that shows you did not read your sources links, here is one from the the wiki (Israeli Military Censor):

https://www.academia.edu/10481823/The_Israeli_paradox_The_mi...

Maybe your third comment will finally succeed...


Both researchers in that paper are Israeli residents. Do you have an independent report that corroborates their findings?

The wiki sources FROM YOUR LINKS, are suddenly not enough once they are against you?

Now now... one may mistakenly think you have some inconsistencies in your theory... and you out to revisit them first before demanding more..


The link corroborates my claim; the State of Israel does not protect press freedom.

I am asking you to cite a better counterarguement, if you want to disprove it. Or concede that Israeli journalists are regularly threatened by their government.


I feel like being a journalist in a warzone is already exposure to a sufficient number of threats for the benefit of human society that we shouldn't simply accept them being exposed to any entirely different set of completely unnecessary threats from a pile of sociopaths running their own sick gambling dead pools.

Explain how fragility of implementation, like spaghetti code, high coupling low cohesion fit into your world view?

As human developers, I think we're struggling with "letting go" of the code. The code we write (or agents write) is really just an intermediate representation (IR) of the solution.

For instance, GCC will inline functions, unroll loops, and myriad other optimizations that we don't care about (and actually want!). But when we review the ASM that GCC generates we are not concerned with the "spaghetti" and the "high coupling" and "low cohesion". We care that it works, and is correct for what it is supposed to do.

Source code in a higher-level language is not really different anymore. Agents write the code, maybe we guide them on patterns and correct them when they are obviously wrong, but the code is just the work-item artifact that comes out of extensive specification, discussion, proposal review, and more review of the reviews.

A well-guided, iterative process and problem/solution description should be able to generate an equivalent implementation whether a human is writing the code or an agent.


A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent. It can do this because it is translating from one formal language to another.

Translating a natural prompt on the other hand requires the LLM to make thousands of small decisions that will be different each time you regenerate the artifact. Even ignoring non-determinism, prompt instability means that any small change to the spec will result in a vastly different program.

A natural language spec and test suite cannot be complete enough to encode all of these differences without being at least as complex as the code.

Therefore each time you regenerate large sections of code without review, you will see scores of observable behavior differences that will surface to the user as churn, jank, and broken workflows.

Your tests will not encode every user workflow, not even close. Ask yourself if you have ever worked on a non trivial piece of software where you could randomly regenerate 10% of the implementation while keeping to the spec without seeing a flurry of bug reports.

This may change if LLMs improve such that they are able to reason about code changes to the degree a human can. As of today they cannot do this and require tests and human code review to prevent them from spinning out. But I suspect at that point they’ll be doing our job, as well as the CEOs and we’ll have bigger problems.


I don't see a world where a motivated soul can build a business from a laptop and a token service as a problem. I see it as opportunity.

I feel similarly about Hollywood and the creation of media. We're not there in either case yet, but we will be. That's pretty clear. and when I look at the feudal society that is the entertainment industry here, I don't understand why so many of the serfs are trying to perpetuate it in its current state. And I really don't get why engineers think this technology is going to turn them into serfs unless they let that happen to them themselves. If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

I am assuming given the rate of advance of AI coding systems in the past year that there is plenty of improvement to come before this plateaus. I'm sure that will include AI generated systems to do security reviews that will be at human or better level. I've already seen Claude find 20 plus-year-old bugs in my own code. They weren't particularly mission critical but they were there the whole time. I've also seen it do amazingly sophisticated reverse engineering of assembly code only to fall over flat on its face for the simplest tasks.


That depends on how fast that change happens. If 45% of jobs evaporate in a a 5 year period, a complete societal collapse is the likely outcome.

Sounds like influencer nonsense to me. Touch grass. If the people are fed and housed, there's no collapse. And if the billionaire class lets them starve, they will finally go through some things just like the aristocracy in France once did. And I think even Peter Thiel is smarter than that. You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

OTOH if what you're really talking about is the long-term collapse in our ludicrous carbon footprint when we finally run out of fossil fuels and we didn't invest in renewables or nuclear to replace them, well, I'm with you there.


>Sounds like influencer nonsense to me. Touch grass.

I don't even know what this means.

The worst unemployment during the Weimar Republic was 25-30%. Unemployment in the Great Depression peaked at 25%.

So yeah if we get to 45% unemployment and those are the highest paying jobs on average then yeah it's gonna be bad. Then you add in second order effects where none of those people have the money to pay the other 55% who are still employed.

We might get to a UBI relatively quickly and peacefully. But I'm not betting on it.

>finally go through some things just like the aristocracy in France once did.

Yeah that's probably the most likely scenario, but that quickly devolved into a death and imprisonment for far more than the aristocrats and eventually ended with Napoleon trying to take over Europe and millions of deaths overall.

The world didn't literally end, but it was 40 years of war, famine, disease, and death, and not a lot of time to think about starting businesses with your laptop.


And the dark ages lasted a millennium. Sounds like quite an improvement on that. And if America didn't want a society hellbent on living the worst possible timeline, why did it re-elect President Voldemaga and give him the football? And then, even when he breaks nearly every political promise, his support remains better than his predecessor? Anyway, I think the richest ~1135 Americans won't let you starve, but they'll be happy to watch you die young of things that had stopped killing people for quite some time whilst they skim all the cream. And that seems to be what the plurality wants or they'd vote differently.

The good news is that America is ~5% of the world. And the more we keep punching ourselves in the face, the better the chance someone else pulls ahead. But still, we have nukes, so we're still the town bully for the immediate future.


What are you even arguing about? I have absolutely no idea where you are going with this.

Yeah I figured that. You think society is going to collapse because of AI. I don't. But I do think that stupid narrative is prevalent in the media right now and the C-suite happily proclaiming they're going to lay people off and replace them with AI got the ball rolling in the first place. Now it has momentum of its own with lunatics like Eliezer Yudkowsky once again getting taken seriously.

Fortunately, the other 95% of humanity is far less doomer about their prospects. So if America wants to be the new neanderthals, they'll be happy to be the new cro magnons.


I don't think society is going to collapse because of AI because I don't think the current architectures have any chance of becoming AGI. I think that if AGI is even something we're capable of it's very far off.

I think that if CEOs can replace us soon, it's because AGI got here much sooner than I predicted. And if that happens we have 2 options Mad Max and Star Trek and Mad Max is the more likely of the 2.


What's with all the catastrophic thinking then? Mad Max? Collapse of Society because 45% unemployment? I really hate people on principle but I have more faith in them looking out for their own self interest than you do apparently. Mad Max specifically requires a ridiculous amount of intact infrastructure for all the gasoline (you know gasoline goes bad in 3-6 months? Yeah didn't think so), manufacturing for all the parts for all those crazy custom build road warrior wagons, and ranches of livestock for all the leather for all the cool outfits (and with all that cow, no one needs to starve but oh the infrastructure needed to keep the cows fed).

If doom porn is your thing, try watching Threads or The Day After, especially Threads. That said, I don't think Star Trek is possible, maybe The Expanse but more likely we run out of cheap energy before we get off world.

As for the AGI, it all depends on your definition. We're already at Amazon IC1/IC2 coding performance with these agents (I speak from experience previously managing them). If we get to IC3, one person will be able to build a $1B company and run it or sell it. If you're a purist like me and insist we stick to douchebag racist Nick Bostrom's superintelligence definition of AGI, then we agree. But I expect 24/7 IC3 level engineering as a service for $200/month to be more than enough and I think that's a year or two away. And you can either prepare for that or scream how the sky is falling, your choice.


>Mad Max specifically requires a ridiculous amount of intact infrastructure for all the gasoline (you know gasoline goes bad in 3-6 months? Yeah didn't think so)

Is this a joke or do you have a learning disability?

>But I expect 24/7 IC3 level engineering as a service for $200/month to be more than enough and I think that's a year or two away. And you can either prepare for that or scream how the sky is falling, your choice.

Or I could do neither and write you off as a gasbag who doesn't know what he's talking about like all the other ex-amazon management I've had the pleasure to work with over the years.


I guess you have a really short context buffer with all this frequently forgetting things you've said yourself.

But that aside, how's all that self-righteousness working out for you?


I bet you have ex-Amazon prominently in your LinkedIn profile.

Don't have a LinkedIn profile, don't need one. But I'm guessing you're listed under LinkedIn Lunatics.

I read back through a few of your posts and you’re either schizophrenic, or a very elaborate troll.

I know a few older people who started posting like this when they hit their 50s. I’ve only got a few years left. Hopefully I can avoid it, but maybe it’s inevitable.


Ageism: now that's a warrior's flex, amIRight?

People like myself in their 50s to 60s who had the experience of banging the metal on imperfect buggy hardware late into the night to mine gems before Python made the entire software engineering community pivot to a core competency of syntax pedanting plus stringing library calls together are having a real party with AI agents effectively doing the same thing they did 30 years ago. I personally never stopped coding even through my one awful experience as an engineering manager.

But you do you, and hear me now, dismiss me later. There won't be 45% unemployment because the minute AI starts replacing current engineering skills for real is the minute the people it targets wake up and start learning how to work with AI coding agents that will be dramatically better than today. People resist change until there are no other options, just look at fossil fuels. The free market will work that one out too eventually.

And no amount of some nontechnical guy vibe scienceing his way to a working mRNA vaccine for his cancer-ridden dog or an engineer unlocking mods to Disney Infinity just from the binary and Claude Code or an entire web browser ported to rust will ever convince you these things are not the enemy. And that's going to put you through some things down the road. So of course, since this will never happen I'm an elaborate troll or a nutcase just like the people who pulled all those things off, never mind all the evidence mounting that these things can be amazing in the right hands. That's CRAZYTALK! Stochastic Parrot! Glorified Autocomplete! Mad Max! Mad Max! DLSS 5!


>You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

You are the epitome of the tech bro.


Sure, sure. Understanding how these sociopaths think clearly makes me a tech bro rather than someone who incorporates worst-case scenarios into my planning. Suggesting they would maintain minimum viable society to save their own asses means I'm in favor of it, right? This is why I work remotely.

Peter Thiel might be smarter than that but I’m not sure about the other ones.

Look how Musk treated the Twitter devs or Bezos any of his workers or Trump anybody.


They're all quite intelligent. And they're world class experts in saving their own bacon. Doesn't mean they have any ethics though nor any emotional intelligence after decades of being surrounded by toadies and bootlickers.

Smart is not equal to intelligent.

You can be very intelligent but have a blind eye on some trivial things.

I’m certain that some of them think they are untouchable (or even just are well prepared). We will only see if that’s really true if shit hits the fan.


We all know they have bunkers and we roughly know where they are. I got suspended on reddit for threatening harm to others for saying that a couple weeks back. But I don't think we need to raid the bunkers in your TEOTWAWKI scenario, their bodyguards will do all the heavy-lifting once they realize the power balance has shifted. But I also don't expect a SHTF scenario, just a slow creeping enshitification of living standards instead of actually implementing a UBI.

And then the survivors who band together to rebuild community instead of chasing some idiotic Mad Max scenario will ultimately prevail. And yes, they are blind to that other option because they wouldn't end up on top.


>If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

But you aren't building, your LLM is. Also, you are only thinking about ways as you, a supposed builder, will benefit from this technology. Have you considered how all previous waves of new technologies have introduced downstream effects that have muddied our societies? LLMs are not unique in this regard, and we should be critical on those who are trying to force them into every device we own.


Would you say the general contractor for your home isn’t a builder because he didn’t install the toilets?

I think that's precisely his thinking and don't let him know about all those fancy expensive unitasker tools they have that you probably don't that let them do it far more cost effectively and better than the typical homeowner. Won't you think of the jerbs(tm)? And to Captain dystopia, life expectencies were increasing monotonically until COVID. Wonder what changed?

I think this argument would be make more sense if you were talking about an architect, or the customer.

A contractor is still very much putting the house together.


The general contractor is not doing the actual building as much as he is coordinating all of the specialist, making sure things run smoothly and scheduling things based on dependencies and coordinating with the customer. I’ve had two houses built from the ground up

3 myself and I have yet to meet a "vibe" contractor.

And he is also not inspecting every screw, wire, etc. He delegates

Oh you're preaching to the choir. I think we are entering a punctuated equilibrium here w/r to the future of SW engineering. And the people who have the free time to go on to podcasts and insist AI coding agents can't do anything useful rather than learning their abilities and their limitations and especially how to wield them are going to go through some things. If you really want to trigger these sorts, ask them why they delegate code generation to compilers and interpreters without understanding each and every ISA at the instruction level. To that end, I am devoid of compassion after having gone through similar nonsense w/r to GPUs 20 years ago. Times change, people don't.

I haven’t stayed relevant and able to find jobs quickly for 30 years by being the old man shouting at the clouds.

I started my career in 1996 programming in C and Fortran on mainframes and got my first only and hopefully last job at BigTech at 46 7 jobs later.

I’m no longer there. Every project I’ve had in the last two years has had classic ML and then LLMs integrated into the implementation. I have very much jumped on the coding agent bandwagon.


Started mine around the same time and yes, keeping up keeps one employed. What's disheartening however is how little keeping up the key decision makers and stakeholders at FAANNG do and it explains idiocy like already trying to fire engineers and replace them with AI. Hilarity ensued of course because hilarity always ensues for people like that, but hilarity and shenanigans appears to be inexhaustible resources.

I very much would rather get a daily anal probe with a cactus than ever work at BigTech again even knowing the trade off that I now at 51 make the same as 25 year old L5 I mentored when they were an intern and their first year back as an L4 before I left.

If you have FIRE money, getting off the hamster wheel of despair that is tech industry culture is the winning move. Well-played.

Not quite FIRE money. I still need to work for awhile - I just don’t need to chase money. I make “enough” to live comfortably, travel like I want (not first class.), save enough for retirement (max out 401K + catchup contributions + max out HSA + max out Roth).

We did choose to downsize and move to state tax free Florida.

If I have to retire before I’m 65, exit plan is to move to Costa Rica (where we are right now for 6 weeks)


I've struggled a bit with this myself. I'm having a paradigm shift. I used to say "but I like writing code". But like the article says, that's not really true. I like building things, the code was just a way to do that. If you want to get pedantic, I wasn't building things before AI either, the compiler/linker was doing that for me. I see this is just another level of abstraction. I still get to decide how things work, what "layers" I want to introduce. I still get to say, no, I don't like that. So instead of being the "grunt", I'm the designer/architect. I'm still building what I want. Boilerplate code was never something I enjoyed before anyway. I'm loving (like actually giggling) having the AI tie all the bits for me and getting up and running with things working. It reminds me of my Delphi days: File->New Project, and you're ready to go. I think I was burnt out. AI is helping me find joy again. I also disable AI in all my apps as well, so I'm still on the fence about several things too.

This resonates. I spent years thinking I enjoyed coding, but what I actually enjoy is designing elegant solutions built on solid architecture. Inventing, innovating, building progressively on strong foundations. The real pleasure is the finished product (is it ever really finished though?) — seeing it's useful and makes people's lives easier, while knowing it's well-built technically. The user doesn't see that part, but we know.

With AI, by always planning first, pushing it to explore alternative technical approaches, making it explain its choices — the creative construction process gets easier. You stay the conductor. Refactoring, new features, testing — all facilitated. Add regular AI-driven audits to catch defects, and of course the expert eye that nothing replaces.

One thing that worries me though: how will junior devs build that expert eye if AI handles the grunt work? Learning through struggle is how most of us developed intuition. That's a real problem for the next generation.


> A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent.

Here are the reported miscompilation bugs in GCC so far in 2026. The ones labeled "wrong-code".

https://gcc.gnu.org/bugzilla/buglist.cgi?chfield=%5BBug%20cr...

I count 121 of them.


If you can’t understand the difference between a bug that will rarely cause a compiler encountering an edge case to generate a wrong instruction and an LLM that will generate 2 completely different programs with zero overlap because you added a single word to your prompt, then I don’t know what to tell you.

The point is that expert humans (the GCC developers) writing code (C++) that generates code (ASM) does not appear to be as deterministic as you seem to think it is.

I’m very aware of that, but I’m also aware that it’s rare enough that the compiler doesn’t emit semantically equivalent code that most people can ignore it. That’s not the case with LLMs.

I’m also not particularly concerned with non-determinism but with chaos. Determinism in LLMs is likely solvable, prompt instability is not.


Classic HN-ism. To focus on the semantics of a statement while ignoring the greater point in order to argue why someone is wrong.

I think it's a perfectly fine point. The OP said (my interpretation) that LLMs are messy, non-deterministic, and can produce bad code. The same is true of many humans, even those whose "job" is to produce clean, predictable, good code. The OP would like the argument to be narrowly about LLMs, but the bigger point even is "who generates the final code, and why and how much do we trust them?"

As of right now agents have almost no ability to reason about the impact of code changes on existing functionality.

A human can produce a 100k LOC program with absolute no external guardrails at all. An agent Can't do that. To produce a 100k LOC program they require external feedback forcing them from spiraling off into building something completely different.

This may change. Agents may get better.


I argued the greater point? Software code-generation is not deterministic, whether it's done by expert humans or by LLMs.

It has nothing to do with determinism. It's the difference between nearly perfectly but not quite perfectly translating between rigorously specified formal languages and translating an ambiguous natural language specification into a formal one.

The first is a purely mechanical process, the second is not and requires thousands of decisions that can go either way.


And that’s no different than human developers

The difference is that a human is that a human can reason about their code changes to a much higher degree than an AI can. If you don't think this is true and you think we're working with AGI, why would you bother architecting anything all or building in any guard rails. Why not just feed the AI the text of the contract your working from and let it rip.

You give way too much credit to the average mid level ticket taker. And again, why do I care how the code does it as long as it meets the functional and none functional requirements?

Because in a real application with real users all of the functional and non-functional requirements aren't documented anywhere but in the code.

If only a coding agent had access to your code…

You realize that coding agents aren’t AGI right? They aren’t capable of reasoning about a code changes impact on anything other than their immediate goal to anywhere near the level even a terrible human programmer is. That why we have the agentic workflow in the first place. They absolutely require guardrails.

Claude will absolutely change anything that’s not bolted to the floor. If you’ve used it on legacy software with users or if you reviewed the output you’d see this.


Compilers are some of the largest, most complex pieces of software out there. It should be no surprise that they come with bugs as all other large, complex pieces of software do.

This seems to apply easily to LLMs as language coprocessors that can output code. How long was it before people trusted compilers?

If you don't understand the difference between something that rigorously translates one formal language to another one and something that will spit out a completely different piece of software with 0 lines of overlap based on a one word prompt change, I don't know what to tell you.

"rigorously" is doing a lot of heavy lifting here.

Let's substitute rigorously with "in an extremely thorough, careful, and methodical way."

As if when you delegate tasks to humans they are deterministic. I would hope that your test cases cover the requirements. If not, your implementation is just as brittle when other developers come online or even when you come back to a project after six months.

1. Agents aren’t humans. A human can write a working 100k LOC application with zero tests (not saying they should but they could and have). An agent cannot do this.

Agents require tests to keep them from spinning out and your tests do not cover all of the behaviors you care about.

2. If you doubt that your tests don’t cover all your requirements, 99.9% of every production bug you’ve ever had completely passed your test suite.


I have never known a human that could or did write 100K lines of bug free working code without running parts of it first and testing.

So humans also don’t write bug free code or tests that cover all use cases - how is that an argument that humans are better?


Not that humans can't write 100k line programs bug free or without running parts of it.

An AI cannot write a 100k line program on its own without external guard rails otherwise it spins out. This has nothing to do with whether the agent is allowed to run the code itself. This is well documented. Look at what was required to allow Claude to write a "C compiler".

This has nothing to do with whether it's bug free. It literally can't produce a working 100k LOC program without external guardrails.


Absolutely no one is arguing that you shouldn’t have a combination of manual and automated tests around either AI or human generated code or that you shouldn’t have a thoughtful design

In a non-trivial app you can't test your way through all of the e2e workflows and thoughtful design isn't what I'm talking about.

How many bugs have you seen that passed your automated and manual testing? Probably 99.9% of them.

Now imagine that you take those same test suites and you unleash an agent on the code that has far worse reasoning capabilities than a human and you tell them they can change anything in the code as long as the tests pass.


So if bugs pass through testing which they have forever, wouldn’t that imply that humans are just as fallible as AI - and slower?

I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker


If you have an employee who codes 2x faster than everyone else but produces 10x the bugs, would your suggestion to be to let him rip and stop reviewing his code output?

> I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker

It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

I regularly find Claude doing insane things that I never would have thought to test against, that would have made it into prod if I hadn’t renewed the code.


> It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

You’re focused on the output , I’m focused on the behavior. Thats the difference. Just like when I delegate a task to either another developer or another company like the random Salesforce integration or even a third party API I need to integrate with.


Unfortunately you are not equipped to observe and test all or even most of the behavior of a non-trivial system.

And if you attempt to treat every module in your system like it’s untrusted 3rd party code you’ll run into severe complexity and size limits. No one codes large systems like that because it’s not possible. There are always escape hatches and entanglements.


Actual a little company you might have heard of called Amazon does…

Jeff Bezos mandated it in 2002.

https://konghq.com/blog/enterprise/api-mandate

AWS S3 by itself is made up of 200+ micro services


Except that they don’t. The API mandate gets violated all the time. And no one at Amazon actually treats every other team as a 3rd party.

Have you worked at Amazon?

I am not saying they treat every other team as a third party. I am saying they treat the code itself as a black box with well defined interfaces. They aren’t reaching in to another services data store to retrieve information.


How do you know someone is ex-Amazon ? Don’t worry they’ll tell you.

I haven’t mentioned it 5 times already so you can be pretty sure I haven’t. I know too many people that have worked there to ever make that mistake.

But more importantly, have you worked there in the last decade?


Then you have know idea how Amazon works. I was there from 2020-2023z.


Valid points. But crucial part of not "letting go" of the code is because we are responsible for that code at the moment.

If, in the future, LLM providers will take ownership of our on-calls for the code they have produced, I would write "AUTO-REVIEW-ACCEPTER" bot to accept everything and deploy it to production.

If, company requires me to own something, then I should be aware about what's that thing and understand ins and outs in detail and be able to quickly adjust when things go wrong


In the past ten years as a team lead/architect/person who was responsible for outsourced implementations (ie Salesforce/Workday integrations, etc), I’ve been responsible for a lot of code I didn’t write. What sense would it have made for me to review the code of the web front end of the web developer for best practices when I haven’t written a web app since 2002?

as a team lead, if you are not aware of what's happening in the team, what kind of team lead is this?

on the other hand, you may have been an engineering manager, who is responsible for the team, but a lot of times they do not participate in on-call rotations (only as last escalation)


> what kind of team lead is this?

One that trusts the team?

Knowing what's happening in the team and personally reviewing parts of the code for best practices are very different things. Are the other team members happy? Does development seem to go smoothly, quickly and without constantly breaking? Does the team struggle to upgrade or refactor things? At some level you have to start trusting that the people working know what they're doing, and help guide from a higher level so they understand how to make the right tradeoffs for the business.


As a team lead, I know the architecture, the functional and non functional requirements, I know the website is suppose to do $x but I definitely didn’t guide how since I haven’t done web development in a quarter century, I know the best practices for architecture and data engineering (to a point).

That doesn’t mean I did a code review for all of the developers. I will ask them how they solved for a problem that I know can be tricky or did they take into account for something.


You are comparing compilers to a completely non deterministic code generation tool that often does not take observable behavior into account at all and will happily screw a part of your system without you noticing, because you misworded a single prompt.

No amount of unit/integration tests cover every single use case in sufficiently complex software, so you cannot rely on that alone.


I just rewrote a utility for the third time - the first two were before AI.

Short version, when someone designs a call center with Amazon Connect, they use a GUI flowchart tool and create “contact flows”. You can export the flow to JSON. But it isn’t portable to other environments without some remapping. I created a tool before that used the API to export it and create a portable CloudFormation template.

I always miss some nuance that can half be caught by calling the official CloudFormation linter and the other half by actually deploying it and seeing what errors you get

This time, I did with Claude code, ironically enough, it knew some of the complexity because it had been trained on one of my older open source implementations I did while at AWS. But I told it to read the official CloudFormation spec, after every change test it with the linter, try to deploy it and fix it.

Again, I didn’t care about the code - I cared about results. The output of the script either passes the deployment or it doesn’t. Claude iterated until it got it right based on “observable behavior”. Claude has tested whether my deployments were working as expected plenty of times by calling the appropriate AWS CLI command and fixed things or reading from a dev database based on integration tests I defined.


That may be the future, but we're not there yet. If you're having the LLM write to a high level language, eg java, javascript, python, etc, at some point there will be a bug or other incident that requires a human to read the code to fix it or make a change. Sure, that human will probably use an LLM as part of that, but they'll still need be able to tell what the code is doing, and LLMs simply are not reliable enough yet that you just blindly have them read the code, change it, and trust them that it's correct, secure, and performant. Sure, you can focus on writing tests and specs to verify, but you're going to spend a lot more time going in agentic loops trying to figure out why things aren't quite right vs a human actually being able to understand the code and give the LLM clear direction.

So long as this is all true, then the code needs to be human readable, even if it's not human-written.

Maybe we'll get to the point that LLMS really are equivalent to compilers in terms of reliability -- but at that point, why would be have them write in Java or other human-readable languages? LLMs would _be_ a compiler at that point, with a natural-language UI, outputing some kind of machine code. Until then, we do need readable code.


Me: My code isn’t giving the expected result $y when I do $x.

Codex: runs the code, reproduces the incorrect behavior I described finds the bug, reruns the code and gets the result I told it I expected. It iterates until it gets it right and runs my other unit and integration tests.

This isn’t rocket science.


I've actually found that well-written well-documented non-spaghetti code is even more important now that we have LLMs.

Why? Because LLMs can get easily confused, so they need well written code they can understand if the LLM is going to maintain the codebase it writes.

The cleaner I keep my codebase, and the better (not necessarily more) abstracted it is, the easier it is for the LLM to understand the code within its limited context window. Good abstractions help the right level of understanding fit within the context window, etc.

I would argue that use of LLMs change what good code is, since "good" now means you have to meaningfully fit good ideas in chunks of 125k tokens.


I somewhat agree. But that’s more about modularity. It helps when I can just have Claude code focus on one folder with its own Claude file where it describes the invariants - the inputs and outputs.

If you don’t read the code how the heck do you know anything about modularity? How do you know that Module A doesn’t import module B, run the function but then ignore it and implement the code itself? How do you even know it doesn’t import module C?

Claude code regularly does all of these things. Claude code really really likes to reimplement the behavior in tests instead of actually exercising the code you told it to btw. Which means you 100% have to verify the test code at the very least.


Well I know because my code is in separately deployed Lambdas that are either zip files uploaded to Lambda or Docker containers run on Lambda that only interact via APi Gateway, a lambda invoke, SNS -> SQS to Lambda, etc and my IAM roles are narrowly defined to only allow Lambda A to interact with just the Lambdas I tell it to.

And if Claude tried to use an AWS service in its code that I didn’t want it to use, it would have to also modify the IAM IAC.

In some cases the components are in completely separate repositories.

It’s the same type of hard separation I did when there were multiple teams at the company where I was the architect. It was mostly Docker/Fargate back then.

Having separately defined services with well defined interfaces does an amazing job at helping developers ramp up faster and it reduces the blast radius of changes. It’s the same with coding agents. Heck back then, even when micro services shared the same database I enforced a rule that each service had to use a database role that only had access to the tables it was responsible for.

I have been saying repeatedly I focus on the tests and architecture and I mentioned in another reply that I focus on public interface stability with well defined interaction points between what I build and the larger org - again just like I did at product companies.

There is also a reason the seven companies I went into before consulting (including GE when it was still a F10 company) I was almost always coming into new initiatives where I could build/lead the entire system from scratch or could separate out the implementation from the larger system with well defined inputs and outputs. It wasn’t always micro services. It might have been separate packages/namespaces with well defined interfaces.

Yeah my first job out of college was building data entry systems in C from scratch for a major client that was the basis of a new department for the company.

And it’s what Amazon internally does (not Lambda micro services) and has since Jeff Bezos’s “API Mandate” in 2002.


This sounds like an absolute hellscape of an app architecture but you do you. It also doesn’t stop anything but the Module A imports C without you knowing about it. It doesn’t stop module A from just copy pasting the code from C and saying it’s using B.

>almost always coming into new initiatives

That says a lot about why you are so confident in this stuff.


When requirements change, a compiler has the benefit of not having to go back and edit the binary it produced.

Maybe we should treat LLM generated code similarly —- just generate everything fresh from the spec anytime there’a a change, though personally I haven’t had much success with that yet.


It very much does have to modify the binary it produced to create new code. The entire Linux kernel has an unstable ABI where you have to recompile your code to link to system libraries.

The Linux userspace ABI is actually quite stable and rarely changes. If this wasn't true, every time you installed a new kernel you'd have to upgrade / reinstall everything else, including the C compiler, glibc, etc. This does not happen.

The Linux kernel ABI (kernel modules, like device drivers) on the other hand, is unstable and closely tied to the kernel version. Most people do not write kernel modules, so generally not an issue. (I did, many years ago.)


Isn’t that the reason that Android phones have piss poor support after being released?

This is fantasy completely disconnected from reality.

Have you ever tried writing tests for spaghetti code? It's hell compared to testing good code. LLMs require a very strong test harness or they're going to break things.

Have you tried reading and understanding spaghetti code? How do you verify it does what you want, and none of what you don't want?

Many code design techniques were created to make things easy for humans to understand. That understanding needs to be there whether you're modifying it yourself or reviewing the code.

Developers are struggling because they know what happens when you have 100k lines of slop.

If things keep speeding in this direction we're going to wake up to a world of pain in 3 years and AI isn't going to get us out of it.


I’ve found much more utility even pre AI in a good suite of integration tests than unit tests. For instance if you are doing a test harness for an API, it doesn’t matter if you even have access to the code if you are writing tests against the API surface itself.

I do too, but it comes from a bang-for-your-buck and not a test coverage standpoint. Test coverage goes up in importance as you lean more on AI to do the implementation IMO.

You did see the part about my unit, integration and scalability testing? The testing harness is what prevents the fragility.

It doesn’t matter to AI whether the code is spaghetti code or not. What you said was only important when humans were maintaining the code.

No human should ever be forced to look at the code behind my vibe coded internal admin portal that was created with straight Python, no frameworks, server side rendered and produced HTML and JS for the front end all hosted in a single Lambda including much of the backend API.

I haven’t done web development since 2002 with Classic ASP besides some copy and paste feature work once in a blue moon.

In my repos - post AI. My Claude/Agent files have summaries of the initial statement of work, the transcripts from the requirement sessions, my well labeled design diagrams , my design review sessions transcripts where I explained it to client and answered questions and a link to the Google NotebookLM project with all of the artifacts. I have separate md files for different implemtation components.

The NotebookLM project can be used for any future maintainers to ask questions about the project based on all of the artifacts.


> It doesn’t matter to AI whether the code is spaghetti code or not. What you said was only important when humans were maintaining the code.

In my experience using AI to work on existing systems, the AI definitely performs much better on code that humans would consider readable.

You can’t really sit here talking about architecting greenfield systems with AI using methodology that didn’t exist 6 months ago while confidently proclaiming that “trust me they’ll be maintainable”.

Well you can, and most consultants do tend to do that, but it’s not worth much.


I wasn’t born into consulting in 1996. AI for coding is by definition the worse today that it will ever be. What makes you think that the complexity of the code will increase faster than the capability of the agents?

You might have maintained large systems long ago, but if you haven't done it in a while your skill atrophies.

And the most important part is you haven't maintained any large systems written by AI, so stating that they will work is nonsense.

I won't state that AI can't get better. AI agents might replace all of us in the future. But what I will tell you is based on my experience and reasoning I have very strong doubts about the maintainability of AI generated code that no one has approved or understands. The burden of proof isn't on the person saying "maybe we should slow down and understand the consequences before we introduce a massive change." It's on the person saying "trust me it will work even though I have absolutely no evidence to support my claim".


Well seeing that Claude code was just introduced last year - it couldn’t have been that long since I didn’t code with AI.

And did I mention I got my start working in cloud consulting as a full time blue badge, RSU earning employee at a little company you might have heard of based in Seattle? So since I have worked at the second largest employee in the US, unless you have worked for Walmart - I don’t think you have worked for a larger company than I have.

Oh did I also mention that I worked at GE when it was #6 in market cap?

These were some of the business requirements we had to implement for the railroad car repair interchange management software

https://www.rmimimra.com/media/attachments/2020/12/23/indust...

You better believe we had a rigorous set of automated tests in something as highly regulated with real world consequences as the railroad transportation industry. AI would have been perfect for that because the requirements were well documented and the test coverage was extreme.

And unless your experience coding is before 1986 when I was coding in assembly language in 65C02 as a hobby, I think I might have a wee bit more than you.

I think you should probably save your “I have more experience” for someone who hasn’t been doing this professionally for 30 years for everything from startups, to large enterprises, to BigTech.


>Well seeing that Claude code was just introduced last year - it couldn’t have been that long since I didn’t code with AI.

That's my entire point!

>And unless your experience coding is before 1986 when I was coding in assembly language in 65C02 as a hobby, I think I might have a wee bit more than you.

Yeah a real wee bit. I started in the late 80s in Tandy Basic.

>I think you should probably save your “I have more experience” for someone who hasn’t been doing this professionally for 30 years for everything from startups, to large enterprises, to BigTech.

I never said anything about having more experience than you, but I've been doing this almost as long as you have. Also at everywhere from startups to large enterprises to BigTech.

But relevant to the discussion at hand, I haven't been consulting for the last part of my career where I could just lob something over the fence and walk away before I have to deal with the consequences of my decisions. This is what seems to be coloring your experience.


> Well you can, and most consultants do tend to do that

Yeah they do.

I'm familiar enough with the claims to feel confident there is plenty of nefarious astroturfing occurring all over the web including on HN.


Indeed. Astro turfing posts have a particular smell to them.

>No human should ever be forced to look at the code behind my vibe coded internal admin portal

Except security researchers. I work in cybersecurity and we already see vulnerabilities caused by careless AI generated code in the wild. And this will only get worse (or better for my job security).


Yep. Unless you inspect the payload and actual code / validation logic, there is nothing to distinguish an API that "works" and is totally insecure from one that is actually secure. I've seen Claude Code generate insecure APIs that "worked", only to find decided to add a user ID as a parameter and totally ignore any API token.

And you haven’t seen security vulnerabilities in the wild based on careless human generated code?

In my experience, consulting companies typically have a bunch of low-to-medium skilled developers producing crap, so the situation with AI isn't much different. Some are better than others, of course.

Also developer UX, common antipatterns, etc

This “the only thing that matters about code is whether it meets requirements” is such a tired take and I can’t imagine anyone seriously spouting it has has had to maintain real software.


The developer UX are the markdown files if no developer ever looks at the code.

Whether you are tired of it or not, absolutely no one in your value you chain - your customers who give your company money or your management chain cares about your code beyond does it meet the functional and non functional requirements - they never did.

And of course whether it was done on time and on budget


As a consumer of goods, I care quite a bit about many of the “hows” of those goods just as much as the “whats”.

My home, which I own, for example, is very much a “what” that keeps me warm and dry. But the “how” of it was constructed is the difference between (1) me cursing the amateur and careless decision making of builders and (2) quietly sipping a cocktail on the beach, free of a care in the world.

“How” doesn’t matter until it matters, like when you put too much weight onto that piece of particle board IKEA furniture.


Do you know how every nail was put into your house? Does the general contractor?

I know where they fucked up and cost me thousands of dollars due to cutting corners during build-out and poor architectural decisions during planning. These kinds of things become very obvious during destructive inspection, which is probably why there are so many limitations on warranties; I digress.

He’s mildly controversial, but watch some @cyfyhomeinspections on YouTube to get a good idea of what you can infer of the “how” of building homes and how it affects homeowners. Especially relevant here because he seems to specialize in inspecting homes that are part of large developments where a single company builds out many homes very quickly and cuts tons of corners and makes the same mistakes repeatedly, kind of like LLM-generated code.


So you’re saying that whether it’s humans or AI - when you delegate something to others you have no idea whether it’s producing quality without you checking yourself…

> you have no idea whether it’s producing quality without you checking yourself

No, I can have some idea. For example, “brand perception”, which can be negatively impacted pretty heavily if things go south too often. See: GitHub, most recently.

I mean, there are already companies that have a negative reputation regarding software quality due to significant outsourcing (consultancies), or bloated management (IBM), or whatever tf Oracle does. We don’t have to pretend there’s a universe where software quality matters, we already live in one. AI will just be one more way to tank your company’s reputation with regards to quality, even if you can maintain profitability otherwise through business development schemes.


So as long as it is meeting the requirements of “it stays up consistently and doesn’t lose my code” you really don’t care how it was coded…

The same as I’ve been arguing about using an agent to do the grunt work of coding.

If GitHub’s login is slow, it isn’t because someone or something didn’t write SOLID code.


> So as long as it is meeting the requirements of “it stays up consistently and doesn’t lose my code” you really don’t care how it was coded…

I don’t think we’ll come to common ground on this topic due to mismatching definitions of fundamental concepts of software engineering. Maybe let’s meet again in a year or two and reflect upon our disagreement.


If you maintain software used by tens of thousands to millions of people, you will quickly realize that no specified functional and non-functional requirements cover anywhere near all user workflows or observable behaviors.

If you mostly parachute in solutions as a consultant, or hand down architecture from above, you won’t have much experience with that, so it’s reasonable for you to underestimate it.


AWS S3 by itself is made up of 300 microservices. Absolutely no developer at AWS knows how every line of code was written.

The scalability requirements are part of the “non functional requirements”. I know that the vibe coded internal admin website will never be used by more than a dozen people just like I know the ETL implementation can scale to the required number of transactions because I actually tested it for that scalability.

In fact, the one I gave to the client was my second attempt because my first one fell flat on its face when I ran it at the required scale


I'm not talking about scalability requirements. I'm talking about the different workflows that 10 million people will come up with when they use a program that won't exist in any requirements docs.

Do you think that AI coded implementations just magically get done witkoug requirements?

You're not understanding what I'm saying. If you go tell your agents to add this new feature to an app, and you do it by writing up a new requirements doc. If you don't review the code, they will change a million different "implementation details" in order to add the new feature that will break workflows that aren't specified anywhere.

The code is the spec. No natural language specification will ever full cover every behavior you care about in practice. No test suite will either.

If you don't know this, you haven't maintained non-trivial software.


And have you never seen what a overzealous developer can do and wreck havoc on an existing code base without a testing harness? Let a developer lose with something like Resharper which has existed since at least the mid 2000’s

If your test don’t cover your use cases, you are just as much in danger from a new developer. It’s an issue with your testing methodology in either case.

And there is also plan mode that you should be reviewing


Of course they can. Those kinds of developers cause problems constantly. It's one of the biggest reasons we have code reviews. Automated tests help too.

But even with all of that we still have bugs and broken workflows. Now take that human and remove most of their ability to reason about how code changes affect non-local functionality and make them type 1000x faster. And don't have anyone review their code.

The code is the spec, someone needs to be reviewing it.


I personally haven't made my my mind either way yet, but I imagine that a vibecoding advocate could say to you that maintaining code makes sense only when the code is expensive to produce.

If the code is cheap to produce, you don't maintain it, you just throw it away and regenerate.


If you have users, this only works if you have managed to encode nearly every user observable behavior into your test suite.

I’ve never seen this done even with LLMs. Not even close. And even if you did it, the test suite is almost definitely more complex than the code and will suffer from all the same maintainability problems.


And in that case how is it different than when random developers come on and off projects?

For one you don't let random devs hop on and off projects without code reviews, which is what people who say they don't care about the code should be doing.

And 2 clearly agents are worse at reasoning through code changes than humans are.


And the team lead with 7 developers isn’t going to be doing code reviews of all the code. At most he is going to be reviewing those critical paths.

I could care less about the implementation behind the vibe coded admin website that will only be used by a dozen people. I care about the authorization.

Even the ETL job, I cared only about the performance characteristics, the resulting correctness, concurrency, logging, and correctness of the results.


>And the team lead with 7 developers isn’t going to be doing code reviews of all the code. At most he is going to be reviewing those critical paths.

Why would the team lead need to review all 7 developers? If you're regularly swapping out every single developer on a team, you're gonna have problems.

>I could care less about the implementation behind the vibe coded admin website that will only be used by a dozen people. I care about the authorization.

If you only have 12 users sure do whatever you want. If you don't have users nothing is hard.


It was 12 users who monitored and managed the ETL job. If I had 1 million users what difference would the front end code have made if the backend architecture was secure, scalable, etc. if the login is taking 2 minutes. I can guarantee you it’s not because the developer failed to write SOLID code…

There you go arguing with strawmen again. I don’t give a single flying flip about SOLID, or Clean Code, or GoF. People who read Clean Code as their first programming but and made that their identity have been the bane of my existence as a programmer.

It’s not about how long something is taking although that is an observable behavior. It’s about how 1 million users over time will develop ways of using your product that you never thought about, much less documented or tested.

Perhaps you’ve heard the phrase “The purpose of the system is what it does”?

The system is the not the spec or the tests. An agent is only reasoning about how to add a new feature, and the only thing preventing from changing observable behavior is the tests. So if an agent is changing untested behavior it’s changing the purpose of the system.


Thats not exactly a great argument depending on undefined behavior. Should I as a developer depend on “undefined behavior” in C (yes undefined behavior is explicitly defined in C)?

On a user facing note, I did a project where I threw stats in DDB just for my own observation knowing very well that was the worse database to use since it does no aggregation type queries (sum, average, etc). I didn’t document it, I didn’t talk about it and yet the developer on their side used it when I specifically documented that he should subscribe to the SNS topic that I emit events to and ingest the data to his own Oracle database.

No library maintainer for instance of C# or Java library is going to promise that private functions that a developer got access to via reflection is not going to change.

I’m solely responsible for public documented interfaces and behaviors.

Oh and that gets back to an earlier point, how do I know that my systems will be able to be maintained? For the most part I design my systems to do a “thing” with clearly defined entry points and extension points and exit points and interfaces. In the case I’m referring to above - it was a search system that was based on “agents” some RAG based, one using a Postgres database with a similarity search, and an orchestrator. You extend the system by adding a new lambda and registering it and prioritizing results if the agent with my vibe coded GUI.

Apple is famous for instance for not caring if you tried to use private APIs and it broke in a new version.


>UB

This is a topic I happen to know a little about. You as a programmer should probably avoid UB for the most part, but the key point here is that programmers don’t follow this rule.

A while back a study found that SQLite, PostgreSQL, GCC, LLVM, Python, OpenSSL, Firefox all contained code that relied on unsigned overflow. Basically even though the C spec says it’s UB, almost every CPU you’ll run into uses twos compliment so it naturally wraps around.

When compiler authors tried to aggressively optimize and broke everything they had to roll that back and/or release flags to allow users to continue using the old behavior.

This kind of stuff happens all the time. The C spec is nearly worthless paper because what matters is what the compilers implement not what the spec tells them to implement. If you spend time talking to LLVM folks, breaking the world because they changed some unspecified behavior is one of their top concerns.

And this is programmers who know how to read specs.

Imagine you’re working on software used by nearly ever major movie studio. You think those users have ever read the spec for the software they are using? They don’t care about UB, they don’t even know the concept exists.

It doesn’t matter how well tested I think my software is. Even very simple software will have unspecified and untested behavior. You give the software a little time and some users and they will start exploiting that behavior. It I unleashed some agents on our code base to implement well architected features, without reviewing their output, and could somehow magically ensure that they didn’t break any workflow that we had documented, tested, or that was even known about to our organization, the head of NBCUniversal would be on the phone with my bosses bosses bosses boss demanding we change it back to the way it was within 24 hours.

Users depend on what the system does, not what you as a designer think it does. The purpose of a system is what it does. Not what it says it does.

We’ve been having this argument since the waterfall days. The code is the spec. We aren’t architects drawing blueprints. The code is the blueprint. If it was that easy to design systems like this all code would already be generated from UML graphs and flowcharts like we’ve been able to do for decades.


Back in my C days, I wrote C code that had to work on PCs that I had access to and mainframes that I never got a chance to test on on ancient compilers. Some little endian and some big endian. We had a custom make file that tried to warn against non portable behavior.

But are you really arguing that I shouldn’t feel free to change private methods because some developer somewhere might use reflection to access it or I shouldn’t change the schema of Sqllite database that is deeply embedded in library folder somewhere?

Or are you saying I should feel free to do

char foo() { char bar = “hello world”; return bar; }

and be upset when weird things happen when I upgrade my compiler?

What do you think Apple would do in that situation? They have multiple times over the past 3 decades said tough noogies if you didn’t do things using the documented APIs.

Jeff Bezos mandated documented interfaces and the “API mandate” in 2002z


You can change whatever you want, but if you make an internal change without signifying that it’s a branding change and it breaks a significant number of your important users workflows, you’re gonna have a bad time.

But that’s mostly irrelevant because most software isn’t written to be used by developers who should know better than to rely on undocumented behavior.

As for Amazon, the API mandate gets violated all the time.

And it’s funny that you should mention them because they just started requiring a code review from a senior engineer for all merged after issues with vibe coding.


So you know that from working at Amazon - ie they aren’t micro service focused (yes I worked at AWS) or that they break it all of the time?

You keep saying you can’t break users workflows. But that doesn’t jibe with reality. In B2B, the user isn’t the customer. B2B businesses break users workflows all of the time. I know people complain about how often AWS changes the console UI all of the time and you hear the same gripes from users all of the time in consumer software. How many people cancel their SaaS contracts because of a change in UI if the features remain?

Photoshop users complain (or did when I followed it closely) all of the time when Adobe broke their automations via AppleScript. They kept buying it.

But the point is that you specifically said that you can’t treat a system as a system of black boxes with well defined interfaces. You damn sure better believe any implementation I started from scratch with a team I did in product companies. It’s the only way you can keep a system manageable with ramp up.

And this is also part of the subject of Stevey’s “Platfotm Rant”

https://gist.github.com/chitchcock/1281611

It’s the reason you can’t fathom that you don’t have to worry about spooky action at a distance when you enforce modularity at the system level.

And even for customers, Apple has a long history of breaking backward compatible and while Microsoft worships at the alter of backward compatibility, major versions of Office have been breaking muscle memory UI for users since the 80s.

If an end users workflow is dependent on mucking with the backend database - more of an issue with desktop software - or an undocumented feature, it’s the same.

Developers have been doing that for years - changing the UI.


You seem to have had a very specific career that consisted mostly of building something new and moving on before you had any idea how it held up long term. I’ve heard enough to pretty confident that despite a 30 year career you don’t actually have much experience in anything other than greenfield projects. This explains the weird overconfidence you have in a methodology with absolutely no track record.

There’s a difference in breaking some user’s workflow ever no and again and doing it every time you add a feature or fix a bug.


I would like to introduce you to the concepts of interfaces and memory safety.

Well-designed interfaces enforce decoupling where it matters most. And believe it or not, you can do review passes after an LLM writes code, to catch bugs, security issues, bad architecture, reduce complexity, etc.


If I'm understanding you, it seems like you're struck by hindsight bias. No one knew the miasma theory was wrong... it could have been right! Only with hindsight can we say it was wrong. Seems like we're in the same situation with LLMs and AGI.

The miasma theory of disease was "not even wrong" in the sense that it was formulated before we even had the modern scientific method to define the criteria for a theory in the first place. And it was sort of accidentally correct in that some non-infectious diseases are caused by airborne toxins.

Plenty of scientific authorities believed in it through the 19th century, and they didn't blindly believe it: it had good arguments for it, and intelligent people weighed the pros and cons of it and often ended up on the side of miasma over contagionism. William Farr was no idiot, and he had sophisticated statistical arguments for it. And, as evidence that it was a scientific theory, it was abandoned by its proponents once contagionism had more evidence on its side.

It's only with hindsight that we think contagionism is obviously correct.


> It's only with hindsight that we think contagionism is obviously correct.

We, the mere median citizen on any specific topic which is out of our expertise, certainly not. And this also have an impact as a social pressure in term of which theory is going to be given the more credits.

That's not actually specific to science. Even theological arguments can be dumb as hell or super refined by the smartest people able to thrive in their society of the time.

Correctness of the theories and how great a match they are with collected data is only a part of what make mass adoption of any theory, and not necessarily the most weighted. It's interdependence with feedback loops everywhere, so even the data collected, the tool used to collect and analyze and the metatheorical frameworks to evaluate different models are nothing like absolute objective givens.


> Only with hindsight can we say it was wrong

It really depends what you mean by 'we'. Laymen? Maybe. But people said it was wrong at the time with perfectly good reasoning. It might not have been accessible to the average person, but that's hardly to say that only hindsight could reveal the correct answer.


I wonder if they also intend to have a role in providing evidence about who blows up the next grade schools, hospitals, and desalination plants

If this bugs you, open chatGPT personality settings, choose “efficient” base style, and turn off the enthusiasm and warmth sliders

It makes a tremendous difference. Almost everything on this list is the emotional fluff ChatGPT injects to simulate a personality.


> Expressed as redaction poetry, combining the WaPo and NYT articles might go something like this:

  Anthropic’s Claude  
  the most advanced AI. 
  quickly prioritizing targets. 
  issuing precise location coordinates 
  supporting massive military operations  
  killed at least 175 people, many of them students attending class.
Read the article to understand the context of this excerpt

I really enjoy redaction poetry:

    Anthropic's
    supporting
    at least 175 people, many of them students attending class
This is great.

In the past when Apple had really well designed software, this would’ve been such an exciting announcement.

Now I feel confident that it will be half integrated, poor user experience nonsense.


Just where my mind went while reading the abstract. Babies hate this sound!

To them, it's like a baby crying.

Now there is a perk that is going to get expensive, and boy it is going to suck when theres a downturn or new facilities manager who decides to cut back and stop offering them.

The real story is how we draw that line and what can be done to prevent these cases.

Because its a new situation, and mentally ill people exist and will be using these tools. Could be a new avenue of intervention.


Place it under the jurisdiction of existing public speech requirements of a company selling communication - advertising.

Agreed it could be prevented - don’t think Google should pay for it though. Tragic but not suit worthy.

If I tell you to kill yourself and you go through with it, will I get into legal trouble or not?

There are definitely jurisdictions in the US (perhaps most or all of them) that have laws which say yes, inciting suicide is a crime.

There are ways to get around that: "Hey, go drive a Tesla on autopilot"

Why not?

Unless someone starts getting slapped with fines, they won't put any equivalent of seat belts in.


We can perhaps say this is a first time thing, so give a small fine this time. However those should be with the promise that if there is a next time the fine will be much bigger until Google stops doing this.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: