I see this behavior all the time. When it can’t read a file using its read tool - it escalates up to try with bash. Often it tries to search the entire file system “find / …”
Also there is an existing system of 7 numeral post codes that autopopulate Prefecture, City, District - leaving you to enter Area # (丁目), Block #, House #.
If they amend this to make it alphanumeric and then autopopulate this last three datums - it fits very neatly into the existing scheme.
Now this isn't a killer argument, but your examples are about readability and safety, respectively - the quality of the result. LLMs seem to be more about shoveling the same or worse crap faster.
I have seen the results of other people. Code LLMs seem to do some annoying stuff more quickly than manually and are sometimes able to improve prose in comments and such. But they also mess up when it gets moderately difficult, especially when there are "long distance" connections between pieces.
That and the probably seductive (to some) ability to crank out working, but repetitive or partially nonsensical code, is what I call shoveling crap faster.
I dunno what to tell you, I am able to get consistently good quality work out of eg 3.7 sonnet and it’d saved me a ton of time. Garbage in garbage out, maybe the people you’ve observed don’t know how to write good prompts.
I guess I should play around with that one then. My general impression was that we're already in the diminishing returns part of the sigmoid curve (calendar time or coefficient array size vs quality) for LLMs, until there's maybe a change other than making them bigger.
I remember when Siri replaced the original voice recognition. “Call XYZ” etc. after a long press of the wired remote.
To this day, it’s less reliable. Also Siri can’t text my wife who has two phone numbers in my address book. Although there is just a single iMessage chat - Siri always asks (Which one? - without offering options!!!)
The old farts in Apple management need to go and pronto.
The ossification of the entire management layer is really reall frightening and a terrible sign for a company.
The worst part is when I ask Siri to call or text my partner, and then the phone goes ahead and calls a different person with a vaguely similar sounding name that I have had no contact with in a decade. Why Siri?
Or when I ask for directions to a hardware store, and I get a result 300km away across a national border...
The question for me is, whether the web still has enough distributed creative spirit left, so that individuals still feel a need to create.
I.e. instead of feeding my reviews to the rent seeking giants, I can just publish them on my own and they’ll be helpfully ingested and made useful by chattyG.