For "Trust", I believe patio11 described this system as the "Taxi Medallion of Email".
e.g. you spend a lot of money to show that you are a legitimate entity or you pay less money to rent something that shows you are connected to said entity.
I'm a DevOps/SRE and I've spent the past couple weeks trying to vibecode as much of what I do as possible.
In some ways, it's magical. e.g. I whipped up a web based tool for analyzing performance statistics of a blockchain. Claude was able to do everything from building the gui, optimizing the queries, adding new indices to the database etc. I broke it down into small prompts so that I kept it on track and it didn't veer off course. 90% of this I could have done myself but Claude took hours where it would have taken me days or even weeks.
Then yesterday I wanted to do a quick audit of our infra using Ansible. I first thought: let's try Claude again. I gave it lots of hints on where our inventory is, which ports matter etc but it still was grinding away for several minutes. I eventually Ctrl-C'ed and used a couple one liners that I wrote myself in a few minutes. In other words, I was faster that the machine in this case.
After the above, it makes sense to me that people may have conflicting feelings about productivity. e.g. sometimes it's amazing, sometimes it does the wrong thing.
I think there's an argument where if Claude had the knowledge map of your personal one liners and a tool for using them, it would often do the right thing in those cases. But it's definitely not as able to compress all the entropy of 'what can go wrong' operations wise as it is when composing code yet.
My experience that with careful specs, Claude or Codex can whip up either CDK, Cloudformation, or Terraform code much quicker than I can and I’ve been using IAC for 8 years - developer/consultant specializing in development + cloud architecture
So one thing that hasn't changed is that the marginal cost of software is still effectively zero. That's where most of the money was being made b/c if you were a monopoly or oligopoly that each additional unit sold was an absolute increase in revenue and you spread out your fixed costs.
What has changed most dramatically is the "fixed" cost of writing the software to begin with. Given that the costs were being spread out over so many units beforehand, it's not entirely clear to me how that changes a lot of the economics.
For the comments about the "SaaS vs build your own", we can use a home services metaphor. Sure, I can do a lot of what my plumber does. But they do it faster, know all of the issues that go wrong with the work and I can pay them a yearly fee to check my boiler to make sure it doesn't fail etc. The time saved by calling the plumber can then be spent with kids, more work or a combo of the two.
> Then ran interviews with the students about their project work, asking them to explain how it works etc. Took a lot of time with a class of 60 students, but worked pretty well, plus they got some experience developing the important skull of communicating technical ideas.
This is amazing and wish professors had done this back when I did CS in the late 1990s.
My mom, who is from Italy, has some great lines about the Mafia:
"Italy will never go bankrupt b/c we have the Pope AND the Mafia"
I once asked her how the Mafia was reined in and she mentioned:
"The Mafia was once trying to kill some judge or politician and they blew up several hundred meters of highway to do it. They also killed a lot of innocent people and the outcry was so big that the Carbinieri(Italian FBI) got involved."
Carabinieri have been involved with (and occasionally fighting) the mafia since late 1800s. That's got nothing to do with how we got to the current situation of relative tranquility.
What happened between the end of the 1980s and the 1990s was that, because of continuous feuds among mafiosi that produced too many civilian victims, political connections broke down, particularly with a few especially vicious bosses. Laws were passed to isolate the worst offenders, new connections were brokered with more moderate mafia leaders, and eventually the "bad" bosses were magically found, hiding more or less in plain sight.
That episode, the Falcone Judge murder, was a bit of a last straw in the way most of italian political parties had dealt with mafia till that point. They realized the issue couldn't be contained to the sicilian cultural and political environment and they couldn't be... that much complacent (they still are, but at least they try to save face when they're found).
Long story short, every political authority at the time was pretty much aware the murder was going to happen, they just didn't expect a terrorist-like approach.
Once we got to that point, a newish department, the DIA[1] was given full authority to handle the issue... again, for a time. Then it went swallowed up too in the neverending whirpool of shit that is the Italian politics.
In the meanwhile, the Mafia got smarter, and rather than going in a full frontal attack with the authorities, they became much more... diplomatic, offering indirect support trough some proxies to some newly political figures that emerged shortly after. You probably heard about that Berlusconi guy.
Carabinieri are actually military-status police force in Italy, which is a different setup from the FBI in the US.
Calling them the Italian FBI, is ironically quite funny, because in Italy they’re the butt of a lot of jokes - "carabiniere" is a common stand-in for "someone dumb".
One of my kids recently had a no-contact knee injury while playing basketball. He immediately started limping and crying and I had to carry him from the court to the car.
I did some searching with Grok and I found out:
- no contact injuries are troubling b/c it generally means they pulled something
- kids don't generally tear an ACL (or other ligament)
- it's actually way more common for the ligament to pull the anchor point off of the bigger bone b/c kid bones are soft
I asked it to differentially diagnose the issue with the details of: can't hold weight, little to no swelling and some pain.
It was adamant, ADAMANT, that this was a classic case of bone being pulled off by the ligament and that it would require surgery. It even pointed out the no swelling could be due to a very small tear etc. It gave me a 90% chance of surgery too.
I followed up by asking what test would definitely prove it one way or the other and it mentioned getting an X-Ray.
We go off to the urgent care, son is already kind of hobbling around. Doctor says he seems fine, I push for an X-Ray and turns out no issue: he probably just pulled something. He was fully healed in 2-3 days.
As someone who has done a lot of differential diagnosing/troubleshooting of big systems (FinTech SRE) I find it interesting that it was basically correct in what could have happened but couldn't go the "final mile" to establish it correctly. Once we start hooking up X-Rays to Claude/Grok 4.2 etc equivalent LLMs, will be even more interesting to see where this goes.
+1. I wish Gemini 2.5/3 Pro's "personality" and long context handling wasn't so erratic, because the medical stuff in there is great. Whatever they did to produce the MedGemma models is clearly built on a strong baseline. I haven't had need to try using MedGemma on x-ray imagery, but I'd be curious to hear results — imagery diagnostics is part of what it's built for.
Opus 4.5 seems good too, though getting dumber. OpenAIs fine tuning is clearly built to toe the professional medical advice line, which can be good and bad.
I like this post about a chat bot being 100% completely, confidently, adamantly wrong that characterizes it as being “basically right” about something that was untrue and did not happen.
It is like getting phished and then pointing out that the scammer was basically right about being Coinbase support aside from the fact that they did not work there
I remember reading that pine trees give off a chemical that is a natural human bronchodilator.
One thought of why people love hiking, especially in piney woods, is that the chemical allows humans to process more oxygen which in turn helps them feel more "energized".
I point this out for two reasons:
1. It's a fascinating bit of trivie
2. It highlights that there are MANY confounding variables so it will always be tough to figure out the isolated impact.
You sometimes hear people say "I mean, we can't just give an AI a bunch of money/important decisions and expect it to do ok" but this is already happening and has been for years.
Examples:
- Algorithmic trading: I once embedded on an Options trading desk. The head of desk mentioned that he didn't really know what the PnL was during trading hours b/c the swings were so big that only the computer algos knew if the decisions were correct.
- Autopilot: planes can now land themselves to an accuracy that is so precise that the front landing gear wheels "thud" as they go over the runway center markers.
and this has been true for at least 10 years.
In other words, if the above is possible then we are not far off from some kind of "expert system" that runs a business unit (which may be all robots or a mix of robots and people).
This is a piece of science fiction and has its own (inaccurate, IMO) view on how minimum wage McDonald's employees would react to a robot manager. Extrapolating this to real life is naive at best.
But those things were considered on the same level of current LLMs in the sense of "well, a computer might do part of my job but not ALL of it".
No, algorithmic trading didn't replace everything a trader did but it most certainly replaced large parts of the workload and made it much faster and horizontally scalable.
The problem here is that you are cherry picking examples of successful technology.
The inverse would be to list off Theranos, Google Stadia, and other failed tech and claim that people said that there was massive steps that subsequently didn't materialise. In fact a lot of times it was mostly fabricated by people with stuff to gain from ripping off VCs.
Look at how bad it is with Microsoft in Windows despite their "all in on AI".
Ultimately no one really knows how it will pan out, and if we will end up with Enron or an Apple. Or even if it's a combination of a successful tech that ultimately is mishandled by corporations and fails, or a limited tech that regardless captures the imagination through pop culture and takes over.
The two key differences to me are infrastructure and specificity of purpose.
Autoland in plane requires a set of expensive, complex, and highly fine-tuned equipment to be installed on every runway in the world that enables it (which as a proportion is statistically not a majority of them).
And as to specificity, this system does exactly one thing - land a specific model of plane on a specific runway equipped with instrumentation configured a specific way.
The point being: it isn’t a magic wand. Any serious conversation of AI in these types of life or death situations has to recognize that without the corresponding investment in infrastructure and specificity of purpose, things like this blog post are essentially just science fiction. The fact that previous generations of technology considered autoland and algorithmic trading to be magic doesn’t really change anything about that.
"Expert system" running a company is never going to happen unless shareholders are okay with no accountability from the company. You'll always need someone to blame in case things go wrong. You could have an executive using such an "expert system" for literally all their decisions, but it has to be a human being signing off on those decisions. There is no way to prosecute code and unless these expert systems can become sentinent or appear in court, best of luck trying to let it run a company in the real sense of actually making those decisions with full autonomy and responsbility.
I'm saying there's something structurally different form autonomous systems generally and from an LLM corpus which has all of the information in one place and at least in theory extractable by one user.
e.g. you spend a lot of money to show that you are a legitimate entity or you pay less money to rent something that shows you are connected to said entity.
reply