There is a lot of "my" floating around in this article. I always love getting peeks into experiences with this sort of thing, but I think the "mys" highlight something I've seen every day. These agents are really great at bespoke personal flows that build up a TON of almost personal tribal knowledge about how things get done if there is any consistency to those flows at all. Doing this in larger theaters is much more difficult because tribal knowledge is death for larger teams. It drives up the cost of everything which is why individuals or extremely new small teams feel so much more productive. Everything is new here and consistency doesn't matter yet.
But today people can just vibe code their own sudo "with blackjack and hookers!"
/s
Really though, it is remarkable just how high we've built this towering house of cards on the selfless works of individuals. The geek in me immediately begins meditating on OSS funding mechanisms I've seen in the past, and what might work today. Then I remember that I don't believe it can work, but hope desperately that people like Todd can keep paying rent and continue getting some satisfaction from the efforts.
Pure anecdote. Over the last year I've taken the opportunity to compare app development in Swift (+ SwiftUI and SwiftData) for iOS with React Native via Expo. I used Cursor with both OpenAI and Anthropic models. The difference was stark. With Swift the pace of development was painfully slow with confused outputs and frequent hallucinations. With React and Expo the AI was able to generate from the first few short prompts what it took me a month to produce with Swift. AI in development is all about force multipliers, speed of delivery, and driving down cost per product iteration. IMO There is absolutely no reason to choose languages, frameworks, or ecosystems with weaker open corpuses.
I always like finding people advocating for older sage knowledge and bringing it forward for new audiences. That said, as someone who wrote a book about Docker and has lived the full container journey I tend to skip the containerized build all together. Docker makes for great packaging. But containerizing ever step of the build process or even just doing it in one big container is a bit extra. Positioning it as a build scripting solution was silly.
I’m inclined to agree with you about not building containers. That said, I find myself going around in circles. We have an app that uses a specific toolchain version, how do we install that version on a build machine without requiring an SRE ticket to update our toolchain?
Containers nicely solve this problem. Then your builds get a little slow, so you want to cache things. Now your docker file looks like this. You want to run some tests - now it’s even more complicated. How do you debug those tests? How do those tests communicate with external systems (database/redis). Eventually you end up back at “let’s just containerise the packaging”.
Thanks - this is an interesting idea I had never considered. I do like the layer based caching of dockerfiles, which you give up entirely for this but it allows for things like running containerised builds cached SCM checkouts (our repository is 300GB…)
The benefit of this approach is it's a lot easier to make sure dependencies end up on the build node so you aren't redownloading and caching the same dependency for multiple artifacts. But then you don't get to take advantage of docker build caching to speed up things when something doesn't change.
That's the part about docker I don't love. I get why it's this way, but I wish there was a better way to have it reuse files between images. The best you can do is a cache mount. But that can run into size issues as time goes on which is annoying.
Depending on how the container is structured, you could have the original container as a baseline default, and then have "enhanced" containers that use it as a base and overlay the caching and other errata to serve that specialized need.
I’ve tried this in the past, but it pushes the dependency management pf the layers into whatever is orchestrating the container build, as opposed to multi stage builds which will parallelise!
Not dismissing, but it’s just caveats every which way. I think in an ideal world I just want Bazel or Nixos without the baggage that comes with them - docker comes so close but yet falls so short of the finish line.
I quite strongly disagree; a Dockerfile is a fairly good way to describe builds, a uniform approach across ecosystems, and the self contained nature is especially useful for building software without cluttering the host with build dependencies or clashing with other things you want to build. I like it so much that I've started building binaries in docker even for programs that will actually run on the host!
It can indeed be uniform across ecosystems, but it's slow. There's a very serious difference between being on a team where CI takes ~1 minute to run, vs. being on a team where CI takes a half hour or even, gasp, longer. A large part of that is the testing story, sure, but when you're really trying to optimize CI times, then every second counts.
If the difference is <1 minute vs >30 minutes, containers (per se) are not the problem. If I was guessing blindly, it sounds like you're not caching/reusing layers, effectively throwing out a super easy way to cache intermediate artifacts and trashing performance for no good reason. And in fact, this is also a place where I think docker - when used correctly - is quite good, because if you (re)use layers sensibly it's trivial to get build caching without having to figure out a per-(language|build system|project) caching system.
I'm exaggerating somewhat. But I'm familiar with Docker's multi-stage builds and how to attempt to optimize cache layers. The first problem that you run into, with ephemeral runners, is where the Docker cache is supposed to be downloaded from, and it's often not faster at all compared to re-downloading artifacts (network calls are network calls, and files are files after all). This is fundamentally different from per-language caching systems where libraries are known to be a dumb mirror of upstream, often hash-addressed for modern packaging, and thus are safe to share between builds, which means that it is safe to keep them on the CI runner and not be forced to download the cache for a build before starting it.
> without having to figure out a per-language caching systems
But most companies, even large ones, tend to standardize on no more than a handful of languages. Typescript, Python, Go, Java... I don't need something that'll handle caching for PHP or Erlang or Nix (not that you can really work easily with Nix inside a container...) or OCaml or Haskell... Yeah I do think there's a lot of room for companies to say, this is the standardized supported stack, and we put in some time to optimize the shit out of it because the DX dividends are incredible.
I really don't see how that's different at all, certainly not fundamentally. You can download flat files over the network, and you can download OCI image layers over the network. I'm pretty sure those image layers are hash-addressed and safe to share between builds, too, and you should make every effort to keep them on the CI runner and reuse them.
You can have fast pipelines in containers - I’ve worked in quick containerised build environments and agonisingly slow non-containerised places, the difference is whether anyone actually cares and if there’s a culture of paying attention to this stuff.
Agree. Using a container to build the source that is then packaged as a "binary" in the resulting container always seemed odd to me. imho we should have stuck with the old ways : build the product on a regular computer. That outputs some build artifacts (binaries, libraries, etc). Docker should take those artifacts and not be hosting the compiler and what not.
If anything the build being in a container is the more valuable bit, though mainly because the container usually more repeatable by having a scripted setup itself. Though I dunno why the build and the host would be the _same_ container in the end.
(and of course, nix kinda blows both out the water for consistency)
Agree, and I would go another step to suggest dropping Docker altogether for building the final container image. It's quite sad that Docker requires root to run, and all the other rootless solutions seem to require overcomplicated setups. Rootless is important because, unless you're providing CI as a public service and you're really concerned about malicious attackers, you will get way, way, way more value out of semi-permanent CI workers that can maintain persistent local caches compared to the overhead of VM enforced isolation. You just need an easy way to wipe the caches remotely, and a best-effort at otherwise isolating CI builds.
A lot of teams should think long and hard about just taking build artifacts, throwing them into their expected places in a directory taking the place of chroot, generating a manifest JSON, and wrapping everything in a tar, which is indeed a container.
I like to build my stuff inside of Docker because it is my moat against changes of the environment.
We have our base images, and in there we install dependencies by version. That package then is the base for our code build. (as apt seemingly doesn't have any lock file support?).
In the subsequent built EVERYTHING is versioned, which allows us to establish provenance all the way up to the base image.
And next to that when we promote images from PR -> main we don't even rebuild the code. It's the same image that gets retagged. All in the name of preserving provenance.
You can still use a base image; you download it from the registry, extract the tar, then add your build artifacts before re-generating a manifest and re-tarring. If you specify a base image digest, you can also use a hash-addressed cache and share it between builds safely without re-downloading.
Once you have your container image, how you decide to promote it is a piece of cake, skopeo doesn't require root and often doesn't require re-pulling the full tar. Containerization is great, I'm specifically trying to point out that there are alternatives to Docker.
I mean personally I find nspawn to be a pretty simple way of doing rootless containers.
Replace manifest JSON with a systemd service file and you've got a rootless container that can run on most linux systems without any non-systemd dependencies or strange configuration required. Dont even need to extract the tarball.
Google created two kinds of value: content discovery via connection (value to the consumer), and market reachability for advertisers. Oh, and also the world's most inconvenient spell check.
AI proposes to solve: a content supply side problem which does not exist, and an analysis problem which also only maybe exists. Really what it does in the best of cases (assuming everything actually works) is drive the cost to produce content to zero, make discovery less trustworthy, make the discovery problem worse, and launder IP. In the best case it is a net negative economic force.
All that said, I believe the original comment is about the fact that the economy exists to serve market participants and AI is not a market participant. It can act as a proxy, but it doesn't buy or sell things in the economic sense. Through that lens, also in the best case the technology erodes demand by reducing economic power of the consumer.
That said, I'm stoked to hear about the next AI web site generator or spam email campaign manager. Lets setup an SPV to get it backed off-balance sheet.
It is such a pure thing when an engineer looks at the world and is surprised, frustrated, or disappointed at behavior at scale. This is a finance game which in itself is a storytelling / belief based system. It might seem like math, but when you're playing on the growth edges valuation is really is about the story you tell and the character of the players. Thats only worse when people stop caring about cashflows or only expect them to happen "in the future" because that makes it someone else's problem.
It's generally difficult to do. The problem is you have no idea when the collapse in value will happen, or even if it will.
A lot of the companies I'd have bet against in the past, like AOL, sold for huge sums of money, and the purchasing company ended up regretting their decision. The actual AOL stock never collapsed.
As for the GME thing, the only reason why I sort give it a pass is because it was sort of an unprecedented thing. I am not sure if regulations have been updated to address a future similar incident.
At least it resulted in the "This Is Financial Advice" video from Folding Ideas.
Fascinating watch after following the event back in the day - and losing €1500 because I didn't reach my goal of earning €500 to buy a PS5 with the profit. If shit went up for just one more day I would have reached my goal.
I happened to "find" an very old IRA I had from a prior employer that had about $1200 sitting in it. I threw it all into GME. I pulled $500 in profits, and left the initial investment to ride.
Today, I'm down about $300 on those shares (taken with the $500 in gains, I'm technically still up by $200), and that's fine. I believe in the leadership, I like the company's current state (flush with cash, little/no debt) and I'm just going to keep letting it ride.
When I retire in 10 years or so, we'll see where it's at. Worst case, I'm out $700 bucks. Best case, I get that new riding lawnmower, for free!
Otherwise, it's Index funds, have a nice day, because none of us can compete with Wall Street.
Often the difficult thing isn't predicting "this bubble will collapse eventually"; it's predicting the _date_ of the collapse. You really need both, to short.
Nobody is buying them today. But these shaky clumsy versions didn't exist even a few years ago. The hype promises these things tomorrow, which is obvious BS. But the better they look today the more investment will be poured into their R&D which accelerates real improvement, which accelerates investment, etc.
Generalist robotics are all about minimizing or at least front loading some portion of retooling cost, minimizing overhead associated with safety and compliance, and being able to capitalize what would have otherwise been human opex. Those pressures aren't going anywhere.
The most difficult part is managing the delivered / processed state and ordered delivery. Consistent ordering of receipt into a distributed buffer is a great challenge. Most stacks do that pretty well. But deciding when a message has been processed and when you can safely decide not to deliver it again it is especially challenging in a distributed environment.
That is sort of danced around a bit in this article where the author is talking about dropped messages, etc. It is tempting to say "use a stream server" but ultimately stream servers make head-of-line accounting the consumer's responsibility. That's usually solved with some kind of (not distributed) lock.
reply