There's a common conversation that goes on around AI: some people swear its a complete waste of time and total boondoggle, some that its a good tool when used correctly, and others that its the future and nothing else matters.
I see the same thing happen with Kubernetes. I've run clusters from various sizes for about half a decade now. I've never once had an incident that wasn't caused by the product itself. I recall one particular incident where we had a complete blackout for about an hour. The people predisposed to hating Kubernetes did everything they could to blame it all on that "shitty k8s system." Turns out the service in question simply DOS'd by opening up tens of thousands of ports in a matter of seconds when a particular scenario occurred.
I'm neither in the k8s is the future nor k8s is total trash. It's a good system for when you genuinely need it. I've never understand the other two sides of the equation.
The complaints I see about Kubernetes are typically more about one of two things: (a) this looks complex to learn, and I don't have a need for it - existing deployment patterns solve my use case, or (b) Kubernetes is much less inefficient than running software on bare-metal (energy or cost.)
Which is an interesting perspective, considering I've led a platform based on Kubernetes running on company-owned bare-metal. I was actually hired because developers were basically revolting at leaving the cloud because of all the "niceties" they add (in exchange for that hefty cloud tax) which essentially go away on bare-metal. The existing DevOps team was baffled why the developers didn't like when they were handed a plain Ubuntu VM and told to deploy their stack on it.
By the time I left, the developers didn't really know anything about how the underlying infrastructure worked. They wrote their Dockerfiles, a tiny little file to declare their deployment needs, and then they opened a platform webpage to watch the full lifecycle.
If you're a single service shop, then yeah, put Docker Compose on it and run an Ansible playbook via GitHub Actions. Done. But for a larger org moving off cloud to bare-metal, I really couldn't see not having k8s there to help buffer some of the pain.
For many shops, even Docker Compose is not necessary. It is still possible to deploy software directly on a VM/LXC container.
I agree that Kubernetes can help simplify the deployment model for large organizations with a mature DevOps team. It is also a model that many organizations share, and so you can hire for talent already familiar with it. But it's not the only viable deployment model, and it's very possible to build a deployment system that behaves similarly without bringing in Kubernetes. Yes, including automatic preview deployments. This doesn't mean I'm provided a VM and told to figure it out. There are still paved-path deployment patterns.
As a developer, I do need to understand the environment my code runs in, whether it is bare-metal, Kubernetes, Docker Swarm, or a single-node Docker host. It impacts how config is deployed and how services communicate with each other. The fact that developers wrote Dockerfiles is proof that they needed to understand the environment. This is purely a tradeoff (abstracting one system, but now you need to learn a new one.)
It can be inefficient because controllers (typically ~40 per cluster) can maintain big caches of resource metadata, and kubelet and kube-proxy usually operate pretty tight while-loops. But such things can be tuned and I don't really consider those issues. The main issue I've actually encountered is that etcd doesn't scale
Yeah if someone says that k8s is costing them energy they are either using it very, very incorrectly, or they just don't know what they are talking about.
Running a Kubernetes deployment requires running many additional orchestration services that bare-metal deployments (whether running on-prem or in the cloud) do not.
There also seems to be confusion about what I meant by "bare-metal." I wasn't intending to refer to the server ownership model, but rather the deployment model where you deploy software directly onto an operating system.
Seems like this can be applied to an increasingly large pool of subjects, where things are polarized by default and having a moderate/indifferent opinion is unusual. For example, I thought of US politics while reading your comment
Good insight. It's always easy to blame that which you don't understand. I know nothing about k8s, and my eyes kinda glaze over when our staff engineer talks about pods and clusters. But it works for our team, even if not everyone understands it.
When all you have is a hammer, every problem starts to look like a nail. And the people with axes are wondering how (or indeed even why) so many people are trying to chop wood with a hammer. Further, some axewielders are wondering why they are losing their jobs to people with hammers when an axe is the right tool for the job. Easy to hate the hammer in this case.
Yeah, I would attribute that to tribalism. There's an intense amount of dogma in the Kubernetes community, likely stemming from the billions of dollars that get fed into the ecosystem by Big Tech. I genuinely think people adopt it as part of their identity and then become hostile to anyone who "doesn't understand the excellence of Kubernetes." I only say this because I've had many lunch time conversations with random strangers at the various KubeCon conferences I've attended - and let's just say some were pretty eye opening.
I would also say that a lot of people, even people who are professional k8s operators, don't understand enough of the "theory" behind it. The "why and how", to put it shortly.
And the end result is often that you have two tribes that have totally incorrect idea of even what tools they are using themselves and how, and it's like you swapped them an intentionally wrong dictionary like in a Monthy Python sketch.
At the end of the day it's all different levels of abstractions and whether or not you're using the abstraction correctly. With k8s, the best practices are mostly set in a lot of use cases. For LLMs, we still have no idea what the best practices are.
That part was really surprising to me because for the kind of compute lake he’s talking about building, k8s seems like a pretty good fit for the layer that sits just above it.
We run k8s with several VMs in a couple different cloud providers. I’d love it if I could forget about the VMs entirely.
Is there a simpler thing than k8s that gets you all that? Probably. But if you don’t use k8s, aren’t you doomed to reimplement half of it?
Like these things:
- Service discovery or ingress/routing (“what port was the auth service deployed on again?”)
- Declarative configuration across the board, including for scale-out
- Each service gets its own service account for interacting with external systems
- Blue/green deployments, readiness checks, health checks
- Strong auditing of what was deployed and mutated, when, and by whom
I see the same thing happen with Kubernetes. I've run clusters from various sizes for about half a decade now. I've never once had an incident that wasn't caused by the product itself. I recall one particular incident where we had a complete blackout for about an hour. The people predisposed to hating Kubernetes did everything they could to blame it all on that "shitty k8s system." Turns out the service in question simply DOS'd by opening up tens of thousands of ports in a matter of seconds when a particular scenario occurred.
I'm neither in the k8s is the future nor k8s is total trash. It's a good system for when you genuinely need it. I've never understand the other two sides of the equation.