My company uses both outlook and slack. Teams is also used for scheduled meetings but never touched for chat. I personally don’t find teams to be significantly worse than zoom but I’d rather never use either.
I'd also strongly recommend this view of how Kubernetes uses cgroups, showing similar drill downs for how everything gets managed. Lovely view of what's really happening! https://martinheinz.dev/blog/91
I've been a bit apoplectic in the past that cgroups seemed not super helpful in Kubernetes, but this really showed me how the different Kubernetes QoS levels are driven by similar juggling of different cgroups.
There’s a big difference between supporting food security and subsidising otherwise unviable land usage and farming practices. In the UK, there are subsidies for upland farming for sheep with produces a negligible amount of food at high cost (monetary and environmental) for next to no return for the farmers even after the subsidy.
Re. green subsidies that is better characterised as investment in technology of the future. You might also like to compare subsidies to the fossil fuel sector as well.
• `panic = "abort"` means that any panic terminates the program. This is not always desirable because you may want to catch and recover from panics, particularly in long-running servers.
• `strip = true` means that anything depending on DWARF would no longer work. Backtraces won't work, but also unwinding will no longer work (so this is disastrous if you haven't set `panic` above). The actual proposal has `strip = "debuginfo"` instead, so unwinding will work while backtraces won't.
• `codegen-units = 1` is the number of concurrent compilation jobs (cgu) in the LLVM codegen phase. A single cgu will significantly increase the compilation time, while allowing a bit more optimization. Otherwise this is okay.
• `lto = true` enables Rust-specific link-time optimizations across crates. The actual benefit depends on the set of crates linked, but it is significantly slower that many large enough projects wouldn't want it. It does benefit small programs like the "Hello, world" program the most though.
• `opt-level = "z"` is same to C/C++ `-Oz` and the same pros and cons apply.
There is also a physical USB cheat/hack/mod[0] that you can plug into your Switch that lets you automatically farm eggs, move you about automatically to hatch eggs, and then release entire boxes of Pokemon you don't need[1]. It's like factory farming Pokemon! There's support for other games and even other game console controllers too, it's an interesting device.
For Pokemon I think it's more interesting than simply memory hacking your save file and altering the Effort Values of the Pokemon, which is the simple way to go about this, though a bit boring and as I understand it very widespread if you battle online.
This guy has gone to the zoo and interviewed all the animals. The tiger says that the secret to success is to live alone, be well disguised, have sharp claws and know how to stalk. The snail says that the secret is to live inside a solid shell, stay small, hide under dead trees and move slowly around at night. The parrot says that success lies in eating fruit, being alert, packing light, moving fast by air when necessary, and always sticking by your friends.
His conclusion: These animals are giving contradictory advice! And that's because they're all "outliers".
But both of these points are subtly misleading. Yes, the advice is contradictory, but that's only a problem if you imagine that the animal kingdom is like a giant arena in which all the world's animals battle for the Animal Best Practices championship [1], after which all the losing animals will go extinct and the entire world will adopt the winning ways of the One True Best Animal. But, in fact, there are a hell of a lot of different ways to be a successful animal, and they coexist nicely. Indeed, they form an ecosystem in which all animals require other, much different animals to exist.
And it's insane to regard the tiger and the parrot and the snail as "outliers". Sure, they're unique, just as snowflakes are unique. But, in fact, there are a lot of different kinds of cats and birds and mollusks, not just these three. Indeed, there are creatures that employ some cat strategies and some bird strategies (lions: be a sharp-eyed predator with claws, but live in communal packs). The only way to argue that tigers and parrots and snails are "outliers" is to ignore the existence of all the other creatures in the world, the ones that bridge the gaps in animal-design space and that ultimately relate every known animal to every other known animal.
So, yes, it's insane to try to follow all the advice on the Internet simultaneously. But that doesn't mean it's insane to listen to 37signals advice, or Godin's advice, or some other company's advice. You just have to figure out which part of the animal kingdom you're in, and seek out the best practices which apply to creatures like you. If you want to be a stalker, you could do worse than to ask the tiger for some advice.
---
[1] The ants are gonna win. Hölldobler and Wilson told me so.
A few years ago I got testicular cancer. The information about the disease came in pieces: first all I knew was that there was a lump; then came the ultrasound, the CT scan, then biopsy of the testicle, then a second surgery to sample lymph nodes to which the cancer might have spread. At every step I would obsessively query my doctors for conditional probabilities: given what we'd just found out, what were the chances of dying? Of relapse? Of chemo? Of sterility? I was always incredibly frustrated at how vague their responses would be - they'd say, e.g. "we don't like to give probabilities because you just never know what will happen!". And I would think, "That's exactly the point of a probability! Please just tell me a number!"
One doctor eventually showed me a paper on outcomes for the lymph node surgery I had, with a relapse rate curve going out five years so. I found this incredibly helpful for managing my emotions because it let me track my progress in a very precise way: every monthly checkup that would go by uneventfully, I knew exactly what my chance of relapse had dropped to. The goal was to get to zero. More importantly, having actual numbers gave me something on which I could focus my optimism. It's so much worse to hear "you might become sterile" than "there's a 5% chance of becoming sterile". With the 5% number in mind, I'd do things like imagine myself in a room full of 20 people and think "wow, it would be incredibly unlikely to be randomly chosen from this group". Having spent a lot of time in a cancer hospital now -- around people who were much worse off than I was -- I believe that almost everyone has incredible reserves of optimism. I think it's better when the hopeful possibility is concretely defined - it makes it easier to imagine a path forward while you're stuck waiting for more information.
Mine is obviously a completely different situation from the terminal cancer described by the author, where the question isn't, "when will I be free of this cancer", but rather "when will I die from it". Testicular cancer is very treatable, and I never faced a significant chance of death. I'm sure I would have been in a much different psychological state if I had.
Also, PSA: testicular cancer is REALLY common for young males (if you're male you have a 1 in 500 chance of getting it between 20 and 34). Given HN user demographics, there are almost certainly some of you reading this who've gotten it already, or who will. You can save yourself a ton of trouble if you do a self-examination every once in a while. That's actually how I found out, and is a big reason that I avoided chemotherapy.
Lots of people make the mistake of thinking there's only two vectors you can go to improve performance, high or wide.
High - throw hardware at the problem, on a single machine
Wide - Add more machines
There's a third direction you can go, I call it "going deep". Today's programs run on software stacks so high and so abstract that we're just now getting around to redeveloping (again for like the 3rd or 4th time) software that performs about as well as software we had around in the 1990s and early 2000s.
Going deep means stripping away this nonsense and getting down closer to the metal, using smart algorithms, planning and working through a problem and seeing if you can size the solution to running on one machine as-is. Modern CPUs, memory and disk (especially SSDs) are unbelievably fast compared to what we had at the turn of the millenium, yet we treat them like they're spare capacity to soak up even lazier abstractions. We keep thinking that completing the task means successfully scaling out a complex network of compute nodes, but completing the task actually means processing the data and getting meaningful results in a reasonable amount of time.
This isn't really hard to do (but it can be tedious), and it doesn't mean writing system-level C or ASM code. Just seeing what you can do on a single medium-specc'd consumer machine first, then scaling up or out if you really need to. It turns out a great many problems really don't need scalable compute clusters. And in fact, the time you'd spend setting that up, and building the coordinating code (which introduces yet more layers that soak up performance) you'd probably be better off just spending the same time to do on a single machine.
Bonus, if your problem gets too big for a single machine (it happens), there might be trivial parallelism in the problem you can exploit and now going-wide means you'll probably outperform your original design anyways and the coordination code is likely to be much simpler and less performance degrading. Or you can go-high and toss more machine at it and get more gains with zero planning or effort outside of copying your code and the data to the new machine and plugging it in.
Oh yeah, many of us, especially experienced people or those with lots of school time, are taught to overgeneralize our approaches. It turns out many big compute problems are just big one-off problems and don't need a generalized approach. Survey your data, plan around it, and then write your solution as a specialized approach just for the problem you have. It'll likely run much faster this way.
Some anecdotes:
- I wrote an NLP tool that, on a single spare desktop with no exotic hardware, was 30x faster than a 6-high-end-system-distributed-compute-node that was doing a comparable task. That group eventually used my solution with a go-high approach and runs it on a big multi-core system with as fast of memory and SSD as they could procure and it's about 5 times faster than my original code. My code was in Perl, the distributed system it competed against was C++. The difference was the algorithm I was using, and not overgeneralizing the problem. Because my code could complete their task in 12 hours instead of 2 weeks, it meant they could iterate every day. A 14:1 iteration opportunity made a huge difference in their workflow and within weeks they were further ahead than they had been after 2 years of sustained work. Later they ported my code to C++ and realized even further gains. They've never had to even think about distributed systems. As hardware gets faster, they simply copy the code and data over and realize the gains and it performs faster than they can analyze the results.
Every vendor that's come in after that has been forced to demonstrate that their distributed solution is faster than the one they already have running in house. Nobody's been able to demonstrate a faster system to-date. It has saved them literally tens of millions of dollars in hardware, facility and staffing costs over the last half-decade.
- Another group had a large graph they needed to conduct a specific kind of analysis on. They had a massive distributed system that handled the graph, it was about 4 petabytes in size. The analysis they wanted to do was an O(N^2) analysis, each node needed to be compared potentially against each other node. So they naively set up some code to do the task and had all kinds of exotic data stores and specialized indexes they were using against the code. Huge amounts of data was flying around their network trying to run this task but it was slower than expected.
An analysis of the problem showed that if you segmented the data in some fairly simple ways, you could skip all the drama and do each slice of the task without much fuss on a single desktop. O(n^2) isn't terrible if your data is small. O(k+n^2) isn't much worse if you can find parallelism in your task and spread it out easily.
I had a 4 year old Dell consumer level desktop to use so I wrote the code and ran the task. Using not much more than Perl and SQLite I was able to compute a large-ish slice of a few GB in a couple hours. Some analysis of my code showed I could actually perform the analysis on insert in the DB and that the size was small enough to fit into memory so I set SQLite to :memory: and finished it in 30 minutes or so. That problem solved, the rest was pretty embarrassingly parallel and in short order we had a dozen of these spare desktops occupied running the same code on different data slices and finishing the task 2 orders of magnitude than what their previous approach had been. Some more coordinating code and the system was fully automated. A single budget machine was theoretically now capable of doing the entire task in 2 months of sustained compute time. A dozen budget machines finished it all in a week and a half. Their original estimate on their old distributed approach was 6-8 months with a warehouse full of machines, most of which would have been computing things that resulted in a bunch of nothing.
To my knowledge they still use a version of the original Perl code with SQlite running in memory without complaint. They could speed things up more with a better in-memory system and a quick code port, but why bother? It's completing the task faster than they can feed it data as the data set is only growing a few GB a day. Easily enough for a single machine to handle.
- Another group was struggling with handling a large semantic graph and performing a specific kind of query on the graph while walking it. It was ~100 million entities, but they needed interactive-speed query returns. They had built some kind of distributed Titan cluster (obviously a premature optimization).
Solution, convert the graph to an adjacency matrix and stuff it in a PostgreSQL table, build some indexes and rework the problem as a clever dynamically generated SQL query (again, Perl) and now they were realizing .01second returns, fast enough for interactivity. Bonus, the dataset at 100m rows was tiny, only about 5GB, with a maximum table-size of 32TB and diskspace cheap they were set for the conceivable future. Now administration was easy, performance could be trivially improved with an SSD and some RAM and they could trivially scale to a point where dealing with Titan was far into their future.
Plus, there's a chance for PostgreSQL to start supporting proper scalability soon putting that day even further off.
- Finally, a e-commerce company I worked with was building a dashboard reporting system that ran every night and took all of their sales data and generated various kinds of reports, by SKU, by certain number of days in the past, etc. It was taking 10 hours to run on a 4 machine cluster.
A dive in the code showed that they were storing the data in a deeply nested data structure for computation and building and destroying that structure as the computation progressed was taking all the time. Furthermore, some metrics on the reports showed that the most expensive to compute reports were simply not being used, or were being viewed only once a quarter or once a year around the fiscal year. And cheap to compute reports, where there were millions of reports being pre-computed, only had a small percentage actually being viewed.
The data structure was built on dictionaries pointing to other dictionaries and so-on. A quick swap to arrays pointing to arrays (and some dictionary<->index conversion functions so we didn't blow up the internal logic) transformed the entire thing. Instead of 10 hours, it ran in about 30 minutes, on a single machine. Where memory was running out and crashing the system, memory now never went above 20% utilization. It turns out allocating and deallocating RAM actually takes time and switching a smaller, simpler data structure makes things faster.
We changed some of the cheap to compute reports from being pre-computed to being compute-on-demand, which further removed stuff that needed to run at night. And then the infrequent reports were put on a quarterly and yearly schedule so they only ran right before they were needed instead of every night. This improved performance even further and as far as I know, 10 years later, even with huge increases in data volume, they never even had to touch the code or change the ancient hardware it was running on.
It seems ridiculous sometimes, seeing these problems in retrospect, that the idea was that to make these problems solvable racks in a data center, or entire data centeres were ever seriously considered seems insane. A single machine's worth of hardware we have today is almost embarrassingly powerful. Here's a machine that for $1k can break 11 TFLOPS [1]. That's insane.
It also turns out that most of our problems are not compute speed, throwing more CPUs at a problem don't really improve things, but disk and memory are a problem. Why anybody would think shuttling data over a network to other nodes, where we then exacerbate every I/O problem would improve things is beyond me. Getting data across a network and into a CPU that's sitting idle 99% of the time is not going to improve your performance.
Analyze your problem, walk through it, figure out where the bottlenecks are and fix those. It's likely you won't have to scale to many machines for most problems.
I'm almost thinking of coming up with a statement: Bane's rule, you don't understand a distributed computing problem until you can get it to fit on a single machine first.
A quote about salary, that I bookmarked as I could see myself making the same mistake:
"Salaries never stay secrets forever. Hiding them only delays the inevitable.
Last year we were having a discussion at lunch. Coworker was building a new house, and when it came to the numbers it was let loose that it was going to cost about $700K. This didn't seem like much, except to a young guy that joined the previous year and had done nothing but kick ass and take names..." (edited for brevity).
"...The conversation ended up in numbers. Coworker building the house pulled about $140K base (median for a programmer was probably $125K), and his bonus nearly matched the new guy's salary, which was an insulting $60K -- and got cut out of the bonus and raise in January for not being there a full year, only 11 months.
Turns out he was a doormat in negotiating, though his salary history was cringeworthy. It pained everyone to hear it, considering how nice of a guy he was. In all honestly, $60K was a big step up for him. Worst of all, this wasn't a cheap market (Boston). The guy probably shortchanged himself well over a half-million dollars in the past decade. This was someone who voluntarily put in long hours and went out of his way to teach others, and did everything he could to help other departments like operations and other teams. On top, he was beyond frugal. Supposedly he saved something around 40% of his take home pay, despite living alone in Boston. He grew up in a trailer park.
He spent the next day in non-stop meetings with HR, his manager and the CTO. That Friday he simply handed in his badge without a word, walked out and never came back.
Until 3 months later. As a consultant. At $175/hour."
> Can we please try to stop talking about this specific language ecosystem as an awful deplorable hell hole or whatever?
Back in the second century BC, Cato the Elder ended his speeches with the phrase 'Carthago delenda est,' which is to say, 'Carthage must be destroyed.' It didn't matter what the ostensible topic of the speech was: above all, Carthage must be destroyed.
My opinion towards JavaScript is much like Cato's towards Carthage: it must be rooted out, eliminated and destroyed entirely. I don't know if I'd go quite so far as to say that the fundamental challenge of mass computing is the final destruction of JavaScript — but I want to say it, even though it's false.
JavaScript is a pox, a disaster, a shame. It is the most embarrassingly bad thing to become popular in computing since Windows 3.1. Its one virtue (that it's on every client device) is outshone by its plethora of flaws in much the same way that a matchstick is outshone by the sun, the stars and the primordial energy of the Big Bang added together.
JavaScript is the XML, the Yugo, the Therac-25 of programming languages. The sheer amount of human effort which has been expended working around its fundamental flaws instead of advancing the development of mankind is astounding. The fact that people would take this paragon of wasted opportunity and use it on the server side, where there are so many better alternatives (to a first approximation, every other programming language ever used), is utterly appalling.