Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The core of a properly built, resilient/robust system is that you have compartmentalized code into different small erlang processes. They work together to solve a problem. A bug in one is isolated to that particular process and can't take the whole system down. Rather, the rest of the system detects the problem, then restarts the faulty process.

The reason this is a sound strategy is that in larger systems, there will be bugs. And some of those bugs will have to do with concurrency. This means a retry is very likely to solve the bug if it only occurs relatively rarely. In a sense, it's the observation that it is easier to detect a concurrency bug than it is to fix it. Any larger system is safe because there's this onion-layered protection approach in place so a single error won't always become fatal to your system.

It's not really about types. It's about concurrency and also distribution. Type systems help eradicate bugs, but it's a different class of bugs those systems tend to be great at mitigating.

However, if you do ship a bug to a customer, it's often the case you don't have to fix said bug right away, because it doesn't let the rest of the application crash, so no other customer is affected by this. And you can wait until the weekend is over in many cases. Then triage the worst bugs top-down when you have time to do so.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: