When I was young and really didn't understand Unix, my friend and were summer st...

derefr · on Aug 21, 2014

> ...a minute or two later one of the folks who had 'root' ran into the machine room with a panic-stricken look because the system had mostly just locked up.

It's kind of weird that, while root has always had e.g. 5% reserved disk space on the rootfs for emergencies, one thing no Unix has ever done is enforce a 5% CPU reservation for root so administrators can "talk over" a cascading failure. I think this is possible just recently in Linux with CPU namespacing, but it's still not something any OS does by default.

zurn · on Aug 21, 2014

It's not specifically the lack of cpu timeslices that crowds out other programs, it's more like exhaustion of all the OS resources (process table fills up, file table fills up, memory runs out, swap death etc).

Sure if you carefully made everything fork-bomb-resistant then a cpu quota would be a part of it. Container systems use fork bombs as basic test cases.

derefr · on Aug 21, 2014

I'm surprised that this wasn't one of the primary goals of cgroups: the ability to group "all userspace processes" into one cgroup, and then say that that cgroup can in sum only use so much CPU, so many processes, so many inodes, etc. You know, a control plane/data plane separation, without requiring hypervision.

throwaway0010 · on Aug 21, 2014

It is. Cgroup provides limits for memory, CPU time. We already have other accounting mechanisms for processes/threads (rlimits) and for inodes and disk space (disk quota systems). We've had those for ages. I imagine there will be more work to integrate these various accounting mechanisms with cgroup as the work continues.

baruch · on Aug 21, 2014

If you care about such things the normal method is to have a backup ssh running on a different port with realtime priority , it is not used at any other time except when some process had gone runaway and you can't do anything else.

danieltillett · on Aug 21, 2014

I haven't seen this in action. Do you know of a write up describing this?

baruch · on Aug 21, 2014

No write up that I know of. I used it in systems I've made in the past. Some of our services were running in a realtime priority and we needed a way to take care of such a system mostly in development.

paulfurtado · on Aug 21, 2014

Most linux distributions assign root processes a better scheduling priority than non-root processes, which should be good enough in most cases. Critical system processes also run at better priorities than other processes. It's not uncommon to see linux users consciously decide on the priority of a process by using nice or renice.

Totally limiting the CPU utilization of a group of processes requires more overhead than changing the scheduling priority since you must actively account for the CPU usage. CPU cgroups should do just that though and in most cases the overhead should be acceptable.

In your comment's parent, I don't think raw CPU utilization was the issue since kabdib mentioned fork and it was in response to a post about fork failures. The problems caused by a fork bomb are not limited to CPU utilization, see: https://en.wikipedia.org/wiki/Fork_bomb

In any case, there will likely always be some system call you can abuse to totally exhaust some resource of the kernel.

derefr · on Aug 21, 2014

> In any case, there will likely always be some system call you can abuse to totally exhaust some resource of the kernel.

If this is true, I would expect there to exist one or more articles entitled "how I brought down my Heroku host-instance" or something along those lines. Anyone got some links? :)

mjevans · on Aug 21, 2014

It would only be possible if a limit were enforced on all non-primary namespaces.

However something that has been /possible/ for a while (but not in practice done) would be to elevate root process priority over other processes. Probably not done due to daemons needing to run as root (which is decreasing as they're able to drop privileges these days).

dredmorbius · on Aug 21, 2014

Root has had the ability to assign negative nice values since long, long ago. Non-root users can only assign positive niceness. The range is -20 - +19.

In theory this can give higher priority to a process, but if you cannot get into the run-queue at all (fork bomb), or the problem is in kernel space (e.g., I/O access, hang, or a kernel space loop), then it's not going to help you much.

blueskin_ · on Aug 21, 2014

Technically, non-root users can use negative nice, if they are explicitly allowed to in /etc/security/limits.conf

0xbadcafebee · on Aug 21, 2014

And, sadly, most of the really hard hangs are kernel space. The general fix is to cut off all network requests/incoming jobs, powercycle, dig through logs, and try to shunt a future hang. (Sometimes just cutting incoming jobs will stop the hang, too)

dalore · on Aug 21, 2014

In theory root could nice all other processes.

mnw21cam · on Aug 21, 2014

On Linux, nice is not an absolute priority system.

In the old days, the Amiga operating system did use static absolute priorities for its multi-tasking. This meant that if a task with a priority of 1 wanted to use as much CPU as it wanted, then all tasks with a priority of 0 or below would be completely starved. This meant that you could boost a certain process (like, say, a CD writer) and get close to real-time behaviour. I was certainly writing coaster-free CDs on a much less powerful Amiga than a Linux box that constantly made coasters from buffer under-runs.

Linux, however, has virtual memory and "nice", which complicates matters. A process with a niceness of 19 will still take a small amount of CPU in the presence of another process with a niceness of -20. In the presence of a fork bomb, you may have a very large number of processes. If they all (by some miracle) have a niceness of 19, you still have very little CPU time left for a process with a normal or negative niceness. Infinity multiplied by a small number is still infinity. Real-time priorities are the only thing that will save you here.

You also have the problem of being able to actually change the processes' nicenesses. That requires CPU time, which you no longer have. You would be better off sending a kill signal. You also have a race condition - you obtain (from the OS) a list of processes that are running that you want to renice or kill. By the time you have iterated through each one renicing or killing them, new processes have appeared.

mcguire · on Aug 22, 2014

For several years I was a sysadmin for the University of Texas computer sciences department. (This was much later than your story, though.) If I remember correctly, the operating systems class was usually taught in the spring and they got to exploring processes sometime in late March or early April. And for about two weeks, none of our generally available systems would have an uptime of more than a couple of days.

Sure, you could get in and kill a fork-bomb before it did anything bad. But two or three on the same machine? And when you've got a couple hundred machines? It was easier to just reboot and let the victims who were inconvenienced handle explaining to the guilty how what they did was bad.

Then there were the guys who would log into one machine in a lab, fork-bomb it, move to the next machine over and make a change to their program, fork-bomb that machine, and expect to iterate that process until they passed the assignment. Leaving a wake of pitifully flailing workstations behind. Ahh, good times.

BuildTheRobots · on Aug 21, 2014

What are these "Lions Notes" of which you speak? Google is not being helpful to me :(

ciupicri · on Aug 21, 2014

"Lions' Commentary on UNIX 6th Edition, with Source Code" http://en.wikipedia.org/wiki/Lions%27_Commentary_on_UNIX_6th...

aqrashik · on Aug 21, 2014

Lions' Commentary on UNIX' 6th Edition

http://www.lemis.com/grog/Documentation/Lions/

hadoukenio · on Aug 21, 2014

That was a cool story. I wanted to know more so I looked at your profile.

Oh, it's you. That story makes you even more awesome :)

mkhalil · on Aug 22, 2014

"which he had to repair by hand with ncheck and icheck, because this was before the days of fsck and that's what real programmers did"

No, real programmers were writing fsck.

:)

kjs3 · on Aug 21, 2014

Must have been V6; I recall V7 had patches to prevent this, at least to the extent that it wouldn't crater the whole machine. I haven't thought about using ncheck & icheck since fsdb showed up about BSD4.2 or thereabouts. I remember using adb as well to fix buggered filesystems back in the ancient days.

I remember well the day one of the elder neckbeards handed me my own photocopy of the Lions books. It was enlightenment in pure form.