I’ve written in the past on the topic of overcommit, which depending on your perspective, is either a feature of Linux and some other kernels, or a bug left over from a time when folks didn’t know how to do virtual memory accounting properly. I’m a serious proponent of strict commit accounting (opposite of overcommit), but for this article, I want to look at the state of the software ecosystem and how it often leaves us overcommit-enabled Linux systems being more failproof than their strict-accounting brothers and sisters.
The idea of strict commit accounting is that malloc
never reports
success only to let your program crash when you actually try to use
the memory. If the kernel cannot ensure that there’s no possible
sequence of paging events that would cause it to run out of physical
storage for all to-be-mapped pages, then allocating new pages fails,
and malloc
returns a null pointer. This gives your program the
power, but also the responsibility, to check for out-of-memory
conditions and handle them, which is in principle, a very good thing.
Where problems begin to arise is when programs don’t check the return
value of malloc
, or use it only for the sake of calling abort
when
allocation fails.
Let’s consider a possible (rather likely, these days) scenario: you have several core system components (things like init/upstart/systemd, inetd, sshd, etc.) that would leave the system in a crippled, unusable, or even kernel-panic state if they die, and these programs are making use of dynamic allocation. What happens when your machine runs out of memory?
If strict commit accounting is in effect, one of their calls to
malloc
fails, resulting in one of the following:
abort
on the first allocation failure. A core
system component is not likely to do this itself, but it may
inadvertently do so by relying on a library (such as glib) that
unconditionally aborts the calling program on allocation
failure.malloc
could fail, and dereferencing
relative to a null pointer. This is the worst possible behavior, but
there’s plenty of software that buggy.If on the other hand overcommit is enabled, along with Linux’s heuristic OOM killer, there’s only one likely result: this critical system component was either manually marked as not a candidate for OOM killing, or was naturally not a candidate since it never engaged in allocation behaviors that the OOM killer judged as abusive. Some bloated desktop app like Firefox or OpenOffice if this is a desktop system, or some runaway PHP program if it’s a webserver, gets OOM-killed instead, and the system is back to “normal” (minus the user perhaps being angry about losing his or her session).
Does this mean I’ve changed my mind about overcommit and it’s actually a good thing? No, not really. What it means, at least in my mind, is that there’s a great deal of work that needs to be done auditing core system components for robustness and fail-safe behavior. In particular:
This is hard work, but I still believe it’s better than the current situation where stability of the essential core components of the system depends on sacrificing (OOM-killing) user applications which might have valuable unsaved data. It just means we still have a long way to go towards a rock-solid, crash-free FOSS platform...