OOM Killer Doesn’t Work Properly [SOLVED]

An out-of-memory (OOM) killer is a mechanism of the Linux kernel that is supposed to terminate (kill) high memory consuming processes if a system is critically low on memory.

Unfortunately, the OOM killer doesn’t always work properly.

It often starts too late, when the system is already out of memory, so it doesn’t have enough memory for itself, that leads to the frozen system.

This note shows how to call the OOM killer earlier, before the system gets into an unresponsive state.

Fix Not Working OOM Killer

If the OOM killer doesn’t save your system from getting out of memory, as an alternative, you can try to use an earlyoom daemon.

When both your available memory and free swap drop below 10% of the total memory available, the earlyoom will kill the largest process.

Run the commands below, depending on your OS, to install the earlyoom and configure it to start on boot:

# Debian 10+ and Ubuntu 18.04+
$ sudo apt update
$ sudo apt install earlyoom

# Fedora and RHEL 8 with EPEL
$ sudo dnf install earlyoom
$ sudo systemctl enable --now earlyoom

# Arch Linux
$ sudo pacman -S earlyoom
$ sudo systemctl enable --now earlyoom

To check the earlyoom logs, execute:

$ journalctl -u earlyoom

In order to see the earlyoom in action, create/simulate a memory leak and let the earlyoom do what it does:

$ tail /dev/zero
- your system will start hanging and then, voila -

Once the earlyoom is installed and enabled, you should have no issues with not working OOM killer anymore.

Was it useful? Share this post with the world!

One Reply to “OOM Killer Doesn’t Work Properly [SOLVED]”

  1. I don’t remember this as being a problem long years ago. It seems to be caused by “Memory overcommitment”.

    I can’t believe we’ve got to the point where applications developers became so bad at “allocating memory”, that instead of finding ways to force them to correct their work (or simply stop using bad software), we decided Linux should degrade itself to make it “fine”. Causing this catastrophic memory management issues:

    -> Linux is now granting more RAM than even physically available (std::bad_alloc doesn’t even work anymore for applications that were carrying on it). Bad developers dreams became true: they are free to allocate infinite amount of (not even used) memory without any check, and no unhandled exception will never occur anymore (I don’t know if this is supposed to be funny or sad)

    -> Successful memory allocations are now expected to be useless and never used. It is so unexpected for it to be used, that the system completely stalls and crashes when it happens.

    -> OOM killer, when needed as consequence to this insane situation, doesn’t even work

    While earlyoom is trying to be useful, by avoiding encountering a situation that still isn’t solved yet, it causes part of the physically available memory to be unusable. Even worse: applications that were still managing to check if the needed memory is actually available, may now be killed by earlyoom when using available memory.

    This seemed impossible to me, but in a way, depending on the use case, this may turn to be even worse.

Leave a Reply