Out of memory on rebuild with 4GB swap?

I’ve had several two-container bootstraps fail today with errors like

 ERR_PNPM_RECURSIVE_EXEC_FIRST_FAIL  Command was killed with SIGKILL (Forced termination): ember build -prod"]

I think once I added swap and had it work the next time. But this has 4GB swap:

# free -h
               total        used        free      shared  buff/cache   available
Mem:           1.9Gi       1.1Gi       391Mi        45Mi       661Mi       830Mi
Swap:          4.0Gi       3.1Gi       911Mi

and it’s still failing. And if I stop the container before the bootstrap, it does succeed.

1 Like

Is the build fetching/using pre-built assets? Or have you disabled that? (or patched core such that the pre-built assets can’t be used)

2 Likes

You have a lot of swap but a great deal is already in use. As a two container setup has less downtime, perhaps it’s true that the current invocation of the server is still running and has clocked up a lot of memory use perhaps due to a leak.

Perhaps a restart would help, before the update. Perhaps measure swap use before and after the restart.

3 Likes

I have some kind of memory leak/expansion on one of my servers. At @RGJ’s sensible suggestion I cron schedule a reboot every 7 days early Monday AM (W. Europe)

(We believe we know the plugin with the issue but I’ve not invested the time to work out why there is a memory leak/expansion)

2 Likes

This looks like an OOM kill: I’m on ~2 GiB RAM and in two-container rebuilds the old app + new build overlap pushes memory over the edge. Swap is already ~3 GiB used before bootstrap, so the ember build spike gets SIGKILL’d. Stopping the running container (or doing a one-container rebuild) avoids the overlap and succeeds. Next step is to confirm via dmesg and then either restart before rebuilds / investigate what’s driving swap up over time / add RAM (swap alone doesn’t seem to save it once it’s already heavily used).

1 Like

This looks less like a pnpm or Ember issue, and more like the host simply running out of memory.

The key detail is the SIGKILL. That usually means the OS stepped in and killed the process (often via the OOM killer), not that ember build -prod failed on its own.

On small hosts, Ember production builds can easily spike to a couple of GB of RAM. Even with swap enabled, once swap is mostly used, the kernel can still decide to kill a memory-hungry node process.

A few things that point in this direction:

  • Swap is already heavily used when the failure happens.
  • The failure is much more likely when another container is running at the same time.
  • If I stop the other container before running the bootstrap, the exact same build succeeds.

So swap helps a bit, but it mostly just delays the problem. Stopping other containers lowers memory pressure enough for the build to finish.

What helped / might help:

  • Avoid running multiple bootstraps or asset builds in parallel.
  • Stop other containers during ember build -prod.
  • Cap Node’s memory usage (e.g. NODE_OPTIONS=--max_old_space_size=1024) to reduce peak usage.
  • If possible, bumping the host RAM (4GB+) makes this a lot more reliable.

Hopefully this helps explain why it feels a bit random and why stopping another container makes it work.

2 Likes

It feels like more swap would help. It wouldn’t hurt. Instead of looking at total swap and saying it seems like a lot, look at free swap and make sure you have the headroom for a rebuild.

2 Likes

Also check that you have enabled overcommit.

Sounds like a good idea! Do you reboot the server or just the container?

I’ll call this the “solution”, if only for tidiness. :person_shrugging:

Anyway, I had it happen again on another 2gb+3GB server. Then I rebooted web_only and tried again and it worked just fine. I think I’ll add to my tooling to reboot web_only, perhaps if memory is some definition of “low”.

I didn’t do anything to disable it.

I do that on machines that I set up . . . and it looks like this one has that turned on already.

Thanks to everyone for your ideas!

1 Like

The entire server :sweat_smile:

2 Likes

It’s not Windows, for heaven’s sake. :rofl:

2 Likes

Oh boy I remember those days! :grimacing:

2 Likes