Rebake failing: how to diagnose and fix?

Memory usage is much greater during a rebuild than during normal operation. It seems that rebaking, similarly, makes a great demand. If so, any kind of regular monitoring wouldn’t add much value: the monitoring would be needed during those peaks, which fortunately happen when the admin takes some specific action.

When I was running on smaller, more marginal machine configuration, I would have a second terminal window open, ssh’d into my server and running
vmstat 5
which gives a record of memory usage as it ebbs and flows. Watch the swpd column and compare against your configured swap space. Commonly the failure will happen suddenly, not gradually, so even looking at short term trends is not much help.

If you have the disk space, there’s no harm at all in having lots of swap - half as much as RAM, or even as much as RAM. It’s there in this case to cope with peaks. You don’t want to see swapping/paging activity during normal use. Again, one can use vmstat 5 5 to get a short-term picture of paging activity (in the si and so columns)

Here’s an example:

# vmstat 5 5
procs -----------memory----------   ---swap--  -----io---- -system--  ------cpu-----
 r  b   swpd   free   buff  cache     si    so    bi    bo   in    cs us sy id wa st
 3  0 1392140  61200  11632  76432    41    32   117    93    0     1  2  1 97  0  0
 1  1 1467220  63416    324  67284  8786 20499 13178 20567 2539  8924 77 13  0 10  0
 0  2 1593340  57916   1096  53832 24262 46868 29986 46889 5377 18534 44 22  0 34  0
 4  0 1155632 120680   2772  86280 39111 35424 54768 37824 6987 25174 38 27  0 35  0
 3  0 1102988  74096   2852  85276 11261   246 12610   271 1879  6365 86  6  0  8  0

You see that that swpd column peaked at over 1.5G, versus my 2.0G configured. You see that swapout (so) activity peaked in the same 5 second window, and swapin (si) peaked in the next window.

(Edit: I can see I had 2.0G swap configured because I’d previous run free:

# free
              total        used        free      shared  buff/cache   available
Mem:        1009140      696504       78544       51784      234092      118436
Swap:       2097144      154628     1942516

we also see I was at the time managing to run discourse with only 1G RAM.)

1 Like