The log indicates that the zip operation runs out of space, but…
it leaves 3 files (two are .gz) spaced roughly 30 min apart.
the .gz files do not appear in the GUI.
It seems to happen when sidekiq gets restarted for using too much memory.
We’ve oodles of RAM, and enough space on the drive if it weren’t for having 3 (failed) backups for that day.
78G space on the drive, backups are 16G each
Here are the corpses left over after the job fails
-rw-r--r-- 1 ubuntu www-data 17094434327 Mar 19 08:48 jag-lovers-forums-2018-03-19-083309-v20180309014014.tar.gz
-rw-r--r-- 1 ubuntu www-data 17099593947 Mar 19 09:19 jag-lovers-forums-2018-03-19-090524-v20180309014014.tar.gz
-rw-r--r-- 1 ubuntu www-data 17184337920 Mar 19 09:52 jag-lovers-forums-2018-03-19-093558-v20180309014014.tar
We started seeing the sidekiq restarts in December
Yes, I have definitely seen this happen if sidekiq gets forcefully restarted (due to excess memory use) in the middle of a backup. cc @sam
Typically it was only an issue due to a global rebake, as the post version was incremented in a commit about 1 month ago – this means every single post in the system must be rebaked and can take months. The global rebake process causes Sidekiq to run out of memory much more frequently over that time period.
It’s been a week since we increased the sidekiq headroom. Sidekiq restarts have gone from 3-12 restarts/day to zero, and the backup hasn’t failed once.