When backup fails, delete the useless backup

pfaffman · July 27, 2017, 7:33pm

If a site backup fails due to disk space when it’s gzipping the backup, the .tar file is left. Discourse can’t see it or use the tar file. On one hand, a decent sysadmin would be alarmed enough at a failed backup and immediately go solve the disk space problem and then gzip the backup by hand in a shell. On the other hand, someone who doesn’t like getting their hands dirty in a shell is sort of out of luck.

As an aside, it would seem like 50GB would be a reasonable partition size for a site with a 13GB backup, but since there are two copies of the current backup while it’s gzipping, and maximum backups doesn’t delete a backup until there are more than maximum backups, 50GB is enough for maximum backups to hold only one backup. It took me quite a while to understand that math.

codinghorror · July 27, 2017, 11:58pm

Try a database-only backup, which skips the “combine all the uploaded files into the database archive” step and thus doesn’t need 2x the disk space in the process.

tgxworld · July 28, 2017, 12:18am

It this isn’t happening this is a bug. I’m pretty sure we have code to handle cleanup on failure. Which folder is the backup left in? Is it the tmp folder?

Falco · July 28, 2017, 12:22am

It’s on the same folder where finished backups resides, /shared/standalone/backups/default/.

tgxworld · July 28, 2017, 12:24am

Hmm that is strange… the entire backup process should take place in a tmp folder before being moved to the backups folder. So if anything blows up, it’ll clean up the tmp folder after. Maybe we’ll not catching the error when gzip blows up some how. I’ll have a

pfaffman · July 28, 2017, 12:24am

Well, the site is up to date.

No. And I thought for a while that perhaps the problem was that it was writing to /tmp and I’ve got a whole separate partition just for backups, so now the site doesn’t crash when the backup fills the disk, but . . . what @falco just said. It could be complicated by having backups somewhere else like this:

 - volume:
      host: /mnt/backups
      guest: /shared/backups

If you’ll point me to the file where the script is (about 30% of the time it’s exactly where I think it’ll be) I’ll check & if I can figure it out (and unless it’s something bizarre, finding the script should be 90% of the problem for me) I’ll submit a PR.

sam · October 2, 2017, 6:34am

Sure, we want to fix this, but we need a proper repro of the issue with very careful and consistent steps.

I believe this somehow relates to running out of disk space, and maybe followed by a server crash, forcing us to add cleanup code either at boot or when you run next backup.

JagWaugh · December 15, 2017, 1:22pm

Bump.

The .tar and the final tar.gz both appear in /var/discourse/shared..../backup while it is running.

The tar.gz is visible in the web interface at /admin/backups while the uploads are being added to it (and it’s size increments up).

When it runs out of space the tar.gz disappears from /admin/backups, but the .tar file is still, and the space is not returned (this 4 hours after the backup failed).

Topic		Replies	Views
Backups keep taking down my forum Bug backups	9	483	March 11, 2024
Suddenly having backup failures Installation backups	3	419	June 7, 2024
Backup fails at gzip saying no space left on device, but there is space Support	2	31	April 12, 2025
I can't get a good backup - it fails after after_create_hook Bug	17	2766	September 21, 2016
How to backup Discourse when The backup has failed? Installation	8	934	August 5, 2020

When backup fails, delete the useless backup

Related topics