I’ve got a site that’s pushing a 20GB backup to Wasabi S3. It works. Most of the time.
But sometimes it fails to upload to S3 and keeps the local .tar.gz. And then eventually the disk fills and I’m left with a full disk, the uncompressed .tar file (because there wasn’t enough space for the compressed version, and soon, a broken site because the disk is full.
Before I punt on Wasabi, I’d like to try to see if there are any clues.
I’ve looked in production.log, production.errors and the sidekiq and unicorn logs and don’t see “acku” anywhere either on the day that the backup failed or when it worked. Shouldn’t there be a log somewhere?
You should get a PM with the log output if it fails. It’s sent either directly to you if it’s a manual backup in the UI or to the admins group if it is an automatic backup.
An exception during the backup should also show up in /logs and, I think, in one of the log files as well. Try searching for EXCEPTION:
But, the fact that it keeps temporary files around makes me wonder if Sidekiq or even Docker or the host get restarted during the backup. That would explain why the cleanup doesn’t run and why you aren’t getting a PM.
Right. This is very odd. I didn’t get a failure notice, even for the one where there was only a .tar and an almost full disk (it’s an up to date site on tests-passed).
It’s as if backup location just got changed on those days, but there’s nothing in the logs. I see “successful” notifications in admin messages for backups initiated from the web interface, but no failures. I’ve moved backup_location to an ENV setting.