Backups interrupted after changing site name


(Wes Osborn) #1

I just noticed today that after our upgrade to 0.9.9.6, our backup process has not been working properly. We just upgraded to 0.9.9.7 today and we’re still having issues.

If I launch a manual backup, the process runs and doesn’t generate any errors (here is the tail of the log):

[2014-06-10 17:39:06] Creating archive: beta-clc-discussion-and-email-list-archive-2014-06-10-213905.tar.gz
[2014-06-10 17:39:06] Making sure archive does not already exist...
[2014-06-10 17:39:06] Creating empty archive...
[2014-06-10 17:39:06] Archiving metadata...
[2014-06-10 17:39:06] Archiving data dump...
[2014-06-10 17:39:06] Archiving uploads...
[2014-06-10 17:39:07] Gzipping archive...
[2014-06-10 17:39:08] Executing the after_create_hook for the backup
[2014-06-10 17:39:11] Removing old backups...
[2014-06-10 17:39:11] Notifying 'wosborn' of the end of the backup...
[2014-06-10 17:39:12] Cleaning stuff up...
[2014-06-10 17:39:12] Removing tmp '/var/www/discourse/tmp/backups/default/2014-06-10-213905' directory...
[2014-06-10 17:39:12] Unpausing sidekiq...
[2014-06-10 17:39:12] Marking backup as finished...
[2014-06-10 17:39:12] Finished!

When I go back to the list of backups, I don’t see the backup listed there and I don’t see that any backups have run since 6/3 when we upgraded to 0.9.9.6. When I look on S3, I don’t see our backup file listed in our bucket either. I also don’t see the backup files listed in:

/var/docker/shared/standalone/backups/default

If I look in /logs I don’t see anything that appears to correlate to the backup time, the only thing I see is:

ActiveRecord::RecordNotFound (Couldn't find Upload with 'id'=10)

In case there is any question, here is read out of our local disk usage on the server hosting Discourse:

administrator@proddiscourse:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       123G  9.9G  107G   9% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            986M  4.0K  986M   1% /dev
tmpfs           200M  264K  199M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            997M  1.3M  995M   1% /run/shm
none            100M     0  100M   0% /run/user
/dev/sda1       511M  3.4M  508M   1% /boot/efi

Any ideas what might be happening and why it appears that I’m getting successful backups according to the backup UI although they aren’t appearing in S3 or the local file system?

Have others who have upgraded to 0.9.9.6 checked to make sure that they’re backups are indeed still running?


(Jeff Atwood) #2

I update discourse.codinghorror.com very regularly (about once a day from latest) and I see no interruption to daily backups:

Any idea on how to troubleshoot this further @zogstrip?


(Wes Osborn) #3

Ok, I’ve got a theory. I went in and manually deleted one of the old backups and then ran the process again, it worked!

Then I tried to run the process again and it did not work.

We changed our sitename between backups (added a beta label), so now the backup prefix name is: beta-clc-discussion-and-email-list whereas it used be: clc-discussion-and-email-list

Is it possible that the part of the backup process that “clears” the old backups is looking for files with the new sitename to delete and not finding them, so they don’t get deleted and a “slot” is not opened up to store the new backup?


Backups not being automatically deleted (1.6.4 stable)
(Régis Hanol) #4

How did you rename your site?


(Wes Osborn) #5

By changing the title in the settings:


(Wes Osborn) #6

Just to let folks know, manually deleting all the previous backups has solved the problem for me (the automated backups are running properly again). So if you’re having this issue too, that workaround should do the trick for you.

But I’m assuming that the Discourse team will try to straighten out the backup routine so that others don’t get caught by this in the future.


No automatic backups happening?
Backups not being automatically deleted (1.6.4 stable)