Out of the blue our backups have started failing. I’ve updated to the very latest version of Discourse and still have the same issue. Everything else appears to be working correctly.
System reports the following backup log… which I’m hoping someone might be able to explain:
BackupRestore::BackupStore::StorageError
/var/www/discourse/lib/backup_restore/s3_backup_store.rb:70:in `rescue in unsorted_files'
/var/www/discourse/lib/backup_restore/s3_backup_store.rb:58:in `unsorted_files'
/var/www/discourse/lib/backup_restore/backup_store.rb:21:in `files'
/var/www/discourse/lib/backup_restore/backup_store.rb:26:in `latest_file'
/var/www/discourse/app/jobs/scheduled/schedule_backup.rb:11:in `execute'
/var/www/discourse/app/jobs/base.rb:137:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/rails_multisite-2.0.4/lib/rails_multisite/connection_management.rb:63:in `with_connection'
/var/www/discourse/app/jobs/base.rb:127:in `block in perform'
/var/www/discourse/app/jobs/base.rb:123:in `each'
/var/www/discourse/app/jobs/base.rb:123:in `perform'
/var/www/discourse/app/jobs/base.rb:185:in `perform'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/mini_scheduler-0.8.1/lib/mini_scheduler/manager.rb:81:in `process_queue'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/mini_scheduler-0.8.1/lib/mini_scheduler/manager.rb:29:in `block in initialize'
Going to the web admin screen for backups gives me this error… which is not much to go on:
Check /logs for errors or click on the /admin/backups.json link in the error page to get additional details about the error.
I assume you are using S3 for your backups:
Make sure that the backup bucket is in the correct region as set in the s3_region site setting.
Make sure that the user and bucket has the right permissions. Carefully read Setting up file and image uploads to S3 and check your settings on S3 and in Discourse.
$ date
Thu Feb 14 05:32:16 EST 2019
$ ntpdate ntp.ubuntu.com
14 Feb 05:26:58 ntpdate[12447]: step time server 91.189.91.157 offset -408.660323 sec
$ date
Thu Feb 14 05:27:02 EST 2019
So that was promising… but backup view still dead with the same error. So then I restarted the app:
It’s a Digital Ocean droplet running vanilla Ubuntu. Comes back up… still not working and the time offset is gone:
$ ntpdate ntp.ubuntu.com
14 Feb 05:36:42 ntpdate[2375]: adjust time server 91.189.91.157 offset -0.001713 sec
I was convinced the 5 minute time correction would be key. Is it possible it’s just stuck now?
Any other ideas?
No. Separate bucket on S3 for backups; set-up unchanged for as long as anyone can remember.
So I couldn’t find anything interesting in logs on the host but i did look at the Discourse logs and noticed that it reports a slightly different error to the /admin/bakups.json
Failed to list backups from S3: The difference between the request time and the current time is too large.
Notice the extra prefix: Failed to list backups from S3
Was dealing with two Discourse forums with similar IPs. Updating the time on one and checking the backups on the other. Looks like both are working now! Huzzah
AWS has an issue once the server falls out of sync by 5 minutes or more:
Fix for Ubuntu was to reset the clock.
$ date
Thu Feb 14 05:32:16 EST 2019
$ ntpdate ntp.ubuntu.com
14 Feb 05:26:58 ntpdate[12447]: step time server 91.189.91.157 offset -408.660323 sec
$ date
Thu Feb 14 05:27:02 EST 2019