Backups have started failing due to server time being wrong


(Geoff Bowers) #1

Out of the blue our backups have started failing. I’ve updated to the very latest version of Discourse and still have the same issue. Everything else appears to be working correctly.

System reports the following backup log… which I’m hoping someone might be able to explain:

BackupRestore::BackupStore::StorageError
/var/www/discourse/lib/backup_restore/s3_backup_store.rb:70:in `rescue in unsorted_files'
/var/www/discourse/lib/backup_restore/s3_backup_store.rb:58:in `unsorted_files'
/var/www/discourse/lib/backup_restore/backup_store.rb:21:in `files'
/var/www/discourse/lib/backup_restore/backup_store.rb:26:in `latest_file'
/var/www/discourse/app/jobs/scheduled/schedule_backup.rb:11:in `execute'
/var/www/discourse/app/jobs/base.rb:137:in `block (2 levels) in perform'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/rails_multisite-2.0.4/lib/rails_multisite/connection_management.rb:63:in `with_connection'
/var/www/discourse/app/jobs/base.rb:127:in `block in perform'
/var/www/discourse/app/jobs/base.rb:123:in `each'
/var/www/discourse/app/jobs/base.rb:123:in `perform'
/var/www/discourse/app/jobs/base.rb:185:in `perform'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/mini_scheduler-0.8.1/lib/mini_scheduler/manager.rb:81:in `process_queue'
/var/www/discourse/vendor/bundle/ruby/2.5.0/gems/mini_scheduler-0.8.1/lib/mini_scheduler/manager.rb:29:in `block in initialize'

Going to the web admin screen for backups gives me this error… which is not much to go on:

Are there any other logs I should be looking at to shed light on the problem?


(Jeff Atwood) #2

My guess is you have weird S3 permissions / buckets set up?


(Jay Pfaffman) #3

Do you have uploads and backups in the same bucket?


(Gerhard Schlager) #4
  • Check /logs for errors or click on the /admin/backups.json link in the error page to get additional details about the error.

  • I assume you are using S3 for your backups:

    • Make sure that the backup bucket is in the correct region as set in the s3_region site setting.

    • Make sure that the user and bucket has the right permissions. Carefully read Setting up file and image uploads to S3 and check your settings on S3 and in Discourse.


(Geoff Bowers) #5

Weird… could have sworn i tried to click that link before. It yields more mystery :wink:

{"errors":["The difference between the request time and the current time is too large."]}

(Jeff Atwood) #6

Is the server clock time off?


(Geoff Bowers) #7

Well I tried the following:

$ date
Thu Feb 14 05:32:16 EST 2019
$ ntpdate ntp.ubuntu.com
14 Feb 05:26:58 ntpdate[12447]: step time server 91.189.91.157 offset -408.660323 sec
$ date
Thu Feb 14 05:27:02 EST 2019

So that was promising… but backup view still dead with the same error. So then I restarted the app:

$ ./launcher restart app
+ /usr/bin/docker stop -t 10 app
app

starting up existing container
+ /usr/bin/docker start app
app

Still no dice :game_die: :game_die:

$ reboot
The system is going down for reboot NOW!

It’s a Digital Ocean droplet running vanilla Ubuntu. Comes back up… still not working and the time offset is gone:

$ ntpdate ntp.ubuntu.com
14 Feb 05:36:42 ntpdate[2375]: adjust time server 91.189.91.157 offset -0.001713 sec

I was convinced the 5 minute time correction would be key. Is it possible it’s just stuck now?

Any other ideas?


No. Separate bucket on S3 for backups; set-up unchanged for as long as anyone can remember.


So I couldn’t find anything interesting in logs on the host but i did look at the Discourse logs and noticed that it reports a slightly different error to the /admin/bakups.json

Failed to list backups from S3: The difference between the request time and the current time is too large.

Notice the extra prefix: Failed to list backups from S3


(Gerhard Schlager) #8

I’ve seen a lot of different errors from S3, but this one is new. Is your server using a timezone other than UTC? If so, try changing it to UTC.


(Geoff Bowers) #9

Well I’ve tried resetting the NTP to point at AWS servers… and just about everything else here:

Changed the AWS access key… it’s starting to get my goat.


(Geoff Bowers) #10

Argh.

Was dealing with two Discourse forums with similar IPs. Updating the time on one and checking the backups on the other. Looks like both are working now! Huzzah :raised_hands:

AWS has an issue once the server falls out of sync by 5 minutes or more:

Fix for Ubuntu was to reset the clock.

$ date
Thu Feb 14 05:32:16 EST 2019
$ ntpdate ntp.ubuntu.com
14 Feb 05:26:58 ntpdate[12447]: step time server 91.189.91.157 offset -408.660323 sec
$ date
Thu Feb 14 05:27:02 EST 2019

(system) closed #11

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.