Hi all,
We have run a self-hosted Discourse instance at https://discourse.bokeh.org for a number of years. Generally speaking, it has been rock-solid an almost no effort to maintain, and in particular, performing updates is usually always a complete non-event that completes perfectly without any issues.
However today after an update to 2.7beta7 (that seemed to complete without issue), our site has completely imploded. It limped along for a bit with pages mis-rendered and JS console errors, but after attempting a rollback, the UI it became non-functional. Logging in to the droplet, I have also tried to no avail:
Rebuild
./launcher rebuild app
This has failed in several ways over several tries.
Discourse Doctor
./discourse-doctor
Restore
./launcher enter app
discourse restore <backup file>
This failed
Wipe
I also tried doing an “wipe” and then restoring
./launcher stop app
./launcher destroy app
rm -r /var/discourse/shared/standalone/
after this I was at least able to get a rebuild to succeed which led to a “fresh install” state, e.g. " Congratulations, you installed Discourse!"
So now I have tried running discourse restore
again but this has failed again
EXCEPTION: 1 posts are not remapped to new S3 upload URL. S3 migration failed for db 'default'.
/var/www/discourse/lib/file_store/to_s3_migration.rb:131:in `raise_or_log'
/var/www/discourse/lib/file_store/to_s3_migration.rb:86:in `migration_successful?'
/var/www/discourse/lib/file_store/to_s3_migration.rb:357:in `migrate_to_s3'
/var/www/discourse/lib/file_store/to_s3_migration.rb:65:in `migrate'
/var/www/discourse/lib/file_store/s3_store.rb:240:in `copy_from'
/var/www/discourse/lib/backup_restore/uploads_restorer.rb:62:in `restore_uploads'
/var/www/discourse/lib/backup_restore/uploads_restorer.rb:44:in `restore'
/var/www/discourse/lib/backup_restore/restorer.rb:62:in `run'
script/discourse:145:in `restore'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.1.0/lib/thor/command.rb:27:in `run'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.1.0/lib/thor/invocation.rb:127:in `invoke_command'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.1.0/lib/thor.rb:392:in `dispatch'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/thor-1.1.0/lib/thor/base.rb:485:in `start'
script/discourse:286:in `'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/cli/exec.rb:63:in `load'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/cli/exec.rb:63:in `kernel_load'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/cli/exec.rb:28:in `run'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/cli.rb:494:in `exec'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/vendor/thor/lib/thor.rb:392:in `dispatch'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/cli.rb:30:in `dispatch'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/vendor/thor/lib/thor/base.rb:485:in `start'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/cli.rb:24:in `start'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/exe/bundle:49:in `block in '
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/lib/bundler/friendly_errors.rb:130:in `with_friendly_errors'
/usr/local/lib/ruby/gems/2.7.0/gems/bundler-2.2.7/exe/bundle:37:in `'
/usr/local/bin/bundle:23:in `load'
/usr/local/bin/bundle:23:in `'
Trying to rollback...
Rolling back...
Cleaning stuff up...
Dropping functions from the discourse_functions schema...
Removing tmp '/var/www/discourse/tmp/restores/default/2021-04-23-235404' directory...
Marking restore as finished...
Notifying 'system' of the end of the restore...
Finished!
[FAILED]
What’s odd is during the restore, the site seemed to be getting back to normal, with old content showing up. The the failure happened and now nothing shows up, accounts are gone, etc.
I could really use any guidance or suggestions. here. We have daily backups going back a week (futher in glacier if need be). We deleted an unused Category a few days ago, could that be the cause of problems? I will try an older back up to see, but any pointers to an iron-clad restore process would be welcome.