I found another backup/restore issue.
When I try to restore a corrupt database dump, the restore (obviously) fails but it leaves the system in an unusable state, because there is no rollback attempted and all tables are left in the backup
schema.
(EDIT this doesn’t only happen with corrupted files, it also happens when one tries to restore a dump that was made with a pg_dump that is newer than the current psql)
Repro:
First let’s create a corrupt dump file for test purposes:
discourse@test16:/var/www/discourse$ echo "-- Dumped by pg_dump version 9.5.12" >> dump.sql
discourse@test16:/var/www/discourse$ echo "broken" >> dump.sql
discourse@test16:/var/www/discourse$ gzip dump.sql
discourse@test16:/var/www/discourse$ mv dump.sql.gz public/backups/db5634/discourse-2018-04-05-174724-v20180308071922.sql.gz
Then restore:
discourse@test16:/var/www/discourse$ RAILS_DB=db5634 RAILS_ENV=production script/discourse restore discourse-2018-04-05-174724-v20180308071922.sql.gz
Starting restore: discourse-2018-04-05-174724-v20180308071922.sql.gz
[STARTED]
'system' has started the restore!
Marking restore as running...
Making sure /var/www/discourse/tmp/restores/db5634/2018-04-05-175722 exists...
Copying archive to tmp directory...
tar: /var/www/discourse/tmp/restores/db5634/2018-04-05-175722/discourse-2018-04-05-174724-v20180308071922.sql: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
No metadata file to extract.
Validating metadata...
Current version: 20180308071922
Restored version: 20180308071922
Cannot restore into different schema, restoring in-place
Enabling readonly mode...
Pausing sidekiq...
Waiting for sidekiq to finish running jobs...
Restoring dump file... (can be quite long)
ERROR: syntax error at or near "broken"
LINE 1: broken
^
EXCEPTION: psql failed
/var/www/discourse/lib/backup_restore/restorer.rb:326:in `restore_dump'
/var/www/discourse/lib/backup_restore/restorer.rb:67:in `run'
script/discourse:111:in `restore'
/home/discourse/.rvm/gems/ruby-2.3.1/gems/thor-0.19.4/lib/thor/command.rb:27:in `run'
/home/discourse/.rvm/gems/ruby-2.3.1/gems/thor-0.19.4/lib/thor/invocation.rb:126:in `invoke_command'
/home/discourse/.rvm/gems/ruby-2.3.1/gems/thor-0.19.4/lib/thor.rb:369:in `dispatch'
/home/discourse/.rvm/gems/ruby-2.3.1/gems/thor-0.19.4/lib/thor/base.rb:444:in `start'
script/discourse:273:in `<main>'
Trying to rollback...
There was no need to rollback
[FAILED]
Restore done.
Note the line near the end stating ‘there was no need to rollback’ . Well, there is, since all the tables have been moved to the backup scheme.
The culprit is here
https://github.com/discourse/discourse/blob/master/lib/backup_restore/restorer.rb#L439-L447
The problem is that @db_was_changed
is never set, because it is set by switch_schema!
and that is never called when can_restore_into_different_schema?
is false.
I think the fix is pretty simple : @db_was_changed = true
should be inserted below this line ?
EDIT: yes that fixes it.