Error Restoring Backup on Migration

ariznaf · September 25, 2019, 9:04pm

Please, may you provide guidance about which file we have to edit in the backup tar?

kerray · September 26, 2019, 6:43am

There is dump.sql packed inside the archives. You need to modify it and then repack the modified version back. I’ve solved my other problems too by modifying it - removed some rogue custom fields that were causing crashes after login.

ariznaf · September 26, 2019, 9:24am

Thank you.

I will try to download the bakup, un pack it and change that file following your instructions.

It is quite scary to have to do all that in order to restore a backup.

I suppose it is a bug of the new release.

But backup and restore are keystones of a disaster recovery plan.
They should be as robust as possible, and a bug in that processes have great impact.

ariznaf · September 30, 2019, 8:07pm

Well I was able to do the restore without changing anything in the backup file.

I just tried several times and oddly enough, one of the times restore with no error.

I was kicked out from discourse and it did not work until I made a launcher rebuild app.

But now It is working correctly.

A strange issue.

Aaron_H · October 9, 2019, 12:57am

This is still giving me trouble restoring my forum from backup. It has been several weeks and the restore from backup functionality appears to still be broken.

Any fix from this?

usulrasolas · October 9, 2019, 1:27am

As far as I can tell, alternate between updating, checking formatting for the tables, making sure everything is similar between source and host, and watching it fail multiple times, and that might or might not work without some minor database edits.

I have successfully migrated 2 of 3 sites, and am forced to use less than one hour a day on it for sanity. I have begun talking to the clients about the issues this could cause in the future with any similar situation. shrug

ariznaf · October 9, 2019, 7:24am

I simply insist in restoring and I could get it working.

The error complains about a column that does no exist in the user profile table.

But it has to be a timeout error or something like that in the database side, may be a bug in the postgres side. If the column is not there it is not created on its own when you insist in restoring.

Jaromir says that Changing the script solves the issue.

Nobody from discourse developers here seems to have worried about this issue, but it is a strange error and very disturbing one, as it affects your disaster recovery plan.

May be the topic has gone unnoticed among the others.

gerhard · October 9, 2019, 3:33pm

It hasn’t gone unnoticed. It will be the first thing I’ll be looking into tomorrow.

And I’m starting to work on improving backups and restores, because nobody should need to worry about those things in case of a disaster or when you simply want to migrate to a new server.

ariznaf · October 9, 2019, 3:41pm

Great. Thank you.
Glad to hear that.

pfaffman · October 9, 2019, 5:00pm

Thanks, Gerhard. I don’t know if you care now, but I’m also having trouble with a site that’s using PG 11 with GCP. It might be worth checking on that as it might affect the future move to PG12 that I understand should happen later this fall.

I just upgraded two instances that share an S3 backup bucket. I ran a backup on one and tried to restore on the other and get

No migration with version number 20191007140446.

gerhard · October 10, 2019, 5:05pm

PostgreSQL 11 and 12 are currently not supported.

gerhard · October 10, 2019, 5:05pm

Okay, I installed the latest version of Discourse (tests-passed) on a droplet and restoring of backups (uploads included, not using S3 for uploads) worked without problems.

If you are still encountering problems during a restore, please do the following:

Rebuild the container:

cd /var/discourse
git pull
./launcher rebuild app

Restore the backup either via web interface or command line:

cd /var/discourse
./launcher enter app
discourse enable_restore
discourse restore <filename>

If it doesn’t work, please post the version number of the backup file you are trying to restore as well as the error message you see during the restore.

pfaffman · October 10, 2019, 9:15pm

Both sites are 2.4.0.beta6 (8fc0cc9aaa). The backups (but not uploads) are on S3.

discourse restore returns

Starting restore: wonderful-community-2019-10-10-184822-v20191007140446.tar.gz
[STARTED]                                                                              
'system' has started the restore!                               
Marking restore as running...                                                                  
Making sure /var/www/discourse/tmp/restores/default/2019-10-10-211121 exists...             
Downloading archive to tmp directory...                                               
Unzipping archive, this may take a while...
EXCEPTION: Compression::Strategy::ExtractFailed
/var/www/discourse/lib/compression/gzip.rb:49:in `block in extract_file'
/var/www/discourse/lib/compression/gzip.rb:45:in `open'
/var/www/discourse/lib/compression/gzip.rb:45:in `extract_file'

Of course, and I think that site will be satisfied with direct database backups on GCP anyway, but at some point Sam said that he was running PG 11 on his dev site and that he’d be interested to know of problems with PG11.

Roman_Rizzi · October 10, 2019, 9:39pm

@pfaffman Please increase the decompressed_file_max_size_mb site setting (it’s hidden). The default is currently set at 1GB.

I have a PR ready to bump the default to 100GB but it wasn’t merged yet:

https://github.com/discourse/discourse/pull/8179

pfaffman · October 10, 2019, 11:29pm

Thanks, @Roman_Rizzi. Well, that solved that problem.

But now I’ve got a bunch of invalid command \N s (and they filled the buffer before I could get what came before them), but maybe

ERROR:  syntax error at or near "Shiny"        
LINE 1: Shiny contest submission 2019-01-07 20:00:05.570573 2019-01-...
^       
EXCEPTION: psql failed
/var/www/discourse/lib/backup_restore/restorer.rb:324:in `restore_dump'
/var/www/discourse/lib/backup_restore/restorer.rb:75:in `run'

is what you need to know.

gerhard · October 11, 2019, 7:50am

Yes, I believe that’s caused by PG11.

pfaffman · October 11, 2019, 8:51am

If it were the pg11 instance I’d agree! But this is a standard 2 container install.

Wait! There is a version mismatch.

root@community:/var/discourse# ./launcher enter data                                      root@staging-data:/# psql --version
psql (PostgreSQL) 10.7 (Ubuntu 10.7-1.pgdg16.04+1)

The one I’m restoring on is 10.9! I bet that’s it. (I think the pg11 fails similarly but there I’m trying to restore on the same instance).

I’ll upgrade the data containers tomorrow and let you know. Thanks for your help.

pfaffman · October 11, 2019, 5:29pm

Well, I upgraded both to 10.10 (using the standard data templates) but still got the invalid command stuff.

When the invalid command errors started I force-quit the restore script. Further attempts to restore (to get the first error before the invalid command messages resulted in:

ActiveRecord::StatementInvalid: PG::UndefinedTable: ERROR:  relation "theme_fields" does not exist

I then did a rake db:migrate on both instances, backed up again and the restore succeeded . Maybe a migration got missed somewhere along the way?

(after changing the setting mentioned above–here are complete instructions for those who might need them in the tiny amount of time before it’s unnecessary)

./launcher enter app
rails c
SiteSetting.decompressed_file_max_size_mb=1000000

pfaffman · October 15, 2019, 3:03am

I just had another one fail. This one both are 2.4.0.beta6 (One is 2c011252f1, the other may be a bit more recent).

I’m restoring via S3. I’ve tried both with and without uploads. Both seemed to be working and then failed like this:

...
COPY 11871
COPY 3689
COPY 0
COPY 36550
COPY 0 
COPY 14736
/usr/local/bin/discourse: line 2:  3232 Killed                  RAILS_ENV=production sudo -H -E -u discourse bundle exec script/discourse "$@"

RGJ · October 15, 2019, 7:32am

Is this the only message you’re getting?

What if you try to remove any s3 dependency and copy the backup file to local first?

@pfaffman it might be good to know that the two (or three) restore issues you have posted in this topic are not occurences of the bug that this topic was originally about (the PG::UndefinedColumn: ERROR issue). You might consider opening new topics for these since they are clearly different issues.

Topic		Replies	Views
Unable to restore backup - No migration with version number Support	7	2413	May 5, 2016
Trouble restoring from backup Support	11	1278	January 31, 2018
Can't restore saved backup Installation	3	1187	March 7, 2017
Restore from old version to a new version of discourse failed Installation	18	2376	October 3, 2023
Backup restore fails with: undefined method `config_filename' Bug	5	585	October 24, 2019

Error Restoring Backup on Migration

Related topics