Finding UI generated backup and restoring site

Hi Discourse,

Last night I was pushing through the Discourse upgrades and rebuilt the app, which resulted in a host of Postgres errors. I realized this was a result of the recent upgrade, but kept getting permission denied errors, among other things (and yes, I chowned everything to 700 so it wasn’t global). So I moved my original /var/discourse somewhere that was supposed to be temporary and reinstalled a fresh instance of Discourse to try and at least get postgres up to date.

Here’s where it gets fun. I had a backup of the site (DB only, uploads are saved to a different volume) generated by the UI from three days ago. Or at least, I thought I did. What I have now is a file called wacky-writers-forum-2021-04-06-033906-v20210328233843.sql.gz which I think I’ve learned is not, in fact, the tar.gz file the actual backup should be in.

I have everybody redirected to a landing page currently, and I’m hoping someone may be able to tell me that it is still possible to retrieve the actual .tar.gz file from the server from 3 days ago, and how, exactly I should go about doing that.

I have my backups and uploads saving to Digital Ocean block storage, and I still have the discourse folder from my old install that was functional, but moving/copying it back over to /var/discourse just breaks everything all over again, including throwing postgres errors. I’ve been working on this for 9 hours straight and I’m just about at my wits’ end. Can anybody help me, or at least try to point me in the right direction? :pray: We just hit our 1k user mark and I would really really like to try and avoid losing all of that.

Edited to amend my upload setup.

If your have your S3 configuration in the app.yml then your can just do a commend line restore and it’ll pull the backup from s3.
Since you have your assets in S3, the backup contains only the database.

You should just be able to clone a new /var/discourse, copy your yml file, rebuild, and do the command line restore.

Using Object Storage for Uploads (S3 & Clones)

Restore a backup from command line

I guess I’m using the wrong term for how my backups/uploads are set up. I used this method: Move Uploads and Backups to DigitalOcean Block Storage

I’ll amend that and say my uploads and backups aren’t local to the main discourse folder (it’s partially how this all got started, I was working on trying to move us to DigitalOcean Spaces). So, no, unfortunately, I don’t have any of the S3 configurations done since I was just saving it to mounted storage.

The backups were being saved in mnt/my_storage/shared/standalone, but when I go to look for backups in there, all I have is the wacky-writers-forum-2021-04-06-033906-v20210328233843.sql.gz file. I did actually try to restore from that for lack of a better idea (which was probably wrong), but I got a permission denied error. I’m sure it’s something to do with how those backups are actually generated.

So are your uploads still on the DO block storage?

Yes, all the uploads are intact.

Ok, good.

So in that case you should be able to restore the SQL file, and then re-mount the block storage volume to get your uploads back.

There are two kinds of backups: sql.gz which does not include uploads, and tar.gz which does include uploads. So you had the wrong kind of backup but the fact that you had the uploads on an external volume saved your butt.

2 Likes

So I enter the app and restore that sql.gz, but get a permission denied error. Any idea why that might be?

You’re telling me!! :slight_smile:

(Assuming you mean chmod). If the files are set to the wrong owner then they’re not writable.

I think this might have caused the permission denied.

What is the exact error?

Yes, thank you, I’ve been up all night and am a bit braindead.

EXCEPTION: lib/discourse.rb:93:in `exec': Failed to copy archive to tmp directory.
cp: cannot open '/var/www/discourse/public/backups/default/wacky-writers-forum-2021-04-06-033906-v20210328233843.sql.gz' for reading: Permission denied

Try chmod 644 /var/www/discourse/public/backups/default/*

2 Likes

Okay I’m working my way through this now, I’ll report back shortly. Thank you for taking time to help me out.

This worked to get the restore going, THANK YOU. :pray:

Now I just have to figure out why the site still isn’t loading. :grimacing:

Rebuild currently in progress with saved app.yml file from before everything broke.

1 Like

Is there a command to move this backup straight into the app? Restore isn’t finding it and I can’t remember how I got it to load before.

YOu can download it from S3 and put it in

/var/discourse/shared/standalone/backups/default

You should be able to restore from the command line.

But after that, you should configure your S3 config as described in the link above; it makes things easier.

2 Likes

Thanks Jay. And yes that is absolutely my plan.

1 Like

Okay so here’s where I am now.

  • The restore was successful from that .sql.gz file. (hooray! Thanks again Richard.)
  • I ensured app.yml was the same setup as before everything died
  • ./launcher rebuild app
  • Rebuild is successful with Postgres 13 (finally)

However, going to the site itself now is still down. I use Cloudflare but I have Development Mode on right now, and flushed the DNS cache. Everything is pointed where it’s supposed to go. The Cloudflare template is in app.yml.

DNS is resolving correctly, hostnames is up to date, the Discourse install was done with the appropriate URL, and I’m running out of ideas.

https://forum.wackywriters.com is the URL, I’m just getting “server unavailable” errors. I feel like I’m going around in circles here (sorry) but any suggestions?

Edit: When I run ./discourse-doctor, I see that there are two instances of the app running in Docker:

Is this normal? (seems like it wouldn’t be, but everything I thought I knew about Discourse has been thrown out the window the last 24 hours :sweat_smile: )

Edit2: I’ve been putting this off as a last resort, but I’m going to try and set up an entirely new server with a clean Discourse install. I’m worried something has gotten fubared with all my mucking around and I can’t figure out what’s broken. Thankfully I still have the backup and all the uploads on block storage, so if I’m lucky, I should be able to connect that to a new droplet and move things over from there. If anyone has additional suggestions or tips, I’d still appreciate more tenured expertise than mine.

Edit3: Even with a new server and IP propagating (nslookup and ping both look good, whatsmydns.net looks good), forum won’t load. Still getting connection errors. It’s like it isn’t connecting the IP address to the Discourse instance and instead is trying to load a static page, which of course, doesn’t exist in this case.

So after almost 24 hours of fighting, I figured out why the site refused to load after I got the restore going.
:point_down:

Because of so many resets and reinstallations and god knows what else, I hit the rate limit, so I’ve temporarily commented out the ssl templates and will get them going again in a week.

The site is “functioning” while I rebake all posts to fix the broken images but I really appreciate Jay and Richard for helping me out today, you got me through the parts I really just couldn’t figure out.

Now to get a real backup downloaded so I can get S3 setup this week without worrying about this again. :sweat_smile:

1 Like

If you search, there is a way to add a second domain so that it’ll count as a separate request for let’s encrypt. But waiting is easier.

I recommend that you put cloudflare to gray cloud with no speedups.

1 Like

@pfaffman Aren’t you confusing object storage with block storage? Object storage is s3, but TS says they used block storage, which is just a disk mounted at their uploads directory:

1 Like

Oh. :man_facepalming:

Yeah. So nothing I said makes any sense.

Thanks for noticing that, Richard.

2 Likes

Well, most things you said did make sense but you had me confused here :slight_smile:

2 Likes