Move a Discourse site to another VPS with rsync

Hi!

I’m trying to follow the steps by @scottfsmith. I manage to get rsync done. It is not important to me to get the most recent changes via rsync since I’m just testing a new Linux version with my existing site to see if all my plugins work. So I’m not doing the second run of rsync. Then trying to do ./launcher rebuild app produces errors.

2022-12-13 14:43:01.974 UTC [59] LOG:  database system was interrupted; last known up at 2022-12-13 10:23:29 UTC
2022-12-13 14:43:02.075 UTC [59] LOG:  invalid primary checkpoint record
2022-12-13 14:43:02.075 UTC [59] PANIC:  could not locate a valid checkpoint record
2022-12-13 14:43:03.137 UTC [56] LOG:  startup process (PID 59) was terminated by signal 6: Aborted
2022-12-13 14:43:03.137 UTC [56] LOG:  aborting startup due to startup process failure
2022-12-13 14:43:03.231 UTC [56] LOG:  database system is shut down
I, [2022-12-13T14:43:06.699692 #1]  INFO -- : 
I, [2022-12-13T14:43:06.711862 #1]  INFO -- : > su postgres -c 'createdb discourse' || true
createdb: error: could not connect to database template1: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2022-12-13T14:43:06.917008 #1]  INFO -- : 
I, [2022-12-13T14:43:06.917421 #1]  INFO -- : > su postgres -c 'psql discourse -c "create user discourse;"' || true
psql: error: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2022-12-13T14:43:07.007654 #1]  INFO -- : 
I, [2022-12-13T14:43:07.008155 #1]  INFO -- : > su postgres -c 'psql discourse -c "grant all privileges on database discourse to discourse;"' || true
psql: error: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2022-12-13T14:43:07.087098 #1]  INFO -- : 
I, [2022-12-13T14:43:07.087319 #1]  INFO -- : > su postgres -c 'psql discourse -c "alter schema public owner to discourse;"'
psql: error: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I, [2022-12-13T14:43:07.167221 #1]  INFO -- : 
I, [2022-12-13T14:43:07.168041 #1]  INFO -- : Terminating async processes

I can’t make enough out of this to find a solution. Some search suggest the container needs to be stopped but it’s not started. Any ideas?

Thanks
David

I would Set up a staging server to solve your particular problem.

From those errors, it looks like the database is broken. You need to stop the database to be able to get a valid set of data for it to work. The second rsync isn’t optional.

1 Like

Wow!

A four year old thread and a response within 3 minutes! :slight_smile:

Anyway, It’s basically a staging server I’m aiming at and using this rsync method. But do you recommend not doing it this way with rsync but using a backup? I recall not getting all Customize settings from a previous staging server I set up, but maybe I’m wrong.

Thanks

1 Like

That’s what that link describes.

Everything, except the plugins (which are in your app.yml) is in the backup; the database and the uploads are all there is.

1 Like

From my testing of this method it seems to be enough to ./launcher stop app before the initial rsync. Of course one of the reasons to use this method seems to be to keep the forum running on the old server as long as possible, in which case it’s obviously necessary to run the second rsync to maintain consistency. But for the relatively common process of moving a forum to a different server and/or host where a brief downtime is acceptable I really like the simplicity and portability of this method.

1 Like

Right.

Right.

My preferred method is to do the rsync of the let’s encrypt and ssl stuff, put the old server in read-only mode, backup, restore on the new server, and then switch DNS (or better, a static IP when the new server is ready.

But if you don’t care about a bit of down time, your way is great.

2 Likes

I’m planning to migrate to a new VPS in January after some recent problems upgrading Discourse on my old Ubuntu.

My questions on migrating from an old Digital Ocean droplet to a new Digital Ocean droplet are:

  • I plan to lower the TTL on the DNS A record the day before my migration to something small, like 5 minutes. Does this sound reasonable?

  • The first post in this thread was last edited in June 2016. Is it still valid and correct?

  • Will this rsync method also copy the entire database from the old VPS to the new VPS?
    – We’re on a standard install

  • Will the existing Let’s Encrypt SSL certificate also be copied across? Is the SSL cert tied or linked to an IP address at all? Will it continue to automatically renew itself? Any gotchas here?

  • At what point should I change the public DNS A record to point to the new VPS?
    – And also change the TTL back to something higher again

That’s all correct.

If you’re using something that let’s to have a permanent ip that can be assigned to multiple vms, then you can do that so that you’re not counting on dns to make the switch.

The only caution i would add is to shut down the old site for the final rsync and then restart it in read only mode while the new one rebuilds.

The first post is still showing the incorrect /var/discourse/ path:

Could you edit/update please?

@Richie, @JammyDodger has now made this a wiki :+1:

2 Likes

I migrated to a new VPS today and thought I’d share my experiences as it looks like quite a few people are running in to the old-version operating system blocker on their updates lately :blush:

I’m on Digital Ocean, so I created a new droplet.

Old vps = Ubuntu Server 18.04.6 LTS

New vps = Ubuntu Server 23.10

I did the usual housekeeping on the new vps - please edit to suit yourself:

Apt-get update

Apt-get upgrade

Apt-get install fail2ban

ufw default deny incoming

ufw default allow outgoing

ufw allow ssh

ufw allow http

ufw allow https

ufw enable

I then created a new empty directory for Discourse:

sudo mkdir -p /var/discourse

Then I installed Docker:

wget -qO- https://get.docker.com/ | sh

Then I changed the TTL on my DNS from 30mins to 10mins (the minimum GoDaddy allows).

On my old server, I downloaded a local copy of last nights Discourse database backup (you can never have enough local backups). I also downloaded a copy of app.yml to my local pc too.

As suggested by a few people above, I did a “root-to-root” rsync. I used the IP address rather than the hostname, so I could avoid any DNS confusion. Also as suggested above, I used -avz switches:

rsync -avz root@old.ip.address.here:/var/discourse /var

For reference, my discourse folder is 25GB.

It took ~25mins to rsync from the old server to the new server. This was simply between two Digital Ocean droplets in the same LON1 region. Your experiences may differ.

After rsync’ing and trying a rebuild, I hit the same error that @piratdavid hit re postgres database system is shut down.

So I then stopped the app on the old vps:

./launcher stop app

And did another rsync, for just the changes this time:

rsync -avz --delete root@old.ip.address.here:/var/discourse /var

Then I started the old Discourse app again and very quickly put it in Maintenance Mode - this is so people can still get to it and will see the usual maintenance warning message.

This also now buys me some time to work on the new vps :blush:

I updated my HOSTS file on my local pc so I could get to the discourse on the new vps without browser warnings / issues.

On the new vps I then ran:

./discourse-setup

This was so it could update the ram and cpu settings in the app.yml file automatically.

I then did an app rebuild on the new vps:

./launcher rebuild app

Did some smoke tests, all good.

DNS updated - job done.

Thanks for the detailed topic, everyone :smiley:

3 Likes

Thanks guys, first post updated re /var/discourse paths.

1 Like

If anyone is having trouble doing the root to root rsync because maybe they disabled root login on the old server, or you just want to do this as a non-root user, I found this post to be helpful to figure out how to use sudo on the remote server: permissions - Using rsync with sudo on the destination machine - Ask Ubuntu

Let’s say you have a user, discourse, on both sides that have sudo privileges. On the remote machine, you’re going to edit the /etc/sudoers file with sudo visudo. You’re going to add the line:

discourse ALL=NOPASSWD:/usr/bin/rsync

Then on the new machine, you’re going to run (as your non root user):

sudo rsync -avz --delete --rsync-path="sudo rsync" discourse@old.ip.address.here:/var/discourse /var

This will allow you to run everything described here as non root users. If you’re keeping the old server around, I’d go back into the /etc/sudoers file and delete the line you just put in.

1 Like

If I understand correctly, this allows the bulk of the transfer to happen while Discourse is running. The restoring from backup strategy requires at least read-only for the backup and moving the backup to the new server (or transfer via S3 bucket). For large sites, this can result in considerable read-only time that the rsync strategy neatly avoids.

It might be possible to squeeze a bit more uptime by avoiding shutting down PostgreSQL on the old system and “fixing” the problem on the new system with pg_resetwal. NB: I haven’t tried this and letting the database shutdown gracefully is almost certainly a better idea.

I wonder if there’s a way to start Discourse is read-only? I suspect the fastest way is via the command line after the container is running.

At any rate, thanks for reporting back your experience! Seems like a useful process to have in your back pocket. :slight_smile:

Very.

So useful in fact, that I’m tempted to do it all again to create a staging environment (on a lower spec’d VPS), just for testing and preempting any issues before implementing any changes on production.

1 Like

Hello,

I’m in the middle of trying this process on an older Discourse instance I’m now responsible for maintaining – migrating off an EOL Ubuntu to something fresher because any upgrade fails if I try it in place – and although the rsync was successful, postgres is failing to launch citing file ownership issues. Running the rsync as root with the ownership-preservation options didn’t correct this (file ownership and permissions do now match the source, I checked), and because bootstrap has failed and I have no running container, I can’t attempt to fix this like Update failed (postgresql) - #7 by noezDE describes.

What’s the best way to normalize whatever postgres is expecting?

Can you chown the files outside of the container? It should be possible if you have root/sudo permissions.

Sure, but to what? From outside the container, the permissions are both correct and also howling nonsense.

Source (functioning):

root@ip-[...]:/var/discourse/shared/standalone# ls
total 54492
drwxr-xr-x 15 root       root         4096 Oct 22  2021 .
drwxr-xr-x  3 root       root         4096 Feb 28  2017 ..
drwxr-xr-x  3 ubuntu     www-data     4096 Feb 28  2017 backups
-rw-r--r--  1 root       root     55730645 Mar 15  2017 discussion.json
drwx------  7 root       root         4096 Mar  6  2017 letsencrypt
drwxr-xr-x  4 root       root         4096 Feb 28  2017 log
drwxr-xr-x  2 _apt       netdev       4096 Feb 28  2017 postgres_backup
drwx------ 19 _apt       netdev       4096 Sep 15 04:39 postgres_data
drwx------ 20 _apt       netdev       4096 Oct 22  2021 postgres_data_old
drwx------ 20 messagebus uuidd        4096 Apr  5  2018 postgres_data_older
drwxrwsr-x  5 _apt       netdev       4096 Sep 15 04:39 postgres_run
drwxr-xr-x  2 lxd        lxd          4096 Sep 16 01:03 redis_data
drwxr-xr-x  2 root       root         4096 Mar  6  2017 ssl
drwxr-xr-x  4 root       root         4096 Feb 28  2017 state
drwxr-xr-x  4 ubuntu     www-data     4096 Sep 15 04:39 tmp
drwxr-xr-x  5 ubuntu     www-data     4096 Apr 13  2017 uploads

Destination (broken):

root@ip-[...]:/var/discourse/shared/standalone# ls -al
total 54488
drwxr-xr-x 15 root       root         4096 Sep 15 04:31 .
drwxr-xr-x  3 root       root         4096 Sep 15 04:27 ..
drwxr-xr-x  3 ubuntu     www-data     4096 Sep 15 04:27 backups
-rw-r--r--  1 root       root     55730645 Sep 15 04:27 discussion.json
drwx------  7 root       root         4096 Sep 15 04:27 letsencrypt
drwxr-xr-x  4 root       root         4096 Sep 15 04:27 log
drwxr-xr-x  2 _apt       netdev       4096 Sep 15 04:27 postgres_backup
drwx------ 19 _apt       netdev       4096 Sep 15 04:27 postgres_data
drwx------ 20 _apt       netdev       4096 Sep 15 04:30 postgres_data_old
drwx------ 20 messagebus uuidd        4096 Sep 15 04:31 postgres_data_older
drwxrwsr-x  5 messagebus tss          4096 Sep 15 04:31 postgres_run
drwxr-xr-x  2 uuidd      _ssh         4096 Sep 15 04:38 redis_data
drwxr-xr-x  2 root       root         4096 Sep 15 04:32 ssl
drwxr-xr-x  4 root       root         4096 Sep 15 04:31 state
drwxr-xr-x  4 ubuntu     www-data     4096 Sep 15 04:31 tmp
drwxr-xr-x  5 ubuntu     www-data     4096 Sep 15 04:31 uploads

I’m assuming that these IDs might make more sense inside the container, maybe?

Yeah, I tried bruteforcing the numeric IDs from ls -aln and I still get the same failure.

2024-09-16 01:21:27.237 UTC [36] FATAL:  data directory "/shared/postgres_data" has wrong ownership

I don’t know what it wants.

I think I had a similar error recently.

One guess it’s that the old container and be one have different /etc/passwd entries. You could compare those files, I guess.

I think your best bet may be to restore from backup. I can’t remember if I did that or v made something 777