How to migrate Discourse from one server to another with the same DNS name

I’m trying to migrate discourse from a personal hosting to a Amazon LightSail server. I’ve searched the forum and read all the posts about migrating servers and settings up discourse:

Move your Discourse Instance to a Different Server
Restore a backup from command line
How To Install Discourse on Ubuntu 18.04 | DigitalOcean
https://github.com/discourse/discourse/blob/master/docs/INSTALL-cloud.md

As I understand the process is:

  1. Install new Discourse server
  2. Export backup from existing discourse (current the backup is configured to be saved on S3 but I understand this will be manual file backup)
  3. Import backup to new discourse (manually since it can’t pick it up from S3 as I understand)

I’m a little stuck with now to do Step 1 given a few constraint that I have a single domain name and I want to keep the same domain name for the new server and I won’t want any down time (my goal is to complete the new server setup, restore the back and then finally change the DNS entry to point to the new server thereby avoiding any downtime since both server will be running at the same time).

As I understand, when I setup the new Discourse server, I can copy the app.yml from the existing server to the new server and then run discourse-setup. The problem I see here is that when I do this, it’ll use the same DNS name as the existing server (which is what I want) but I foresee two issues here which I’m trying to figure out

  1. The lets-encrypt certificate won’t generate a SSL certificate for the new server since the domain name still points to the old server
  2. Without the SSL certificate (the old server config is set to only use SSL which will be carried over in app.yml) so the server won’t start
  3. I’ve tried connecting to the discourse server by using a DNS name redirection, but if the URL entered doesn’t match the app.yml configuration, either NGINX or Discourse won’t work, you’ll get an error in the browser while trying to connect. So without a web interface I can’t launch the restore process

So how do I complete the setup of the new discourse server using the existing server app.yml and then restore the backup followed by a DNS switch over? OR is there a different easier way to do this?

If you’re not going to use the same S3 bucket then there is a hidden setting that forces the backup to download the S3 files. You can look in the settings file for the name and set it at the rails console. There’s a topic that discusses it, but it might be easiest to look in settings.yml.

You don’t need to run discourse-setup, just copy the app.yml and rebuild.

You can rsync over the let’s encrypt certificates. In fact you can copy over the whole /var/discourse (perhaps excluding some logs and such).

3 Likes

The goal is ideally to “lift n shift” but that’s not possible with amazon lightsail since one cannot import an existing image. So yes it would be using the exact same S3.

It seems like your approach is closest to lift n shift. If I understand what you’re saying, I can just tar/gz the entire /var/discourse folder from the original server and untar it into the new server followed by a discourse start and then just repoint the DNS to the bee server. Is that it? Do I need to rebuild discourse in the new server? What about Nginx, docker and other dependencies outside the folder?

Yes, move the files however makes sense to you. Yes, you’ll need to do a rebuild to build and launch a new container.

1 Like

Thanks. Apparently lift n shift wasn’t as clean as I thought, there are a few checks to be done before and after to ensure a smooth lift n shift operation (Postgress was being upgraded from12.0 to 13.0 which taught me a few lessons in the lift n shift process). Here’s a step by step guide for future reference for folks trying to move to a Amazon LightSail server (1GB RAM):

Original Server

  • Create backup to S3
  • cd /var/discourse
  • ./launcher rebuild # get the latest build for an easy transition
  • ./launcher cleanup # clean it up to remove old data and reduce the package size
  • ./launcher stop app # not doing this causes a failure while trying to rebuild it later with Postgres
  • tar -zcvf /var/discourse discourse.tar.gz

New Amazon Light Sail Server

  • Install Ubuntu 20.20 image from Amazon (1GB RAM)
  • Install Docker
  • Create 2GB swap # not doing this may cause rebuild to fail
  • Configure vm.overcommit_memory=1 # not doing this may cause a failure with Postgres during a rebuild
  • FTPS/transfer discourse.tar.gz from original server
  • tar -zxvf discourse.tar.gz -C /
  • cd /var/discourse
  • Set UNICORN_WORKERS in app.yml to 2 # increasing it beyond 2 with 1GB RAM will risk swapping and throttling it due to excessive disk activity
  • ./launcher rebuild
  • Change DNS to point to new Amazon server

Is there an easier way to migrate servers (lift n shift) without having to go through a discourse setup process?

1 Like

Do you mean without running discourse-setup or do you mean without building the container that is required to run Discourse? If you mean the latter, it’s possible by pushing the old image to a repo that the new server could use, but that’s not something that a novice could easily handle.

Your process was complicated by the PG13 upgrade. It might have been a bit easier to build a new image from scratch on the new server and backup/restore the backup from the old one, but you’d still have some fiddly bits on getting the new server to get the let’s encrypt certs.

1 Like

Yep, that’s the only thing that was preventing me from doing a ./discourse-setup on the new server and then restore from the S3 image (and how to do this without access to the web admin console since the DNS would still be pointing to the old server and AFAIK discourse won’t response to a IP address in the browser). Since I had a live system and I needed to switch the DNS on the fly from the old to the new system, the lack of Lets Encrypt certificates was the only roadblock for me.
If there’s a way to transfer the certs from the old system to the new system, complete ./discourse-setup without any lets encrypt errors and restore from the S3 backup without the web console then that would be a simpler way to do this.

If you copy over the yml files inside of containers then you don’t need discourse-setup (it can adjust memory params if they are different on the new server, but you could do that afterward). You’d just ./launcher rebuild app.

Okay I think I see what you’re saying but to be sure let me restate my understanding.

In the original server, it was set up to backup discourse to S3 (which are the settings and site content).

By copying the yml files from the containers, it will copy all the original server configuration over to the new server, so now in the new server I no longer need to do discourse-setup, instead doing a ./launcher rebuild app will use the original server configuration to download the latest image and configure discourse.

Now there are two things which are pending:

  1. How does one transfer the lets encrypt certificates (since the DNS is still pointing to the original server it can’t be recreated and I’m guessing this needs to be done before the ./launcher rebuild app)
  2. How do I restore discourse (settings + content) from the S3 backup after the rebuild? Since the DNS is still pointing to the original server, is there a way to get access to the discourse admin web interface using an IP address or localhost or can the S3 backup be restored via the console?

If you copy the old /var/discourse you’ll get the certificates and the rebuild will work as expected.

You can restore from the command line from inside the container.

Thanks for the detailed steps, I just had to do something similar, moving to a new host.
Since the site was working, I did not like to go through the backups, so I followed the steps here.

It almost worked but the rebuild on the new host failed.
It turns out the UID/GID mapping was not entirely the same on the two hosts, so when starting Postgres would break due to incorrect ownership of the data folder.

This is something that can happen in other instances as well, but fortunately a fix is available.

There is one extra detail for the scenario in this post, which is that the container is not built, so ./launcher enter app does not work at this stage. As the rebuild would go on for quite some time, I was able to use docker ps to get the name of the container doing the build, and then entering the container:

docker exec -it <container_name> bash
chown -R postgres:postgres /shared/postgres_*

The rebuild then fails (or you cant CTRL+C stop it). After it has stopped, simply run it again, and the permissions are fixed:

./launcher rebuild app

And it is running again :sweat_smile: .

1 Like

For anyone using 1GB RAM be sure to create atleast 4GB swap otherwise the rebuild will fail. See 3.1.x to 3.2.0 upgrade hangs/fails on 1GB instance

1 Like