Set up an environment to migrate another forum to Discourse

Want to setup a Discourse import environment? Let’s get started!

Setup a Discourse instance on DigitalOcean

Setup a Discourse instance on DigitalOcean using our official guide. Email setup is not required for migration. I accessed the instance using DigitalOcean allotted IP instead of setting up a domain name.

I forked the discourse_docker repository instead of cloning the original one. The reason will be explained in further steps.

Download the import data on server

cd /var/discourse/shared/
mkdir import_data

Download the import data (using curl, wget, sftp, etc) and unzip the files if required. Let’s assume the downloaded files are vanilla_mysql.sql (MySQL database dump) and import_uploads (uploads/attachments folder).

Move the SQL dump and uploads to shared docker storage

Create /import/data/ folder in /var/discourse/shared/standalone/.

cd /var/discourse/shared/standalone/
mkdir import
cd import/
mkdir data

Move the downloaded data to the folder created above.

cd /var/discourse/shared/import_data
mv vanilla_mysql.sql /var/discourse/shared/standalone/import/data/
mv import_uploads/ /var/discourse/shared/standalone/import/data/

Prepare and build import container

Copy the existing app.yml to create new import.yml file.

cd /var/discourse
cp containers/app.yml containers/import.yml

In the first step I mentioned that I forked discourse_docker. That is because we need to create a new template file specific to our import requirements.

In my case I created a new import template file called vanilla.template.yml for Vanilla bulk import script and pushed it to a new branch called import.

Vanilla Template
# This template installs MariaDB and all dependencies needed for importing from vanilla.

params:
  home: /var/www/discourse

hooks:
  after_web_config:
    - exec:
        cd: /etc/service
        cmd:
          - rm -R unicorn
          - rm -R nginx
          - rm -R cron

    - exec:
        cd: /etc/runit/3.d
        cmd:
          - rm 01-nginx
          - rm 02-unicorn

    - file:
        path: /etc/mysql/conf.d/import.cnf
        contents: |
          [mysqld]
          # disable InnoDB since it is extremely slow in Docker container
          default-storage-engine=MyISAM
          default-tmp-storage-engine=MyISAM
          innodb=OFF
          sql_mode=NO_AUTO_CREATE_USER

          datadir=/shared/import/mysql/data

          skip-host-cache
          skip-name-resolve

    - exec:
        cmd:
          - mkdir -p /shared/import/mysql/data
          - apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y nano libmariadb-dev mariadb-server
          - sed -Ei 's/^log/#&/' /etc/mysql/my.cnf

    - file:
        path: /etc/service/mysql/run
        chmod: "+x"
        contents: |
          #!/bin/bash
          cd /
          umask 077

          # Make sure the datadir exists, is accessible and contains all system tables
          mkdir -p /shared/import/mysql/data
          chown mysql -R /shared/import/mysql/data
          /usr/bin/mysql_install_db --user=mysql

          # Shamelessly copied from http://smarden.org/runit1/runscripts.html#mysql
          MYSQLADMIN='/usr/bin/mysqladmin --defaults-extra-file=/etc/mysql/debian.cnf'
          trap "$MYSQLADMIN shutdown" 0
          trap 'exit 2' 1 2 3 15
          /usr/bin/mysqld_safe & wait

    - file:
        path: /etc/runit/3.d/99-mysql
        chmod: "+x"
        contents: |
          #!/bin/bash
          sv stop mysql

    - file:
        path: /usr/local/bin/import_vanilla.sh
        chmod: "+x"
        contents: |
          #!/bin/bash
          set -e

          chown discourse -R /shared/import/data

          # Allow connection as root user without password
          mysql -uroot -e "ALTER USER 'root'@'localhost' IDENTIFIED VIA mysql_native_password"
          mysql -uroot -e "FLUSH PRIVILEGES"

          if [ -f "/shared/import/data/vanilla_mysql.sql" ]; then
            if [ -f "/shared/import/mysql/imported" ] && ! sha256sum --check /shared/import/mysql/imported &>/dev/null ; then
              echo "Checksum of database dump changed..."
              rm /shared/import/mysql/imported
            fi

            if [ ! -f "/shared/import/mysql/imported" ]; then
              echo "Loading database dump into MySQL..."
              mysql -uroot -e "DROP DATABASE IF EXISTS vanilla"
              mysql -uroot -e "CREATE DATABASE vanilla"
              mysql -uroot --default-character-set=utf8 --database=vanilla < /shared/import/data/vanilla_mysql.sql
              sha256sum /shared/import/data/vanilla_mysql.sql > /shared/import/mysql/imported
            fi
          else
            sv stop mysql
          fi

          cd $home
          echo "The Vanilla import is starting..."
          echo
          su discourse -c 'bundle exec ruby script/import_scripts/vanilla.rb'

    - exec:
        cd: $home
        cmd:
          - mkdir -p /shared/import/data
          - chown discourse -R /shared/import

  after_bundle_exec:
    - exec:
        cd: $home
        cmd:
          - echo "gem 'mysql2'" >> Gemfile
          - echo "gem 'ruby-bbcode-to-md', :github => 'nlalonde/ruby-bbcode-to-md'" >> Gemfile
          - su discourse -c 'bundle config unset deployment'
          - su discourse -c 'bundle install --no-deployment --path vendor/bundle --jobs 4 --without test development'

Note that the template file will need changes as per your import requirements. In my case I edited it to install MariaDB (for MySQL) and import the database dump.

Let’s fetch that custom branch now.

git fetch origin import
git reset --hard origin/import

Edit containers/import.yml to include the custom import template file - "templates/import/vanilla.template.yml". Afterwards it should look something like this:

templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"
  - "templates/import/vanilla.template.yml"

That’s it. Now build the import container.

/var/discourse/launcher stop app
/var/discourse/launcher rebuild import

Enter into import container and verify/cleanup database

:bulb: Starting here I recommend using tmux so that the import can run in background and you can shutdown your computer if needed.

Enter into the import container.

/var/discourse/launcher enter import

Now to import the database dump run the script import_vanilla.sh script you created in the custom template.

import_vanilla.sh
#!/bin/bash
set -e

chown discourse -R /shared/import/data

# Allow connection as root user without password
mysql -uroot -e "ALTER USER 'root'@'localhost' IDENTIFIED VIA mysql_native_password"
mysql -uroot -e "FLUSH PRIVILEGES"

if [ -f "/shared/import/data/vanilla_mysql.sql" ]; then
if [ -f "/shared/import/mysql/imported" ] && ! sha256sum --check /shared/import/mysql/imported &>/dev/null ; then
  echo "Checksum of database dump changed..."
  rm /shared/import/mysql/imported
fi

if [ ! -f "/shared/import/mysql/imported" ]; then
  echo "Loading database dump into MySQL..."
  mysql -uroot -e "DROP DATABASE IF EXISTS vanilla"
  mysql -uroot -e "CREATE DATABASE vanilla"
  mysql -uroot --default-character-set=utf8 --database=vanilla < /shared/import/data/vanilla_mysql.sql
  sha256sum /shared/import/data/vanilla_mysql.sql > /shared/import/mysql/imported
fi
else
sv stop mysql
fi

cd $home
echo "The Vanilla import is starting..."
echo
su discourse -c 'bundle exec ruby script/import_scripts/vanilla.rb'

The above script will import the database for you. Now let’s verify the MySQL database.

mysql -uroot
use vanilla;
show tables;

Now is the time to do database cleanup if needed. In my case I had to delete the duplicate users in the GDN_User table, so I ran this command from mariadb console:

DELETE t1 FROM GDN_User t1 INNER JOIN GDN_User t2 WHERE t1.UserID < t2.UserID AND t1.Email = t2.Email;

Start the import process

Now the source code inside the import container will be pulled from the latest Discourse repo. To make custom/specific changes to the import script you are using, I recommend pushing them to your own Discourse repo and then fetching them from inside the container.

git remote add import https://github.com/techAPJ/discourse.git
git fetch import vanilla
git reset --hard import/vanilla
rake db:migrate

Now start the import process:

su discourse -c 'bundle exec ruby script/bulk_import/vanilla.rb'

Depending on database and uploads size the time for import process to complete will vary. If you used tmux as I suggested above simply detach from the session and the import will continue running in background. You can attach at any time to view import progress.

Verify the import data and take a backup

After the import process is complete let’s verify the data. Exit from the import container and start app container.

/var/discourse/launcher stop import
/var/discourse/launcher start app

Create an Admin account for yourself.

/var/discourse/launcher enter app
rake admin:create

Visit the DigitalOcean server IP and login. Now make site setting changes as per your requirement. I recommend enabling setting login required to prevent data from leaking in public.

Now is the time to run any custom rake task or perform any post import process, like a rebake.

If everything looks good take a backup of the data and download it on your computer.

scp root@{DROPLET_IP_ADDRESS}:/var/discourse/shared/standalone/backups/default/backup.tar.gz .

:tada: Voila!

That’s it the downloaded backup file is now ready to be restored on Discourse servers.


Credits

This import setup is inspired from @gerhard’s excellent Importing from phpBB3 guide.


Last Reviewed by @cocococosti on 2023-08-09T04:00:00Z

20 Likes