Importing an IPB 3.1 Forum into Discourse


(CeBe) #1

Hi Discourse Community,

I want to share my experience with importing an IPB 3.1 Forum into Discourse 2.1, hoping that this will be useful for others.

A short summary about the Community:

  • Topic: Yii PHP Framework (code related discussions and support)
  • Members: ~26k
  • Topics: ~64k
  • Posts: ~293k

The import took 27h 46min on a Machine with 16GB RAM and 4 CPU cores.

Import Requirements:

  • Keep members, but clean up all SPAM accounts (~250k accounts of which ~26k remain after cleanup)
  • Implement SSO from the website (user accounts are not managed by Discourse)
  • keep topics, posts and categories with their original URLs so web search results will
    still work and also other links from platforms like Stackoverflow

This is based on Migrating from Invision Power Board to Discourse so thanks to @pfaffman for the great work done on the importer.

Preparation

Exporting Data from IPB

mysqldump <databasename> > /tmp/ipb.sql
cd /var/www/yiiframework.com/forum/ && tar czvf uploads.tgz uploads/

Copy the sql dump and uploads to the new server and put them into /var/discourse/shared/standalone/.
I am assuming a simple Docker setup of Discourse here.

Which script to use?

There are two import scripts, ipboard.rb and ipboard3.rb, the ipboard3.rb script looks very rough and also does not fit the table schema we have, so I went with ipboard.rb.

The current version of ipboard.rb import script does not handle attachments well and also does
not convert code tags, which is very important to us as we are talking a lot about PHP
code. So I made the following changes to the script:

Making Uploaded Attachments available

The import script replaces post attachments with URLs to the uploaded file.
If you are going to keep your IPB instance online on the URL where it was before you can
just specify the URL (UPLOADS is a configuration of the import script, see below) to the uploads directory and you are done:

UPLOADS="https://www.yiiframework.com/forum/uploads"

But we are importing to Discourse to remove the old forum completely so we have to put the
uploads somewhere else. If you are using an nginx proxy in front of Discourse, you
can configure it to serve the uploaded files from a directory on the server. Put the following
into the server part of the nginx config:

location /ipb_uploads/ {
    alias /var/www/ipb_uploads/;
}

And configure the attachment URL (see below) like this:

UPLOADS="https://forum.yiiframework.com/ipb_uploads"

Setting up MySQL on the discourse container

Start a bash shell in the discourse app container:

docker exec -it app bash

In the container, install MySQL and import the database:

apt-get install mysql-server mysql-client libmysqlclient-dev
service mysql start
echo "create database ipb" | mysql -uroot -p
mysql -uroot -p ipb < /shared/ipb.sql

When I tried the first import to test the script it was running over multiple days to import 200k users, of which we knew a large amount was SPAM accounts, so we created some SQL queries to delete accounts that never had posted anything:

Note that we are going to use SSO, so deleted users will be re-created when they log in.
If you are not going to use SSO, your criteria for deleting users might be different.
You may import your data without cleaning it up.

mysql -uroot -p
# then apply the cleanup queries

Next we need to install dependencies for the import script:

cd /var/www/discourse
echo "gem 'mysql2'" >>Gemfile
echo "gem 'reverse_markdown'" >>Gemfile
bundle install --no-deployment

To allow database access to the Discourse Postgres database, replace peer with trust in /etc/postgresql/10/main/pg_hba.conf. Note that 10 stands for the postgres version, if the file does not exist in your setup replace 10 with the version of postgres you are currently running.
Restart Postgres to load the changes: service postgresql restart

Importing

prepare Avatars and uploaded files:

mkdir /shared/imports
mv /shared/uploads.tgz /shared/imports
cd /shared/imports && tar xzvf uploads.tgz

run the importer script:

cd /var/www/discourse
DB_HOST="localhost" DB_NAME="yiisite" DB_USER="root" DB_PW="root" TABLE_PREFIX="ipb_" IMPORT_AFTER="1970-01-01" UPLOADS="https://www.forum.yiiframework.com/ipb_uploads" AVATARS_DIR="/shared/imports/uploads/" USERDIR="user" bundle exec ruby script/import_scripts/ipboard.rb | tee import.log

Make sure to adjuste the UPLOADS URL as discussed above as uploads will be included in the posts as links to the original upload file.

Cleanup

if all went fine, clean up with service mysql stop, apt-get purge mysql-server, rm -rf /var/lib/mysql

Setting up URL redirection

For keeping existing URLs to the forum intact the import script creates permalinks for
each topic that reflect the URL of topics and categories in IPB.
These permalinks however do not cover links to specific posts in a topic or different pages.
For these URLs to work properly you have to configure some URL rewriting rules, there are 3 options:

  • Using the permalink normalizations setting in Discourse to remove unnecessary parts from the URLs
  • Rewrite rules in nginx, if you have an nginx proxy in front of Discourse
  • If the old forum was on a different URL/Host than Discourse you can have a custom script to rewrite the URLs (this is what I did)

Here is the PHP code we use for URL redirection:

Related resources


Need help on importing IPB to Discourse