Migrate a vBulletin 4 forum to Discourse

In this tutorial we will learn how to migrate vBulletin forum to :discourse: platform using vBulletin importer script.

I have tested the script with a big database and the result was very good. The script is well written and maintained.

Let’s get started :slight_smile:

What data can be imported?

  • Groups
  • Users
    • Banned Users => Suspended Users
  • Forums
    • Each parent and child forum => Category and Sub-category
  • Topics
    • Sticky Thread => Pinned Topic
    • Closed Thread => Closed Topic
    • Soft Deleted Thread => Not listed Topic
    • Moderated Thread => Not listed Topic
  • Posts
  • PMs
  • Attachments
  • Avatars (requires embedding into database, see below)
  • Permalinks for Topics (just mapping)

Caveats

  • Resuming an incomplete import is fine and updating a previously completed import from a new database snapshot is also working, but any edits to users, categories, posts, topics, … that have already been imported will not be updated.
  • Categories that are nested too deep will be flattened by the importer. Be aware.
  • [ul] [ol] and [li] support for BBCode is incomplete and will need manual fixing. (This issue will not be fixed by the discourse team. See here: CommonMark testing started here!)

Export Note

The import script will import avatars, but only if they are included in the database export. Visit /admincp/avatar.php?do=storage on your site to enable this before you export the database. (And @pfaffman added this without ever visiting that page himself, so if it works, you might add a note here.)

Using development environment (recommended)

  • Setup your development environment by following the guides for Ubuntu (or use Vagrant if you are on Mac or PC)

  • Install MySQL server:

    sudo apt-get install libmysqlclient-dev mysql-server-5.7
    

    The installer will ask you to create a password for the root user (database root, not OS root) which we will use it later.

    After finishing installing MySQL check its status:

    sudo service mysql status
    

    If MySQl service is not running/active:

    sudo service mysql start
    
  • Install the importer dependencies:

    cd ~/discourse
    echo "gem 'mysql2', require: false" >> Gemfile
    echo "gem 'php_serialize', require: false" >> Gemfile
    bundle install
    sudo apt-get update
    sudo apt-get install jhead libjpeg-turbo-progs jpegoptim gifsicle optipng
    npm install -g svgo
    
  • Export the database dump (from vBulletin server):

    mysqldump -u [:user-name] -p [:name-of-database] > vb4_dump.sql
    
  • Copy the database to your Discourse server (scp or rsync). And of course you can gzip it first if it’s a big one.

  • Import the database (on Discourse server):

    mysql -uroot -p -e 'CREATE DATABASE vb4'
    mysql -uroot -p vb4 < vb4_dump.sql
    
  • Clear existing data from your local Discourse instance

    cd ~/discourse
    bundle exec rake db:drop db:create db:migrate
    
  • Copy the attachments to your Discourse instance (for the path check your VB4 settings).

  • Run the importer and wait until the import is done. You can restart it if it slows down.

    export DB_NAME="vb4" # Change this to the name of VB4 database.
    export DB_USER="root"
    export DB_PW="" # Add the password of root user that you provided in the installation.
    export TABLE_PREFIX="vb_" # Change this to correspond to your database prefix (or empty if there is not).
    export ATTACHMENT_DIR='/path/to/your/attachment/folder' # The path for attachments you copied from the old vb4 server.
    export TIMEZONE="America/Los_Angeles" # Change this if needed
    
    cd ~/discourse
    bundle exec ruby script/import_scripts/vbulletin.rb
    
  • Start your Discourse instance:

    bundle exec rails server
    
  • Start Sidekiq and let it do its work. Depending on your forum size this can take time. You can monitor the progress at http://localhost:3000/sidekiq/queues

    bundle exec sidekiq
    
  • Perform a backup and upload it to your production server by following this tutorial.

  • Congratulations! :tada:


Using Docker container

  • Setup your production environment by following the official installation guide.
    Afterwards go to the Admin section and configure a few settings:

    • Change the value of slug_generation_method. See this post for more information.
    • Enable login_required (recommended. At least to finish the importing)
    • Disable download_remote_images_to_local if you don’t want Discourse to download images embedded in posts.
    • Enable disable_edit_notifications if you enabled download_remote_images_to_local and don’t want your users to get lots of notifications about posts edited by the system user.

    :bulb: You can make any of the previous settings (or any other settings) manually or you can automate this by editing the base.rb file. This very helpful if you are in testing phase. For example to change the value of slug_generation_method, look for the method get_site_settings_for_import and add slug_generation_method: 'encoded' to the hash.

  • Prepare the Docker container:

    cd /var/discourse
    ./launcher enter app # Now your are inside the container.
    
  • Install MySQL server:

    apt-get update && apt-get install libmysqlclient-dev mysql-server-5.7
    

    The installer will ask you to create a password for the root user (database root, not OS root) which we will use it later.

    After finishing installing MySQL check its status:

    sudo service mysql status
    

    If MySQl service is not running/active:

    sudo service mysql start
    
  • Install dependencies

    echo "gem 'mysql2', require: false" >> /var/www/discourse/Gemfile
    echo "gem 'php_serialize', require: false" >> /var/www/discourse/Gemfile
    cd /var/www/discourse
    su discourse -c 'bundle install --no-deployment --without test --without development --path vendor/bundle'
    
  • Export the database dump (from vBulletin server):

    mysqldump -u [:user-name] -p [:name-of-database] > vb4_dump.sql
    
  • Copy the database to your Discourse server (scp or rsync). And of course you can gzip it first if it’s a big one.

  • Import the database (inside Discourse container):

    mysql -uroot -p -e 'CREATE DATABASE vb4'
    mysql -uroot -p vb4 < vb4_dump.sql
    
  • Copy the attachments to your Discourse instance (for the path check your VB4 settings).

  • Run the importer and wait until the import is done. You can restart it if it slows down.

    export DB_NAME="vb4" # Change this to the name of VB4 database.
    export DB_USER="root"
    export DB_PW="" # Add the password of root user that you provided in the installation.
    export TABLE_PREFIX="vb_" # Change this to correspond to your database prefix (or empty if there is not).
    export ATTACHMENT_DIR='/path/to/your/attachment/folder' # The path for attachments you copied from the old vb4 server.
    export TIMEZONE="America/Los_Angeles" # Change this if needed
    
    cd /var/www/discourse
    su discourse -c 'bundle exec ruby script/import_scripts/vbulletin.rb'
    

    :bulb: It’s a good idea to start the import inside a tmux or screen session so that you can reconnect to the session in case of SSH connection loss.

  • Cleaning up:

    mysql -uroot -p -e 'DROP DATABASE vb4'
    apt-get remove libmysqlclient-dev mysql-server-5.7
    apt-get autoremove
    gem uninstall mysql2
    gem uninstall php_serialize
    

Congratulations! :tada:


vBulletin 3

I have tested the vbulletin.rb script with vBulletin 3.8.7 Patch Level 1 with very small data and I got very good results. The script was able to run without any errors and migrate the data successfully. And I can say that the vBulletin script is compatible with vBulletin 3.

18 Likes

The script will now import avatars, but they must be embedded into the database. To do so, I’m told that you visit /admincp/avatar.php?do=storage and do something there. It seemed to work for the last client I did an import for.

Hey, @erlend_sh, make this a wiki and I’ll add to it in a few days when I submit a PR for this importer.

6 Likes

I was following your guide to import my vBulletin forum to Discourse. I am using a docker container for Discourse. I ran the following command:

I got a long list of errors:

sudo: no tty present and no askpass program specified
sudo: no tty present and no askpass program specified
[...]
sudo: no tty present and no askpass program specified
sudo: no tty present and no askpass program specified

Bundler::SudoNotPermittedError: Bundler requires sudo access to install at the moment. Try installing again, granting Bundler sudo access when
prompted, or installing into a different path.
An error occurred while installing rake (11.2.2), and Bundler cannot continue.
Make sure that `gem install rake -v '11.2.2'` succeeds before bundling.

I am new to ruby on rails and dockers. I do understand the concept of gems and Bundler.
What do these errors mean and how can I resolve them?

Ok so basically I had to install all the necessary gems.
The script worked quite well.

However, I am not able to login with one of the users I created on my vBulletin forum just before importing the database over to discourse.

1 Like

You’ll need to set passwords. The script doesn’t import them (last I looked).

And even if it did, you’d need to install a plugin to make them work.

1 Like

I am planning to write a script that will import the passwords for my case.
I know the mysql tables I need to consider for vBulletin. What is the table in PostgreSQL for Discourse that stores passwords and is used during login?

Prakarsh would you be able to share the forum url that you are trying to switch from? The vBulletin one. Will the Discourse one have the same URL?

Yes. Here it is: http://neuroimage.usc.edu/forums/

Our plan is to keep the URL same for now. I cannot say when it will be live. Probably by end of next week if everything goes right :stuck_out_tongue:

1 Like

Before you do that, you should read this topic.

You don’t want to directly touch the Discourse tables. You want a script like this. When I last used it and tried to import the passwords, it ran really slowly. I’m not sure why, but after reading this I decided that it’s probably not a bad idea to make people change their passwords.

3 Likes

@pfaffman - You mentioned you used import_pass field in your import script.
Where exactly do you put the import_pass field? And how do you use it?
If you could share it would be great help.

I have tried using this script but it failed 90% into the import of the users:

su discourse -c 'RAILS_ENV=production ruby script/import_scripts/vbulletin.rb'
root:XXXX@localhost wants vb4
loading existing groups...
loading existing users...
loading existing categories...
loading existing posts...
loading existing topics...

importing groups...
        6 / 19 ( 31.6%)  Failed to create group id 7 Moderators: ["Name has already been taken"]
       19 / 19 (100.0%)  
importing users
     2746 / 15319 ( 17.9%)  W, [2017-10-12T10:07:05.104837 #3578]  WARN -- : Bad date/time value "0000:00:00 00:00:00": mon out of range
W, [2017-10-12T10:07:05.106562 #3578]  WARN -- : Bad date/time value "0000:00:00 00:00:00": mon out of range
W, [2017-10-12T10:07:05.107305 #3578]  WARN -- : Bad date/time value "0000:00:00 00:00:00": mon out of range
    13901 / 15319 ( 90.7%)  script/import_scripts/vbulletin.rb:137:in `strip': invalid byte sequence in UTF-8 (ArgumentError)
        from script/import_scripts/vbulletin.rb:137:in `block (2 levels) in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:226:in `block in create_users'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `each'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `create_users'
        from script/import_scripts/vbulletin.rb:133:in `block in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:784:in `block in batches'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `loop'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `batches'
        from script/import_scripts/vbulletin.rb:117:in `import_users'
        from script/import_scripts/vbulletin.rb:78:in `execute'
        from /var/www/discourse/script/import_scripts/base.rb:45:in `perform'
        from script/import_scripts/vbulletin.rb:902:in `<main>'

Anyone have any suggestions regarding how to debug and fix this?

You have an encoding problem in your database.

The database claims is using one encoding, but some of the data is encoded some other way. A job I just did had a similar problem. UTF-8 likely does not exist when you started your forum.

This is probably a problem for stack exchange or a database person. It’s not a problem with the importer.

It’s the kind of problem that will take 10 minutes following a recipe that you don’t understand, ten hours looking for that recipe, or 100 hours fixing it by hand.

3 Likes

Cheers @pfaffman I was approaching the same conclusion… I started writing up the following before your comment…

It looks like it might be caused by non-ascii characters in usernames, so dumping the usernames into a text file and searching for UTF-8:

echo "select username from user" | mysql -p vb4 | grep --color='auto' -P -n "[^[:ascii:]]"

These were the entities this threw up:

’ ´ – é ã Ñ í ó 

So I used this page to generate encoded versions and updated the usernames in the database, for example:

UPDATE user SET username="&rsquo; &acute; &ndash; &eacute; &atilde; &Ntilde; &iacute; &oacute;" WHERE username="’ ´ – é ã Ñ í ó"

But the script still fails:

script/import_scripts/vbulletin.rb:137:in `strip': invalid byte sequence in UTF-8 (ArgumentError)
        from script/import_scripts/vbulletin.rb:137:in `block (2 levels) in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:226:in `block in create_users'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `each'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `create_users'
        from script/import_scripts/vbulletin.rb:133:in `block in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:784:in `block in batches'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `loop'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `batches'
        from script/import_scripts/vbulletin.rb:117:in `import_users'
        from script/import_scripts/vbulletin.rb:78:in `execute'
        from /var/www/discourse/script/import_scripts/base.rb:45:in `perform'
        from script/import_scripts/vbulletin.rb:902:in `<main>'

So I then dumped the whole users table into a text file:

echo "select * from user" | mysql -p vb4 > /tmp/allusers.txt

And then tried all the suggesting in this stackexchange thread for finding non-ascii characters and none were found, so now to search the whole database for the problem…

When I find a solution I’ll post it here in case it helps others…

2 Likes

The problem almost certainly exists in other tables.

You need to figure out what it’s encoded in and fix it.

Or, it could be that for some years it’s in one encoding and other years it’s another.

Best of luck. I was relieved they my client had engineers they solved the problem so I didn’t have to. I may have some notes that I’ll try to find when I get out of bed.

1 Like

The database on the old server was UTF-8 but clearly it has some non-UTF-8 in it, the site has been in use for a long time and has 13k users… so taking care with dumping the old database:

mysqldump -uroot -p database -r dump.sql

And trying various methods to check the encoding (thanks to Stackexchange):

file dump.sql 
  dump.sql: ASCII text, with very long lines

apt install uchardet 
uchardet dump.sql
  UTF-8

iconv -f utf-8 -t utf-8 dump.sql > dump.utf-8.sql
  iconv: illegal input sequence at position 29618109

apt install moreutils
isutf8 dump.sql 
  dump.sql: line 2995, char 42, byte 29618109: Expecting bytes in the following ranges: 00..7F C2..F4.

And this is a image line:

INSERT INTO `customavatar` VALUES (6355,'ÿØÿà\0^PJFIF\0^A^A\0\0^A\0^A\0\0ÿþ\0;CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 75

And before I delete the line using vim I need to increase the size of the partitions as it is a 1.4G file and I’m running out of space when editing it…

1 Like

The problem was a user with &#55357;&#56444; in the username field, once the HTML entities were removed the import script ran, but there were a lot of errors like this:

ERR Error running script (call to f_b06356ba4628144e123b652c99605b873107c9be): @user_script:14: @user_script: 14: -MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. 

Is this something that I should be concerned about and something that I should fix and then redo the import?

Are you out of disk space?

1 Like

Yes! I’m now spinning up a new virtual server with a huge amount of disk space, CPUs and RAM and I’m going to start from scratch…

By the way, I’m following the instructions above for the Docker version and found that I needed to add an additional step in the container

echo "discourse ALL = NOPASSWD: ALL" >> /etc/sudoers

Before running:

su discourse -c 'bundle install --no-deployment --without test --without development'

In the non-docker instructions above there is this step to delete data from the database before doing the import:

  • Clear existing data from your local Discourse instance
cd ~/discourse
bundle exec rake db:drop db:create db:migrate

Was this omitted from the Docker instructions on purpose?

1 Like

There are two, different, vbulletin import scripts, in the docker container:

cd /var/www/discourse
find ./ -name vbulletin.rb
./script/bulk_import/vbulletin.rb
./script/import_scripts/vbulletin.rb

The second one is twice the size of the first one and it is the one I have been using.