Migrate a vBulletin 4 forum to Discourse

Instructions need to be updated. Here’s what works for me as of Nov 2020. Note it is indeed better to run this import using screen because it’d be hours to do an import, and using nohup is likely not helpful because the import script will constantly be updating number of each item imported so the stdout file will likely be large.

Install Database To Host vBulletin Data

Download Latest Packages

Note MySQL is no longer available unless Oracle MySQL repo is added explicitly into the repo list. MariaDB has replaced MySQL.

root@uat-app:~# apt-get update
root@uat-app:~# apt-get install libmariadb-dev
root@uat-app:~# apt-get install default-mysql-server

Start Database

root@uat-app:~# service mysql status
[info] MariaDB is stopped..
root@uat-app:~#
root@uat-app:~# service mysql start
[ ok ] Starting MariaDB database server: mysqld.
root@uat-app:~# service mysql status
[info] /usr/bin/mysqladmin Ver 9.1 Distrib 10.3.25-MariaDB, for debian-linux-gnu on x86_64
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Server version 10.3.25-MariaDB-0+deb10u1
Protocol version 10
Connection Localhost via UNIX socket
UNIX socket /var/run/mysqld/mysqld.sock
Uptime: 4 sec

Threads: 7 Questions: 461 Slow queries: 0 Opens: 177 Flush tables: 1 Open tables: 31 Queries per second avg: 115.250.

Install Gems For Database Connectivity

Following shows that the latest ‘bundle’ doesn’t like some of the flags in the original instructions and there is a need to unset ‘deployment’ mode.

root@uat-app:~# echo "gem 'mysql2', require: false" >> /var/www/discourse/Gemfile

root@uat-app:~# echo "gem 'php_serialize', require: false" >> /var/www/discourse/Gemfile

root@uat-app:~# cd /var/www/discourse
root@uat-app:/var/www/discourse# su discourse -c 'bundle install --no-deployment --without test --without development --path vendor/bundle'
[DEPRECATED] The `--path` flag is deprecated because it relies on being remembered across bundler invocations, which bundler will no longer do in future versions. Instead please use `bundle config set path 'vendor/bundle'`, and stop using this flag
[DEPRECATED] The `--without` flag is deprecated because it relies on being remembered across bundler invocations, which bundler will no longer do in future versions. Instead please use `bundle config set without 'development'`, and stop using this flag
You are trying to install in deployment mode after changing
your Gemfile. Run `bundle install` elsewhere and add the
updated Gemfile.lock to version control.

If this is a development machine, remove the /var/www/discourse/Gemfile freeze by running `bundle config unset deployment`.

The dependencies in your gemfile changed

You have added to the Gemfile:
* mysql2
* php_serialize

Update Configuration and Rerun Install

Check By CLI

Checking configuration confirmed that it is set to ‘deployment’ mode.

root@uat-app:/var/www/discourse# bundle config list
Settings are listed in order of priority. The top value will be used.
deployment
Set for your local app (/var/www/discourse/.bundle/config): true

jobs
Set for your local app (/var/www/discourse/.bundle/config): 4

retry
Set for your local app (/var/www/discourse/.bundle/config): 3

path
Set for your local app (/var/www/discourse/.bundle/config): "vendor/bundle"

without
Set for your local app (/var/www/discourse/.bundle/config): [:development, :test]

Check By Inspecting Config File

Following is doing the same check by inspecting the config file.

root@uat-app:/var/www/discourse# cat /var/www/discourse/.bundle/config
---
BUNDLE_DEPLOYMENT: "true"
BUNDLE_JOBS: "4"
BUNDLE_RETRY: "3"
BUNDLE_PATH: "vendor/bundle"
BUNDLE_WITHOUT: "development:test"

Update Configuration

root@uat-app:/var/www/discourse# bundle config set path 'vendor/bundle'
Your application has set path to "vendor/bundle". This will override the global value you are currently setting
root@uat-app:/var/www/discourse# bundle config set without 'development:test'
Your application has set without to "development:test". This will override the global value you are currently setting
root@uat-app:/var/www/discourse# bundle config unset deployment

Validate Configuration Again

root@uat-app:/var/www/discourse# bundle config list
Settings are listed in order of priority. The top value will be used.
path
Set for your local app (/var/www/discourse/.bundle/config): "vendor/bundle"
Set for the current user (/root/.bundle/config): "vendor/bundle"

without
Set for your local app (/var/www/discourse/.bundle/config): [:development, :test]
Set for the current user (/root/.bundle/config): [:development, :test]

jobs
Set for your local app (/var/www/discourse/.bundle/config): 4

retry
Set for your local app (/var/www/discourse/.bundle/config): 3

Attempt Install Again

Run install again for the Gems and exit the container.

root@uat-app:/var/www/discourse# su discourse -c 'bundle install'
...........
Bundle complete! 125 Gemfile dependencies, 163 gems now installed.
Gems in the groups development and test were not installed.
Bundled gems are installed into `./vendor/bundle`
root@uat-app:/var/www/discourse# exit

Create Directory For vBulletin Data

Create Directory

[root@uat standalone]# pwd
/var/discourse/shared/standalone
[root@uat standalone]# mkdir vbulletin

Copy vBulletin Database

[root@uat standalone]# scp <login user>@<vbulletin server IP>:/home/backup/vbulletin/vbulletin-2020-11-14-03:30:01.sql.bz2 ./vbulletin/.

Unzip vBulletin Database

[root@uat containers]# docker exec -it app bash
root@uat-app:/# cd /shared/vbulletin
root@uat-app:/shared/vbulletin# bunzip2 vbulletin-2020-11-14-03\:30\:01.sql.bz2

Setup Data Source

Create Database vb4

root@uat-app:/shared/vbulletin# mysql -uroot -p -e 'CREATE DATABASE vb4'
Enter password:

Import vBulletin Into MariaDB

root@uat-app:/shared/vbulletin# mysql -uroot -p vb4 < vbulletin-2020-11-14-03\:30\:01.sql
Enter password:

Unzip Profile Archives

[root@uat vbulletin]# tar xvfz signaturepics.tar.gz
[root@uat vbulletin]# tar xvfz customavatars.tar.gz
[root@uat vbulletin]# tar xvfz customprofilepics.tar.gz

Update Database Root Password

root@uat-app:/var/www/discourse# mysql -uroot -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 77
Server version: 10.3.25-MariaDB-0+deb10u1 Debian 10

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> ALTER USER 'root'@'localhost' IDENTIFIED BY '1234';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> quit
Bye

Import Into Discourse

Set Data Source Connection Details

[root@uat vbulletin]# export DB_NAME="vb4"
[root@uat vbulletin]# export DB_USER="root"
[root@uat vbulletin]# export DB_PW="1234"
[root@uat vbulletin]# export TABLE_PREFIX="vbulletin"
[root@uat vbulletin]# export ATTACHMENT_DIR='/shared/vbulletin'
[root@uat vbulletin]# export TIMEZONE="America/Vancouver"
[root@uat vbulletin]# cd /var/www/discourse
root@uat-app:/var/www/discourse# su discourse -c 'bundle exec ruby script/import_scripts/vbulletin.rb'
root:1234@localhost wants vb4
Loading existing groups...
Loading existing users...
Loading existing categories...
Loading existing posts...
Loading existing topics...

importing groups...
15 / 15 (100.0%) [3272 items/min] n]
importing users
117 / 11033 ( 1.1%) [145 items/min] in]
3 Likes

So the problem with your initial connection to the database was mixing libraries?

2 Likes

I don’t think so. I literally just changed the password explicitly and it was able to run. I am having a problem when importing private messages now. I am seeing a lot of the following. Any idea what this is?

importing private messages...
      139 / 177409 (  0.1%)  [399 items/min]  one of the participant's id is nil -- [nil, 270]
pm-149 has no target (a:1:{i:486;s:5:"TonyN";})
      364 / 177409 (  0.2%)  [418 items/min]  one of the participant's id is nil -- [nil, 276]
pm-420 has no target (a:1:{i:623;s:14:"the other side";})
      571 / 177409 (  0.3%)  [414 items/min]  one of the participant's id is nil -- [nil, 445]
pm-702 has no target (a:1:{i:767;s:6:"greatg";})
      572 / 177409 (  0.3%)  [414 items/min]  one of the participant's id is nil -- [nil, 445]
      605 / 177409 (  0.3%)  [416 items/min]  one of the participant's id is nil -- [nil, 461]
1 Like

It’s either that those users didn’t get imported for some reason (missing email address used to a cause, but that should be resolved now), or for some other reason the code that looks up imported usernames isn’t working (case of the username, perhaps?).

3 Likes

@pfaffman yeah it does look that way although it’s not clear on some of the specifics. Let’s look at the first one for example.

  1. What does pm-149 mean?
  2. For a:1:{i:486;s:5:"TonyN";} the text “TonyN” looks like the username but what about the other numbers?
  3. How about [nil, 270]? What does 270 signify?

If I can understand what it’s complaining I can at least try to look up the database to see if there are any data problems. But I’m not sure what these really mean.

BTW I also noticed that all imported forums have permission for everyone. This means all the forums permitted only for moderators were set to be everyone visible. Is there a way to control this?

1 Like

Sorry. I don’t remember well enough to explain. This is about all of the free help I have to offer on this one.

Of course. See How to use category security settings to create private categories - faq - Discourse Meta

Some importers endeavor to import groups, but few of them know how to apply those permissions to the categories that get imported. You’ll need to fix those up by hand.

2 Likes

Following @titusc 's instructions, and I seem to be having issues importing the database…

root@DO-Discourse-app:/shared/vbulletin# mysql -uroot -p vb4 < CC12-Sat-Full-Backup.sql
Enter password: 
ERROR 1265 (01000) at line 4928: Data truncated for column 'method' at row 1
root@DO-Discourse-app:/shared/vbulletin#

Any suggestions on what it’s looking for?

N/M It’s errors in the original database…

2 Likes

I’m only a recent discourse convert, so after a lot of trial and error I’ve combined everything above into a full command by command list (thanks @titusca and @enigmaty).

Hopefully this will help (or at least accelerate) fellow newcomers go from start to finish. Would like to incorporate this into the first post given the updates to mysql->mariadb that I think have thrown a lot of confusion into the process.

Background:

  • 1.6 million post transfer.
  • Utilized Digital Ocean Droplet (CPU Optimized 4 vCPU/8GB)

#1 - Install Digital Ocean Discourse 1-click droplet

#2 - Finish discourse install through SSH by following prompts

Open SSH console
root
(yourrootpassword)
(enter)
(yourdomain).com
(etc…)

#3 - Login to SFTP to upload database dump

sftp root@XXX.XXX.XX.XX
y
yes
(yourrootpassword)
put db.sql /var/discourse/shared/standalone/db.sql

#4 - Login to new discourse website to setup admin account

#5 - Login to SSH - begin process

ssh root@XXX.XXX.XX.XX
cd /var/discourse
./launcher start app
docker exec -it app bash
sudo apt-get update
sudo apt-get upgrade
y

#6 - Install MariaDB (replacement for mysql)

apt-get update && apt-get install mariadb-server-10.3 libmariadbd-dev
y

#7 - Mysql Database Setup

service mysql start
mysql -u root -p
password
create database vbulletin;
exit;

#8 - Vbulletin -> Mysql Database Transfer

mysql -u root -p vbulletin < /shared/db.sql
password

#9 - GEM File

echo “gem ‘mysql2’” >>Gemfile
echo “gem ‘mysql2’, require: false” >> /var/www/discourse/Gemfile
echo “gem ‘php_serialize’, require: false” >> /var/www/discourse/Gemfile
cd /var/www/discourse
su discourse -c ‘bundle install --no-deployment --without test --without development --path vendor/bundle’
(Ignore red text result)

#10 - Configure install script

vi /var/www/discourse/script/import_scripts/vbulletin.rb

#10.a - Make edits to text file as needed

DB_HOST ||= ENV[‘DB_HOST’] || “localhost”
DB_NAME ||= ENV[‘DB_NAME’] || “vbulletin”
DB_PW ||= ENV[‘DB_PW’] || “password”
DB_USER ||= ENV[‘DB_USER’] || “root”
TIMEZONE ||= ENV[‘TIMEZONE’] || “America/Los_Angeles”
TABLE_PREFIX ||= ENV[‘TABLE_PREFIX’] || “”
ATTACHMENT_DIR ||= ENV[‘ATTACHMENT_DIR’] || ‘/shared/attachments/’

#10.c - End edits

:wq

#11 - Bundle Config

bundle config set path ‘vendor/bundle’
bundle config set without ‘development:test’
bundle config unset deployment
su discourse -c ‘bundle install’

#12 - Mysql config (may be possible to do this with previous)

mysql --version
sudo mysql -u root -p
password
ALTER USER 'root'@'localhost' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;
exit

#13 - Install Script

su discourse -c ‘bundle exec ruby script/import_scripts/vbulletin.rb’

Good luck!

5 Likes

Just wanted to leave feedback after our migration from vB4:

  • FIXED Soft-Deleted posts where not properly hidden: https://github.com/discourse/discourse/pull/12057
  • [ul] + [li] and nested [LIST] were not migrated properly and the BBcode plugin doesn’t seem to handle this either → This seems to be expected: CommonMark testing started here! (Quote: Core will not implement [ul] [ol] and [li] support for BBCode cause it is a recipe for failure.) → I will need to build some RegEx magic post-fixup for this.
  • We made an initial migration using the normale importer (took > 3 days) and restarted the migration with newer DB snapshots a couple of times to keep the import “fresh” and reduce the downtime to effectively 30 minutes. This procedure worked quite well, except for everything that was edited after we initially imported the threads, posts. We need to manually rework this information now.
  • Creating Plugins for Discourse is really hard due to lack of documentation and a big picture of how the folder structure works. Though it is getting nicer and better after you understand how it works.

Questions that i have left:

  • I not not sure how the importer maps already imported posts and how to match the old vB4 post_id to the new Discourse post_id to hide those “soft-deleted” post. If someone can give me a hint that would be very welcome! Found it: import_id inside the post_custom_fields table. Nice. Now i need to write some handy script to fix this :slight_smile: → Edit: An even better way is to use the importer script, which maps all imported id’s for easy use.
2 Likes

Unfortunately I can’t edit my previous post :slight_smile:

I found another issue: Every attachment that is not linked into a post, will not be available to Discourse.

My draft PR for fixing this issue: https://github.com/discourse/discourse/pull/12187

Thanks!

3 Likes

Just a quick followup on my issue list. I fixed the visibility problem.

Dump all affected posts from your old vBulletin database:

SELECT postid
FROM `vb4_post`
WHERE `visible` > '1'
ORDER BY postid

Make an imported_post_ids.txt file which has all the postid’s line by line

Create a new file for the fixing script:

nano script/import_scripts/fix_visibility.rb 

Content:

require_relative '../../config/environment'
require_relative 'base/lookup_container'

@lookup = ImportScripts::LookupContainer.new

broken_postids = []
broken_real_postids = []

File.foreach("imported_post_ids.txt") do |line|
  broken_postids.append(line.to_i)
end

broken_postids.each do |id|
  broken_real_postids.append(@lookup.post_id_from_imported_post_id(id))
end

broken_real_postids.each do |id|
  puts id
  Post.find(id).trash!
end

Run the script:

su discourse -c 'bundle exec ruby script/import_scripts/fix_visibility.rb'

The script will use the logic from the importer to map the imported post_id’s to the read discourse post_id’s which we want to hide.

4 Likes

Hi folks,

I’ve got the script churning away on a vb3 migration. I’m doing one step at a time and it’s currently churning through 122k users at 330/minute. Then we’ll have 2.5 million posts to go through.

We’re doing this on a production server. Nobody’s using the discourse site, we just set it up and it’s at an anonymous url. If I log in I can see the new user notifications incrementing. Probably a dumb question, but I wonder if the migration would process faster if we suspended or disabled the live site somehow?

1 Like

That depends on the load and amount of CPU’s on your production server. You can always try to stop the web server for 5 minutes and see if the import goes faster.

3 Likes

Importing really takes a while. As far as i know the bulk importer should be faster. We did a first import from a backup on our beefy dev machine and then did an incremental one from another backup to make the switch to Discourse with only half an hour of downtime. Beware of the things that can go wrong when doing incremental updates :slight_smile: (See here: Importing / migrating from vBulletin 4 - #132 by paresy)

paresy

3 Likes

I see one core pegged that I think is the server ingesting the updated data, and another core pegged when running the import script. I really don’t have the domain knowledge to know if competition between those two processes for the db resource could be slowing down the importer, and I also don’t have the domain knowledge to know if it’s even possible to stop the ingestion while leaving the container up. The ingestion has to happen anyway, so I suppose the safest thing to do is just let it keep on churning away.

One tip for future readers, I see that 27k (22%!) of our users are banned spambots. We’ll purge them on the source side before doing the final import.

[adding] One necessary edit that I don’t see mentioned above:

--- a/script/import_scripts/vbulletin.rb
+++ b/script/import_scripts/vbulletin.rb
@@ -134,6 +133,7 @@ EOM
        , usertitle
        , usergroupid
        , joindate
+       , lastvisit
        , email
        , password
        , salt

And an edit that may be vb3 specific:

--- a/script/import_scripts/vbulletin.rb
+++ b/script/import_scripts/vbulletin.rb
@@ -987,7 +989,7 @@ EOM
   end

   def parse_timestamp(timestamp)
-    Time.zone.at(@tz.utc_to_local(timestamp))
+    Time.zone.at(@tz.utc_to_local(Time.at(timestamp)))
   end

[adding] The import is running on an oracle cloud 4-core ampere instance. For comparison I installed a discourse dev server local/native on an M1 macbook air and was surprised that the import process ran significantly slower.

5 Likes

Were you getting errors with the pre-existing script? I lost the date & time information from all our old vBulletin 4 posts because of that. If this is a fix, I’d love to know if reimporting would be a good idea if all posts have been copied over.

2 Likes

Yes, the script would error out because it was feeding an integer to a time function.

3 Likes

No. The script skips posts that have been imported already.

3 Likes

Hi,

Did you figure out how to fix this?

Our two main/bottom forums have parentid = -1 (I think this is due to us converting from v3 back in the day).

I’m not sure how to go ahead, do I just set them to 0 if -1 in the conversion script? Assuming 0 is the main discourse category?

Actually, looking at the discourse site now; those two seems to be the only two that has been imported?

 importing top level categories...
         2 / 2 (100.0%)  [211 items/min]  in]
 importing children categories...
 Traceback (most recent call last):
         5: from script/import_scripts/vbulletin.rb:1003:in `<main>'
         4: from /var/www/discourse/script/import_scripts/base.rb:47:in `perform'
         3: from script/import_scripts/vbulletin.rb:84:in `execute'
         2: from script/import_scripts/vbulletin.rb:287:in `import_categories'
         1: from script/import_scripts/vbulletin.rb:287:in `each'
script/import_scripts/vbulletin.rb:289:in `block in import_categories': undefined method `[]' for nil:NilClass (NoMethodError)
1 Like

Probably. I’ve done a bunch of vBulletin imports since then. :person_shrugging:

You’ll just have to give it a try and see what happens. It does look like the same thing I described.

I’d just modify the script to . . . do something … if that thing is nil.

1 Like