Importing from vBulletin 4

Ok so basically I had to install all the necessary gems.
The script worked quite well.

However, I am not able to login with one of the users I created on my vBulletin forum just before importing the database over to discourse.

1 Like

You’ll need to set passwords. The script doesn’t import them (last I looked).

And even if it did, you’d need to install a plugin to make them work.

1 Like

I am planning to write a script that will import the passwords for my case.
I know the mysql tables I need to consider for vBulletin. What is the table in PostgreSQL for Discourse that stores passwords and is used during login?

Prakarsh would you be able to share the forum url that you are trying to switch from? The vBulletin one. Will the Discourse one have the same URL?

Yes. Here it is: http://neuroimage.usc.edu/forums/

Our plan is to keep the URL same for now. I cannot say when it will be live. Probably by end of next week if everything goes right :stuck_out_tongue:

1 Like

Before you do that, you should read this topic.

You don’t want to directly touch the Discourse tables. You want a script like this. When I last used it and tried to import the passwords, it ran really slowly. I’m not sure why, but after reading this I decided that it’s probably not a bad idea to make people change their passwords.

3 Likes

@pfaffman - You mentioned you used import_pass field in your import script.
Where exactly do you put the import_pass field? And how do you use it?
If you could share it would be great help.

I have tried using this script but it failed 90% into the import of the users:

su discourse -c 'RAILS_ENV=production ruby script/import_scripts/vbulletin.rb'
root:XXXX@localhost wants vb4
loading existing groups...
loading existing users...
loading existing categories...
loading existing posts...
loading existing topics...

importing groups...
        6 / 19 ( 31.6%)  Failed to create group id 7 Moderators: ["Name has already been taken"]
       19 / 19 (100.0%)  
importing users
     2746 / 15319 ( 17.9%)  W, [2017-10-12T10:07:05.104837 #3578]  WARN -- : Bad date/time value "0000:00:00 00:00:00": mon out of range
W, [2017-10-12T10:07:05.106562 #3578]  WARN -- : Bad date/time value "0000:00:00 00:00:00": mon out of range
W, [2017-10-12T10:07:05.107305 #3578]  WARN -- : Bad date/time value "0000:00:00 00:00:00": mon out of range
    13901 / 15319 ( 90.7%)  script/import_scripts/vbulletin.rb:137:in `strip': invalid byte sequence in UTF-8 (ArgumentError)
        from script/import_scripts/vbulletin.rb:137:in `block (2 levels) in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:226:in `block in create_users'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `each'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `create_users'
        from script/import_scripts/vbulletin.rb:133:in `block in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:784:in `block in batches'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `loop'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `batches'
        from script/import_scripts/vbulletin.rb:117:in `import_users'
        from script/import_scripts/vbulletin.rb:78:in `execute'
        from /var/www/discourse/script/import_scripts/base.rb:45:in `perform'
        from script/import_scripts/vbulletin.rb:902:in `<main>'

Anyone have any suggestions regarding how to debug and fix this?

You have an encoding problem in your database.

The database claims is using one encoding, but some of the data is encoded some other way. A job I just did had a similar problem. UTF-8 likely does not exist when you started your forum.

This is probably a problem for stack exchange or a database person. It’s not a problem with the importer.

It’s the kind of problem that will take 10 minutes following a recipe that you don’t understand, ten hours looking for that recipe, or 100 hours fixing it by hand.

3 Likes

Cheers @pfaffman I was approaching the same conclusion… I started writing up the following before your comment…

It looks like it might be caused by non-ascii characters in usernames, so dumping the usernames into a text file and searching for UTF-8:

echo "select username from user" | mysql -p vb4 | grep --color='auto' -P -n "[^[:ascii:]]"

These were the entities this threw up:

’ ´ – é ã Ñ í ó 

So I used this page to generate encoded versions and updated the usernames in the database, for example:

UPDATE user SET username="&rsquo; &acute; &ndash; &eacute; &atilde; &Ntilde; &iacute; &oacute;" WHERE username="’ ´ – é ã Ñ í ó"

But the script still fails:

script/import_scripts/vbulletin.rb:137:in `strip': invalid byte sequence in UTF-8 (ArgumentError)
        from script/import_scripts/vbulletin.rb:137:in `block (2 levels) in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:226:in `block in create_users'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `each'
        from /var/www/discourse/script/import_scripts/base.rb:225:in `create_users'
        from script/import_scripts/vbulletin.rb:133:in `block in import_users'
        from /var/www/discourse/script/import_scripts/base.rb:784:in `block in batches'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `loop'
        from /var/www/discourse/script/import_scripts/base.rb:783:in `batches'
        from script/import_scripts/vbulletin.rb:117:in `import_users'
        from script/import_scripts/vbulletin.rb:78:in `execute'
        from /var/www/discourse/script/import_scripts/base.rb:45:in `perform'
        from script/import_scripts/vbulletin.rb:902:in `<main>'

So I then dumped the whole users table into a text file:

echo "select * from user" | mysql -p vb4 > /tmp/allusers.txt

And then tried all the suggesting in this stackexchange thread for finding non-ascii characters and none were found, so now to search the whole database for the problem…

When I find a solution I’ll post it here in case it helps others…

The problem almost certainly exists in other tables.

You need to figure out what it’s encoded in and fix it.

Or, it could be that for some years it’s in one encoding and other years it’s another.

Best of luck. I was relieved they my client had engineers they solved the problem so I didn’t have to. I may have some notes that I’ll try to find when I get out of bed.

1 Like

The database on the old server was UTF-8 but clearly it has some non-UTF-8 in it, the site has been in use for a long time and has 13k users… so taking care with dumping the old database:

mysqldump -uroot -p database -r dump.sql

And trying various methods to check the encoding (thanks to Stackexchange):

file dump.sql 
  dump.sql: ASCII text, with very long lines

apt install uchardet 
uchardet dump.sql
  UTF-8

iconv -f utf-8 -t utf-8 dump.sql > dump.utf-8.sql
  iconv: illegal input sequence at position 29618109

apt install moreutils
isutf8 dump.sql 
  dump.sql: line 2995, char 42, byte 29618109: Expecting bytes in the following ranges: 00..7F C2..F4.

And this is a image line:

INSERT INTO `customavatar` VALUES (6355,'ÿØÿà\0^PJFIF\0^A^A\0\0^A\0^A\0\0ÿþ\0;CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 75

And before I delete the line using vim I need to increase the size of the partitions as it is a 1.4G file and I’m running out of space when editing it…

The problem was a user with &#55357;&#56444; in the username field, once the HTML entities were removed the import script ran, but there were a lot of errors like this:

ERR Error running script (call to f_b06356ba4628144e123b652c99605b873107c9be): @user_script:14: @user_script: 14: -MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. 

Is this something that I should be concerned about and something that I should fix and then redo the import?

Are you out of disk space?

1 Like

Yes! I’m now spinning up a new virtual server with a huge amount of disk space, CPUs and RAM and I’m going to start from scratch…

By the way, I’m following the instructions above for the Docker version and found that I needed to add an additional step in the container

echo "discourse ALL = NOPASSWD: ALL" >> /etc/sudoers

Before running:

su discourse -c 'bundle install --no-deployment --without test --without development'

In the non-docker instructions above there is this step to delete data from the database before doing the import:

  • Clear existing data from your local Discourse instance
cd ~/discourse
bundle exec rake db:drop db:create db:migrate

Was this omitted from the Docker instructions on purpose?

1 Like

There are two, different, vbulletin import scripts, in the docker container:

cd /var/www/discourse
find ./ -name vbulletin.rb
./script/bulk_import/vbulletin.rb
./script/import_scripts/vbulletin.rb

The second one is twice the size of the first one and it is the one I have been using.

Posting a link to this thread in case anyone else does an import from vBulletin and then can’t work out why the Rails console doesn’t work — you need to backup your data after an import and migrate it to a new server before things work properly (I don’t know why).

It might also be worth linking to this thread on the matter of vBulletin imported user background images.

4 Likes

Thanks for this info. Planning to do the migration as well
@enigmaty How big was your database? Are the attachments where on the file server or the DB? I’m at 1GB DB & 100GB+ on FS. Running vBulletin4

I am writing on a couple vbulletin imports now. One has almost five million posts and took about a week to run on a earlier test. I’ve made some improvements to the script that handle PMs, internal links, 301 redirect, and a bunch of formatting stuff.

If I ever get caught up I’ll submit a PR.

3 Likes