Help with importer script vbulletin


(Tor) #1

Can sombody help me with a guide for the importer script for vbulletin? I have noe idea how it works and what I should do to install it or use it. Please help!
Currently running on ubuntu.


(Kirupa Chinnathambi) #2

+1

I’m looking for this information as well.

I see the scripts here: discourse/script/import_scripts at master · discourse/discourse · GitHub

Some of my questions are: How do I run this script? How do I point it to my vBulletin forum/database and provide my username/password? Do I need to import (aka rsync) my MySQL Database into my digitalocean account from my current vBulletin server?


(James Cook) #3

I’m in the process of running this script myself and making changes so that it will work without forum and group mappings, as well as working through other little quirks.

Basically the original script was written by @zogstrip specifically for a client request so there are some very client specific things in there.

The basic way of running this script is to export your vBulletin database data to CSV files and point to these files when you run the script using the options that it asks you to pass in.

You run a rails script by using the following command:

I will put a guide together for this importer once I’ve got everything up and running and made a few changes.


(Jeff Atwood) #4

Feel free to submit PRs to improve the importer and make it more general! Very much appreciated.


(Helmi) #5

just saying i’d also be interested. Still having a vbulletin migration in mind that should still happen this year. Please keep us posted, @jamesmarkcook about how things run for you and what data you are getting migrated and what not.


(Mittineague) #6

I was not involved with technical aspects of the migration of our vB forum to Discourse, but here are some facts as best as I can think of ATM.

Many threads were successfuly migrated over to topics. Done in “small” batches at a time.
Many member accounts were successfully migrated over.

Known issues.
Converting bbCode to equivalent MarkDown was a bit gnarly. Most was good, some required “rebaking” and some failed.
Similarly, some vB smilies now show as text.
Many attachments in posts were lost
Some vB member names did not meet the Discourse member name rules.
User “meta data” was lost. eg. profile info, join date
All imported member accounts started at TL0


(silfax) #7

I’m in trouble with the importer script vBulletin … I made the CSV files from my vBulletin database but when I run the script, I have this error and I don’t know how to manage it :frowning:

/usr/lib64/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:45:in `require': cannot load such file -- /opt/config/environment (LoadError)
	from /usr/lib64/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:45:in `require'
	from /opt/discourse/scripts/base.rb:19:in `initialize'
	from scripts/vbulletin.rb:24:in `initialize'
	from scripts/vbulletin.rb:20:in `new'
	from scripts/vbulletin.rb:20:in `run'
	from scripts/vbulletin.rb:680:in `<main>'

I understand that the script is looking for a (configuration?) file but do I create one? If yes, what should be its content?

My Discourse instance is located under /opt/discourse and is running on openSUSE 13.1.

Any help will be really appreciate …


(Kane York) #8

I think there’s a relative path error somewhere.


#9

I have spent the past few weeks importing a vB forum with @kirupa. The script in the Discourse repo was a helpful start, but as others have mentioned, it suffers from being used for a one-off project. I’ll post some thoughts and code here in the hopes that it helps others. (I’ll consider contributing to the in-repo script, but I’m not sure my stuff is that much more robust.)

SQL export

One thing I noticed is that the script expects you to generate some CSVs from your vB database, but it unhelpfully doesn’t give you some example export queries. So I’ll share a link to mine here.

Some things to note about the export process:

  • You’ll probably need to do some custom escaping and CSV line/field termination settings. Otherwise Ruby’s CSV parser will choke. The existing vbulletin.rb importer has different field/line termination rules for different exports (like the user table vs. permission table). I ended up picking one format and modifying the import script to ignore the defaults, because I didn’t understand why they were different.
  • As @Mittineague mentioned, you’ll need to import large forums in batches, otherwise the script will choke. On a huge DigitalOcean instance, I managed to import all of the first-posts of each thread successfully (about 300k threads) in one batch. But I batched the reply import to 500k at a time (we have 2.3 million posts).
  • To get only the first posts in a thread to a separate CSV, I used some funky joins and sorting (see the gist for details.)

Forum to Category mappings

This was the most confusing/undocumented part of the vbulletin.rb import script for me, perhaps aside from getting the script to start running at all. I ended up ripping out the existing code there and writing my own, since I spent a few hours trying to understand the mapping and never got close to getting it working.

My fix was to pre-create the Discourse categories that I wanted, then I made a CSV to map my old forum IDs onto the Discourse category IDs. I can share that code (or my whole modified vbulletin.rb) if there’s interest, but I might be the only one who couldn’t get that mapping to work.

Misc Notes

  • A fairly significant chunk of posts were lost in the import, but it was really a pain to debug, since tiny import batches imported completely, but larger ones took hours so it was not feasible to iterate much on the script or have a good idea of which posts were missing, and why.
  • The script renames users with invalid names, but sometimes the new ones are really terrible. For example, we have a few users with single-character Unicode names, like λ, who was renamed to 955 because that’s λ’s HTML entity stripped of the HTML entity escape characters.
  • There’s really heavy CPU usage all the time after the import, even if it’s been a few days since the import and nobody is browsing the site. I haven’t figured out if this is normal or anomalous, but I might post a question about that in a new topic here. (I’m pretty sure they’re not Sidekiq tasks.)

You wouldn’t believe how much time I spent googling around to figure out the right syntax for invoking that script with the right rails context. It was the runner bit I was missing. I tried all sorts of rails c + load variants, plain old ruby, etc. In the end (since I did this before you replied 11 days ago) I wrote bunches of the script so that it wouldn’t read command line arguments and could be started using rails c + load 'scriptname.rb'.

You’ve helped many future generations of vB importers just with that simple detail!


#10

@krilnon, would really like to see your version of the vbulletin.rb script. I have a 2M post vBulletin forum that I want to start experimenting with…


#11

Hi @jpg,

I haven’t reviewed this script in many months, but here it is:

I broke my post exports into roughly 500k post chunks, otherwise the importer would eventually die on our host. Your mileage may vary.


#12

Thanks krilnon, I’ll post up my results eventually.


(Torrey Rozycki) #13

I’d like to replace vbulletin with discourse on a server but cannot find anyone on elance that has experience.
Only 34k posts.
Is anyone here who’s confident in their skills at this a freelancer by chance?

Update: got a few responses! Thanks! Working with someone now who seems to be very sharp and I am happy with the progress. Looking forward to posting rave reviews when complete.


(Andrius) #14

Is there a chance you still have an example for group_mapping.csv and forum_mapping.csv files? This is the last missing piece.