I would just leave it running until it finishes. 1000 items/minutes is about what I usually get when importing large forums.
Just confirming here that the Vanilla import script is a real winner. Our export file was ~1.3GB and imported just fine.
Last one (I think). Something I’d like the Vanilla Importer to do is auto-assign a trust level to imported users. All my users are getting the ‘new user post limit’ notice even when some of them have been users for 10+ years. Annoying introduction to Discourse for them…
loading existing groups... loading existing users... loading existing categories... loading existing posts... loading existing topics... parsing file... reading file... parsing categories... parsing comments... parsing discussions... parsing media... parsing permissions... parsing roles... parsing users... parsing user_comments... parsing user_discussions... parsing user_roles... importing users... 18 / 18 (100.0%) importing categories... importing first-level categories... 4 / 4 (100.0%) importing topics... 42 / 42 (100.0%) [151 items/min] importing posts... 35 / 35 (100.0%) [236 items/min] importing private topics... /var/www/discourse/script/import_scripts/base.rb:444:in `create_posts': undefined method `size' for nil:NilClass (NoMethodError) from script/import_scripts/vanilla.rb:190:in `import_private_topics' from script/import_scripts/vanilla.rb:25:in `execute' from /var/www/discourse/script/import_scripts/base.rb:45:in `perform' from script/import_scripts/vanilla.rb:254:in `<main>'
How to fix this error on new ver.?
Perhaps you have no private topics and you should comment out the call to that function.
So how to do this? Can you tell me more?
Actually I’ve found a problem with this importer, it doesn’t import join dates of all previous users. And we’ve got members who were 10+ years registered in our database who’ve now all had their join dates reset
That should be a fairy easy fix. Find where the data is in the database and look at some other importers to see how to stick it in.
This command has to be done through localhost by the way.
If done directly in your vps, you’ll get a “No such file or directory”.
I know it’s sounds logic for a developper, but not for a newbie like me
Thank you - this worked for me.
Hi, could somebody please post a sample file format that Vanilla Porter produces and that the Vanilla.rb importer expects?
I want to use vanilla.rb as the basis of a migration from MVCForum and my Ruby skills are too basic to follow the code.
Unsurprisingly I ran into lots of post limit, rate limit, permissions and performance issues. But the resultant forum has served to show the current membership what they might expect from Discourse.
The good news is that the membership wants to move to Discourse but the bad news is that I’ve got to migrate the data in the next 3-4 weeks with as little downtime as possible.
So, as I’m hopefully going to be an active contributor to the community, I’ve decided to learn a bit of Ruby and change vanilla.rb for my needs.
I am thinking about turning my version of the vanilla.rb routine into a generic json import that can be configured through mapping files so that those with low coding skills (myself currently) might be able to configure it instead of coding it.
Thanks in anticipation.
It’s easier to import from the database and then just change the database query (from an importer that supports what you want to import).
Can you get a dump of the MVCForum database?
Oh. It looks like mssql. I don’t know if there are any importers available that use mssql, but I’m about to submit a PR for Telligent, which uses mssql. (I think I’ve written two, but that one was a custom forum).
If you want to hire someone to solve the problem, please see Discourse Migration – Literate Computing.
Thanks for the reply @pfaffman and the advice - I really appreciate it.
Yes, it is MSSQL and yes I can get a dump of the data. However, I’m going to have to do a lot of post-processing to tidy up the post bodies that tend to feature markup that gives Discourse heartburn.
My plan at the moment is to query the MVCForum from Python, as I have no real Ruby experience, and then output the massaged data in a format that is compatible with the file expected by the vanilla.rb importer.
I’m sure I’ll have to change the vanilla.rb code too but as I’m a Ruby nube it would help me enormously to get a peek at the format of the file that vanilla.rb is expecting.
I do understand that others make a living from this sort of thing and if we were a larger forum that charged for membership or carried advertising (we do neither), I’d be throwing a few quid at this. Alas, we’re run on a tight budget by volunteers who fund the hosting between us. So I’m going to try to get this done myself - with some help from here I hope. (enough sob story?)
I wrote several importers before I knew any Ruby, so you probably could too. If you’re doing your post-processing with regexes, then that’s pretty language independent. Also, there are lots of importers that do post-processing, so you could poke at those for hints.
But if it’s json import you want, you should also check out the Ning importer.
Just had a quick peek and it instantly looks more intuitive. It also appears to have regex hooks for dodgy markup too.
Thanks for the steer and your general advice.
I’m running into an error in trying to run the import and can’t figure out why. I’ve checked the line mentioned in the error (and the ones before and after it) but the quotes all look correct and extra quotes and commas are escaped in the text. The line falls in the comments data being imported. Any idea on what to look for? Error:
/usr/local/lib/ruby/2.4.0/csv.rb:1875:in `block (2 levels) in shift': Missing or stray quote in line 40980 (CSV::MalformedCSVError) from /usr/local/lib/ruby/2.4.0/csv.rb:1868:in `each' from /usr/local/lib/ruby/2.4.0/csv.rb:1868:in `block in shift' from /usr/local/lib/ruby/2.4.0/csv.rb:1828:in `loop' from /usr/local/lib/ruby/2.4.0/csv.rb:1828:in `shift' from /usr/local/lib/ruby/2.4.0/csv.rb:1770:in `each' from /usr/local/lib/ruby/2.4.0/csv.rb:1784:in `to_a' from /usr/local/lib/ruby/2.4.0/csv.rb:1784:in `read' from /usr/local/lib/ruby/2.4.0/csv.rb:1324:in `parse' from script/import_scripts/vanilla.rb:63:in `parse_file' from script/import_scripts/vanilla.rb:17:in `execute' from /var/www/discourse/script/import_scripts/base.rb:46:in `perform' from script/import_scripts/vanilla.rb:254:in `<main>'
Make sure that the CSV file does not contain mixed line endings. I’ve seen similar errors with a file that contained Windows and Unix line endings. Try converting it to Unix line endings.
Thanks. Turns out it was some corrupted/oddly encoded IP addresses in the data causing the problem. I cleaned those out and it seems to be working.
Is there anything I can do to increase the import performance? I’m currently only getting 63 items/min on user import with over 240k users. Server has plenty of resources.