I would just leave it running until it finishes. 1000 items/minutes is about what I usually get when importing large forums.
Just confirming here that the Vanilla import script is a real winner. Our export file was ~1.3GB and imported just fine.
Last one (I think). Something Iād like the Vanilla Importer to do is auto-assign a trust level to imported users. All my users are getting the ānew user post limitā notice even when some of them have been users for 10+ years. Annoying introduction to Discourse for themā¦
loading existing groups...
loading existing users...
loading existing categories...
loading existing posts...
loading existing topics...
parsing file...
reading file...
parsing categories...
parsing comments...
parsing discussions...
parsing media...
parsing permissions...
parsing roles...
parsing users...
parsing user_comments...
parsing user_discussions...
parsing user_roles...
importing users...
18 / 18 (100.0%)
importing categories...
importing first-level categories...
4 / 4 (100.0%)
importing topics...
42 / 42 (100.0%) [151 items/min]
importing posts...
35 / 35 (100.0%) [236 items/min]
importing private topics...
/var/www/discourse/script/import_scripts/base.rb:444:in `create_posts': undefined method `size' for nil:NilClass (NoMethodError)
from script/import_scripts/vanilla.rb:190:in `import_private_topics'
from script/import_scripts/vanilla.rb:25:in `execute'
from /var/www/discourse/script/import_scripts/base.rb:45:in `perform'
from script/import_scripts/vanilla.rb:254:in `<main>'
How to fix this error on new ver.?
Perhaps you have no private topics and you should comment out the call to that function.
So how to do this? Can you tell me more?
Actually Iāve found a problem with this importer, it doesnāt import join dates of all previous users. And weāve got members who were 10+ years registered in our database whoāve now all had their join dates reset
That should be a fairy easy fix. Find where the data is in the database and look at some other importers to see how to stick it in.
This command has to be done through localhost by the way.
If done directly in your vps, youāll get a āNo such file or directoryā.
I know itās sounds logic for a developper, but not for a newbie like me
Thank you - this worked for me.
Hi, could somebody please post a sample file format that Vanilla Porter produces and that the Vanilla.rb importer expects?
I want to use vanilla.rb as the basis of a migration from MVCForum and my Ruby skills are too basic to follow the code.
Background
Iām about to migrate a forum from MVCForum to a Discourse version. Iāve done this once by writing a routine in Python and using the API to add the data.
Unsurprisingly I ran into lots of post limit, rate limit, permissions and performance issues. But the resultant forum has served to show the current membership what they might expect from Discourse.
The good news is that the membership wants to move to Discourse but the bad news is that Iāve got to migrate the data in the next 3-4 weeks with as little downtime as possible.
So, as Iām hopefully going to be an active contributor to the community, Iāve decided to learn a bit of Ruby and change vanilla.rb for my needs.
I am thinking about turning my version of the vanilla.rb routine into a generic json import that can be configured through mapping files so that those with low coding skills (myself currently) might be able to configure it instead of coding it.
Thanks in anticipation.
Itās easier to import from the database and then just change the database query (from an importer that supports what you want to import).
Can you get a dump of the MVCForum database?
Oh. It looks like mssql. I donāt know if there are any importers available that use mssql, but Iām about to submit a PR for Telligent, which uses mssql. (I think Iāve written two, but that one was a custom forum).
If you want to hire someone to solve the problem, please see Discourse Migration ā Literate Computing, LLC.
Thanks for the reply @pfaffman and the advice - I really appreciate it.
Yes, it is MSSQL and yes I can get a dump of the data. However, Iām going to have to do a lot of post-processing to tidy up the post bodies that tend to feature markup that gives Discourse heartburn.
My plan at the moment is to query the MVCForum from Python, as I have no real Ruby experience, and then output the massaged data in a format that is compatible with the file expected by the vanilla.rb importer.
Iām sure Iāll have to change the vanilla.rb code too but as Iām a Ruby nube it would help me enormously to get a peek at the format of the file that vanilla.rb is expecting.
I do understand that others make a living from this sort of thing and if we were a larger forum that charged for membership or carried advertising (we do neither), Iād be throwing a few quid at this. Alas, weāre run on a tight budget by volunteers who fund the hosting between us. So Iām going to try to get this done myself - with some help from here I hope. (enough sob story?)
I wrote several importers before I knew any Ruby, so you probably could too. If youāre doing your post-processing with regexes, then thatās pretty language independent. Also, there are lots of importers that do post-processing, so you could poke at those for hints.
But if itās json import you want, you should also check out the Ning importer.
Just had a quick peek and it instantly looks more intuitive. It also appears to have regex hooks for dodgy markup too.
Thanks for the steer and your general advice.
Iām running into an error in trying to run the import and canāt figure out why. Iāve checked the line mentioned in the error (and the ones before and after it) but the quotes all look correct and extra quotes and commas are escaped in the text. The line falls in the comments data being imported. Any idea on what to look for? Error:
/usr/local/lib/ruby/2.4.0/csv.rb:1875:in `block (2 levels) in shift': Missing or stray quote in line 40980 (CSV::MalformedCSVError)
from /usr/local/lib/ruby/2.4.0/csv.rb:1868:in `each'
from /usr/local/lib/ruby/2.4.0/csv.rb:1868:in `block in shift'
from /usr/local/lib/ruby/2.4.0/csv.rb:1828:in `loop'
from /usr/local/lib/ruby/2.4.0/csv.rb:1828:in `shift'
from /usr/local/lib/ruby/2.4.0/csv.rb:1770:in `each'
from /usr/local/lib/ruby/2.4.0/csv.rb:1784:in `to_a'
from /usr/local/lib/ruby/2.4.0/csv.rb:1784:in `read'
from /usr/local/lib/ruby/2.4.0/csv.rb:1324:in `parse'
from script/import_scripts/vanilla.rb:63:in `parse_file'
from script/import_scripts/vanilla.rb:17:in `execute'
from /var/www/discourse/script/import_scripts/base.rb:46:in `perform'
from script/import_scripts/vanilla.rb:254:in `<main>'
Make sure that the CSV file does not contain mixed line endings. Iāve seen similar errors with a file that contained Windows and Unix line endings. Try converting it to Unix line endings.
Thanks. Turns out it was some corrupted/oddly encoded IP addresses in the data causing the problem. I cleaned those out and it seems to be working.
Is there anything I can do to increase the import performance? Iām currently only getting 63 items/min on user import with over 240k users. Server has plenty of resources.
Hello. Is this tutorial still working?
I would imagine so. Do you have evidence to the contrary?