For the same reason, I also wouldn’t remove personal messages, just replace their content with a bit of random text.
It’s good to know that my work on that was not for nothing.
You can remove most of the data from the
phpbb_users table. Only
group_id are really needed. I’d also recommend cleaning the
phpbb_config table. Importing private messages is an optional feature of the importer, so the content of the
phpbb_privmsgs table isn’t essential.
What about the email address?
I can fill that up with dummy data. No problem.
After ~24h, import on my local VM is at 17.8% and the importing speed is constantly going down. It started at 320 items/min but right now is at 271 items/min and still melting (melting quite slow, but still). We can assume it will crawl at 45%, just like on production machine.
It also turns out that i’ve made a wrong assumption few posts back. “the longer it runs, the slower it gets” should be changed to “the farer it gets, the slower it works”. I can say that because even with @gerhard fork (which totally improved skipping part time), import is super slow at 45%, it has nothing to do with script run time. It seems that it’s somewhat related to the amount of posts already imported/skipped. Does that make any sense? Maybe that would be a clue for somene.
ps. i’m gonna send a db dump to gerhard
Sorry for confusion guys, i’ve decided to test yet another solution before sharing the database (not sure why i didn’t done that earlier). I’m using a docker-less instalation script and it’s been using ruby
After moving to
2.2.0, speed drastically increased to about ~437 posts/min which is even faster than when i was at 2% progress. I’m still waiting for the results at 45% progress, that will drive a final verdict.
btw - the
posts/min counter is broken. It counts the results not based on posts imported in current “session” but from first post imported. So when one will run import script, stop at 100 post and run it again after an hour, counter will display a speed of 100 posts/hour. Which is actually a truth but such information is not practical at all. It would be way better to display speed of import since the most recent script init. I was forced to measure import speed using a stopwatch
I’m happily running the importer with two small issues/questions:
- Quoted posts are being converted with text markers like this:
— Begin quote from "SomeUser"
— End quote
rather than proper Discourse quotes. Is there something I can do to change this?
- Small thing, but the euro symbol is being rendered as “â‚¬”. If I look at the phpbb3 database I can see it’s stored as the € character there. Is there any way I can get it to come across correctly?
Thanks in advance.
Importing of quotes should work. There’s something strange going on here. Can you send me the raw text from the phpBB3 database of a post with quotes? I’ll look into it…
There seems to be an encoding issue with your database. What’s the encoding of your phpBB3 database? Something like this can happen when you have tables with latin1 encoding and the forum stored the content in utf8 encoding. I have an experimental script that tries to fix this (don’t run it on your production server!).
Yikes, I think I may was using the wrong ruby-bbcode-to-md package, will try again with the correct one.
On the DB, if an encoding mismatch would cause that let me have a go at forcing utf8. Thanks for your help, I’ll revert once tested.
Yup, my quoting issue was due to using the default ruby-bbcode-to-md gem rather than specifying the git repo.
I wonder if @neil has any plans to work on nested [quote] objects in his ruby-bbcode-to-md package, which this importer uses?
I’m not a fan of the bbcode-to-md gem, so I haven’t tested it with it enabled. But nested quotes should work without it as long as there are no empty lines between the quotes.
I don’t have any phpbb migrations to work on, so I won’t be working on ruby-bbcode-to-md any time soon.
Understood. Do you recall if there was much involved in nested quotes? Debating jumping in for a look but not if it’s a huge deal…
I have no memory of that. I didn’t write that gem, I only forked it and tweaked it to serve my purposes.
So i came here looking for a solution to import a phpbb to discourse but i am running a docker installation and i am a bit confused right now. In terms of what do i have to do to import from phpbb3 to docker installed discourse, maybe i am blind but i did see something according to this posted here, but it was not finished back in november so may i ask if there is already a simple solution for a docker installation or at least can someone tell me what to do here?
Yes, there’s a simple solution for importing with the Docker container. I just didn’t find the time to update the original tutorial. Maybe there are some technical writers in the community who would like to do that.
Anyway, I added an outline of all the necessary steps to the first post.
So maybe this will work better as my previous attempt where i set up a dev Environment and well yeah evrything worked so far only problem i stumbled across is that when i want to look at my users (the normal list at /users) discourse shows two entries without any usersname and all values at zero, and it tells me there are 2 users, when i look into /admin/users/list/active all isers are there, any idea what i did wrong or how to fix this without doing a new import? (many thanks for the tut at this point i will happily try this if i can’t get the users to be shown!)
The list of users wont show up right away. You’ll probably have to wait until all the postprocessing of the imported data is finished.
Ah there still is some postprocessing, i didn’t realize that, thank you for all the great work and help till now, so i’ll sit back and wait for evrything to finish, is this what i can look into at the sidekiq webinterface?