Importing / migrating from phpBB3

Do you have polls enabled?

yes, some polls are imported but not all.

You’ll need to create that user or modify the script to, say, use system if the owner doesn’t exist.

Normal threads from deleted users are imported properly. How can i modify the import script so that polls from deleted users are imported as well?

I’m currently trying a migration from phpBB the first time and I wonder if I transfer a Discourse backup later will it contain all meta information from the import (like the original post ID from phpBB). Reason is I’m thinking about doing the import on a bigger machine for quicker import and run the forum on a VPS later but may do another incremental import after moving everything to the VPS. I wonder if it’s enough to do a discourse backup / reimport or if I better dump the DB any other way (which?)

And one more question: Are the URL rewrites in the importer safe for a domain change afterwards? I was planning to use a different domain or sub domain for the initial migration and later switch to the real domain.

2 Likes

Yes, the backup contains the original post ID in a custom field. Running the final import on the cloud should be fine.

Changing hostname does require a bit of work. Change the domain name or rename my Discourse?

3 Likes

I just finished the first test migration and before anything else I have to say thank you to @gerhard and everyone else who contributed in this. It’s amazing how well this worked on a first try. It just took me roughly 24 hours to import shy of 900k posts with users, private messages and stuff and things really look good for a first shot. It’s just awesome to have such a great importer to get this done.

There’s a few things though and the most important one that I’m currently trying to hunt down is that some internal links seem to have gone completely wrong while others worked quite well. To be more precise way more seem to be right than wrong. I’m trying to find a pattern which ones didn’t work. Can anyone offer some insights on how this internal link changing works. Are there probably some gotchas where it might fail?

Sidenote: Right after the import I saw all avatars missing in quotes but then found out that they seem to be generated and looks like after 20 minutes everything’s there now. Fascinating :wink:

Regarding import time: This was on a Hetzner 4core VPS with exclusive cpu cores - I think I’m going to retry this on a bare metal server just for the import to improve the migration time. Need to see how moving the discourse backup works first.

4 Likes

The code for replacing internal links is here: https://github.com/discourse/discourse/blob/43ddf60cdf27a865b7b1aa0d54a144a3e46c74cf/script/import_scripts/phpbb3/support/text_processor.rb#L73-L114

Rewriting internal links will fail when a post contains a link to another post that hasn’t been imported yet. Posts are imported in the order of their original creation.

5 Likes

That’s an interesting hint. Can’t see why this would have been the case in some of the cases I checked but I will definitely go down that route. Weird thing is that it links the wrong posts in my case. In case of a fail I’d assume it would do nothing or leave the link alone. Unfortunately my ruby knowledge is close to zero but I’ll have a look at the code anyway.

Is there any way to work around this issue? I think it could happen a lot with tutorial style posts that get edited and ammended over and over and links are added. Any chance to get around this apart from manually changing those posts?

Thanks for helping! :+1:

Hmm just noticed that user groups haven’t been imported and also saw that they aren’t listed in the initial post here but they are indeed one of the first things that get listed on the importer. I wonder if there’s a way to import them including attaching users to those groups or if that’s just not yet working? Iin the latter case we would likely need to find a workaround.

Yes. That’s weird. That suggests that some of your local links are in some format other than what the import script expects and that makes the regex get the wrong post to look for. If someone has gone back and edited a post to point to a post that didn’t exist then the post was originally written (have you confirmed that is the case?) then they might have used some other means to make the link that confused the script.

It’s surprising the degree to which each import (especially for a mature forum) is a snowflake. It’s fairly rare that an import is just a matter of running the script (but, I suppose, people are more likely hire me if their import does have complications).

I can at least say that it not only happens for those cases where older posts link to newer ones.

I got one example where I first found the issue.

Original Post ID: 842948 links to Page 22 of a thread that already existed and at least the start post (&start=220) has an ID lower than the linking post (842880)

After import this original post (which was posted only days before the dump was created) links to a 7 year old thread where the first post has an original ID of 1353.

I can’t seem to find any hints on why this happens, any similarities in numbers in any way… The original link was just a plain posted URL which was auto linked by phpBB. Those links in general don’t cause any issues - they work in several other places.

Generally I don’t expect the import to go well immediately. I’m really happy to how it went so far but I assume this issue might be one of the hardest to fix, especially because I don’t have any clue of setting up a dev/debug environment with Discourse yet. Might need to dig into that soon.

2 Likes

Hi,
From what I remember, the phpbb importer was by default filling the Discourse user full name by guessing from the user email address. Am I right and is it still the case? I can’t find anything about this in the importer files…

This is something I wouldn’t want to happen in the next forum I may import.

Also, another question.

On the current phpbb forum, there are custom user fields (like users facebook or instagram urls). I would like to import this in Discourse custom fields. I guess I’ll do something like first installing and configuring Discourse by adding these custom user field, then importing the phpbb data from my custom import script?

That’s not happening anymore.

That sounds like a good plan.

4 Likes

Is there a reliable way to estimate the monthly volume of emails that Discourse will send to users after a migration from phpbb? New user registrations, mentions and replies, weekly digests and so on… Phpbb sends very few emails by default and I think we’ll have to change the current email provider.
I don’t currently have much stats from the actual phpbb. It exists since 2013, with 200000 messages and 5500 members. New members register everyday.

How many posts per day?

I dont know yet. I have currently nothing but the public stats. Maybe 20.

With only 20 posts per day I think you’d probably be looking at 3000 emails a month, at most, well within the free Mailgun plan.

However seeing as you’re migrating with lots of users I’d recommend you turn off the digest emails for everyone who hasn’t been to your forum within the last month or two (they can always turn it on themselves if they want). You can do that with a query in the Rails console, but it’s been a year since I did it for mine, so I don’t remember the exact code sorry.

That should actually be something the import script could do for you IMO, ideally with a setting for the time since last visit to enable digest emails.

2 Likes

I disagree. That’s not something an import script should do, but there’s a site setting for this: suppress digest email after days

3 Likes

Indeed :slight_smile: After my first phpbb import on my current forum, I had to decrease the default value; my email provider automatically blocked the used address because of all the digest emails sent. It was flagged as spam.