Questions About Migrating From Xenforo

Do you know if that was all automatic through the import or something that had to be messed with quite a bit?

What had to be modified for post formats?

I think I was the first migration from Xenforo using that script anyway so I think the importer that’s there now was written after doing mine. So hard to say how much of it was automatic because it hadn’t been done before.

Post format issues mainly related to things like username tagging and media embeds and things like that. My forum had come from:

  • SMF
  • then to Kunena
  • then to vBulletin
  • then to IP Board
  • then to Xenforo

and then to Discoure. So there were lots of legacy things hanging around that were part of other migrations in the past that weren’t specific to this migration. One thing that was difficult with the large volume of posts was rebaking them all to update for new embedding etc.

So getting a @user name into @user_name worked fine but then each post needed to be updated to convert the second tag into a link to that new user_name. That sort of issue didn’t really bother me though and those older posts were soon irrelevant.

3 Likes

@techAPJ & @riking since you two are the main contributors to the xenforo importer do you know if there are any issues I should be aware of before doing an import? Such as:

  • Will the categories transfer over?
  • Will user groups transfer over?
  • Will all accounts (even those that use unsupported characters) transfer over, it’s fine if it removes the characters, if they can still access it with their email address?
  • Will their logins be the same or would they need to reset their passwords?
  • What is the highest you’d recommend setting the batch limit on the importer to speed up the process? I can deploy any AWS EC2 instance you recommend for the import, but does anything else need to be changed (server side?) if the batch number is increased?

Do you know if that is implemented now? As most members would like to retain control over their previous content.

Most of our team wants to move to this software, however, we want to ensure it’ll work first as losing a chunk of our content would be something that will hold the community back from wanting the move to go forward.

Basically, it’s hard to say until you try it. Imports are pretty weird, there’s no one-size-fits-all.

Expect to wipe the Discourse database a few times in case you don’t get it quite right. This is required if you make changes to the import script to move data that got missed, as already-imported records are skipped.

Pay special attention to data from plugins. Categories will definitely be transferred. User groups - probably not, but read the import script (if groups are important, you’ll want to add that to the script).

The default behavior is for everyone to need a password reset, but you can use this plugin (will probably require import script changes):

6 Likes

Really, if the users, categories, topics, and posts are carried over default then we’ll be fine. Does the current import script successfully cover those?

So the steps are:

Anything else?

Also the import server would be a r3.8xlarge - 32 vCPU, 104 ECU, 244GB RAM, 320GB SSD.

Looking at it briefly, yes, it should.

this is probably overkill - Ruby has a GIL (so 1 thread executing user-mode code at any time). Go for the highest single-core performance server available to do the import; 8GB RAM will be more than enough.

Batch size 1000 was fine in the past for me.

2 Likes

So there isn’t any way to speed up the import? As some said it took up to 2 days for less content than what mine has, would prefer to minimize the amount of content / posting time loss as much as possible.

Sorry, but ain’t more cores help the sidekiq run his things?

3 Likes

You can re-run the importer with an updated source database and it will take less time, i.e. an incremental import. That’s typically what is done.

2 Likes

So basically run the import as is right now. Then once it’s complete, run it again and it will add any new users / posts / threads that were posted on the other board since the last backup and won’t duplicate data?

That’s what I was thinking, surely there is a way to dedicate more of the resources to it to speed up the process?

1 Like

Right, I forgot about the sidekiq worker count, and you can probably scale the batch size in proportion to your available working RAM.

1 Like

I didn’t explain myself very well. All posts and users and owners transferred across. It was just an issue with in-post tagging. Even then, the posts were updated fine cosmetically but it wasn’t worth the hassle of rebaking them to get the new hyperlinks to work properly.

Simular issues with YouTube embeds for example. They converted to showing the URL correctly (I think there was a media tag that needed to be stripped out). But they weren’t oneboxed without rebaking. Again not worth the effort on the large volume of posts.

In short, no significant or annoying issues. A few minor trade-offs in the interests of efficiency.

3 Likes

Any idea what the batch limit per sidekiq worker should be?
And what the sidekiq worker per GB of RAM should be?

@codinghorror – any idea for that since I’ve seen you mention doing migrations for the paid hosting customers?

Also, if I import it today, then once it finished I re-import for whatever posts were made during the time of the import (100 or so new threads, 1-2k new posts/day, 20 new users or so) – it will just skip past the content that was already imported without duplicating it right?

So all the old content will be there correctly, properly attached to the correct users, etc, it just may not look as pretty and formatted as before? I’m fine with that.

Yep, the import_id custom field is created with the database id from the source database.

It also skips entire batches at a time if possible:

https://github.com/discourse/discourse/blob/master/script/import_scripts/xenforo.rb#L43-L45

2 Likes

As you seem to have discovered it’s not nearly that simple.

Seems the import script is nowhere near ready for prime time.

Your steps are the only actual instructions I’ve read on how to perform the import. How is that even possible?

There’s also quite a bit of need to know information missing like that you need to run that command from within the container.

Definitely seemed to be one of the first self-host xenforo imports to discuss it on the site at least, I tried finding any details I could. In the end it was just too messy of a process so I decided to just order the paid hosting option here and get the Discourse team to handle the migration. Adds peace of mind that it’ll be a smooth process and let’s me focus solely on managing the community and growing it rather than having to spend time worrying about the server side of things.

It’s a bit more expensive than self-hosting it, but the price is worth the time I’d have to spend managing it myself as well as it helps fund future development of Discourse so it’s a win-win for everyone.

5 Likes

Thanks! Every time we improve the importers all our effort is fully open sourced, so you’re also making things better for the next person or site down the line. :bow:

6 Likes

I’m planing to do migration from XenForo after XF 2.0 gets released and if I don’t like it’s changes.

Does discourse also support redirecting all Xenforo links to the new one?

Glancing quickly from my phone, it doesn’t appear that the Xenforo importer does redirects. That would require a bit of code be added.

1 Like

Continued here: