Questions About Migrating From Xenforo


(Dylan) #1

I have a forum with 30k users, 3m posts, and 100k+ threads currently running Xenforo looking to migrate to Discourse. Have a few pre-migration questions, I had this typed in a draft but when I came back it seems to have killed off the draft so it may not be as detailed this time. D:

  1. Are there any known issues with Xenforo -> Discourse, any common aspects that will not be carried over during the migration?

  2. If I could spin up any size AWS EC2 instance (which one would you recommend if price doesn’t matter for one server for a day or two to handle the migration?) would you need to change any settings to let the import use the full resources? For example, Xenforo has rate limits built into the import tool that you have to manually change to increase the speed of the import. If so, how would I go about doing that to make the import go as fast as possible to limit downtime?

  3. Is the default install of Discourse good or do we need to customize anything due to the size of the board? Xenforo for example requires it’s Enhanced Search (through elasticsearch) once you get over a million posts or so, or else it has severe performance issues. Also, it’s fine just importing it on one server, then copying the DB and putting it on the production environment correct?

  4. Anyway to enable multi-factor authentication (at least through FB/Twitter/Github)? Any other security recommendations (i.e.: htaccess for admin panel)?

  5. Anything else we should know for this migration?

Unimportant questions:

  1. Is there a way to let users use different themes? There was a makeshift workaround for this at one point, unsure if it still works or if it’s been replaced, I haven’t actively followed Discourse in the last year. Some like a light theme while others require a dark theme.

  2. What’s the best way for integrating a blog on the home page made up of topics on the forum? So for example, popular topics could be shown as blog entries (similar to how Discourse’s blog posts look) then the comments pull from a thread on the forum where it was posted, with some way for admins to moderate what makes it to the front page. Even if we have to manually activate each one it’s fine, just want to have the option to do so.

  3. Any other recommended plugins?

Looking forward to migrating to Discourse, thank you. :slight_smile:


(Rafael dos Santos Silva) #2

Nothing extra.

Nope.

You need another site, using something like wordpress, a cms, or soemthing handcrafted for that.


#3

I migrated my forum from Xenforo (or, the Discourse team guys did it for me). Far fewer users but > 1m posts so it was large enough. Can’t think of anything that didn’t come across nicely. Some minor workarounds for post formats and things like that. Think the biggest challenge was the username conventions which didn’t allow for spaces but nothing important missing feature wise.


(Dylan) #4

Do you know if that was all automatic through the import or something that had to be messed with quite a bit?

What had to be modified for post formats?


#5

I think I was the first migration from Xenforo using that script anyway so I think the importer that’s there now was written after doing mine. So hard to say how much of it was automatic because it hadn’t been done before.

Post format issues mainly related to things like username tagging and media embeds and things like that. My forum had come from:

  • SMF
  • then to Kunena
  • then to vBulletin
  • then to IP Board
  • then to Xenforo

and then to Discoure. So there were lots of legacy things hanging around that were part of other migrations in the past that weren’t specific to this migration. One thing that was difficult with the large volume of posts was rebaking them all to update for new embedding etc.

So getting a @user name into @user_name worked fine but then each post needed to be updated to convert the second tag into a link to that new user_name. That sort of issue didn’t really bother me though and those older posts were soon irrelevant.


(Dylan) #6

@techAPJ & @riking since you two are the main contributors to the xenforo importer do you know if there are any issues I should be aware of before doing an import? Such as:

  • Will the categories transfer over?
  • Will user groups transfer over?
  • Will all accounts (even those that use unsupported characters) transfer over, it’s fine if it removes the characters, if they can still access it with their email address?
  • Will their logins be the same or would they need to reset their passwords?
  • What is the highest you’d recommend setting the batch limit on the importer to speed up the process? I can deploy any AWS EC2 instance you recommend for the import, but does anything else need to be changed (server side?) if the batch number is increased?

Do you know if that is implemented now? As most members would like to retain control over their previous content.

Most of our team wants to move to this software, however, we want to ensure it’ll work first as losing a chunk of our content would be something that will hold the community back from wanting the move to go forward.


(Kane York) #7

Basically, it’s hard to say until you try it. Imports are pretty weird, there’s no one-size-fits-all.

Expect to wipe the Discourse database a few times in case you don’t get it quite right. This is required if you make changes to the import script to move data that got missed, as already-imported records are skipped.

Pay special attention to data from plugins. Categories will definitely be transferred. User groups - probably not, but read the import script (if groups are important, you’ll want to add that to the script).

The default behavior is for everyone to need a password reset, but you can use this plugin (will probably require import script changes):


(Dylan) #8

Really, if the users, categories, topics, and posts are carried over default then we’ll be fine. Does the current import script successfully cover those?

So the steps are:

  • Open this script discourse/xenforo.rb at master · discourse/discourse · GitHub
  • Add DB information
  • Modify batch size at the top (any recommendations on the number to set it to? do we need to edit batch size elsewhere on the script?)
  • Then enter “RAILS_ENV=production bundle exec ruby script/import_scripts/xenforo.rb”

Anything else?

Also the import server would be a r3.8xlarge - 32 vCPU, 104 ECU, 244GB RAM, 320GB SSD.


(Kane York) #9

Looking at it briefly, yes, it should.

this is probably overkill - Ruby has a GIL (so 1 thread executing user-mode code at any time). Go for the highest single-core performance server available to do the import; 8GB RAM will be more than enough.

Batch size 1000 was fine in the past for me.


(Dylan) #10

So there isn’t any way to speed up the import? As some said it took up to 2 days for less content than what mine has, would prefer to minimize the amount of content / posting time loss as much as possible.


(Rafael dos Santos Silva) #11

Sorry, but ain’t more cores help the sidekiq run his things?


(Kane York) #12

You can re-run the importer with an updated source database and it will take less time, i.e. an incremental import. That’s typically what is done.


(Dylan) #13

So basically run the import as is right now. Then once it’s complete, run it again and it will add any new users / posts / threads that were posted on the other board since the last backup and won’t duplicate data?

That’s what I was thinking, surely there is a way to dedicate more of the resources to it to speed up the process?


(Kane York) #14

Right, I forgot about the sidekiq worker count, and you can probably scale the batch size in proportion to your available working RAM.


#15

I didn’t explain myself very well. All posts and users and owners transferred across. It was just an issue with in-post tagging. Even then, the posts were updated fine cosmetically but it wasn’t worth the hassle of rebaking them to get the new hyperlinks to work properly.

Simular issues with YouTube embeds for example. They converted to showing the URL correctly (I think there was a media tag that needed to be stripped out). But they weren’t oneboxed without rebaking. Again not worth the effort on the large volume of posts.

In short, no significant or annoying issues. A few minor trade-offs in the interests of efficiency.


(Dylan) #16

Any idea what the batch limit per sidekiq worker should be?
And what the sidekiq worker per GB of RAM should be?

@codinghorror – any idea for that since I’ve seen you mention doing migrations for the paid hosting customers?

Also, if I import it today, then once it finished I re-import for whatever posts were made during the time of the import (100 or so new threads, 1-2k new posts/day, 20 new users or so) – it will just skip past the content that was already imported without duplicating it right?

So all the old content will be there correctly, properly attached to the correct users, etc, it just may not look as pretty and formatted as before? I’m fine with that.


(Kane York) #17

Yep, the import_id custom field is created with the database id from the source database.

It also skips entire batches at a time if possible:


(g0st) #18

As you seem to have discovered it’s not nearly that simple.

Seems the import script is nowhere near ready for prime time.

Your steps are the only actual instructions I’ve read on how to perform the import. How is that even possible?

There’s also quite a bit of need to know information missing like that you need to run that command from within the container.


(Dylan) #19

Definitely seemed to be one of the first self-host xenforo imports to discuss it on the site at least, I tried finding any details I could. In the end it was just too messy of a process so I decided to just order the paid hosting option here and get the Discourse team to handle the migration. Adds peace of mind that it’ll be a smooth process and let’s me focus solely on managing the community and growing it rather than having to spend time worrying about the server side of things.

It’s a bit more expensive than self-hosting it, but the price is worth the time I’d have to spend managing it myself as well as it helps fund future development of Discourse so it’s a win-win for everyone.


(Jeff Atwood) #20

Thanks! Every time we improve the importers all our effort is fully open sourced, so you’re also making things better for the next person or site down the line. :bow: