In the forum I transferred, the xengallery was once installed, so I had to change the following, because the table xfgallery no longer existed.
def get_xf_sql(type, id)
case type
when :gallery
return "SELECT NULL WHERE 1=0;"
when :attachment
<<-SQL
SELECT a.attachment_id, a.data_id, d.filename, d.file_hash, d.user_id
FROM #{TABLE_PREFIX}attachment AS a
INNER JOIN #{TABLE_PREFIX}attachment_data d ON a.data_id = d.data_id
WHERE attachment_id = #{id}
AND content_type = 'post'
SQL
end
end
Running the above gives me the following error. I checked the Gemfile and it only contains this one line - gem âmysql2â
This Gemfile does not include an explicit global source.
Not using an explicit global source may result in a different lockfile being generated depending on the gems you have installed locally before bundler is run.
Instead, define a global source in your Gemfile like this: source "https://rubygems.org".
Could not find gem 'mysql2' in locally installed gems.
root@ip-172-566-459-13-app:/#
Ok so I managed to move onto the next step. Someone above posted that we need to be in /var/www/discourse folder on the container and then add the gem.
I am getting this error. What could I be doing wrong?
/var/www/discourse/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/connection_adapters/postgresql_adapter.rb:63:in "rescue in new_client": We could not find your database: discourse. Available database configurations can be found in config/database.yml. (ActiveRecord::NoDatabaseError)To resolve this error:- Did you not create the database, or did you delete it? To create the database, run: bin/rails db:create- Has the database name changed? Verify that config/database.yml contains the correct database name.
Solved it: I was running as root user, had to switch to the âdiscourseâ user. Import has started.
So I picked up a reasonably good server at 4CPU and 16GB RAM. At the rate at which the posts are getting migrated, it will take me 9 days for just the posts to get migrated. The users took 2.5 hours to get migrated. Safe to say that this is going to be a no go for me as is but at least I can spend some months familiarizing myself till I figure out a solution for this bulk migration.
PS:
In the migration script I see that duplicate emails are not imported. What are the different ways that duplicate is determined? I noticed that xyz@gmail.com is treated same as xyz+1@gmail.com and xy.z@gmail.com
Iâve tried doing migrations on VPS with specs similar to my personal computer, but for some reason it was always much, much slower than using my computer.
Nowadays, I always do my migrations locally. How many posts do you have?
On my machines, a rate of 800-1000 users or posts/minute is fairly typical.
Note that when you do the final import, itâll import only the users and posts that havenât been imported already, so it wonât take very long.
Turn off the Normalize emails site setting (off was the default until recently). It should probably get turned off in this function here:
You can put it in your customized version of the xenforo script with SiteSetting.normalize_emails=false. Iâm not sure what happened to those duplicate email users; there are two obvious things to do, give them a bogus email address or skip importing them. Looks like it gives them bogus ones? (And thereâs a pretty good chance that they are, in fact, bogus users anyway). If it skipped them, then running the script again will import them.
Yes on my laptop, it is churning things much faster at 1000 items per minute. Thats about 2 times faster than the on server. Still thats about 3 days.
I went through the skipped emails and it seems its doing a good job ignoring those accounts. I will just merge them prior to the final import. Hardly 20 odd such cases.
Note that when you do the final import, itâll import only the users and posts that havenât been imported already, so it wonât take very long.
Thank you for pointing this out. I observed this myself and it seems this is what is going to save the day when I do the final import. So I take a backup and restore on D-3 and then another backup and restore with the new DB backup file on Day 0. Is that correct?
Are those backups and restores on the Xenforo site, or do you have some live Discourse site that youâre going to import the Xenforo data to?
As long as you donât make changes to the script that require re-importing data, and what you have on your laptop now is what you want on your Discourse server, then you can just keep getting new dumps of the Xenforo database and importing them (to test, see how long it takes, and so on) and then on the cut-over day, you freeze the Xenforo site, get that database, run the script once more and upload to your Discourse server.
If you already have data on your Discourse site that you want to keep, things are much more complicated since youâll need to freeze that site, then get the Xenforo data and then proceed as described above.
Itâll be a fresh install of Discourse so that makes it straightforward.
I have a decent amount of time at hand as I want to test migrations multiple times, familiarize myself with Discourse thoroughly, get all add-ons configured the way I want and maybe also get my hands dirty with some add-on customization myself.
What youâve explained lifts one pain point off my chest completely as I thought I would have to figure out bulk imports too.
Have come back with a query, does the import script output any logs? My test import is stuck at 98.2% for a few hours.
Another thing I realized, if I restart the migration, it takes around 30 seconds to skip over a batch of 1000 posts. So effectively the speed is now 2000 items per minute. Not a significant improvement over the 1000 posts per minute for the first import, as even on the last import on the day of the cutover, it will take about a days time. 23 hours out of which will just be skipping already imported items.
Yes, itâll skip all data thatâs been imported already. And it does it much faster than 2000 posts/minute. I suspect youâll see when you restart it now.
So managed to get the avatars and attachments imported. Copied these folders.
/internal_data/attachments
/data/avatars
To answer my question, the avatars and attachments get finalized once imported. If a user changes their avatar after their ID is imported, it will not get imported/updated because that post or user will get skipped in the second run.
Now just need to figure out the conversations import (can skip too but good to have) and permanent redirects.
@Fajfi - Thank you for your contribution to the import script. Worked flawlessly for avatars and attachments. Its still running and have not reached the likes portion yet.
Fixed the conversations import. Was able to import over half a million messages from XF2.3 into discourse. Have raised a PR in case someone is interested.
----EDIT----
Raise another PR with a fix for likes import. It is surprising that nobody migrated from XF2.1+ to discourse till now. Likes were renamed to reactions in 2019 when XF2.1 released.