Migrate an SMF2 forum to Discourse

OK after MANY hours back and forth trying to get this to work I have identified the issue and am posting it here for any others that might get stuck like I did.

The MySQL docker container is version 8, this means for some reason the mariadb library that the MySQL template draws in doesn’t work.

I did not include the MySQL template file in the import docker container config file, this was the first change.

I built the import container, entered into it with ./launcher enter import


echo "gem 'mysql2'" >> Gemfile
wget https://dev.mysql.com/get/mysql-apt-config_0.8.17-1_all.deb
dpkg -i mysql-apt-config_0.8.17-1_all.deb

This has an interactive prompt and you can select the defaults (MySQL 8 with tools)

Then I installed the normal MySQL 8 library and continued the build:

apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y libmysqlclient-dev
su discourse -c 'bundle config unset deployment'
su discourse -c 'bundle install --no-deployment --path vendor/bundle --jobs 4 --without test development'

Once this was all done the rest was the same, the import is now running.

The Discourse devs may want to adjust the import scripts to account for this, at least it’s now on the forums for others to see if they get stuck like I did.

  1. Destroy according to these instructions? how-to-migrate-import-from-smf2-to-discourse/90129#cleanup-5

  2. I’ve manually transferred the attachments directory from smf2 to Discourse prior to importing, then ran the importer once, but uploads aren’t appearing in posts in Discourse. Any ideas why?

  3. I’ve already run the importer once without S3 enabled, but I want to use S3. What should I do?

  1. Yes
  2. Hmm…I’d try enabling S3 and re-importing.
  3. Enable S3 before starting the import.
1 Like

I already ran the importer once without S3 though. Would uploads from topics that were already imported be transferred to S3 during a delta import?

It’s best if you just start over and enable S3 before importing.

Getting this error upon creating users:

oxipng worker: oxipngnot found; please provide proper binary or disable this worker (--no-oxipng argument or:oxipng => false through options)

Then after that it begins creating posts.

  • Could this be why attachments aren’t appearing in Discourse?
  • Where should oxipng be installed?

Update: I’m confused why I’m getting this oxipng error. I can’t find anything about it anywhere in Discourse, and I’m experiencing this issue only upon running the smf2 import script. Could it possibly be related to this? https://meta.discourse.org/t/faster-and-smaller-uploads-in-discourse-with-rust-webassembly-and-mozjpeg-blog

That is just a warning from the image optimization library we use at server side, shouldn’t block anything.


So I thought my import issues were possibly related to S3 issues, but turns out S3 seems to be working just fine. The issue I’m having is that after running the import script (with S3 enabled) and rebuilding the import container, it seems all topics in Discourse are lacking the uploads that are attachments in smf2 topics. In other words, there’s no visual cue that there is even an upload in a Discourse topic, which is clearly seen as an attachment in its smf2 topic equivalent. It’s the same result when I import with S3 disabled. A bit out of ideas here. :confused: Any thoughts?

It may be that the images are attached to the post, so the import script knows to import them, but that they are not referenced in the post so that they don’t appear. You need to modify the post text to include a link to the images. I’ve seen that, I think with a different forum.

You mean modify the post on smf2 to include links to the attached files before importing to Discourse?

Well, that could be a way.

But I mean to modify the import script to append the attachment to the raw post. Something like

We tried and we are facing these issues: it has imported all posts, but topic titles won’t apear and external images won’t render. The current SMF2 Forum is: https://forum.mundofotografico.com.br we are trying to migrate to Discourse here: https://discourse.fotografos.online - All the topics and proper descriptions didn’t come through, images are not loading… please help! @marcozambi @miligraf @FireAllianceNX @pfaffman

I’m just embarking on the SMF migration process and I’m currently importing posts into a test instance at about 1000/hour so all is well so far apart from the MySQL performance script where MySQL didn’t like the ‘ALTER USER’ command for some reason. I manually did a ‘CREATE USER’ and all was well after that.

I read the comment about deleted users but I can’t easily create new users/fake emails to cover all my deleted users (my forum has been running for over 20 years and I’ve probably got more deleted users than real users now). I suspect I’ve got 4-5000 deleted users. Not all will have posted but a lot will have done so I probably have many hundreds of ‘missing’ users.

The posts are being imported as belonging to ‘system’ which isn’t really ideal. I did wonder whether the following would work.

  1. Before importing, create a dummy user, e.g. ‘Deleted User’.
  2. Find out the user number for ‘Deleted User’
  3. Modify the line “user_id: user_id_from_imported_user_id(message[:id_member]) || -1,” in smf2.rb and replace the ‘-1’ with the user number of ‘Deleted User’ (I think system user is -1?)

Would that work? Also, are there other places in smf2.rb where I’d need to make a similar change?

Hi there, by “deleted” do you mean that they’re actually deleted from the SMF database, or are they still in the database with their username and email and marked as suspended? How do posts by those “deleted” users currently display in SMF?

I’m in the middle of a huge migration from Drupal to Discourse, also from a long-established forum with tons of suspended users. I definitely wanted to maintain those same suspended usernames and their associated email addresses in Discourse, so I had to add that function to the Drupal importer script for Discourse. Basically the script imports all the users as normal active users, and if they had any publicly visible posts those will also get imported just like on the original forum. Then at the very tail end of the process I added a function that I lifted from another importer script to go through the Drupal database and if the user was marked as suspended to also suspend the Discourse account. You can see the code for that in my post history here.

1 Like

Hi. The users are actually deleted, i.e. there’s no longer a record for them in the smf_member table. SMF doesn’t have a function to suspend users. You can ban users, but that doesn’t seem right do to for an account where the user has died or lost interest in the hobby/forum. It’s not really right from a data protection perspective either.

SMF posts have two fields stored for each record…the user member number, which is set to zero for deleted users, and the poster name, which contains the username of the poster. So you can see which user posted the message but there are no longer any details (email, full name, etc) available for the user. Their posts have a ‘Guest’ marker when displayed.

I guess I could create a new user account for every user who has posted a message that has a member ID of zero and assign a dummy email address for the account, then mark the user as suspended afterwards. I could mark accounts as suspended based on the format of the dummy email account if I used something unique but identifiable. That feels a bit weird in some cases though…creating accounts for people who I know died 10-15 years ago!

I have time to think about this though…the migration partially worked but I now have to figure out why attachments weren’t attached, in-forum links weren’t modified and why passwords for migrated users don’t work. There may be other problems too, but I’ll work at fixing those problems first and then see what else crops up.

You mean Postgres? I’m not sure what this is about.

What I would do is if the user id is 0 use the username for the ID. Then if find_username_by_import_id fails to find the user, create the user, setting the email address to fake_email (it’s a function in base.rb that generates a fake email address) and the username with the username that you have. Then if you’re ambitius you could at the end of the script you could suspend all users that have @email.invalid in their email address. They won’t be active, so I don’t think it would matter much if you didn’t suspend them.

Another way would be to do a query that somehow generated a list of all of the deleted users and then created them before you started doing posts, but that seems harder.

If you want to create a deleted user user and have all of those posts owned by that user instead of system you could do that and just replace the -1 with the user number of deleted user. You could create it as a regular user or do something fancy and make it have a user id of -2 or something like that.

In some systems this is because sometimes attachments are in the body of the post and others the attachment record is in the database.

Did you install the Migrated password hashes support plugin after you ran the import (it can interfere with running imports in at least some circumstances). Does SMF2 hash passwords the same way that smf does

Sorry wrong name for the script. It’s the MySQL script referred to in the first post

– file: ~/smf2/script_for_mysql_tuning.sql
ALTER USER ‘user’@‘%’ IDENTIFIED WITH mysql_native_password BY ‘pass’;

Thanks for the suggestions about users and particularly fake_email. My first task is to learn enough Ruby to be able to make changes to the import script!

SMF2 attachments are records in the database. Having dug a bit deeper it looks like some have been imported, but only a few hundred out of tens of thousands. I’ll keep digging to see if I can figure out why.

Ahhh, that’s probably what I’m missing! I’m pretty sure that SMF2 uses the same hashing (salted MD5 IIRC) as SMF1 so the plugin will hopefully fix the problem. I need to do more import runs before I worry too much about users logging in.

One other question comes to mind. Is there a way to reset the system to allow me to do another import. I should have taken a backup before I started but forgot :anguished:

Oh. You mean just getting mysql set up. I see.

If you know some other languages, you can probably just muck along.
I wrote several importers before I did anything like learning Ruby. :slight_smile:

Here is one way to drop and create a new Discourse database.

sv stop unicorn;DISABLE_DATABASE_ENVIRONMENT_CHECK=1 IMPORT=1 rake db:drop db:create db:migrate; sv start unicorn

If you can remember to make a backup that can be a bit faster. Maybe.

Another trick, once you have the users figured out, is to stop the script after users are imported and make a backup then. That will let you debug the post import without having to import all the users again.

I know a few. I wrote my first program in 1976 in binary machine code on an Intel 4004. I’m starting to make sense of smf2.rb with a bit of DuckDuckGoing to understand some of the code structures that are new to me.

Thanks for database drop/create method. Time to start over and see if I can make some incremental changes to the importer for my data.

1 Like

I’ve managed to mod the importer to create dummy accounts with a fake email address for deleted users and the dummy accounts own their correct posts so that’s a good start.

I’m trying to understand attachments next because I don’t see any on any of the posts I’ve imported so far (and there should be some).

If I create a message normally via the Discourse web page I get a record in the posts table (id=4346), two records in the uploads table (ids=403 and 404), four records in upload_references (403/Draft/4, 403/Post/4346, 404/Draft/4, 404/Post/4346). I also see 403 in the image_upload_id field for post 4346 and HTML referring to the two uploads in the posts/cooked field.

For imported posts, I get a post table record for each imported SMF message and a record in the uploads table for each attachment associated with an imported SMF message. The uploads table records refer to disk files that contain the correct images, so that part is working OK. However, I don’t get any upload_references records for the uploaded images or any of the upload ids in the image_upload_id field in the posts table.

I assume that I need to try to get the upload_references records created and posts-image_upload_id and cooked fields populated, but I wanted to check first that there isn’t some other way of associating uploads with posts that’s being used (or attempting to be used) by the importer?