Importing / migrating SMF2 to Discourse - The Ultimate Guide

SMF2 to Discourse - The Ultimate Guide

Starting from the idea of creating an up-to-date guide to migrate from SMF2 to Discourse discussed in this excellent thread by @Vincent and @cmwebdev, I have prepared an ambitious “Ultimate Guide” using some notes I took merged with details already written in this other topic, which was fundamental to even start working on the import concept.

We are more than happy to update this guide with any feedback and further experiences as they come in.

Preparation work on SMF2

In order to minimise the issues while porting your SMF2 content to Discourse, please perform those of the following steps that are relevant for your current SMF2 configuration

  • Provide a description for each Category
  • Check that there are no users having the same e-mail.
  • Check for corrupted attachments. In certain cases corrupted attachments are crashing the import process.
  • If you have deleted users, but their posts are still in the SMF2 DB, in Discourse they’ll be assigned to the user “System”. Consider the idea to re-create the deleted users (each having a unique and valid e-mai to avoid import problems). This step might need a bit of behind-the-curtains SMF2 DB tweak, but this is beyond the purpose of this Guide.
  • Consider writing a script to deal with BBcode and other oddities that the Discourse importer is not dealing with (more below, in the “Bonus track” section). We strongly recommend to install the BBCode official Plugin, which substantially extends the number of BBcode accepted by Discourse.
  • If you have your attachments split in multiple directories (a feature that SMF2 allows), be aware that the Discourse import script expects them to be in a single directory.

Install Discourse

We’re not going to provide a step-by-step guide to install Discourse, as you can choose to do it as a paid service, or using (as we did) this excellent guide: discourse/INSTALL-cloud.md at master · discourse/discourse · GitHub
From here onwards we assume that you have been successful with your Discourse installation, and that you have command-line access to the host system where Discourse is installed.
We also assume that you have familiarity with the linux command line and with a few basic linux commands and text editors (e.g. vi or nano).

Prepare the SMF2 data for export

So, at the moment all your SMF2 data is sitting in a MySQL DB on a server which is, possibly, different than the one running your Discourse installation. In principle it is possible to connect directly to the SMF2 MySQL DB server from the Discourse server, provided that you have port 3306 exposed to the open Internet and that you have the credentials to connect to it.

[Optional] For our own SMF2 migration we have decided to do an extra step. We have created a duplicate of the SMF2 database on the server hosting the SMF2 forum, then run a pre-export script onto this clone, then export the cloned DB rather then the original one. This is because we are still testing the Discourse migration and we want to be absolutely sure we have the most complete and transparent migration possible. For further details see the “Bonus track section”.

In this guide we have chosen to go for a different solution, creating a minimalistic MySQL container on the Discourse server where we’re going to import the SLQ dump of our SMF2 database.
On your SMF2 DB server perfom a SMF2 DB dump.

/usr/bin/mysqldump -u<your_user> -p<your_password> --databases <your_smf2_db> > smf2.db

Prepare the SMF2 data for being imported into Discourse

There are a few steps that shall be performed for a successful first import. Execute the following on your Discourse host server.

  1. Make yourself at home and prepare the necessary directories
cd ~
mkdir smf2
mkdir smf2/attachments
  1. Copy over the SMF2 database, the attachments and the SMF2 Settings.php file.
rsync -aruviP user@smf2server:/path_to_db/smf2.sql ~/smf2/
rsync -aruviP user@smfd2server:/path_to_smf2_root/attachments/* ~/smf2/attachments/
rsync -aruviP user@smf2server:/path_to_smf2_root/Settings.php ~/smf2/
# If you have more than one attachments dir, this is a good time to just copy everything into one on the Discourse server
# rsync -aruviP user@smf2server:/path_to_smf2_root/attachments_A/ ~/smf2/attachments/
# rsync -aruviP user@smf2server:/path_to_smf2_root/attachments_B/ ~/smf2/attachments/
# rsync -aruviP user@smf2server:/path_to_smf2_root/attachments_C/ ~/smf2/attachments/
# ...
  1. Create and start the MySQL container.
docker run -d -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=db -v ~/smf2:/backup --name=mysql mysql

The container has been created to mount our directory ~/smf2 as volume, in the /backup directory inside the container. In other words all files and directories that you may have in the host directory ~/smf2 will be visible and available inside the mysql container under /backup.

  1. In case you need to (re)start the mysql container later, for any reason, you can just use the basic docker commands to do it
# Starting the mysql container
docker start mysql

# Stopping the mysql container
docker stop mysql
  1. Create a SQL script to configure the MySQL server running in the mysql container. This is necessary in case you have a large SMF2 database. Our own SMF2 forum has 320000+ posts and about 8 GB of attachments to import, and the first import tries were just frustrating. Crash, after crash, after crash. The crashes were due to the connection between the Discourse importer and the MySQL running in the container timing out. After reading several contributions, all leading us towards the right direction, we put together a SQL script containing every setting we needed for an ultra-stable importer-to-mysql connection.
    After this script was executed, we were able to run a full import without any timeout errors (btw, in our case it lasted for about 48 hours). Here is SQL the script (we saved it in ~/smf2 as “script_for_mysql_tuning.sql”)
-- file: ~/smf2/script_for_mysql_tuning.sql
ALTER USER 'user'@'%' IDENTIFIED WITH mysql_native_password BY 'pass';
SET GLOBAL net_write_timeout=3600;
SET GLOBAL net_read_timeout=3600;
SET GLOBAL delayed_insert_timeout=3600;
SET GLOBAL max_length_for_sort_data=8388608;
SET GLOBAL max_sort_length=8388608;
SET GLOBAL net_buffer_length=1048576;
SET GLOBAL max_connections=10000;
SET GLOBAL connect_timeout=31536000;
SET GLOBAL wait_timeout=31536000;
SET GLOBAL max_allowed_packet=1073741824;
SET GLOBAL mysqlx_read_timeout=2147483;
SET GLOBAL mysqlx_idle_worker_thread_timeout=3600;
SET GLOBAL mysqlx_connect_timeout=1000000000;

SET SESSION net_write_timeout=3600;
SET SESSION net_read_timeout=3600;
SET SESSION max_length_for_sort_data=8388608;
SET SESSION max_sort_length=8388608;
SET SESSION wait_timeout=31536000;

It’s important to remember that this .sql script will be “automatically” visible in the mysql container under /backup

  1. It’s now time to enter the mysql container
docker exec -it mysql bash
  1. Let’s import the smf2.db content. We configure the MySQL server first, and then we import the data. Environment variable $MYSQL_PASSWORD $MYSQL_DATABASE are pre-defined in the container. Keep in mind this step can be quite time consuming, depending on how much data you have in your smf2.db file.
mysql -uroot -p$MYSQL_PASSWORD $MYSQL_DATABASE < /backup/script_for_mysql_tuning.sql
mysql -uroot -p$MYSQL_PASSWORD $MYSQL_DATABASE < /backup/smf2.sql
  1. Exit the mysql container with
CTRL+D
  1. We need to get the IP address of the mysql container, that will be used later, when importing into Discourse. Note down the IP address.
docker inspect mysql | grep IPAddress

Preparing the the Discourse Host and container for importing

  1. First of all, we need to create a copy of the original app.yml file, for example import.yml. We’re going to edit the content of import.yml to enable the mysql2 gem and mount as volume the directory containing our smf2 attachments.
cd /var/discourse
cp containers/app.yml containers/import.yml
nano containers/import.yml
  1. Now inside containers/import.yml add - “templates/import/mysql-dep.template.yml” to the list of templates. Afterwards it should look something like this:
templates:
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"
  - "templates/web.ssl.template.yml"
  - "templates/web.letsencrypt.ssl.template.yml"
 # Un-comment the line below to enable the MySQL library in the Discourse container
  - "templates/import/mysql-dep.template.yml"

  ...

## The Docker container is stateless; all data is stored in /shared
volumes:
  - volume:
      host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log
  - volume:
      host: /root/smf2 # Here is where we wave copied, on the host, all attachments
      guest: /shared/smf2 # Here is the mounting point of the volume in the Discourse import container
  1. Stop the app container and rebuild the import container. Wait patiently.
/var/discourse/launcher stop app
/var/discourse/launcher rebuild import
  1. Copy back the smf2.rb file into the import container
docker cp ~/smf2/smf2.rb import:/var/www/discourse/script/import_scripts/
  1. Edit the Settings.php file from SMF2 with the correct connection details. To be completely clear, this step is not really necessary, as the DB connection parameters can also be passed to the smf2.rb import script as parameters, but I found this solution to be faster and more flexible.
########## Database Info ##########
$db_type = 'mysql';
$db_server = '172.17.0.X'; # This is the IP address of the mysql container - use yours!
$db_name = 'db';
$db_user = 'user';
$db_passwd = 'pass';
$ssi_db_user = '';
$ssi_db_passwd = '';
$db_prefix = 'smf_';
$db_persist = 1;
$db_error_send = 0;

Import!

  1. Enter the import containers
/var/discourse/launcher enter import
  1. Start the import!
su discourse -c "bundle exec ruby script/import_scripts/smf2.rb /shared/smf2 -t Europe/Rome"
  1. Grab a beer and wait…

Further import sessions

Following the first, massive import, we are doing “delta” imports every night, untill we will be ready to move to Discourse 100%. To do this we are basically re-running the rsync to copy over whatever new attachment was created each day, then repeating the MySQL dump and import in the mysql container, and to finish we re-launch the import script.

Everything that has already been imported will just be ignored, so importer runs quite faster. If you really do want to minimise the import time you could alter the smf2.rb code adding a WHERE id_msg clause to the Query that prepares the data for importing posts.
For our deltas we have changed it like this:

create_posts(query(<<-SQL), total: total) do |message|
  SELECT m.id_msg, m.id_topic, m.id_member, m.poster_time, m.body,
         m.subject, t.id_board, t.id_first_msg, COUNT(a.id_attach) AS attachment_count
  FROM {prefix}messages AS m
  LEFT JOIN {prefix}topics AS t ON t.id_topic = m.id_topic
  LEFT JOIN {prefix}attachments AS a ON a.id_msg = m.id_msg AND a.attachment_type = 0
  WHERE m.id_msg > 304000
  GROUP BY m.id_msg
  ORDER BY m.id_topic ASC, m.id_msg ASC
SQL

Bonus track

:mega: The script now sanitizes BBCode on import.

As mentioned above, we have developed a PHP script which helps us sanitising some of the unsupported BBCcode and does also some extra stuff with our embedded images/links and unsupported emoji.
I paste here the main function of this script, in case all of part of it can be useful for your migration efforts.

Please note that this script is supposed to run on the SMF2 server, and that it does do potentially harmful changes to your SMF2 DB!. Some of the preg_replace have been commented out as we have installed the BBCode Plugin which supports them.

function cleanup()
{
	global $exportDbConnection; // This is a PDO connection object, which connects to the SMF2 DB.

	// Unsupported or custom emoji translation
	$emoF = array(
		0 => '/:tease:/',
		1 => '/\[emoji1\]/',
		2 => '/:agree:/',
		3 => '/:happy:/',
		4 => '/\[emoji28\]/',
		5 => '/:surprise:/',
		6 => '/:embarrassed:/',
		7 => '/:evil:/',
		8 => '/:sad:/',
		9 => '/:undecided:/',
		10 => '/:death:/',
		11 => '/:help:/',
		12 => '/:hurt:/',
		13 => '/:sick:/',
		14 => '/:spam:/',
		15 => '/:surprise:/',
		16 => '/:vomit:/',
		17 => '/:wounded:/',
		18 => '/:yes:/',
		19 => '/:badmood:/',
		21 => '/:stica:/',
		22 => '/:spank:/',
		23 => '/:shock:/',
		24 => '/:censored:/',
		25 => '/:rtfm:/',
		26 => '/:police:/',
		27 => '/:blindfold:/',
		28 => '/:canadian:/',
		29 => '/:clown:/',
		30 => '/:crazy:/',
		31 => '/:educated:/',
		32 => '/:gum:/',
		33 => '/:hungry:/',
		34 => '/:snore:/',
		35 => '/:suspious:/',
		36 => '/:tired:/',
		37 => '/:ugly:/',
		38 => '/:whatever:/',
		39 => '/:whistle:/',
		40 => '/:ninja:/',
		41 => '/:pirate:/',
		42 => '/:\[emoji16\]:/'
	);

	$emoT = array(
		0 => ':tongue:',
		1 => ':smiley:',
		2 => ':ok_hand:',
		3 => ':smile:',
		4 => ':sweat_smile:',
		5 => ':astonished:',
		6 => ':flushed:',
		7 => ':japanese_ogre:',
		8 => ':disappointed:',
		9 => ':thinking:',
		10 => ':skull:',
		11 => ':ambulance:',
		12 => ':face_with_head_bandage:',
		13 => ':face_with_thermometer:',
		14 => ':wastebasket:',
		15 => ':astonished:',
		16 => ':face_vomiting:',
		17 => ':face_with_head_bandage:',
		18 => ':ok_hand:',
		19 => ':angry:',
		21 => ':rocket:',
		22 => ':facepunch:',
		23 => ':dizzy_face:',
		24 => ':face_with_symbols_over_mouth:',
		25 => ':bookmark_tabs:',
		26 => ':policeman:',
		27 => ':see_no_evil:',
		28 => ':man_dancing:',
		29 => ':clown_face:',
		30 => ':crazy_face:',
		31 => ':notebook:',
		32 => ':smiley:',
		33 => ':spaghetti:',
		34 => ':confuse:',
		35 => ':thinking:',
		36 => ':weary:',
		37 => ':thinking:',
		38 => ':expressionless:',
		39 => ':kissing_smiling_eyes:',
		40 => ':martial_arts_uniform:',
		41 => ':skull_and_crossbones:',
		42 => ':grinning:'
	);

	$sta = 0;
	$step = 30000;

	do
	{
		$end = $sta + $step;

		$query = "SELECT id_msg, subject, body FROM smf_messages WHERE id_member != 19754 AND id_msg >= " . $sta . " AND id_msg <= " . $end . ";";
		echo $query.PHP_EOL;

		$sta += $step;

		try {
			$stmt = $exportDbConnection->query($query);
		} catch(PDOException $ex) {
			echo "An Error occured!";
			echo $ex->getMessage();
		}

		$results = $stmt->fetchAll(PDO::FETCH_ASSOC);
		foreach ($results as $k => $line) {

			$bbcode = $line['body'] ;

      // echo $line["id_msg"]." - ".$line["subject"].PHP_EOL;
			// echo $bbcode.PHP_EOL;
			// echo PHP_EOL." => ".PHP_EOL;

			// HTML line breaks to \n
			$bbcode = preg_replace('/(<br\s?\/?>)/is', "\n", $bbcode);

			//$bbcode = html_entity_decode ($bbcode,ENT_COMPAT | ENT_HTML401,"UTF-8");
			$bbcode = preg_replace('/\[hr\]/i', "\n---\n", $bbcode);

			/*
			$bbcode = preg_replace('/\[b\]/i', " **", $bbcode);
			$bbcode = preg_replace('/\[\/b\]/i', "** ", $bbcode);
			$bbcode = preg_replace('/\[u\]/i', "", $bbcode);
			$bbcode = preg_replace('/\[\/u\]/i', "", $bbcode);
			$bbcode = preg_replace('/\[i\]/i', " *", $bbcode);
			$bbcode = preg_replace('/\[\/i\]/i', "* ", $bbcode);
			$bbcode = preg_replace('/\[(ul|list|list type=decimal)\]/is', "", $bbcode);
			$bbcode = preg_replace('/\[\/(ul|list|li)\]/is', "", $bbcode);
			$bbcode = preg_replace('/\[li\]/is', " * ", $bbcode);
			$bbcode = preg_replace('/(\[(ol|ul|list|list type=decimal)\])\[/is', "$1\n", $bbcode);
			*/

			// We get rid of the [img] bbcode and we just keep the image url
			$bbcode = preg_replace('/(\[img]|\[img width(=|\d|")+\])(.+?)\[\/img]/i', "\n$3\n", $bbcode);

			// Fix double URLs like [url=http://www.website.it/xyz]http://www.website.it/xyz[/url]
			$regexp = '/\[url=(https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*))\]https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)\[\/url\]/i';
			$bbcode = preg_replace($regexp,'$1',$bbcode);

			// Images in [center] do not work in Discourse, removing the BBCcode
			$regexp = '/\[center\](\r\n|\r|\n)?(\S+\.(png|jpe?g|gif))(\r\n|\r|\n)?\[\/center\]/i';
			$bbcode = preg_replace($regexp, "$2", $bbcode);

			// LaTeX
			$bbcode = preg_replace('/\[tex\](.+?)\[\/tex\]/i', ' $ $1 $', $bbcode);

			// Each bbcode goes to newline,
			$bbcode = preg_replace('/\]\[/is', "]\n[", $bbcode);

			/*
			$bbcode = preg_replace('/\[center\](.+?)\[\/center\]/i', '### $1', $bbcode);
			$bbcode = preg_replace('/\[color.+?\](.+?)\[\/color\]/i', '$1', $bbcode);
			$bbcode = preg_replace('/\[size=?\d+pt\](.+?)\[\/size\]/i', '$1', $bbcode);
			$bbcode = preg_replace('/\[font.+?\](.+?)\[\/font\]/i', '$1', $bbcode);
			*/

			$bbcode = preg_replace('/\[sub\](.+?)\[\/sub\]/i', '$1', $bbcode);

			// Replace multiple (3 ore more) line breaks with a single one.
			$bbcode = preg_replace('/[\r\n]{3,}/s', "\n\n", $bbcode);
			$bbcode = preg_replace('/[\n]{3,}/s', "\n\n", $bbcode);

			// Handle some special case here...
			$bbcode = preg_replace('/(&amp;#039;|&#039;)+/',"'", $bbcode);
			$bbcode = preg_replace('/&nbsp;/'," ", $bbcode);

			$bbcode = preg_replace($emoF,$emoT,$bbcode);

			$subject = html_entity_decode ($subject,ENT_COMPAT | ENT_HTML401,"UTF-8");

			$upd = $exportDbConnection->prepare("UPDATE smf_messages SET subject=?, body=? WHERE id_msg=?");
			$upd->execute(array($subject,$bbcode, $line['id_msg']));
			$affected_rows = $upd->rowCount();
			echo $affected_rows.PHP_EOL;
			echo '<hr>'.PHP_EOL;
		}

		ob_flush();
		sleep (5);
	} while ($end <= 360000);
}
18 Likes

Great guide! It really helped me migrate a large SMF2 forum to Discourse and I’m loving it.

Just two notes:

  • The host’s smf2 path might be different. In my case it was /home/ubuntu/smf2 (EC2 instance running Ubuntu 16.04 LTS image).
  • After the import is finished, the user needs to destroy the import container because if not, that’s the one that will run on server boot (instead of the app container).

Tips:

  • Run the import using a large EC2 instance (c5.2xlarge) and then downgrade to a smaller type (t2.medium) - for me it took about 1.5 hours to import a forum with 28k messages and 10GB of attachments!
  • If you’re planning on using S3 for uploads - then set it up before running the import!
7 Likes

Does this import script support the importing of thread views?

So I’ve pretty much followed this guide all the way through (except for the script running with CLI settings rather than the settings.php) I adjusted all the SQL settings, but I am still getting attachment fails.

It’s not very detailed in the error however. Was wondering how I should be troubleshooting this?

Here’s a sample:

    11573 / 27235 ( 42.5%)  [1089 items/min]  Attachment for post 13812 failed: sf-logo-1.jpg
Attachment for post 13812 failed: sf-logo-2.jpg
    13854 / 27235 ( 50.9%)  [1097 items/min]  Attachment for post 16815 failed: Kaiah3.JPG
    14757 / 27235 ( 54.2%)  [1099 items/min]  Attachment for post 18085 failed: 2010-10-29_19-35-06_292.jpg
    17536 / 27235 ( 64.4%)  [1094 items/min]  Attachment for post 20762 failed: WoWScrnShot_052511_221316.jpg
    17559 / 27235 ( 64.5%)  [1094 items/min]  Attachment for post 20792 failed: WoWScrnShot_052511_221316.jpg
Attachment for post 20792 failed: WoWScrnShot_052511_221316.jpg
    17562 / 27235 ( 64.5%)  [1094 items/min]  Attachment for post 20793 failed: WoWScrnShot_052511_221316.jpg

All the topics and posts went through, along with users. (There are image links pointing to old servers still and no idea how to fix those either, but that’s for another day.)

My old hosting provider also decided that our SMF hosted server was too outdated now and will be cutting us off next month so now I’m trying to get this imported off the ground ::sigh::

Thank you for this wonderful guide and any help is appreciated.

3 Likes

First, look and see if you have those filenames anywhere. If you are lucky you just didn’t configure the importer to look for the files in the right place. Sometimes it’s something more difficult to straighten out.

If those image links work, It’s possible to pull in those images from the other server. You can see system settings that have “download” in them.

If you’re stuck and have a budget see Discourse Migration – Literate Computing, LLC.

3 Likes

So I definitely don’t have all of the attachments for sure - but the ones that failed with the longer entries are the one that do exists. The way the importer is updating the name of the file from the hash name of the actual file makes me think that it’s able to convert the hash… but I am not sure where the importer is looking for the files (No clue how to read ruby, but I can read simple PowerShell/bash scripts :open_mouth: )

Definitely don’t have a budget that they are asking for xD

There is usually a variable to set at the top of the file that says where to find the attachments. Also, note that the file structure inside the container and outside (is you’re running this in a container, that is) is different.

1 Like

I’m happy to report that I finally got the import working. It would appear that I was using all the switches for the database but never pointed to the root of the SMF2 file structure so it was using the the patch it defaults to (/var/www/discourse)
Apparently running the .rb script with no argument will let you know all these things lol sorry for these replies, but I didn’t really have anyone to talk to that would have any idea what I’m talking about :laughing:

2 Likes

Great! Glad you got it working!

3 Likes

Thanks for this great guide! I followed the steps and get the migration finished in far less time than expected.

But I’m having a big issue: users can not log in to discourse using the passwords that they were using on the smf2 forum… of course they can reset the passwords using the email but I’m wondering if you got the same problem.

Thanks!

If you configured things right, this might help: Migrated password hashes support

4 Likes

Thank you! I will try that

1 Like

Hi,

I’m having trouble to complete a full import of our SMF database (that has almost 6 million posts with about 7000 users).

Everything went well until about 50% of the process, although with a lot of connection timeouts and extreme slowlyness after around 40%. After several tries moving forward a few percents (until about 50%, as said before), I can’t now do a rebuild of my import container to continue the process. When I execute it, I get this:

FAILED
--------------------
Pups::ExecError: cd /var/www/discourse && apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y libmysqlclient-dev failed with return #<Process::Status: pid 421 exit 100>
Location of failure: /pups/lib/pups/exec_command.rb:112:in `spawn'
exec failed with the params {"cd"=>"$home", "cmd"=>["echo \"gem 'mysql2'\" >> Gemfile", "apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y libmysqlclient-dev", "su discourse -c 'bundle install --no-deployment --path vendor/bundle --jobs 4 --without test development'"]}
309ce696d5c0816b0799ae8f2f0b8f74af2c5cb5cd56b28215e727bb4d488ac0
** FAILED TO BOOTSTRAP ** please scroll up and look for earlier error messages, there may be more than one.
./discourse-doctor may help diagnose the problem.

Any ideas? Can it be related to some mysql version not being supported by the database I’m trying to import?

Not a lot to go on here. My guesses are disk full and not enough ram. Oh, but . . . . this rebuild problem could be a problem with the new docker image and something being different about debian vs ubuntu. Perhaps @gerhard could take a look.

You might enter the container and see that the mysql client library is installed (unless those words make no sense to you).

1 Like

Thanks, this part is what pointed me in the right direction. Inside the container the MySQL library failed to install due to the libmysqlclient-dev instruction not being accepted. Fortunately the system instructed the use of libmariadb-dev instead and the launcher could then rebuild the import container successfully.

Btw, is there any way to increase the speed of the import process? Would allocating more resources to the mysql container help? Is there a way to tune it with something like a my.cnf or similar?

1 Like

Mysql probably isn’t the bottleneck (You can increase the number of items that it pulls at once, but I don’t think it makes a lot of difference), but Postgres and rails do. Fast CPU and ram for postgres and Rails are what you need.

One more question (@pfaffman might know a thing or two about this :smile:),

I know the importer scripts are made in such way to skip already imported topics, however I have a lot of them right now (meaning duplicated topics). It might have been something that happened around the first 10-20% of imported topics, because it hasn’t happened again as far as I can tell.

What’s the solution for this? Drop the database entirely and start a new import? Or is there a way to find the duplicates and erase them (ideal solution)?

Thanks in advance.

When posts (and users and topics) are imported a post_custom_field is created so that you shouldn’t get duplicated data if you restart the script, or even just run it twice in a row. If you do have duplicated data somehow, then dropping the database and restarting is the easiest solution.

If you do have duplicated data then you’ll probably want to figure out why before you move forward.

2 Likes

I hit the same problem. Here’s a PR to fix it.

https://github.com/discourse/discourse_docker/pull/446

Maybe @gerhard or @Falco can accept it.

EDIT: Thanks, Gerhard.

4 Likes

Thanks for putting together this fantastic guide! My group has been on SMF for years and years and now that SMF seems to officially be going the way of the dodo with the next PHP upgrade we’ve decided to move on.

Up front, I am not a sys admin. I was a developer in another life, so I know my way around linux but I’ve never worked with databases from a command line. I’m currently on step 7 and I’m not sure if I’m stuck or if things are happening, but there isn’t any visible output.

Two strange things I’ve noticed when running

mysql -uroot -p$MYSQL_PASSWORD $MYSQL_DATABASE < /backup/script_for_mysql_tuning.sql
  1. My terminal on digital ocean keeps replacing “<” with “>” and I have no idea why or what to do about it. This may be my issue.
  2. After running the above, but with the wrong direction angle bracket, I get a prompt to enter a password. I enter the db password and then nothing. It appears to be hung at this point.

I’d appreciate any pointers you can share. Thanks!