PHPBB to Discourse Migration Speed Optimization

Hey been with phpbb since 2004 so transferring a decent size forum. Right now I’m two days in 303139/2167314 (14%) 446 per min. How can I speed it up? I’m on digital ocean and only using 4% CPU. Redis keeps timing out on me. I have have attachments turned off.

A few useful details would help narrow this down:

  • Droplet size: RAM, vCPU count, disk type/size
  • The exact import command you are running
  • Your app.yml resource-related settings, with secrets removed, especially:
    • UNICORN_WORKERS
    • db_shared_buffers
    • db_work_mem
  • Whether the import is running inside the standard Discourse Docker container
  • Any Redis timeout messages/log excerpts
  • Whether the database is local in the same container or external

One thing to note: UNICORN_WORKERS probably will not speed up the import much, because the phpBB importer is not primarily served through the web workers. Increasing it may just consume more RAM. The PostgreSQL memory settings and disk I/O are more likely to matter.

Since CPU usage is only around 4%, this may be I/O-bound, database-bound, or waiting on Redis rather than lacking CPU. Redis timing out is worth investigating separately.

image

My updated log
script/import_scripts/phpbb3.rb:15:in ‘’
302570 / 2167314 ( 14.0%) [448 items/min] Exception while creating post 304683. Skipping.
Waited 1.0 seconds
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection/buffered_io.rb:214:in ‘block in RedisClient::RubyConnection::BufferedIO#fill_buffer’
internal:kernel:168:in ‘Kernel#loop’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection/buffered_io.rb:197:in ‘RedisClient::RubyConnection::BufferedIO#fill_buffer’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection/buffered_io.rb:187:in ‘RedisClient::RubyConnection::BufferedIO#ensure_remaining’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection/buffered_io.rb:152:in ‘RedisClient::RubyConnection::BufferedIO#getbyte’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection/resp3.rb:113:in ‘RedisClient::RESP3.parse’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection/resp3.rb:50:in ‘RedisClient::RESP3.load’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/ruby_connection.rb:97:in ‘RedisClient::RubyConnection#read’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/connection_mixin.rb:37:in ‘RedisClient::ConnectionMixin#call’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:374:in ‘block (2 levels) in RedisClient#call_v’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client/middlewares.rb:16:in ‘RedisClient::BasicMiddleware#call’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:373:in ‘block in RedisClient#call_v’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:781:in ‘RedisClient#ensure_connected’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-client-0.28.0/lib/redis_client.rb:372:in ‘RedisClient#call_v’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis/client.rb:90:in ‘Redis::Client#call_v’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/rack-mini-profiler-4.0.1/lib/mini_profiler/profiling_methods.rb:90:in ‘block in Redis::Client#profile_method’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis.rb:152:in ‘block in Redis#send_command’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis.rb:151:in ‘Monitor#synchronize’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis.rb:151:in ‘Redis#send_command’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis/commands/scripting.rb:110:in ‘Redis::Commands::Scripting#_eval’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/redis-5.4.0/lib/redis/commands/scripting.rb:97:in ‘Redis::Commands::Scripting#evalsha’
/var/www/discourse/lib/discourse_redis.rb:38:in ‘Kernel#public_send’
/var/www/discourse/lib/discourse_redis.rb:38:in ‘block in DiscourseRedis#method_missing’
/var/www/discourse/lib/discourse_redis.rb:29:in ‘DiscourseRedis.ignore_readonly’
/var/www/discourse/lib/discourse_redis.rb:38:in ‘DiscourseRedis#method_missing’
/var/www/discourse/lib/discourse_redis.rb:270:in ‘DiscourseRedis::EvalHelper#eval’
/var/www/discourse/lib/distributed_mutex.rb:82:in ‘DistributedMutex#get_lock’
/var/www/discourse/lib/distributed_mutex.rb:50:in ‘block in DistributedMutex#synchronize’
/var/www/discourse/lib/distributed_mutex.rb:49:in ‘Thread::Mutex#synchronize’
/var/www/discourse/lib/distributed_mutex.rb:49:in ‘DistributedMutex#synchronize’
/var/www/discourse/lib/distributed_mutex.rb:34:in ‘DistributedMutex.synchronize’
/var/www/discourse/lib/post_creator.rb:410:in ‘PostCreator#transaction’
/var/www/discourse/lib/post_creator.rb:200:in ‘PostCreator#create’
/var/www/discourse/script/import_scripts/base.rb:611:in ‘ImportScripts::Base#create_post’
/var/www/discourse/script/import_scripts/base.rb:555:in ‘block in ImportScripts::Base#create_posts’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/rack-mini-profiler-4.0.1/lib/patches/db/mysql2/alias_method.rb:8:in ‘Mysql2::Result#each’
/var/www/discourse/vendor/bundle/ruby/3.4.0/gems/rack-mini-profiler-4.0.1/lib/patches/db/mysql2/alias_method.rb:8:in ‘Mysql2::Result#each’
/var/www/discourse/script/import_scripts/base.rb:542:in ‘ImportScripts::Base#create_posts’
/var/www/discourse/script/import_scripts/phpbb3/importer.rb:212:in ‘block in ImportScripts::PhpBB3::Importer#import_posts’
/var/www/discourse/script/import_scripts/base.rb:943:in ‘block in ImportScripts::Base#batches’
internal:kernel:168:in ‘Kernel#loop’
/var/www/discourse/script/import_scripts/base.rb:942:in ‘ImportScripts::Base#batches’
/var/www/discourse/script/import_scripts/phpbb3/importer.rb:293:in ‘ImportScripts::PhpBB3::Importer#batches’
/var/www/discourse/script/import_scripts/phpbb3/importer.rb:208:in ‘ImportScripts::PhpBB3::Importer#import_posts’
/var/www/discourse/script/import_scripts/phpbb3/importer.rb:38:in ‘ImportScripts::PhpBB3::Importer#execute’
/var/www/discourse/script/import_scripts/base.rb:47:in ‘ImportScripts::Base#perform’
/var/www/discourse/script/import_scripts/phpbb3/importer.rb:22:in ‘ImportScripts::PhpBB3::Importer#perform’
script/import_scripts/phpbb3.rb:35:in ‘module:PhpBB3
script/import_scripts/phpbb3.rb:16:in ‘module:ImportScripts
script/import_scripts/phpbb3.rb:15:in ‘’
333919 / 2167314 ( 15.4%) [441 items/min]

Seems to be working though, not complaining.
so far

stopped unicorn

disabled attachments and avatars

That droplet size looks reasonably strong for this kind of import, so I would not expect UNICORN_WORKERS to be the main limiter here.

A couple of observations:

  • Stopping Unicorn is probably fine for an import-only/staging container, but it likely will not massively speed up the importer itself.
  • Attachments and avatars being disabled should remove two of the slower parts, so if you are still only getting ~440 posts/min, the bottleneck may be database I/O, the source phpBB MySQL query side, or Redis/PostgreSQL waits.
  • The log line is worth watching carefully:

Exception while creating post 304683. Skipping.

Even though the import continues, that means at least that post was skipped. If this happens repeatedly, the final migration may be missing posts.

The Redis part looks like a 1-second Redis client timeout during Discourse’s distributed mutex lock while PostCreator is creating a post. I would check whether Redis is actually overloaded or whether the timeout is just too aggressive during a long import.

Useful next checks would be:

free -h
df -h
iostat -xz 1
vmstat 1
redis-cli INFO memory
redis-cli INFO stats
redis-cli SLOWLOG GET 20

Also, is the phpBB MySQL database local on the same droplet, or being read over the network? Since the stack trace is iterating through mysql2, a slow source database or slow disk can keep CPU usage low while the importer waits.

At the current rate, from 333919 / 2167314 at about 441 items/min, you still have roughly 69 hours left, so it may finish, but I would mainly be concerned about whether those Redis timeout exceptions are causing skipped posts.

I’d not suggest raising UNICORN_WORKERS. On an import-only run, Unicorn is mostly irrelevant. The important thing is that his import is continuing but skipping posts, which is the part I’d flag.