Hi there, this topic gives some background on the migration that I’m slowly planning and testing. I finally tried the Drupal importer last Friday on a testbed VPS using a combination of this and this. The importer is still running as I type this, so I haven’t been able to actually test the functionality of the test site yet, but it’s about to finish soon.
The biggest issue I’m facing is a “duplicate key value” on 8 apparently random nodes (the equivalent of topics in Discourse) out of ~80,000 total nodes. These are the specific nid
numbers just in case there’s some really weird Y2K-esque sort of math bug at stake:
42081, 53125, 57807, 63932, 66756, 76561, 78250, 82707
This same error always happens on those same nid
s when re-running the importer:
Traceback (most recent call last):
19: from script/import_scripts/drupal.rb:537:in `<main>'
18: from /var/www/discourse/script/import_scripts/base.rb:47:in `perform'
17: from script/import_scripts/drupal.rb:39:in `execute'
16: from script/import_scripts/drupal.rb:169:in `import_forum_topics'
15: from /var/www/discourse/script/import_scripts/base.rb:916:in `batches'
14: from /var/www/discourse/script/import_scripts/base.rb:916:in `loop'
13: from /var/www/discourse/script/import_scripts/base.rb:917:in `block in batches'
12: from script/import_scripts/drupal.rb:195:in `block in import_forum_topics'
11: from /var/www/discourse/script/import_scripts/base.rb:224:in `all_records_exist?'
10: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/transactions.rb:209:in `transaction'
9: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_statements.rb:316:in `transaction'
8: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/transaction.rb:317:in `within_new_transaction'
7: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
6: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
5: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
4: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
3: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/transaction.rb:319:in `block in within_new_transaction'
2: from /var/www/discourse/script/import_scripts/base.rb:231:in `block in all_records_exist?'
1: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:56:in `exec'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:56:in `exec': ERROR: duplicate key value violates unique constraint "import_ids_pkey" (PG::UniqueViolation)
DETAIL: Key (val)=(nid:42081) already exists.
20: from script/import_scripts/drupal.rb:537:in `<main>'
19: from /var/www/discourse/script/import_scripts/base.rb:47:in `perform'
18: from script/import_scripts/drupal.rb:39:in `execute'
17: from script/import_scripts/drupal.rb:169:in `import_forum_topics'
16: from /var/www/discourse/script/import_scripts/base.rb:916:in `batches'
15: from /var/www/discourse/script/import_scripts/base.rb:916:in `loop'
14: from /var/www/discourse/script/import_scripts/base.rb:917:in `block in batches'
13: from script/import_scripts/drupal.rb:195:in `block in import_forum_topics'
12: from /var/www/discourse/script/import_scripts/base.rb:224:in `all_records_exist?'
11: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/transactions.rb:209:in `transaction'
10: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/database_statements.rb:316:in `transaction'
9: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/transaction.rb:317:in `within_new_transaction'
8: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
7: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
6: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
5: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activesupport-7.0.3.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
4: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/activerecord-7.0.3.1/lib/active_record/connection_adapters/abstract/transaction.rb:319:in `block in within_new_transaction'
3: from /var/www/discourse/script/import_scripts/base.rb:243:in `block in all_records_exist?'
2: from /var/www/discourse/script/import_scripts/base.rb:243:in `ensure in block in all_records_exist?'
1: from /var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:56:in `exec'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/pg.rb:56:in `exec': ERROR: current transaction is aborted, commands ignored until end of transaction block (PG::InFailedSqlTransaction)
The only way I could make it proceed was by hacking the SQL conditions:
...
LEFT JOIN node_counter nc ON nc.nid = n.nid
WHERE n.type = 'forum'
AND n.status = 1
AND n.nid != 42081
AND n.nid != 53125
AND n.nid != 57807
AND n.nid != 63932
AND n.nid != 66756
AND n.nid != 76561
AND n.nid != 78250
AND n.nid != 82707
LIMIT #{BATCH_SIZE}
OFFSET #{offset};
...
I inspected the first failed node as well as the previous and next nid
s on either side of it in the source Drupal database, and I can’t see anything wrong. The nid
is set as the primary key and it has AUTO_INCREMENT
, and the original Drupal site works fine, so there can’t be any fundamental issue with the source database integrity.
Apart from the above bug, these are the limitations I’m having with the script:
-
Permalinks: It looks like the importer script will create permalinks for the former node URLs
example.com/node/XXXXXXX
. But I also need to maintain links to specific comments within those nodes, which have the format of:example.com/comment/YYYYYYY#comment-YYYYYYY
(YYYYYYY
is the same in both occurences). The Drupal URL scheme does not include the node ID that the comment is associated with, whereas Discourse does (example.com/t/topic-keywords/XXXXXXX/YY
), so that looks like a major complication. -
Username limitations: Drupal allows spaces in usernames. I understand that Discourse does not, at least it doesn’t allow new users to create them that way. This post suggests that the importer script will automatically “convert” the problematic usernames, but I can’t see any code for that inUpdate: Actually it looks like Discourse handled this automatically in the correct manner./import_scripts/drupal.rb
. -
Banned users: It appears that the script imports all users, including banned accounts. I could probably add a condition fairly easily to the SQL selection
WHERE status = 1
to only import active user accounts, but I’m not sure if that would cause issues with the serialization of the records. Above all I would prefer to keep those previously banned account names with their associated email addresses permanently blocked off so the same problem users don’t sign up again on Discourse. -
User profile fields: Does anybody know if there is example code in another one of the importers for importing personal information fields from user account profiles? I have just one profile field (“Location”) that I need to import.
-
Avatars (not Gravatars): It seems kind of strange that there is code in the Drupal importer to import Gravatars but not for the far more commonly used local account avatar pictures.
-
Private messages: Almost all Drupal 7 forums will probably be using the third-party privatemsg module (there is no official Drupal PM functionality). The importer doesn’t support importing PMs. In my case I need to import about 1.5M of them.
Thanks in advance for your help and for making the Drupal importer script available.