我想首先向任何可能因这篇帖子而感到受攻击的人道歉,因为说实话,我从周一开始就在处理这些问题,现在我厌倦了为 Discourse 代码进行调试/热修复。
在尝试了无数次(数到第七次就放弃计数了)之后,我认为我将放弃,因为看起来 Discourse 在迁移支持方面并没有投入太多精力。
我认为最大的问题是这个庞大数据库中使用的字符集是 utf8mb4,而脚本(?)不支持它。
使用 utf8(默认)只会产生大量错误报告,但并不清楚发生了什么,因为脚本仍然会继续。数据库条目被跳过了吗?还是被复制过来时包含了一些不支持的字符(经典的方块)?
最糟糕的是,最近三次运行(使用批量导入器)都遵循了完全相同的指令,但结果却不同。最后一次运行导入了主题,立即开始报告错误但仍在继续(???):
Loading application...
Starting...
Preloading I18n...
Fixing highest post numbers...
Loading imported group ids...
Loading imported user ids...
Loading imported category ids...
Loading imported topic ids...
Loading imported post ids...
Loading groups indexes...
Loading users indexes...
Loading categories indexes...
Loading topics indexes...
Loading posts indexes...
Loading post actions indexes...
Importing categories...
Importing parent categories...
5 - 1104/sec
Importing children categories...
500 - 1539/secERROR: duplicate key value violates unique constraint \"unique_index_categories_on_name\"
DETAIL: Key (COALESCE(parent_category_id, '-1'::integer), name)=(-1, Armata Brancaleone) already exists.
CONTEXT: COPY categories, line 69
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/pg-1.4.5/lib/pg/connection.rb:204:in `get_last_result'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/pg-1.4.5/lib/pg/connection.rb:204:in `copy_data'
/var/www/discourse/script/bulk_import/base.rb:720:in `create_records'
/var/www/discourse/script/bulk_import/base.rb:361:in `create_categories'
script/bulk_import/vbulletin5.rb:291:in `import_categories'
script/bulk_import/vbulletin5.rb:69:in `execute'
/var/www/discourse/script/bulk_import/base.rb:98:in `run'
script/bulk_import/vbulletin5.rb:779:in `<main>'
Importing topics...
600 - 4073/sec
ERROR: undefined method `[]' for nil:NilClass
/var/www/discourse/script/bulk_import/base.rb:513:in `process_topic'
/var/www/discourse/script/bulk_import/base.rb:724:in `block (2 levels) in create_records'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/mysql2/alias_method.rb:8:in `each'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/mysql2/alias_method.rb:8:in `each'
/var/www/discourse/script/bulk_import/base.rb:721:in `block in create_records'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/pg-1.4.5/lib/pg/connection.rb:196:in `copy_data'
/var/www/discourse/script/bulk_import/base.rb:720:in `create_records'
/var/www/discourse/script/bulk_import/base.rb:364:in `create_topics'
script/bulk_import/vbulletin5.rb:321:in `import_topics'
script/bulk_import/vbulletin5.rb:70:in `execute'
/var/www/discourse/script/bulk_import/base.rb:98:in `run'
script/bulk_import/vbulletin5.rb:779:in `<main>'
最终崩溃在此处:
script/bulk_import/vbulletin5.rb:779:in `<main>'
572329 - 531/sec
Importing replies...
client_loop: send disconnect: Connection reset
但在此之前,它基本上一直在不断地刷屏这两个错误:
ERROR: undefined method `gsub!' for nil:NilClass
script/bulk_import/vbulletin5.rb:727:in `preprocess_raw'
script/bulk_import/vbulletin5.rb:369:in `block in import_topic_first_posts'
/var/www/discourse/script/bulk_import/base.rb:723:in `block (2 levels) in create_records'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/mysql2/alias_method.rb:8:in `each'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/mysql2/alias_method.rb:8:in `each'
/var/www/discourse/script/bulk_import/base.rb:721:in `block in create_records'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/pg-1.4.5/lib/pg/connection.rb:196:in `copy_data'
/var/www/discourse/script/bulk_import/base.rb:720:in `create_records'
/var/www/discourse/script/bulk_import/base.rb:367:in `create_posts'
script/bulk_import/vbulletin5.rb:361:in `import_topic_first_posts'
script/bulk_import/vbulletin5.rb:71:in `execute'
/var/www/discourse/script/bulk_import/base.rb:98:in `run'
script/bulk_import/vbulletin5.rb:779:in `<main>'
和
ERROR: invalid byte sequence in UTF-8
script/bulk_import/vbulletin5.rb:727:in `gsub!'
script/bulk_import/vbulletin5.rb:727:in `preprocess_raw'
script/bulk_import/vbulletin5.rb:369:in `block in import_topic_first_posts'
/var/www/discourse/script/bulk_import/base.rb:723:in `block (2 levels) in create_records'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/mysql2/alias_method.rb:8:in `each'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/rack-mini-profiler-3.0.0/lib/patches/db/mysql2/alias_method.rb:8:in `each'
/var/www/discourse/script/bulk_import/base.rb:721:in `block in create_records'
/var/www/discourse/vendor/bundle/ruby/2.7.0/gems/pg-1.4.5/lib/pg/connection.rb:196:in `copy_data'
/var/www/discourse/script/bulk_import/base.rb:720:in `create_records'
/var/www/discourse/script/bulk_import/base.rb:367:in `create_posts'
script/bulk_import/vbulletin5.rb:361:in `import_topic_first_posts'
script/bulk_import/vbulletin5.rb:71:in `execute'
/var/www/discourse/script/bulk_import/base.rb:98:in `run'
script/bulk_import/vbulletin5.rb:779:in `<main>'
请注意,我是一步一步进行的,通过注释掉要运行的函数,然后在继续之前运行 rake import:ensure_consistency,并注释掉已运行的函数,等等,因为如果我只是让整个脚本重新运行之前的步骤,它会因为发现重复的 ID 而崩溃。
在通常的“你不能抱怨免费软件”的论点出现之前,我想澄清一下,我正在为其他开源项目做出贡献,并且也在免费制作软件,但对我来说,最重要的是,如果我发布了某些东西,那么这个东西就应该能正常工作并且有很好的文档(即使只是为了避免成千上万条关于“这是如何工作的”的合理询问),或者我已经准备好修复出现的任何错误。
虽然 Discourse 似乎有一个很好的开箱即用体验,但应该清楚的是,现在是 2022 年,社区的存在远早于这个产品。“采用”需要强大的迁移支持,而这似乎不是 Discourse 目前的状态。
我承认一个 20GB 的数据库是一个边缘情况,但我们在这里没有遇到大小问题,而是字符集或其他未知问题,因为甚至没有一个持续的错误,而且大多数情况下:没有文档,只能通过搜索过去经历过同样痛苦的人留下的帖子和线索来寻找,希望找到一个解决方法,并且源代码自那时以来没有太大变化。
此时,我强烈建议任何从 vbulletin 迁移过来的用户,在迁移脚本(似乎正在进行中?)的全面检修完成之前,暂停任何迁移。