正在处理 Yahoo Groups mbox 导入,但遇到了一些错误。目前不确定在调试和导入方面该朝哪个方向进行。以下是我目前看到的错误信息:
https://pastebin.com/raw/2WTN3GTj
你正在使用 mbox 脚本吧?我这边用得很顺利,没有任何错误。虽然附件缺失,但对我来说问题不大。
没错,@tobiaseigen。导入过程持续了 2 个多小时。
除了最后一个问题,我还想补充一点:我不确定是否应该在出现这些失败的情况下继续导入。我在想,如果在修复错误/失败后再次导入,系统是否会跳过已导入的消息,并继续进行正常的导入。
Not sure sidekiq is relevant here - the import script runs outside discourse I think.
In case it helps, here’s my import log. There are in fact a few lines that are similar to yours, but I just decided not to worry about it. Life is too short.
Since you have so many errors, you seem to have have a more systematic problem. Are you sure the system has enough RAM available? I don’t know if you have already tried it, but you may want to look at the import file a little more closely and try to identify if there is anything you can find out there - maybe you just need to adjust the split_regex in some way, or upload the file to your server in a different format?
If you keep having trouble, you could ask for help in marketplace - there are some consultants hanging out here who are quite experienced at doing imports. I’m certainly no expert - this was my first attempt. ![]()
root@discourse:/var/discourse# ./launcher enter import
root@discourse-import:/var/www/discourse# RAILS_DB=secondsite
root@discourse-import:/var/www/discourse# export RAILS_DB
root@discourse-import:/var/www/discourse# import_mbox.sh
The mbox import is starting...
Loading existing groups...
Loading existing users...
Loading existing categories...
Loading existing posts...
Loading existing topics...
creating index
indexing files in /shared/import/data/list
indexing /shared/import/data/list/18929486-3.mbox
indexing /shared/import/data/list/18929486-2.mbox
indexing replies and users
creating categories
1 / 1 (100.0%) [4916421 items/min]
creating users
69 / 69 (100.0%) [1178 items/min] ]
creating topics and posts
Date is missing. Skipping 0462b41b966d8c11e6e32cc14c0b576d
1 / 2333 ( 0.0%) [179689 items/min] Date is missing. Skipping 0adb9bd80082595a33130f7749d7f530
2 / 2333 ( 0.1%) [224693 items/min] Date is missing. Skipping 3bd86d7adb396fbeb7d6dfcfe9f0be5f
3 / 2333 ( 0.1%) [283328 items/min] Date is missing. Skipping 4f5397838e6c7f96eedfe116ce27be13
4 / 2333 ( 0.2%) [184374 items/min] Date is missing. Skipping c8c14ab80e92ae1cacd4af99351319bd
45 / 2333 ( 1.9%) [334 items/min] Failed to map post for 2f401ce90708241252h30bdae5iad2ae0096e067b71@mail.gmail.com
undefined method `hex' for nil:NilClass
/var/www/discourse/app/models/upload.rb:132:in `base62_sha1'
/var/www/discourse/app/models/upload.rb:386:in `short_url_basename'
/var/www/discourse/app/models/upload.rb:115:in `short_url'
/var/www/discourse/lib/upload_markdown.rb:17:in `image_markdown'
/var/www/discourse/lib/upload_markdown.rb:10:in `to_markdown'
/var/www/discourse/lib/email/receiver.rb:1085:in `block in add_attachments'
/var/www/discourse/lib/email/receiver.rb:1060:in `each'
/var/www/discourse/lib/email/receiver.rb:1060:in `add_attachments'
/var/www/discourse/script/import_scripts/mbox/importer.rb:137:in `format_raw'
/var/www/discourse/script/import_scripts/mbox/importer.rb:121:in `map_post'
/var/www/discourse/script/import_scripts/mbox/importer.rb:145:in `map_first_post'
/var/www/discourse/script/import_scripts/mbox/importer.rb:103:in `block (2 levels) in import_posts'
/var/www/discourse/script/import_scripts/base.rb:491:in `block in create_posts'
/var/www/discourse/script/import_scripts/base.rb:490:in `each'
/var/www/discourse/script/import_scripts/base.rb:490:in `create_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:97:in `block in import_posts'
/var/www/discourse/script/import_scripts/base.rb:870:in `block in batches'
/var/www/discourse/script/import_scripts/base.rb:869:in `loop'
/var/www/discourse/script/import_scripts/base.rb:869:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:83:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:91:in `import_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:35:in `execute'
/var/www/discourse/script/import_scripts/base.rb:47:in `perform'
script/import_scripts/mbox.rb:16:in `<module:Mbox>'
script/import_scripts/mbox.rb:10:in `<module:ImportScripts>'
script/import_scripts/mbox.rb:9:in `<main>'
940 / 2333 ( 40.3%) [398 items/min] Failed to map post for BBCAF42471FF9540868B4DC02B885B1BBCDA1F@wn1217.or.providence.org
undefined method `hex' for nil:NilClass
/var/www/discourse/app/models/upload.rb:132:in `base62_sha1'
/var/www/discourse/app/models/upload.rb:386:in `short_url_basename'
/var/www/discourse/app/models/upload.rb:115:in `short_url'
/var/www/discourse/lib/upload_markdown.rb:17:in `image_markdown'
/var/www/discourse/lib/upload_markdown.rb:10:in `to_markdown'
/var/www/discourse/lib/email/receiver.rb:1085:in `block in add_attachments'
/var/www/discourse/lib/email/receiver.rb:1060:in `each'
/var/www/discourse/lib/email/receiver.rb:1060:in `add_attachments'
/var/www/discourse/script/import_scripts/mbox/importer.rb:137:in `format_raw'
/var/www/discourse/script/import_scripts/mbox/importer.rb:121:in `map_post'
/var/www/discourse/script/import_scripts/mbox/importer.rb:159:in `map_reply'
/var/www/discourse/script/import_scripts/mbox/importer.rb:105:in `block (2 levels) in import_posts'
/var/www/discourse/script/import_scripts/base.rb:491:in `block in create_posts'
/var/www/discourse/script/import_scripts/base.rb:490:in `each'
/var/www/discourse/script/import_scripts/base.rb:490:in `create_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:97:in `block in import_posts'
/var/www/discourse/script/import_scripts/base.rb:870:in `block in batches'
/var/www/discourse/script/import_scripts/base.rb:869:in `loop'
/var/www/discourse/script/import_scripts/base.rb:869:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:83:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:91:in `import_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:35:in `execute'
/var/www/discourse/script/import_scripts/base.rb:47:in `perform'
script/import_scripts/mbox.rb:16:in `<module:Mbox>'
script/import_scripts/mbox.rb:10:in `<module:ImportScripts>'
script/import_scripts/mbox.rb:9:in `<main>'
944 / 2333 ( 40.5%) [399 items/min] Failed to map post for 3A1D6C799D451B41BD0500303339622A023AA1@s-mail.integral-corp.com
undefined method `hex' for nil:NilClass
/var/www/discourse/app/models/upload.rb:132:in `base62_sha1'
/var/www/discourse/app/models/upload.rb:386:in `short_url_basename'
/var/www/discourse/app/models/upload.rb:115:in `short_url'
/var/www/discourse/lib/upload_markdown.rb:17:in `image_markdown'
/var/www/discourse/lib/upload_markdown.rb:10:in `to_markdown'
/var/www/discourse/lib/email/receiver.rb:1085:in `block in add_attachments'
/var/www/discourse/lib/email/receiver.rb:1060:in `each'
/var/www/discourse/lib/email/receiver.rb:1060:in `add_attachments'
/var/www/discourse/script/import_scripts/mbox/importer.rb:137:in `format_raw'
/var/www/discourse/script/import_scripts/mbox/importer.rb:121:in `map_post'
/var/www/discourse/script/import_scripts/mbox/importer.rb:159:in `map_reply'
/var/www/discourse/script/import_scripts/mbox/importer.rb:105:in `block (2 levels) in import_posts'
/var/www/discourse/script/import_scripts/base.rb:491:in `block in create_posts'
/var/www/discourse/script/import_scripts/base.rb:490:in `each'
/var/www/discourse/script/import_scripts/base.rb:490:in `create_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:97:in `block in import_posts'
/var/www/discourse/script/import_scripts/base.rb:870:in `block in batches'
/var/www/discourse/script/import_scripts/base.rb:869:in `loop'
/var/www/discourse/script/import_scripts/base.rb:869:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:83:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:91:in `import_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:35:in `execute'
/var/www/discourse/script/import_scripts/base.rb:47:in `perform'
script/import_scripts/mbox.rb:16:in `<module:Mbox>'
script/import_scripts/mbox.rb:10:in `<module:ImportScripts>'
script/import_scripts/mbox.rb:9:in `<main>'
1149 / 2333 ( 49.2%) [408 items/min] Failed to map post for FF35EE5B30156244A4370DC859B7F650F50626@s-mail.integral-corp.com
undefined method `hex' for nil:NilClass
/var/www/discourse/app/models/upload.rb:132:in `base62_sha1'
/var/www/discourse/app/models/upload.rb:386:in `short_url_basename'
/var/www/discourse/app/models/upload.rb:115:in `short_url'
/var/www/discourse/lib/upload_markdown.rb:17:in `image_markdown'
/var/www/discourse/lib/upload_markdown.rb:10:in `to_markdown'
/var/www/discourse/lib/email/receiver.rb:1085:in `block in add_attachments'
/var/www/discourse/lib/email/receiver.rb:1060:in `each'
/var/www/discourse/lib/email/receiver.rb:1060:in `add_attachments'
/var/www/discourse/script/import_scripts/mbox/importer.rb:137:in `format_raw'
/var/www/discourse/script/import_scripts/mbox/importer.rb:121:in `map_post'
/var/www/discourse/script/import_scripts/mbox/importer.rb:159:in `map_reply'
/var/www/discourse/script/import_scripts/mbox/importer.rb:105:in `block (2 levels) in import_posts'
/var/www/discourse/script/import_scripts/base.rb:491:in `block in create_posts'
/var/www/discourse/script/import_scripts/base.rb:490:in `each'
/var/www/discourse/script/import_scripts/base.rb:490:in `create_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:97:in `block in import_posts'
/var/www/discourse/script/import_scripts/base.rb:870:in `block in batches'
/var/www/discourse/script/import_scripts/base.rb:869:in `loop'
/var/www/discourse/script/import_scripts/base.rb:869:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:83:in `batches'
/var/www/discourse/script/import_scripts/mbox/importer.rb:91:in `import_posts'
/var/www/discourse/script/import_scripts/mbox/importer.rb:35:in `execute'
/var/www/discourse/script/import_scripts/base.rb:47:in `perform'
script/import_scripts/mbox.rb:16:in `<module:Mbox>'
script/import_scripts/mbox.rb:10:in `<module:ImportScripts>'
script/import_scripts/mbox.rb:9:in `<main>'
2328 / 2333 ( 99.8%) [467 items/min]
Updating topic status
Updating bumped_at on topics
Updating last posted at on users
Updating last seen at on users
Updating topic reply counts...
70 / 70 (100.0%) [10745 items/min]
Updating first_post_created_at...
Updating user post_count...
Updating user topic_count...
Updating topic users
Updating post timings
Updating featured topic users
Updating featured topics in categories
9 / 9 (100.0%) [2505 items/min] n]
Updating user topic reply counts
70 / 70 (100.0%) [9174 items/min] ]
Resetting topic counters
Done (00h 06min 58sec)
所以我直接允许此操作继续(稍后我会查看错误),但现在出现了一个非常奇怪的状况。我尝试将这些邮件导入到一个名为“old-yahoo-group”的文件夹中,方法是先在系统中创建这个分类,然后将所有 mbox 文件夹推送到以下目录:
/var/discourse/shared/standalone/import/data/old-yahoo-group
我以为自己理解了说明,即这些邮件在导入后应该会显示在相应的分类中,但它们在整个系统中都隐藏了。
我可以通过搜索找到旧邮件,但它们没有出现在任何汇总位置。
我该如何调整这次最后的导入,使其进入一个指定的分类,让所有约 3.5 万条邮件都显示在一个方便查看的版块中,并标明这些是旧邮件?
进一步查看后,我似乎找到了原因:
现在我需要弄清楚如何从中恢复……
以下操作完全成功了(前提是 old-yahoo-group 分类已创建,且系统中不存在其他未分类的帖子(实际上该设置在设置中已被禁用)):
/var/discourse/launcher enter app
rails c
un=Category.find_by_slug('uncategorized')
newcat=Category.find_by_slug('old-yahoo-group')
Topic.where(category_id: un.id).update_all(category_id: newcat.id)
顺便提一下,我也有过类似的经历。不知为何,导入脚本忽略了我已创建的分类,尽管其 slug 相同。但它为我创建了新的分类,所以我并没有遇到问题。我只需删除自己创建的分类,然后将脚本创建的分类重命名即可。