I don’t have mbox files, and I’m not aware of any way to get them–Yahoo certainly won’t let me download them. Do you know of something that would convert JSON to mbox? Google shows a number of tools for going in the other direction, but I don’t see anything that covers this quickly.
I’d expected that, since there were existing scripts designed to migrate Yahoo groups specifically, those scripts would actually work, and that would be the most straightforward way to accomplish this task. It appears my expectation was optimistic–the scripts “work” in that they migrate the messages, and they kind of migrate the users, but missing most of the email addresses and assigning most of the messages to the wrong user is a bit of a problem.
The thing that’s frustrating me is that it seems like this should be a trivial fix for someone who actually knows a thing or two about Ruby–but unfortunately I’m not such a person (I’m trying, but there’s never enough time for everything). My group is small enough that I can probably fix it manually if I need to–but I’d rather not need to, and even more to the point, I’m trying to come up with a general method that other Yahoo groups owners can use.
Edit: I guess I should be glad that I’m managing as much as I am in a language I really don’t know anything about, but I still feel like there’s something major (that should be obvious) that I’m missing. I’ve tried using a different method with the Mail gem. The portion of import_users that I’ve edited reads as follows:
create_users(profiles.to_a) do |u|
user_id = user_id + 1
# fetch last message for profile to pickup latest user info as this may have changed
user_info = @collection.find("ygData.profile": u["_id"]["profile"]).sort("ygData.msgId": -1).limit(1).to_a[0]
# Store user_id to profile lookup
@user_profile_map.store(user_info["ygData"]["profile"], user_id)
puts "User created: #{user_info["ygData"]["profile"]}"
user_email = Mail::Address.new(HTMLEntities.new.decode(user_info["ygData"]["from"]))
user =
{
id: user_id, # yahoo "userId" sequence appears to have changed mid forum life so generate this
username: user_info["ygData"]["profile"],
name: user_info["ygData"]["authorName"],
email: user_email.address, # mandatory
created_at: Time.now
}
user
end
And it works! Well, mostly. Of 302 distinct users counted by the script, it imports 289. They show up on the admin page with the correct usernames, full names (when provided), and email addresses. The script says it imports all 302 and reports no errors. But when it starts importing topics, I get this:
Importing discussions
Topic: 1 / 12232 (0.01%) Subject: Newspapers
Topic: 2 / 12232 (0.02%) Subject: Ents
Traceback (most recent call last):
8: from script/import_scripts/yahoogroup.rb:168:in `<main>'
7: from /home/dan/discourse/script/import_scripts/base.rb:47:in `perform'
6: from script/import_scripts/yahoogroup.rb:40:in `execute'
5: from script/import_scripts/yahoogroup.rb:101:in `import_discussions'
4: from script/import_scripts/yahoogroup.rb:101:in `each_with_index'
3: from script/import_scripts/yahoogroup.rb:101:in `each'
2: from script/import_scripts/yahoogroup.rb:132:in `block in import_discussions'
1: from /home/dan/discourse/script/import_scripts/base.rb:535:in `create_post'
/home/dan/.rbenv/versions/2.6.2/lib/ruby/gems/2.6.0/gems/activerecord-6.0.0/lib/active_record/core.rb:177:in `find': Couldn't find User with 'id'=298 (ActiveRecord::RecordNotFound)
…which isn’t surprising, since the highest user id is 290.