How to migrate from Yahoo groups to Discourse

I’d started a topic (Migration from Yahoo! Groups) for some guidance on this, and another topic (Yahoo Groups to Discourse migration?) asks about it, so here’s what I came up with to migrate.

Background

Yahoo announced a few weeks ago that they’re significantly curtailing the services that will be available through Yahoo Groups. Beginning 28 Oct 19, they’ve disabled user-uploaded content. On 14 Dec 19, they say they will remove all uploaded content, including message archives. I bold this last point because it wasn’t obvious to me initially, and it made migration of the group much more urgent. They say at this point that they’ll continue to be usable as a mailing list, but with no archives in the future.

There’s another service at groups.io that is pretty much a turnkey replacement for Yahoo groups, and it’s kind of the obvious choice for someone wanting to migrate a Yahoo group–they’ll handle moving everything, and users will keep the same interface they’re used to. The latter point, IMO, is one of the larger drawbacks to this service; the other is the $220 cost to migrate a group. I felt that, if I were to bother to migrate a group, it would be good to update the interface to something more modern, but that would still retain the ability to be used as a mailing list.

If your group has lots of photos or uploaded files, you might want to look at a different method of hosting them. Otherwise, you can post them into topics on your site, perhaps in separate categories. If you have other types of data there (e.g., databases or calendars) that you want to save, I’ll need to leave to others the best way to migrate those.

Preparation

Key to this process is Yahoo’s “Get my data” tool, which will allow you to download certain data from your groups. Specifically, it will let you get:

  • All messages for all groups of which you are a member, and
  • All uploaded files (but not photos) for all groups of which you are a member.

The downloaded messages come in .mbox format, and include full email addresses, regardless of whether you’re a moderator or admin for the group.

This tool lets you submit a request. Once Yahoo processes it, you’ll get an email notification that your download is ready–that took about a week for me.

As noted above, the “Get my data” tool does not download photos. For those, I used GitHub - IgnoredAmbience/yahoo-group-archiver: Scrapes and archives a Yahoo groups email archives, photo galleries and file contents using the non-public API. It downloads all the other data too (and there’s no way to limit it AFAIK), but it will get the photos along with their metadata.

Installation

Install following discourse/INSTALL-cloud.md at main · discourse/discourse · GitHub on a VPS host of your choosing (I use contabo.com, but there’s no shortage of VPS providers). Get a domain if you don’t already have one (freenom.com if you want one for free; easydns.com or name.com work well for me as paid registrars). Set up DNS using your preferred host (I like cloudflare.com for this). Set up outgoing email (I used mailgun.com) and incoming email following Direct-delivery incoming email for self-hosted sites.

Configure your installation as desired; the import won’t overwrite anything you’ve already configured.

Migrate messages

The “Get my data” tool will give you a single .zip file. It will have a directory for every group of which you’re a member, and in each directory will be messages.zip and files.zip. When you unzip messages.zip, you’ll have .mbox files containing all messages from the group, with as many 10 MB files as are necessary to contain them (it was 15 of them for the 38,000 messages in the group I was migrating). Once you have those, you can follow the instructions at Importing / migrating mailing lists (mbox, Listserv, Google Groups, emails, ...) to import them into your Discourse installation. If you have existing users, the script will match messages to those users by email address. Any email addresses that don’t have a corresponding user will create a new user.

Issues

Since user creation is email-based, Yahoo users who have changed email addresses over time will result in multiple users in your Discourse installation. The merge user plugin (Merge Users Plugin) should address this, though identifying them will still be a manual process.

Conclusion

That’s where I am at this point–my site is working, the messages are there, the users are there, and now it’s down to tinkering and adjusting. I hope this helps others take the plunge and move their Yahoo groups to a Discourse-based site rather than stay with more of the same.

21 Likes