Did you look into either of these projects?
- GitHub - icy/google-group-crawler: [Deprecated] Get (almost) original messages from google group archives. Your data is yours.
- GitHub - henryk/gggd: Get Google Groups Data – Tool to fetch all messages of a Google Group as raw mbox data
They output mbox data, which Discourse already has a script for:
So as long as you could get either of those projects to output clean mbox data that the importer script can work with, we’re golden ![]()
p.s. @pacharanero also took the scraper approach and successfully migrated several sites with it.