Parsing/importing portions of very large JSON export from another forum

I have a forum we’ve just migrated with a lot of discussions (unfortunately not all, about 500 of the thousands) that need to be archived/accessible for the life of the project. It’s coming from is cpmmonwealth.im - a fairly similar platform.

I have the entire history exported by their API - 3 fairly large (between 3-16MB) JSON files - communities, topics, threads, comments and reactions. (2 of those I don’t need, the small ones…)
It doesn’t even really matter if they are imported in their original format of Thread > comment > comment etc. It can just be each thread+comments combined into a single transcript-style message. I will be adding them as “new threads” in the “archive” topic and immediately locking them anyway.

Basically I just don’t really know where to begin. I can figure out some level of ‘jq’, and I loaded one of the jsons into some web app but it made my PC nearly unusable.
One of the support team here on mentioned some script to dump JSON into an SQL db or something? I’m really not super familiar with most of this but I can probably pick it up, I just need some direction to start with.

See https://github.com/discourse/discourse/blob/main/script/import_scripts/drupal_json.rb for example and look there for other scripts that process json.

3 Likes

I just made my own thing here, although now I need to still translate this into a form the Discourse API can handle.

The json scripts just take the json and stick it into an sql database and read that.

1 Like

eh I know very very little about any of this and even less about sql so that was where I decided to go lol… it was kind of fun anyway.

1 Like