Advice on archiving a site

I’m looking less for “how to” advice and more for strategy advice.

I intend to retire comments from a blog. I’ve been using Discourse to provide comments for that blog. I have comments that I don’t want to lose, but I also don’t want to merely convert existing comments into static HTML to include alongside the posts. I have in mind to rewrite old posts, integrating content from the comments into the new versions, and to do this gradually.

I infer that I want to archive the Discourse site, so that I can then serve it locally and browse it and grab content in an “as I need it” fashion to rewrite or edit a post. I’m curious whether anyone has any “better” suggestions for what to do in order to be able to mine these comments for content later. I’m leaving “better” up to your interpretation, because I’d like to take advantage of the creativity of the group. Please don’t ask me what kind of “better” I have in mind; I want your unfiltered ideas. :slight_smile:

Many thanks.

3 Likes

Hey @jbrains — I’m moving this to Community since I think that’s a more appropriate category for this topic.

1 Like

Perhaps wayback machine?

Would this work for archiving your site?

Might not be the best option, but this older discussion seems like a useful start.

1 Like

This reminds me of Mirroring a read-only mailing list in Discourse, but without the mailing list part.

Maybe running the site in read-only mode or development mode on your network is the way to go?

Also if you only need to sift through post content and don’t need the Discourse UI, a simple Discourse Data Explorer query can let you download a csv/json file with the data that you can use other applications to search.