Update!
This might be the answer:
Provides a simple archive of Discourse content
I looked at:
Improving Discourse static HTML archive .
It’s old.
I’m going to retire https://forum.talksurf.com/ .
Yes, I’m going to archive a backup.
But what if I just want some browseable HTML files?
Should I just run ArchiveDiscourse/archive-discourse.py at master · kitsandkats/ArchiveDiscourse · GitHub ?
Or is there something better?
Thanks in advance!
CC: @pfaffman
Aloha,
Justin
Would something like Wayback Machine be similar?
This worked. I had to make a slight code update.
master
← justin808:patch-1
opened 11:59PM - 10 Jul 25 UTC
3 Likes
pfaffman
(Jay Pfaffman)
July 11, 2025, 10:36pm
4
But not much older than your Discourse version!
I have had some luck mirroring sites with wget
. Something like
wget --mirror --page-requisites --convert-links --adjust-extension --compression=auto --reject-regex "/search" --no-if-modified-since --no-check-certificate --execute robots=off --random-wait --wait=1 --user-agent="Googlebot/2.1 (+http://www.google.com/bot.html)" --no-cookies --header "Cookie: _t=$COOKIE" https://forum.talksurf.com/
But you need to get the cookie named _t
Send me an email and I’ll see what I can do.
1 Like
翔_贺
(翔 贺)
July 14, 2025, 1:50am
5
I’ve been doing this recently, and this is how I did it。
def serve
file_path = File.expand_path(
params[:path]+"."+params[:format],
File.join(File.dirname(__FILE__), "../../public")
)
if File.file?(file_path)
send_file file_path, type: "text/html", disposition: "inline"
else
render plain: "404 Not Found", status: 404
end
end
Just to let you know, this does not pull the images with new URLs. The photos will still point to your server (which is about to be decommissioned!).
Jay kindly sent me the dump, and I compared it to mine.
His technique works better in the sense that it saves the images.
However, his internal links don’t point to the articles, but rather to the decommissioned site. However, the articles can be found with images.
It would be a “nice to have” if Discourse supported a static export. .
2 Likes
The good thing is that you have all the data, so one could be written that exported the data directly from a backup if anyone had the inclination to do so.
But we’re not likely to write one
1 Like
pfaffman
(Jay Pfaffman)
July 15, 2025, 12:52am
9
It shouldn’t be too hard to fix the internal links, add it looks like they just need .html
added.
I thought that the --convert-links
would fix those links…