יש עדכונים לגבי הדרך הטובה ביותר ליצור ארכיון HTML של אתר סטטי?

Update!

This might be the answer:

I looked at:

Improving Discourse static HTML archive.

It’s old.

I’m going to retire https://forum.talksurf.com/.

Yes, I’m going to archive a backup.

But what if I just want some browseable HTML files?

Should I just run ArchiveDiscourse/archive-discourse.py at master · kitsandkats/ArchiveDiscourse · GitHub ?

Or is there something better?

Thanks in advance!

CC: @pfaffman

Aloha,

Justin

האם משהו כמו וייבק משין (Wayback Machine) דומה?

This worked. I had to make a slight code update.

3 לייקים

But not much older than your Discourse version!

I have had some luck mirroring sites with wget. Something like

wget --mirror --page-requisites --convert-links --adjust-extension --compression=auto --reject-regex "/search" --no-if-modified-since --no-check-certificate --execute robots=off --random-wait --wait=1 --user-agent="Googlebot/2.1 (+http://www.google.com/bot.html)" --no-cookies --header "Cookie: _t=$COOKIE" https://forum.talksurf.com/

But you need to get the cookie named _t

Send me an email and I’ll see what I can do.

לייק 1

עשיתי את זה לאחרונה, וכך עשיתי זאת.

 def serve
    file_path = File.expand_path(
      params[:path]+"."+params[:format],
      File.join(File.dirname(__FILE__), "../../public")
    )
    if File.file?(file_path)
      send_file file_path, type: "text/html", disposition: "inline"
    else
      render plain: "404 Not Found", status: 404
    end
  end

רק כדי לעדכן אותך, זה לא מושך את התמונות עם כתובות URL חדשות. התמונות עדיין יצביעו על השרת שלך (שעומד להיות מושבת!).

Jay kindly sent me the dump, and I compared it to mine.

His technique works better in the sense that it saves the images.

However, his internal links don’t point to the articles, but rather to the decommissioned site. However, the articles can be found with images.

It would be a “nice to have” if Discourse supported a static export. :smile:.

2 לייקים

The good thing is that you have all the data, so one could be written that exported the data directly from a backup if anyone had the inclination to do so.

But we’re not likely to write one :wink:

לייק 1

It shouldn’t be too hard to fix the internal links, add it looks like they just need .html added