Importing from plain html of old custom forum without source code or DB


(cosy) #1

Hello! Stunned with subject case.

Is there any way to import ‘plain’ html with topics, usernames, timestamps, years?
Old forum perl sourse code and DB lost, so I have only text.

Please anybody give me some steps or tools info or even practice ideas more to parse and import html in discourse.
Thank you.

(Jeff Atwood) #2

Probably start here

(cosy) #3

Should I use base.rb script ? And how to prepare data for it in this case.

Maybe better decision will be searching some proper import tools for platforms from listed in link above (supported with import to disourse already) ?

It’s kind a hard task for me but it fits to global task ‘learn some programming’.
Just glad to get some roadmap to avoid difficulties.

(Jens Maier) #4

No, base.rb contains an abstract class that importers may / are supposed to inherit useful functions from. phpbb.rb contains an example of an actual and pretty straight forward importer. Parsing those old HTML files sounds much more difficult than replicating what the other importers do to push data into Discourse…