Suggestions for where to start importing an XML dump from Squarespace

Our legacy forum is implemented under Squarespace 5. I can export it as a giant XML dump, do you have any recommendations for which of the existing discourse import tools might be the most appropriate starting point ? I only need to pull in the basics like author, title, date, body – the structure is more like a blog than a forum so I’m hoping that’s going to make it easier…

Any ideas?
Thanks!

Hi Greg, I’ve come from the future (it’s bleak by the way, don’t ask me who won the election :grimacing:) and since you’re clearly clueless, (note to self: were you always this way?) here’s what I’d recommend:

  1. Start by taking a look at https://github.com/discourse/discourse/tree/master/script/import_scripts. There’s close to 40 importers there, and while none of them will do what you need, you can see what’s possible and generally how it’s done.
  2. You will notice they are all written in ruby. Remember how you learned ruby and thought it so amazingly great? You’re thinking at this point that it was developed by space creatures with no index fingers. Just breathe, it will come back to you over time.
  3. Most of the importers are mysql based, they’re importing directly from source forum databases because many of them are trying to suck in as many of the suck-in-able features that can be sucked. Clearly this is not what you want to do, but you’ll also notice that most of the scripts spend their time transforming data from one concept to another – you’re going to have to do that as well, so you can find some helpful tips & tricks there – and especially see what valid data is supposed to look like.
  4. There are a handful of scripts that actually process csv and xml files. Specifically, have a look at the Discus importer. It’s so right for you in a lot of ways. More specifically look at the line Nokogiri::XML::SAX::Parser.new(@parser) – aaah? aaaaaaah??? Check out the Nokogiri page too, as well, also. You’re going to like this brave new world.
  5. As you’re going through the scripts, look for def execute, that’s going to give you a quick idea of how much that particular script is going to bite-off as well as what’s basically possible to import… all the way from:

down to something more your speed:

That should get you pointed in the right direction. I’m off now to watch :skull_crossbones:President Camacho’s State of the Union address.:skull_crossbones: You worry too much!

3 Likes

Hi Greg,

Curious what ended up happening here?

2 Likes

Hi Sam, I got it basically working, through a mix of automated screen-scraping and whatever XML a squarespace backup coughs up, but I lost the UNIX image I was working on, and then the need for a solution evaporated a few months later anyway (they decided to just shutter the forum, not move it). So I don’t have any shareables.

HOWEVER, on a totally different tangent, they’re looking at replacing a bunch of sharepoint with a forum concept and I’ve told them they should definitely host with you – so if they actually decide to change, I’m sure they will. So there’s that.

4 Likes