Populating topic_links table

I’m in the middle of writing the process to migrate from bespoke forum software to Discourse. My import script is not currently adding entries to the topic_links table, and I wanted to ask if there is something, like maybe a rake task, that can do that.

(Clicktracking for imported posts is a “nice to have” rather than an essential and granted I don’t see anything for it under rake --tasks but since I can’t see rake search:reindex there either and I’m using that during the import I thought it might be worth checking.)

Any response appreciated, even if it’s just a “no, not currently possible.”

2 Likes

How are you including the links? I think I thought that those links would be created when the post was baked.

2 Likes

Could be! I have 750,000 posts to import and about one week to get it all finalized (because of hosting company shenanigans, don’t ask) so my current approach is just to clean up the imported posts manually with regexes rather than go through the rebake process. If that is how the links are generated, we can look at rebaking everything at some point in the future.

1 Like

That is unlikely to end well. Posts will need to be rebaked at some point in the future.

You can check one post with a link and see if a rebaked fixes it. You could then rebake just the ones with links.

Are you starting with one of the existing import scripts?

If you have a database dump, worst case you could extend your time line by putting up a placeholder page for a bit while you finish the import. It’ll be much worse to put up a botched import and allow people to add new posts, as it’s much, much easier to do an import on an empty site. Another solution would be to put up a new forum while you finish your script and then put the forum on hold for a bit while you run the final import to add the old data to the new forum.

2 Likes

As far as I can tell it should be okay. The forum software I’m migrating from basically allowed users to put whatever HTML tags they wanted in their posts, so my method was to strip out all tags, with a handful of exceptions like <b> (and </b>), <a>, <blockquote> and so on. Currently the only difference between cooked and raw posts in my database is that there are newlines instead of p and br tags in the raw version.

I’ve set a couple of hundred posts as uncooked and started a rebake uncooked posts task on my dev server just now and it does seem to be populating the topic_links table, so thanks! Problem solved.

No, I rolled my own, following the steps of one of the bulk importers.

3 Likes

Oh hooray. That’s great! As long as you have something useful in raw then you should be good. And you could rebake them anytime after you go live.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.