What happens to internal links upon import from another forum

import

(Christoph) #1

I’m trying to wrap my head around how internal links (between posts) are handled when migrating from another forum software to discourse. The sparse information I found on meta seems to be in part contradictory, which is either due to me not understanding stuff or some information being outdated (e.g. because a permalink UI was added to site settings or import scripts were improved).

So here’s what I’d like to clarify:

  1. Do (most) import scripts convert (i.e. rewrite, not redirect) internal links?
  2. Is it correct that the the purpose of permalinks is primarily to redirect incoming traffic?
  3. If so, do permalinks have any relevance for making sure internal links work?
  4. If, for whatever reason, internal links were not converted upon import (i.e. the links on the new forum are pointing to posts on the old forum), is there a way to retrospectively rewrite those internal links so as to get rid of all links pointing to the old site?

I understand that additional problems exist when old an new forums are on the same subdomain, so to keep thinks simple®, let’s assume that the new forum is on a different subdomain.


#2

As your questions meet mine, I’d like to add an aspect to your topic: there’s a raw access to a post, for example, https://meta.discourse.org/raw/90474/1 gives a Markdown version of this topic’s introduction (resp. /2 would give this one.)

Now when images are uploaded they appear as something like: [foo|325x210](upload://some-hash-reference). This would certainly refer to a hash in the database, which would thus be made available upon import to a new Discourse instance. But such links are of no use for external use, e.g., compiling posts for display elsewhere. In this case, the raw version of a post with embedded images would simply give a broken image – or a broken link if a manual link were made with, e.g., [bar](/t/some-topic-slug/123/45).

So an additional question would be – if relevant to this topic:

  • how to control “link export”, i.e., links not used for internal purpose, but that would work from an external site? (Sometimes it’s easier to search and replace the domain name than rework all links, especially when they don’t follow a consistent structure.)

(Jay Pfaffman) #3

Most importers do not rewrite internal links.

Permalinks are for incoming links only.

Given that the internal links contain the ID that’s in the import_id it’s not too hard to write code they will replace them.