Redirecting old forum URLs to new Discourse URLs

import
(Jay Pfaffman) #42

It does get to the server. What you want to do, @marcozambi, is to make permalink be something like
/oldforumpost/ID and then use Permalink_normalizations to re-map /wahtever/link.php?1234#IDto/oldforumpost/ID`.

1 Like

(Cameron:D) #43

No it doesn’t, try making a request to yoursite.com/#anythingyouwant and looking at the logs (or the requests in your dev tools), the request will just be for /, and once the client loads / it will deal with the #anythingyouwant part, usually by scrolling to that part of the page, or handled with JS.

Alternatively you can test this with Discourse itself by making a new permalink for, say, test#404, and when you try to load that exact url it will 404. Add a new permalink for just test and it should load happily.

Now, the permalink normalisation is something new to me and may be worth looking at that instead of my rewrites.

3 Likes

(Marco) #44

Continuing my effort to translate SMF2 old forum URLs into the Discourse ones, I’m now spending some time on the Permalink Normalization so that i can get rid of the non necessary parts of the SMF2 links.

For example, given this SMF2 link

 https://www.myforum.it/index.php?topic=27962.msg305350#msg305350

I need to get rid of the unuseful part

.msg305350#msg305350

as I’m already successfully translating old topic ids into the discourse standard (see here).

In order to do that I’m trying to use this regexp (which works when tested in https://regex101.com/)

 | P1                       | P2  | 
 |--------------------------|-----|
/(\/index\.php\?topic=[0-9]+)(\..+)/\1

Basically I need to keep what is found in P1, while P2 can just be discarded.
If I understood correctly how to use Permalink Normalization, P1 should contain /index.php?topic=27962 while P2 is so generic that will catch anything after P1 (and it’s ok so).
Setting P3 to \1 should then return /index.php?topic=27962 but nevertheless I am led to 404 error page.

For completeness, here’s the screenshot of my current Permalink Normalization setting.

What am I doing wrong?

2 Likes

(Gerhard Schlager) #45

The regex shouldn’t start with a slash. So, the following should work:

/(index\.php\?topic=[0-9]+)(\..+)/\1
4 Likes

(Marco) #46

Perfect! It did the trick Thank you very much! :medal_military:
I will add this to the SMF2 migration guide…

3 Likes

(Marco) #47

Back on this topic to report a really weird behavior of the old topic redirection:

  • if in one of the Discourse messages imported from the old forum I click on an internal link making use of the old SMF2 URL pattern, e.g. https://www.myforum.com/index.php?topic=123, even if I have (and I do have) a correct permalink set for this old topic pointing to the Discourse-type link, I get landed on the 404 page.
  • if I copy/paste the exact same old SMF2 link above https://www.myforum.com/index.php?topic=123 into the browser’s URL bar, or I simply press F5 from the 404 page where I was redirected in the first place, the redirection works like a charm and I get to the new Discourse-type link.

One of our beta testers (we’re still not operational with Discourse) has noticed that when we click on an internal old SMF2-style link, a GET request is generated (maybe for keeping track of the number of clicks on that link?) which is something like https://www.myforum.com/clicks/track?https://www.myforum.com/index.php?topic=123&post_id=56789 286778&topic_id=19650&redirect=false&_=1533275034978 , which gets an empty 200 OK response.

If I modify parameter redirect=false to redirect=true in the GET request above, then I finally get a 302 FOUND response whose header sends to the correct new discourse-like topic URL.

Any idea on how to avoid this?

0 Likes

(Jay Pfaffman) #48

What you really need to do is to replace the internal links with discourse urls in the importer (in the raw post).

1 Like

(Marco) #49

Yes, that would certainly work.
The reason I went with this approach is that avoids me to do a very difficult matching of the old SMF2 topic ids with the new Discourse topic ids, and will take care of the pages already indexed by search engines.

I think that an option in the settings to switch off “counting” the clicks of internal links would be a possible solution…

0 Likes

(Jay Pfaffman) #50

The permalinks will handle the pages indexed by Google, but as you have learned, not the internal links.

The old topic ids should be in the post custom fields table, but it’s usually easier to do in the import rather than afterward, though I’ve done it both ways.

1 Like

#51

Since the permalinks table can get quite big for large Forums migrations, I wonder if it could incorporate some mechanism to store the last time each permalink was used, and a count.

This would allow admins to get some notion of whether they can

  • delete entries from the permalink table (not accessed for a long time, or never)
  • take steps to update old links somewhere out there on the Internet, where possible (those that get used a lot and generate lots of redirects)

Maybe this is already there (I didn’t check, because I don’t know where to check…). Thanks for any comments anyone may have on this idea.

4 Likes