Problem with permalinks, or regex?


(Jay Pfaffman) #1

I’m trying to improve the permalink redirects on the vbulletin importer. Vbulletin can have any of several URL formats. My solution is to create generic permalinks which will then be handled by permalink normalizations. Like this:

Old URL: http://somesite/forums/f38/slug-blah-blah-blue-11769.html
Permalink: oldlinks/topics/thread-11769

Normalization: /forums\/f[0-9]+\/.+-([0-9])+\.html.*/oldlinks\/topics\/thread-\1

It looks like it should work to me, but it’s not.

Have I missed a character somewhere?


Redirecting old forum URLs to new Discourse URLs
(Eli the Bearded) #2

Trailing slash on the RE?


(Jay Pfaffman) #3

Thanks, @elijah. Are you saying I’m missing one, or that I’ve got an extra? In either case, it looks OK, right? I added an image of the normalizations since there’s no way to copy them.


(Eli the Bearded) #4

I’m good with REs, but that means I know there are a maze of similar looking implementations all with slightly different syntax. Your RE looks to be / delimited, but has no closing /:

/forums\/f[0-9]+\/.+-([0-9])+\.html.*/oldlinks\/topics\/thread-\1/

(And here I also wonder is it \1 or $1 for substitutions?)


(Jay Pfaffman) #5

Here’s the most convenient answer to those questions:

There’s no trailing / in their example (I think I included one in one of my earlier attempts), and yes, it’s \1 to insert captures.


(Mittineague) #6

Isn’t there? I.e.
/(topic.*)\?.*/

I can’t help but wonder, do you actually have URLs that have “.html” towards the end, and if you do, how often are they followed by zero or more nothing, anything, everything characters?


(Jay Pfaffman) #7

I meant after the replacement string. I have a / at the end of my RE (right?), but not the end of the replacement.

I don’t know. Perhaps not. In other systems I’ve seen various cruft after the URL (as the example replacement suggests). I’m probably safe removing that last .* to be sure.

####To recap:

Normalization: /forums\/f[0-9]+\/.+-([0-9])+\.html/oldlinks\/topics\/thread-\1

I’ve checked the regex at http://regexr.com/ & it seems to work there. I’ve done something similar in the past. I’m stumped.


(Jay Pfaffman) #8

I thought that perhaps my problem was that I didn’t need to \ escape
the /s in my replacement string, but that doesn’t seem to have solved it either.

And I’ve confirmed that the regex works, it’s the replacement that’s broken. If I create a permalink badword-11769 and use this

/forums\/f[0-9]+\/.+-([0-9])+\.html/badword-\1

The redirect works fine.

Edit: No, it doesn’t.

Final edit:

WRONG: /forums\/f[0-9]+\/.+-([0-9])+\.html/oldlinks\/topics\/thread-\1
RIGHT: /forums\/f[0-9]+\/.+-([0-9]+)\.html/oldlinks\/topics\/thread-\1
                                  ^^

Sigh.