Topic with Japanese in URL doesn't redirect if URL doesn't perfectly match

In community.wanikani.com, opening any links to topics with Japanese in the fully qualified URL stopped working when opening them in a new tab or directly copying and pasting the link. Clicking the link to navigate within the same tab still works.

For example, opening this link in a new tab should navigate to

キノの旅 Home Thread (Intermediate Book Club) - Book Clubs - WaniKani Community

But instead it tries to navigate to

キノの旅 Home Thread (Intermediate Book Club) - Book Clubs - WaniKani Community

which fails to load.

If the link happens to match exactly, it works fine. But of course with topic renames this is often not the case.

I also tried to reproduce on try.discourse.org, but on that install Japanese characters never get added to the URL even when included in the topic title. I’m not sure why that’s the case, but without that happening I can’t demonstrate the bug there.

2 Likes

Both are links to topic 34890. Both load fine for me in Firefox. What is the problem?

3 Likes

sigh It looks like it might be yet another Chrome bug. It works fine for me in both Firefox and Edge. Weirdly it works the first time in an incognito window, but then fails the second time. Same thing after clearing site cache/cookies and restarting my computer.

Chrome says the error is too many redirects.


Would you mind checking in Chrome to confirm if it’s an issue there in general and not just a problem for me? Just be sure to try to open the page multiple times since the first one seems to work fine. I appreciate the help!

3 Likes

I can repro it on Chrome mobile. :bug:

5 Likes

Thanks. I guess I’ll report it to Google.

1 Like

On my android phone on chrome the second link redirects indefinitely.

1 Like

Still with Chrome? Just want to be sure before reporting it to them. I assume nothing changed related to this recently on Discourse? (Regardless, this would probably still be a Chrome issue since it just happens there, even if something did change on Discourse.)

3 Likes

Wait a bit, this may be Discourse. Even a service worker bug.

5 Likes

Okay, thanks for the update.

1 Like

Any update on this?

This is going to take a while to sort out, it is assigned out so it will not fall between cracks.

7 Likes

I don’t expect to find any kind of solution or workaround. We just have to wait for it to be fixed.

Are you still seeing this issue? It looks like maybe it’s fixed, unless I’m doing something differently this time.

1 Like

Well scratch that. It started happening again today. I have no idea how it got fixed for a bit.


Since this might take a while, I’d be open to workarounds for now. I mentioned in the OP about how on try (and I checked here on meta too) the Japanese characters never get added to the URL, effectively circumventing this issue. Is that a site or category setting that I can talk to my site admin about? Any other suggestions for a workaround besides that?

When I enter URL for an Arabic title into the browser like:

https://forums.coretabs.net/t/2456

I will get into an infinite redirections (and the generated link is not right I guess this is related to the encoding)

It should redirect instead into:

https://forums.coretabs.net/t/ماذا-يجب-ان-نتعلم-في-javascript-؟/2456

Why don’t I share links with their titles?

Cuz of the bad Arabic support in Twitter and Facebook:

image

  • This bug didn’t exist before latest updates (last time I tried sharing a link was around two weeks ago, and it was perfectly fine).
3 Likes

I have dig into our codebase and looks like the error is kinda simple, but I would like to verify my assumptions.

We have a site setting named slug_generation_method which must be changed from the default ascii value to encoded in order to trigger this bug. When you change this site setting, we clear all slugs and generate those again.

What I don’t understand is why when the site setting is set to “encoded” we generate a slug like this:

[3] pry(main)> SiteSetting.slug_generation_method
=> "encoded"
[4] pry(main)> Slug.for(t.slug)
=> "キノの旅-home-thread-intermediate-book-club"

where I expected that “encoded” means something like

[5] pry(main)> CGI.escape(Slug.for(t.slug))
=> "%E3%82%AD%E3%83%8E%E3%81%AE%E6%97%85-home-thread-intermediate-book-club"

This appears to come from

https://github.com/discourse/discourse/pull/3370

The raw slug from the table is returned in the Location header 301 response when a topic slug doesn’t match, and IMO we should return a valid URL in there.

9 Likes

Yeah we should clean up the slug generation method for encoded so it relies less on browser magic.

8 Likes

So are you saying the URL itself would show the encoded version? Or just that the redirect would internally handle it by using the encoded version? Either way, getting this to “just work” and not rely on browser quirks would be great.

1 Like

Hello,

Is this case resolved?
Because I am still facing this issue as I posted in the topic I initiated regarding this matter.

1 Like

No, as denoted by the open topic in the bug category here :sweat_smile:

@sam I took a look on this again today and there are two ways to go about it:

  1. Store an actual encoded slug in the slug column when the slug generation setting is set to encoded. Make a migration to clear out all current slugs when using encoded slugs so they regenerate over time properly.

  2. Keep the current UTF-8 slug and patch it on the fly when sending it out in a header for 301 redirect.

IMO 1 is “more correct” and it will make harder to pass the raw slug to a client. However, just patching the slug generator wasn’t enough as the browsers gets an encoded URL on 301 redirect but decodes it for the next request our slug comparison fails and redirects again. That means I will also need to patch the slug comparison method in the topics controller, and maybe other places.

Should I continue on this path?

6 Likes