UTF-8 slugs for categories

Is it possible to add? When I create category with russian name, links to it converts to SEO-friendly latin letters. And that’s awesome! I bet it’s possible to do same with nicknames. If name on russan / japanese / hindi, it can be converted to latin version and you can use it for links to user profiles, but show localized UTF-8 one on pages.

1 Like

Planned at some point see:

2 Likes

The ASCII conversion for slugs is not so SEO-friendly as it could.

For example, “Nos Œuvres Cachées” could produce nos-oeuvres-cachees (ASCII) but actually produce nos-uvres-cachees, which breaks words.

An UTF-8 character is automatically replaced with a -, which often breaks in non-English languages (e.g., in French, German, and other languages with lots of accents). In German, the umlaut (e.g., ü) is traditionally replaced with ue in URLs.

(Feel free to move this post to a new topic.)

2 Likes

Discourse uses transliteration rules to replace Unicode characters with ASCII characters. The rules depend on the site’s default_locale.

It works perfectly fine for German (transliterate.de.yml). It doesn’t work for French because there are no transliteration rules yet.

EDIT: Oh, looking at the output below, it actually matches your preferred “nos-oeuvres-cachees”. I’m not sure why… Maybe Rails started to include transliteration rules by default? :thinking:

[1] pry(main)> SiteSetting.default_locale = :de
=> :de
[2] pry(main)> Slug.for("Küche", "")
=> "kueche"
[3] pry(main)> SiteSetting.default_locale = :fr
=> :fr
[4] pry(main)> Slug.for("Nos Œuvres Cachées", "")
=> "nos-oeuvres-cachees"

Could you create a PR with French transliteration rules?

1 Like

I took an example, maybe this was a bad one. But thank you for the console tips, I will definitely have a look and see if I can play with the transliteration files, if there’s a way to customize some transliteration cases to override the defaults it could be nice to have.

I’d be happy to do so, but I’m already overbooked. Maybe someone in the @translators group (especially people in the Quality of French Translations topic) would be faster than I could.)

I was concerned about a couple of edge-cases that would rather be solved by overriding rules (e.g., sometimes you want a subscript number to be replaced with a normal number or nothing rather than -, e.g., writing CO₂ would rather be changed to co2 than co :slight_smile: )

1 Like