Force Lowercase slug URLs when set to "encoded"

I wanted to know if there is an option to force lowercase slugs instead of the same capitilization it is written. I prefer lowercase URLs than mixed ones.

Looking at meta, it seems to be lower-case… can you provide an example where it isn’t lowercase?

When I edit the title of the welcome page on my site, the title and the slug is presented in mixed case. On your site, meta, it’s ok. I also posted a topic in a category, and the slug is mixed case.

It may be a wild guess, but…

What LANG and DISCOURSE_DEFAULT_LOCALE is set in app.yml?

Where is the app.yml located, I need the full path in Ubuntu. I can’t find it under /var/

If you followed the default installation steps, it’s /var/discourse/containers/app.yml.

That’s what I have there:

env:
LANG: en_US.UTF-8

DISCOURSE_DEFAULT_LOCALE: en

1 Like

An idea how to solve this?

Anyone can help me with this, can’t continue using the platform without being soves.

I’ve found out that the slug is lowercase in ‘ascii’ mode, but when I change to ‘encoded’ it stays with the same letter case that I’ve written the topic with.

2 Likes

Why do you need it in “encoded”? ASCII is correct for western languages.

Because I want the keywords to be in the URL, even if they are written in Japanes, Chinese or any other non-English keywords.

If I use ascii, this topic title: “どういたしまして どういたしまして”

becomes

/t/topic/17

I used it in all my websites, it’s great for usability and make the URL meaningful in search engines after indexing, and it’s useful for SEO as well.

You already have a percentage encoding option, so you just need to apply lowercasing, so it will work as ASCII for English letters with small case and will be compatible with non-ASCII characters. Wordpress and all the other popular blog and forum software have this.

Google also shows the URL with the decoded characters, not the encoded ones.

What happens when you lowercase unicode? This is complex and indeterminate. Not sure if Ruby can even do it.

You lowercase before encoding, Every platform that I use (and I have 15 years of developing experoence, not in Ruby though), does that. It’s a missed opportunity, I will look into rubi and see what I find.

If you aim your platform for multilanguage, you have to do that.

You are making inconsistent and mutually exclusive requests – the idea is that by explicitly enabling Unicode display in the URL you care about exactly what characters display, whereas passing Unicode through a lowercase algorithm will mangle it.

Ruby 2.4.0.preview 1 launched today with unicode aware downcase/upcase.

In the mean time, rails active support provides this.

If the code path isn’t critical, it can be added. Not sure if we must add tough :smile:.

1 Like

Sorry, my english is not native.

This is how wordpress does it.

The goal is the URL that the slug/url will be presented in the links and in the canonical as percentage encoded, so titles in different languages will appear as-is in the URL when search engine index them - and it is important.

So sorry for not explaining myself the right way, but for foreign language forum it has to be done. As I mentioned, all the popular forum software do that.

I am probably confusing between the two, but the thing that I don’t understand, is if you encode the URL, why not having lowercase for English characters?

Good luck – as you can see above, Ruby just added support for this. Sorry, you may wish to pick different discussion software if this is a critical requirement for you.

(it is also quite performance intensive to properly lowercase among all known human languages and character sets)

OK, so the Rubi you are using still not suppoort it

The Rails Active Support gem provides upcase, downcase, swapcase,capitalize, etc. methods with internationalization support:

gem install activesupport
irb -ractive_support/core_ext/string
“STRING ÁÂÃÀÇÉÊÍÓÔÕÚ”.mb_chars.downcase.to_s
=> “string áâãàçéêíóôõú”
“string áâãàçéêíóôõú”.mb_chars.upcase.to_s
=> “STRING ÁÂÃÀÇÉÊÍÓÔÕÚ”

source

Performance wise: 10000000 iterations in C#

ToLower(): 1054 ms
ToLowerInvariant(): 1724 ms
ToLower(CultureInfo): 884 ms [fastest]

Don’t think it’s an issue.