Force Lowercase slug URLs when set to "encoded"


(Spooky) #1

I wanted to know if there is an option to force lowercase slugs instead of the same capitilization it is written. I prefer lowercase URLs than mixed ones.


(cpradio) #2

Looking at meta, it seems to be lower-case… can you provide an example where it isn’t lowercase?


(Spooky) #3

When I edit the title of the welcome page on my site, the title and the slug is presented in mixed case. On your site, meta, it’s ok. I also posted a topic in a category, and the slug is mixed case.


(Felix Freiberger) #4

It may be a wild guess, but…

What LANG and DISCOURSE_DEFAULT_LOCALE is set in app.yml?


(Spooky) #5

Where is the app.yml located, I need the full path in Ubuntu. I can’t find it under /var/


(Felix Freiberger) #6

If you followed the default installation steps, it’s /var/discourse/containers/app.yml.


(Spooky) #7

That’s what I have there:

env:
LANG: en_US.UTF-8

DISCOURSE_DEFAULT_LOCALE: en


(Spooky) #8

An idea how to solve this?


(Spooky) #9

Anyone can help me with this, can’t continue using the platform without being soves.


(Spooky) #10

I’ve found out that the slug is lowercase in ‘ascii’ mode, but when I change to ‘encoded’ it stays with the same letter case that I’ve written the topic with.


(Jeff Atwood) #11

Why do you need it in “encoded”? ASCII is correct for western languages.


(Spooky) #12

Because I want the keywords to be in the URL, even if they are written in Japanes, Chinese or any other non-English keywords.

If I use ascii, this topic title: “どういたしまして どういたしまして”

becomes

/t/topic/17

I used it in all my websites, it’s great for usability and make the URL meaningful in search engines after indexing, and it’s useful for SEO as well.

You already have a percentage encoding option, so you just need to apply lowercasing, so it will work as ASCII for English letters with small case and will be compatible with non-ASCII characters. Wordpress and all the other popular blog and forum software have this.

Google also shows the URL with the decoded characters, not the encoded ones.


(Jeff Atwood) #13

What happens when you lowercase unicode? This is complex and indeterminate. Not sure if Ruby can even do it.


(Spooky) #14

You lowercase before encoding, Every platform that I use (and I have 15 years of developing experoence, not in Ruby though), does that. It’s a missed opportunity, I will look into rubi and see what I find.

If you aim your platform for multilanguage, you have to do that.


(Jeff Atwood) #15

You are making inconsistent and mutually exclusive requests – the idea is that by explicitly enabling Unicode display in the URL you care about exactly what characters display, whereas passing Unicode through a lowercase algorithm will mangle it.


(Rafael dos Santos Silva) #16

Ruby 2.4.0.preview 1 launched today with unicode aware downcase/upcase.

In the mean time, rails active support provides this.

If the code path isn’t critical, it can be added. Not sure if we must add tough :smile:.


(Spooky) #17

Sorry, my english is not native.

This is how wordpress does it.

The goal is the URL that the slug/url will be presented in the links and in the canonical as percentage encoded, so titles in different languages will appear as-is in the URL when search engine index them - and it is important.

So sorry for not explaining myself the right way, but for foreign language forum it has to be done. As I mentioned, all the popular forum software do that.

I am probably confusing between the two, but the thing that I don’t understand, is if you encode the URL, why not having lowercase for English characters?


(Jeff Atwood) #18

Good luck – as you can see above, Ruby just added support for this. Sorry, you may wish to pick different discussion software if this is a critical requirement for you.

(it is also quite performance intensive to properly lowercase among all known human languages and character sets)


(Spooky) #19

OK, so the Rubi you are using still not suppoort it


(Spooky) #20

The Rails Active Support gem provides upcase, downcase, swapcase,capitalize, etc. methods with internationalization support:

gem install activesupport
irb -ractive_support/core_ext/string
"STRING ÁÂÃÀÇÉÊÍÓÔÕÚ".mb_chars.downcase.to_s
=> “string áâãàçéêíóôõú”
“string áâãàçéêíóôõú”.mb_chars.upcase.to_s
=> “STRING ÁÂÃÀÇÉÊÍÓÔÕÚ”

source

Performance wise: 10000000 iterations in C#

ToLower(): 1054 ms
ToLowerInvariant(): 1724 ms
ToLower(CultureInfo): 884 ms [fastest]

Don’t think it’s an issue.