Discourse is trying to shoehorn the concept of “pretty slugs” to non ascii char set based languages. It may be acceptable to do this transliteration in Norwegen or Turkish, but stuff gets really weird when you do it in Korean or Chinese or Arabic.
When you look at alexa top 100 for say Egypt or China or any non-latin alphabet you never see this concept of a “slug” which is attempting to convert your alphabet to English. It is offensive and comical.
Our default behavior, which is to “only keep English” is even more offensive. Even in India there is no such practice. (though there is this very creative practice in some places http://videos.jagran.com/news-pm-modi-defends-land-bill-says-it-is-aimed-at-welfare-of-farmers-in-mann-ki-baat-v8323.html )
I am willing to bet big bucks that this issue would never have been raised repeatadly if the URL was
site.com/t/42, most non-ascii languages are used to this anyway 99% of the time.
Personally I think going forward we should allow site operators 3 options
- English slugs (default for latin based languages)
- Percent encoded slugs (for full unicode support)
- No slugs (default for non latin based languages)
I also think we should junk StringEx, I think it just does not fit in correctly with the philosophy of slugs and simply looks buggy, if people want to turn their language into an English farce, so be it, but do that in a plugin.