About Friendly urls and Non-Latin characters

I tried encoded but actually the URL can be seen good on address of internet browser. However, when copy or send it, the real URL is a escape string and not friendly anymore.

I check the lib/slug.rb and suggest to change encoded_generator function likes this:

  def self.encoded_generator(string, downcase: true)
    # This generator will sanitize almost all special characters,
    # including reserved characters from RFC3986.
    # See also URI::REGEXP::PATTERN.
    string = I18n.transliterate(string, :locale => SiteSetting.default_locale)

    string = string.strip.gsub(/\s+/, "-").gsub(CHAR_FILTER_REGEXP, "")

    string = string.downcase if downcase

    string
    #CGI.escape(string)
  end

I tried with Viet Nam locale and got the desired results. Hope it works with another language.

I guess this is a compatibility consideration. Some APPs only support Latin text links and will directly cut off links in non-Latin text.

I’m not a ruby code writer, but I read the document and find this:

https://www.rubydoc.info/gems/activesupport/2.3.17/I18n.transliterate

Setting a Hash in <locale>.yml:

i18n:
  transliterate:
    rule:
      ü: "ue"
      ö: "oe"

So this is the way for customize transliterate in setting, at the end of function, we got Latin text (friendly url) only. I tried with my native language, and no need to define custom hash setting.

Is there any way to keep the code change? Because if I did:

./launcher destroy app && ./launcher start app

The file ./lib/slug.rb turns back to original code. I’m sorry because I’m new in ruby on rails environment.

I’m so sorry because my annoycing!

I decide to change the ascii instead the encoded function. I don’t know why, but I got the different results when use encoded.

What is my wrong here? Thank you very much!

  def self.ascii_generator(string)
    #I18n.with_locale(SiteSetting.default_locale) { string.tr("'", "").parameterize }
    string = I18n.transliterate(string, :locale => SiteSetting.default_locale)
    string = string.strip.gsub(/\s+/, "-").gsub(CHAR_FILTER_REGEXP, "")
    string
  end

After play around. I found that:

  • Never change ascii_generator. It make discourse compile template/theme/css wrong.
  • Only use encoded_generator and maybe hard-code to your locale, keep the system default locale setting (English).
  • Add changed code of slug.rb in .yml to keep customize slug.rb for every time rebuild app.

Nice day!