Message Format support for localization

translation

(Sam Saffron) #1

For the feature I was working on yesterday, @codinghorror wanted a rather complex sentence.

“There is 1 unread and 9 new topics remaining, or browse other topics in [category]”

This seemingly simple sentence was a royal nightmare to localize with our existing localization system. Think through all the permutations:

“There are 2 unread and 9 new topics remaining, or browse other topics in [category]”
“There are 2 unread and 1 new topic remaining, or browse other topics in [category]”
“There is 1 unread and 1 new topic remaining, or browse other topics in [category]”

Trouble with our current system was that you have no sane way of building these kind of sentences, see: internationalization - Clean pattern for localizing sentences in Rails i18n - Stack Overflow , you can only easily localize one count in a non compound sentence.


To alleviate this I introduce a new mechanism that is available (optionally) client side. The above sentence is localized using:

There {UNREAD, plural, 
   one {is <a href='/unread'>1 unread</a>} 
   other {are <a href='/unread'># unread</a>}
} and {NEW, plural, 
  one {<a href='/new'>1 new</a> topic} 
  other {<a href='/new'># new</a> topics}} remaining, or browse other topics in {catLink}

The client localization file has a special rule, if a key ends with _MF it is interpreted as a MessageFormat message, then to access it on the client you use:

I18n.messageFormat("topic.read_more_in_category_MF", {"UNREAD": unreadTopics, "NEW": newTopics, catLink: opts.catLink})

You can see a few other examples here:

We do not plan at the moment to move to message format style localization everywhere, however it is nice to have this extra bit of flexibility that lets us generate interesting sentences.


On a technical note, this feature adds almost no weight to the client side JavaScript, all message format strings are pre-compiled into a JavaScript function with no external dependencies. The tricks used can be viewed here: discourse/js_locale_helper.rb at master · discourse/discourse · GitHub


1 minute Message Format primer

f = "hello"
f() => "hello"

f = "hello {WORLD}"
f(WORLD: "world") => "hello world" 
f(WORLD: "other world") => "hello other world" 

f = "I have {HATS, plural, one {one hat} other {# hats}}"
f(HATS: 1) => "I have one hat"
f(HATS: 10) => "I have 10 hats" 

f = "I am a {GENDER, select, male {boy}, female {girl}}"
f(GENDER: "male") => "I am a boy"
f(GENDER: "female") => "I am a girl"

Our plan for now is to use this strategically, however it is worth noting that this gives more flexibility in localization, for example in czech, the plural form is rather interesting as @kuba could attest :

MessageFormat.locale.cs = function (n) {
  if (n == 1) {
    return 'one';
  }
  if (n == 2 || n == 3 || n == 4) {
    return 'few';
  }
  return 'other';
};

Message Format supports this fine, built in.

f = "I have {HATS, plural, one {one hat} other {# hats} few {# few hats}}"

It's not possible to correctly translate "topic_stat_sentence"
Let's sort out singular vs plural for Russian (and actually all the other languages)
Missing Key for delete all posts confirmation
Meta bug: non-plural marked strings
SyntaxError in Turkish Translation
Invalid Format: Uncaught SyntaxError for Turkish locale
(Bill Ayakatubby) #2

I know this isn’t really the point of your post, but I don’t think that’s correct. At least, it seems super-awkward to this American. Is it possible to rewrite the sentence entirely? Maybe something like this:

You have 1 unread topic and 9 new topics left, or browse other topics in [category].

If I were to have written what you have now, I wouldn’t have done any word switching based on pluralization:

There are (x unread and y new) topics

Parentheses added to illustrate my point. It’s the “and” in there that does it. Now if you really want to switch “topic”/“topics”, this gets you closer:

There (is/are) 2 unread topics and 3 new topics

There (is/are) 1 unread topic and 3 new topics

There (is/are) 2 unread topics and 1 new topic

There (is/are) 1 unread topic and 1 new topic

But that leaves an issue of “is”/“are”. One of them probably is correct, but neither of them would look correct. This may help with that, but now we’re getting kind of verbose:

There are 2 unread topics, and there are 3 new topics; or browse other topics in [category].

There is 1 unread topic, and there are 3 new topics; or browse other topics in [category].

There are 2 unread topics, and there is 1 new topic; or browse other topics in [category].

There is 1 unread topic, and there is 1 new topic; or browse other topics in [category].


(Jeff Atwood) #3

Simpler to just drop “There”.

2 unread and 9 new topics remaining, or browse other topics in [category]

I could go either way, but it reads fine to me as-is.


(Marco) #4

You have to be a Ruby programmer to translate this. Do I have an option to test the code somewhere to see if it reads well in my language?


(Sam Saffron) #5

Message format is not a “you have to be a programmer to translate it” thing

Just figure out how to translate

There {UNREAD, plural, 
   one {is <a href='/unread'>1 unread</a>} 
   other {are <a href='/unread'># unread</a>}
} and {NEW, plural, 
  one {<a href='/new'>1 new</a> topic} 
  other {<a href='/new'># new</a> topics}} remaining, or browse other topics in {catLink}

And you should be good.


(Marco) #6

I can understand it (maybe), but I’m thinking to the average non-techy translator. This is code, by the way, an elegant form of a conditional expression and with some basic html. It is not completely clear what are the words that have to be translated and what are not.

Someone could translate one and other, while I guess they are keywords. And if I cannot test the translated block I’m not sure how it reads. Shouldn’t be code left to coders?

There must be a simpler way. A precompiler, maybe?


(Sam Saffron) #7

This is only used in one spot and is pretty much the only way to translate strings that have 20 permutations