Is there enough interest in the community to merit adding inline markup for language?
This is would be helpful in mixed language communities, and especially language learning forums.
For example, something like this:
[lang=ja]…[/lang]
To produce html:
<span lang=“ja”>…</span>
The same Unicode character is rendered differently in different languages:
In the image above, the browser renders it differently when it knows the text is Japanese.
I realize there are site-wide and user-specific locale settings, but in a mixed-language post (like on a language learning site), two languages could be used in the same post.
Very interesting, I wonder if we should simply whitelist <span lang="ja"> in core, it seems pretty low risk.
Not sure about the general interest here, this is the first time I have seen this request come up, but totally get that mixed Chinese / Japanese communities exist and they surely need this feature.
That’s certainly a simpler solution.
I guess I would lean toward whitelisting the lang attribute on all Discourse-supported html tags, since the html standard allows it on all html tags.
One other tag set worth whitelisting is ‘ruby’ tags, which are a standard part of html for Japanese language support.
Since there are thousands of kanji (e.g. 漢字) in the Japanese language, Japanese students are still learning them all the way through high school. So, it is common for publications to use “furigana” to mark the pronunciation above kanji (see snapshot below from NHK News website).
I did some testing, and there are side-effects if <ruby> is allowed in thread titles. Post body seems fine, though.
I suppose CSS could hide them in thread titles. That seemed to work fine in tests.
Without additional research, I’m not sure how important it is to support <ruby> in titles. But I do feel like it would be a big boost for the message body.
Anyway, here are some samples (randomly inserted text):
Since ‘lang’ is a global attribute, I would be inclined to whitelist it on any Discourse-supported tags… but I don’t know what the implications of that would be for Discourse’s filtering efficiency.
If there’s good reason for keeping the whitelisting to a minimum, it would still be helpful to support at least one block-level element (e.g. <div>) and one inline element (e.g. <span>).
<ruby>, <rb>, <rt>, and <rp>
And the [lang] attribute on <ruby>, <rb>, and <rt>
(<rp> is just for enclosing parenthesis as a fallback for old browsers, so [lang] isn’t useful.)
Based on the tests I ran above, I’d recommend only whitelisting them in the post body, not in titles. It would cause layout problems in titles, but is okay in the post body.
Also, a slightly increased font size makes quite a bit of difference for readability:
before:
after:
ruby {
font-size: 16px; /* default is 14px from css on 'html' in base.scss */
}
rt {
font-size: 10px; /* default is 50% in Chrome */
}
@awesomerobot do you want to add some basic styling here? We can not have the px based rules but something out-of-the-box would be nice for CJK communities.
I made an update to increase the rt font-size from 50% to 72%, which is approximately 10px (based on our default 14px font size).
The font-size in the ruby tag is based off of our base font-size for all content, so it seems like that should be increased along with all site or post text and not on its own.