Automatically add a space before the emoji if the user is typing Chinese and uses emoji picker


Hi Discourse team:

I have been using Discourse for 4 years and love the modern design and quick update. Thank you!

Recently I have noticed some issues in my forum, that in China, we seldom use spaces like you do in English, so basically every emoji from emoji picker is screwed like this:


Which is annoying and confuses most Chinese users. Can you think of a way to improve the Chinese user experience?

Thank you!


(Jeff Atwood) #2

Any ideas @joffreyjaffeux?

(Sam Saffron) #4

We half fixed this with:

But what we need to do is match the implementation here:

Which is checking for isSpace or isPunctChar.


Any progress? Got complaints from my users again today ;(

(Stephen Chung) #6

You can do this much simpler by always putting a space before the emoji code when selected from the picker.

You can also skip this space if the character immediately before is space.


Do you mean doing this manually or programmatically?

(Stephen Chung) #8

That will need Discourse core to be modified. Probably somebody can write a simple plugin to do this…

(Sam Saffron) #9

We have a few options here:

  1. Allow 呵:t_rex: so :t_rex: is an emoji. The problem though is that it is very expensive to detect Chinese per is-chinese/ischinese.js at master · alsotang/is-chinese · GitHub so this would add a bunch of extra slowness to the MD engine.

  2. Fix it so autocomplete behaves consistently with the way the engine interprets it. Meaning 呵:t should not pop open an emoji window.

  3. @schungx suggestion where after autocomplete is done we insert a space prior to emoji if needed.

  4. Add a “special mode” into Markdown where it requires no special punctuation between emoji and then a:t_rex:a would render a :t_rex: a.

I feel like the best way forward here is (2) and (4) combined. So:

  • Out-of-the-box autocomplete is not doing any funny business where it pretends to add an emoji and does not

  • In Chinese forums where words are just joined we would set SiteSetting.require_punct_before_emoji = false (and have autocomplete respect this)

@misaka4e21 / @tgxworld what are your thoughts here?

(Alan Tan) #10

Yea I think a locale based site setting is best way to go here.

(Yihan "Misaka 0x4e21" X.) #11

There are two problems:

  • (a) If an emoji are chosen from the emoji picker, it will lack for a space before the first colon :, not only in Chinese posts, but in English posts as well. Users will get a not rendered a:smile: (a:smile:) if they input a and then select :smile: (:smile:) from the emoji picker.
  • (b) A colon : after a Latin character won’t cause auto-completion, yet a colon after a CJK Unihan character will. It could be solved exactly by (2).

To solve (a), (3) should be implemented in the emoji picker, not / not just in auto-completion.

(1) and (4) are in different logic.
To check if a character is in Chinese or not, we could try:

Though I don’t know whether the regular expression above will provide higher performance.

(Alan Tan) #12

I remember going down the regular expression route in the past but that means we carry this giant regexp everywhere we go. For chinese forums, we can just have a site setting that does not require a space before : to render emojis.