Support emoji in CJK context without spaces

Spaces are required for emoji to display correctly in a CJK (Chinese, Japanese, and Korean) language context. For example,

你好:hugs:
displays
你好:hugs:

You have to add a space here:
你好 :hugs:
displays
你好 :hugs:

Spaces are rarely used in CJK, and most users don’t know about this issue so they end up with some untranslated emoji in their posts. It would be really great if you can make a special rule for CJK and emoji :+1:

4 Likes

Also raised a few times before

And a few others, open to changing this in CJK

2 Likes

@nbianca can you add a new site setting require_punct_before_emoji that is enabled by default but disabled for CJK locales?

2 Likes

They can simply insert the actual Unicode emoji via their native OS emoji picker. I am afraid this might be a giant can of worms @zogstrip so I am not sure if we should take it on right now.

This will only affect CJK locales and is one click away to be disabled if needed.

1 Like

Ok, as long as it can be tightly scoped!

That’s good point. I use a IM which allows me to insert Emoji. As long as it’s shown as emoji in source view, users won’t notice the syntax at all.

image

Scoping wise I think we need to go with the ULTRA simple… if you see a : assume an emoji is coming up when this is enabled.

At the moment we have:

https://github.com/discourse/discourse/blob/93d4281706ab9a038e67963d659ef582efbf5504/app/assets/javascripts/pretty-text/engines/discourse-markdown/emoji.js.es6#L65-L73

Meaning if you see a space or a punctuation prior to : it can be an emoji.

The new algorithm for CJK with the client setting enable_inline_emoji_translation would be to remove that restriction. Basically if the site setting is enabled that whole if statement is skipped.

The alternative I am not too keen about is adding any kind of regex test cause it will impact performance on the MD engine. (eg: Unicode Kanji Code Table)

Even the reverse is hard, it is easy to detect “letter/number” but then we would also need to deal with Vietnamese and other heavily accented languages, so building this in is just too expensive.

We also need to make sure the emoji autocompleter is aware of this if possible.

5 Likes

Implemented in

https://github.com/discourse/discourse/pull/6669

9 Likes