ballistic
(Ballistic Tire)
April 5, 2018, 9:11pm
1
PrettyText.markdown("❤️❤️❤️", {})
Will not generate the same code as:
PrettyText.markdown(":heart::heart::heart:", {})
It generates:
=> "<p><img src=\"/images/emoji/apple/heart.png?v=5\" title=\":heart:\" class=\"emoji\" alt=\":heart:\">️:heart:️:heart:️</p>"
( there are ':️' & ':', copy paste to https://www.soscisurvey.de/tools/view-chars.php to see it)
I think it has something to do with
replacement = "\u200b" + replacement;
In lib/pretty_text/shims.js
( on 1.9.4 )
1 Like
The control code insertion isn’t happening on latest:
[1] pry(main)> puts PrettyText.markdown("❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️</p>
[2] pry(main)> puts PrettyText.markdown(":heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
although I do note it behaves differently:
[1] pry(main)> puts PrettyText.markdown("❤️❤️❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️:heart:️:heart:️</p>
[2] pry(main)> puts PrettyText.markdown(":heart::heart::heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
❤️❤️❤️
:heart::heart::heart:
generates:
1 Like
riking
(Kane York)
April 5, 2018, 9:30pm
3
FFEF is not an assigned Unicode character, so I wonder what’s putting it in. (note: it’s not the BOM, that’s FEFF.)
I think it’s locale character encoding involved.
I don’t know what it means, but \u+ffb8
is “Halfwidth Hangul Letter Cieuc”
U+FFB8 is the unicode hex value of the character Halfwidth Hangul Letter Cieuc. Char U+FFB8, Encodings, HTML Entitys:ᄌ,ᄌ, UTF-8 (hex), UTF-16 (hex), UTF-32 (hex)
1 Like
riking:
it’s not the BOM
That was my initial hunch, a big endian - little endian paste thing. (been there, done that)
If the leading colon isn’t the “beginning of a ‘word’” Discourse adds a “hair space” - the \u200b
.
exports.default = I18n;
});
define("discourse-common/lib/helpers", ["exports"], function (exports) {
exports.helperContext = function () {
return {
siteSettings: { avatar_sizes: __optInput.avatar_sizes },
};
};
});
define("pretty-text/engines/discourse-markdown/bbcode-block", [
"exports",
"discourse-markdown-it/features/bbcode-block",
Looking at bytes, if it is endian related, I don’t see how.
\u200b
01011100 01110101 00110010 00110000 00110000 01100010
\uffef
01011100 01110101 01100110 01100110 01100101 01100110
\uffb8
01011100 01110101 01100110 01100110 01100010 00111000
\uffe2
01011100 01110101 01100110 01100110 01100101 00110010
So I think it must be the regex’s interpretation of what it considers to be a “word”, i.e. locale related.
2 Likes
ballistic
(Ballistic Tire)
April 6, 2018, 3:57pm
8
This is what I see on the console:
$ echo ":heart:️:heart:️" | hexdump
0000000 3a 68 65 61 72 74 3a ef b8 8f 3a 68 65 61 72 74
0000010 3a ef b8 8f 0a
0000015
well, if I type it, it will be:
$ echo ":heart::heart:" | hexdump \r\n 0000000 3a 68 65 61 72 74 3a 3a 68 65 61 72 74 3a 0a \r\n 000000f
( I have problem to format this post )
The first one with invisible characters is 5 char longer and it disabled the following emoji to url escape.
Maybe OS’s clipboard or terminal changed something, but the problem we want to fix is why unicode hearts cannot be escaped to urls.
ballistic
(Ballistic Tire)
April 6, 2018, 4:02pm
9
What is the best way to debug it? console.log doesn’t work on the server side js.
Please satisfy my curiosity and let me know what the locale is.
I’m guessing it’s something where the typical western concept of what constitutes a “word” doesn’t apply. But if I’m completely off-base it’s the wrong rabbit hole to go down into.
ballistic
(Ballistic Tire)
April 6, 2018, 8:10pm
11
it is from Bitnami Discourse Stack for Virtual Machines
So en_US.
$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
1 Like
Thanks. “Bitnami install” is a completely different rabbit hole
Not that the cause is because of the Bitnami install, but I have seen many posts here dealing with problems it had. So I’m leaning towards thinking it is more a Bitnami thing rather than Discourse itself. Any way to reach out to others with Bitnami installs to see if they have the same issue?
1 Like
ballistic
(Ballistic Tire)
April 6, 2018, 9:57pm
13
@Mittineague Can you reproduce the problem?
Because @supermathie did reproduce it. I feel he is not on bitnami.
1 Like
Yes, 1-2 are supermathie’s results. 5,6,7,8 are mine
[1] pry(main)> puts PrettyText.markdown("❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️</p>
[2] pry(main)> puts PrettyText.markdown(":heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
[5] pry(main)> puts PrettyText.markdown("❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️</p>
[6] pry(main)> puts PrettyText.markdown(":heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
[1] pry(main)> puts PrettyText.markdown("❤️❤️❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️:heart:️:heart:️</p>
[2] pry(main)> puts PrettyText.markdown(":heart::heart::heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
[7] pry(main)> puts PrettyText.markdown("❤️❤️❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️:heart:️:heart:️</p>
[8] pry(main)> puts PrettyText.markdown(":heart::heart::heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
sam
(Sam Saffron)
April 6, 2018, 10:32pm
15
@joffreyjaffeux is there some internal bug in the emoji remapper?
1 Like
riking
(Kane York)
April 7, 2018, 12:48am
16
Mittineague:
:️:
Uhhh what’s with the two different colons here?
% echo -n :<fe0f>: | xxd
00000000: 3aef b88f 3a :...:
There’s a U+FEOF VARIATION SELECTOR 16, also known as “force emoji display” that’s not being picked up by the remapper.
1 Like
Your acuity is much better than mine. I still can’t discern a difference even when I look for it.
As to where they came from, I simply copied supermathie’s example puts into my console to see if I could reproduce the results.
ballistic
(Ballistic Tire)
April 13, 2018, 5:41pm
18
Should we move this to the bug category? Or I should create a new topic in the bug category?
sam
(Sam Saffron)
April 14, 2018, 1:58am
19
There is probably a minor issue with the emoji -> image convertor but I am struggling here with the real world impact of this.
Can you provide an example in a post of where this is an actual issue short of hexedit showing something off?
2 Likes
ballistic
(Ballistic Tire)
April 14, 2018, 4:24pm
20
1 Like
ballistic
(Ballistic Tire)
April 14, 2018, 4:27pm
21
Mobile view, this is another problem.
ballistic
(Ballistic Tire)
April 14, 2018, 4:30pm
22
The real world problem is when people are excited, they use a lot of emojis, and this issue breaks their hearts.