ballistic  
                
                  
                    April 5, 2018,  9:11pm
                   
                  1 
               
             
            
              PrettyText.markdown("❤️❤️❤️", {}) 
Will not generate the same code as:
PrettyText.markdown(":heart::heart::heart:", {})
It generates:
=> "<p><img src=\"/images/emoji/apple/heart.png?v=5\" title=\":heart:\" class=\"emoji\" alt=\":heart:\">️:heart:️:heart:️</p>" 
( there are ':️' & ':', copy paste to https://www.soscisurvey.de/tools/view-chars.php to see it)
I think it has something to do with
replacement = "\u200b" + replacement;
In lib/pretty_text/shims.js
( on 1.9.4 )
             
            
              1 Like 
            
                
            
           
          
            
            
              The control code insertion isn’t happening on latest:
[1] pry(main)> puts PrettyText.markdown("❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️</p>
[2] pry(main)> puts PrettyText.markdown(":heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
although I do note it behaves differently:
[1] pry(main)> puts PrettyText.markdown("❤️❤️❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️:heart:️:heart:️</p>
[2] pry(main)> puts PrettyText.markdown(":heart::heart::heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
❤️❤️❤️
:heart::heart::heart:
generates:
             
            
              1 Like 
            
            
           
          
            
              
                riking  
              
                  
                    April 5, 2018,  9:30pm
                   
                  3 
               
             
            
              FFEF is not an assigned Unicode character, so I wonder what’s putting it in. (note: it’s not the BOM, that’s FEFF.)
             
            
              
            
           
          
            
            
              I think it’s locale character encoding involved.
I don’t know what it means, but \u+ffb8 is “Halfwidth Hangul Letter Cieuc”
  
  
    U+FFB8 is the unicode hex value of the character Halfwidth Hangul Letter Cieuc. Char U+FFB8, Encodings, HTML Entitys:ᄌ,ᄌ, UTF-8 (hex), UTF-16 (hex), UTF-32 (hex)
   
  
    
    
  
  
 
             
            
              1 Like 
            
            
           
          
            
            
              
 riking:
 
it’s not the BOM
 
 
That was my initial hunch, a big endian - little endian paste thing. (been there, done that)
If the leading colon isn’t the “beginning of a ‘word’” Discourse adds a “hair space” - the \u200b.
  
  
    
    
      
            exports.default = I18n; 
          }); 
          define("discourse-common/lib/helpers", ["exports"], function (exports) { 
            exports.helperContext = function () { 
              return { 
                siteSettings: { avatar_sizes: __optInput.avatar_sizes }, 
              }; 
            }; 
          }); 
          define("pretty-text/engines/discourse-markdown/bbcode-block", [ 
            "exports", 
            "discourse-markdown-it/features/bbcode-block", 
       
     
  
    
    
  
  
 
Looking at bytes, if it is endian related, I don’t see how.
\u200b
01011100 01110101 00110010 00110000 00110000 01100010
\uffef
01011100 01110101 01100110 01100110 01100101 01100110
\uffb8
01011100 01110101 01100110 01100110 01100010 00111000
\uffe2
01011100 01110101 01100110 01100110 01100101 00110010
So I think it must be the regex’s interpretation of what it considers to be a “word”, i.e. locale related.
             
            
              2 Likes 
            
            
           
          
            
              
                ballistic  
              
                  
                    April 6, 2018,  3:57pm
                   
                  8 
               
             
            
              
This is what I see on the console:
$ echo ":heart:️:heart:️" | hexdump
0000000 3a 68 65 61 72 74 3a ef b8 8f 3a 68 65 61 72 74
0000010 3a ef b8 8f 0a                                 
0000015
well, if I type it, it will be:$ echo ":heart::heart:" | hexdump \r\n     0000000 3a 68 65 61 72 74 3a 3a 68 65 61 72 74 3a 0a   \r\n     000000f
( I have problem to format this post )
The first one with invisible characters is 5 char longer and it disabled the following emoji to url escape.
Maybe OS’s clipboard or terminal changed something, but the problem we want to fix is why unicode hearts cannot be escaped to urls.
             
            
              
            
           
          
            
              
                ballistic  
              
                  
                    April 6, 2018,  4:02pm
                   
                  9 
               
             
            
              What is the best way to debug it? console.log doesn’t work on the server side js.
             
            
              
            
           
          
            
            
              Please satisfy my curiosity and let me know what the locale is.
I’m guessing it’s something where the typical western concept of what constitutes a “word” doesn’t apply. But if I’m completely off-base it’s the wrong rabbit hole to go down into.
             
            
              
            
           
          
            
              
                ballistic  
              
                  
                    April 6, 2018,  8:10pm
                   
                  11 
               
             
            
              it is from Bitnami Discourse Stack for Virtual Machines 
So en_US.
$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8 
            
              1 Like 
            
            
           
          
            
            
              Thanks. “Bitnami install” is a completely different rabbit hole 
Not that the cause is because of the Bitnami install, but I have seen many posts here dealing with problems it had. So I’m leaning towards thinking it is more a Bitnami thing rather than Discourse itself. Any way to reach out to others with Bitnami installs to see if they have the same issue?
             
            
              1 Like 
            
            
           
          
            
              
                ballistic  
              
                  
                    April 6, 2018,  9:57pm
                   
                  13 
               
             
            
              @Mittineague  Can you reproduce the problem?
Because  @supermathie  did reproduce it. I feel he is not on bitnami.
             
            
              1 Like 
            
            
           
          
            
            
              
Yes, 1-2 are supermathie’s results. 5,6,7,8 are mine
[1] pry(main)> puts PrettyText.markdown("❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️</p>
[2] pry(main)> puts PrettyText.markdown(":heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
[5] pry(main)> puts PrettyText.markdown("❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️</p>
[6] pry(main)> puts PrettyText.markdown(":heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
[1] pry(main)> puts PrettyText.markdown("❤️❤️❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️:heart:️:heart:️</p>
[2] pry(main)> puts PrettyText.markdown(":heart::heart::heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
[7] pry(main)> puts PrettyText.markdown("❤️❤️❤️", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:">️:heart:️:heart:️</p>
[8] pry(main)> puts PrettyText.markdown(":heart::heart::heart:", {});
<p><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"><img src="/images/emoji/twitter/heart.png?v=5" title=":heart:" class="emoji" alt=":heart:"></p>
 
            
              
            
           
          
            
              
                sam  
              
                  
                    April 6, 2018, 10:32pm
                   
                  15 
               
             
            
              @j.jaffeux  is there some internal bug in the emoji remapper?
             
            
              1 Like 
            
            
           
          
            
              
                riking  
                
                  
                    April 7, 2018, 12:48am
                   
                  16 
               
             
            
              
 Mittineague:
 
:️:
 
 
Uhhh what’s with the two different colons here?
% echo -n :<fe0f>: | xxd
00000000: 3aef b88f 3a                              :...:
There’s a U+FEOF VARIATION SELECTOR 16, also known as “force emoji display” that’s not being picked up by the remapper.
             
            
              1 Like 
            
            
           
          
            
            
              
Your acuity is much better than mine. I still can’t discern a difference even when I look for it.
As to where they came from, I simply copied supermathie’s example puts into my console to see if I could reproduce the results.
             
            
              
            
           
          
            
              
                ballistic  
              
                  
                    April 13, 2018,  5:41pm
                   
                  18 
               
             
            
              Should we move this to the bug category? Or I should create a new topic in the bug category?
             
            
              
            
           
          
            
              
                sam  
              
                  
                    April 14, 2018,  1:58am
                   
                  19 
               
             
            
              There is probably a minor issue with the emoji -> image convertor but I am struggling here with the real world impact of this.
Can you provide an example in a post of where this is an actual issue short of hexedit showing something off?
             
            
              2 Likes 
            
            
           
          
            
              
                ballistic  
              
                  
                    April 14, 2018,  4:24pm
                   
                  20 
               
             
            
            
              1 Like 
            
            
           
          
            
              
                ballistic  
                
                  
                    April 14, 2018,  4:27pm
                   
                  21 
               
             
            
              
Mobile view, this is another problem.
             
            
              
            
           
          
            
              
                ballistic  
              
                  
                    April 14, 2018,  4:30pm
                   
                  22 
               
             
            
              The real world problem is when people are excited, they use a lot of emojis, and this issue breaks their hearts.