Hi there, I just installed discourse and found out that it didn’t support Zero Width Space at all. It is Khmer Unicode special character that represents a hidden space to separate each word. you can take a look here http://www.askcambodia.org/t/angelababy/21
Can you give us an example of how it should look?
Not being familiar with Khmer it’s hard to tell what’s going wrong.
It should work, all unicode is supported. Where are you trying to use this character, specifically?
unless … this is another nokogiri bug …
The OP did not say “entity”, and what was referred to is a Unicode character, not a string representing a HTML entity… so unless it is a strange HTML entity, it should work.
@supermathie the thing is that it replaces zero width space to be a space for example here
ប្រទេស កម្ពុជា this one with zero width space
ប្រទេសកម្ពុជា this one with no zero width space
Thanks,
@codinghorror something makes the zero width space becoming a space. do you have any idea?
It is a bug, we will try to find it, in the mean time perhaps you should look at translating Discourse
Hi Sam I just requested for khmer language on Transifex. hope it is approved soon and i can invite my team to work on that.
Just making sure @techAPJ sees this ^^
Approved! Thanks for contributing translations.
I am not seeing this bug in our markdown cooking code:
it 'does not strip zero width spaces' do
from = "hi\u{200B}\u{FEFF}there"
cooked = PrettyText.cook(from)
cooked = cooked.gsub(/<[^>]*>/, "")
expect(cooked).to eq from
end
So let me take this one step back, does the preview look right before you post?
@sam Yes sure it works fine in the preview. It just breaks the words after posted.
@sam Any update on this?
I am confused about the repro, in your example:
ប្រទេស កម្ពុជា
has a real space in the raw markdown, I need an example I can work with.
@sam, I was able to replicate the issue this way:
I copied sample from:
and pasted here:
Word Word Word
After each Word you should find one zero-width space (U+200B).
Discourse displayed ‘WordWordWord’ correctly in preview pane (separated with U+200B), but after I posted this, I saw that it got cooked/refreshed and this operation replaced ‘WordWordWord’ with ‘Word Word Word’.
So indeed, Discourse replaces U+200B with real space
AHA I can see why this is happening
@zogstrip added this …
https://github.com/discourse/discourse/commit/f4208ae83fd43e0cdd663d82a73fabfb65f327bb
This is actually a case of us being too smart for our own good. It was added as a system to avoid people “gaming” edits and bypassing rules. However it is totally not really acceptable that we break formatting here, so I patched it with:
https://github.com/discourse/discourse/commit/58c95f64d2887134604b8024afb0d673c48f433f
@sam so how can i fix it? or wait for another release of discourse?
Wait for tests passed to pass and then update your instance if you are on the tests-passed branch which is the default in our configs.