Search results are not highlighted in some languages

He there! In my installation of Discourse 2.5.0.beta1 there is not working highlighting searched keywords in search results if search was done in other than english.
Is it a bug or a feature?)
Thanks

4 Likes

What is expected here @sam?

2 Likes

We only highlight complete words. This looks like a partial word.

5 Likes

Hi! Expecting that the founded keyword would be highlighted like so:

sorry, not in this case:

4 Likes

@vinothkannans can you have a quick look at:

https://github.com/discourse/discourse/blob/4635be10c88cb53779a12eb83daa38bf8101b799/app/assets/javascripts/discourse/lib/highlight-text.js.es6#L13-L16

@smith can you make a post here with the exact Greek word? (I think this is Greek) this could be a bug in our highlighting Javascript library.

4 Likes

This is Cyrillic, based on Greek tho, here is some example:

Ибо угодно Святому Духу и нам не возлагать на вас никакого бремени более, кроме сего необходимого:
воздерживаться от идоложертвенного и крови, и удавленины, и блуда, и не делать другим того, чего себе не хотите. Соблюдая сие, хорошо сделаете. Будьте здравы.

4 Likes

https://github.com/discourse/discourse/blob/7b194743d7e10a425086d52c8e8ddd42ce1856f3/vendor/assets/javascripts/highlight.js#L100-L102

The issue is in the jQuery Highlight plugin. To highlight the words it’s using the word boundary \b regex which is not supported in Unicode characters.

(?<=[\s,.:;"']|^)UNICODE_WORD(?=[\s,.:;"']|$)

It looks like a possible solution :thinking:

5 Likes

I think @gerhard dealt with this particular issue with unicode usernames (at least I vaguely recall)

Super happy to see a fix here, but we got to be ultra careful that whatever regex we choose does not have pathologically bad performance in certain cases where text is long or particularly bad. Also I think in Chinese we don’t even care about word boundaries?

6 Likes

Created a new PR with the fix

https://github.com/discourse/discourse/pull/9163

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.