Cannot search by Thai tag

discourse version 2.6.0.beta1

These are example posts that have tag in Thai.

When I go to search page and try to search with Thai character (no extra vowel) the results match the amount from tag filter.
image

However, when I search with extra vowel, there is no result found. (but in tag filter there are 17 posts)
image

4 Likes

Hi K. @siriwatknp and Sawatdee from Thailand,

I am guessing the app search engine has problems with almost all Thai vowels, (4) tone marks and (5) diacritics?

Reference:

man20090100461

1 Like

Any suggestion about workaround for this issue?

My first suggestion is to search the net for similar documented issues and try to find “the way” other search engines manage this layer of complexity; and then look at the Discourse code and see what changes might be required to improve this search algo.

OBTW: Have you tried other Thai char sets in your browser keyboard settings?

Note:

A quick search, I see some experts have proposed the “Two-Pass Search Algorithm” approach:

6 Conclusion
We have presented a discriminative learning approach for Thai morphological
analysis. We consider Thai morphological analysis as a search problem. We propose the two-pass search algorithm that finds the most likely path in the expanded search space. The objective of our algorithm is to increase the coverage
of word hypotheses based on probability estimation in the lattice. The experimental results on ORCHID corpus show that the two-pass search algorithm can
improve the performance over the standard search approach

See also: Computers and the Thai Language

This article explains the history of Thai language development for
computers, examining such factors as the language, script, and
writing system, among others. The article also analyzes characteristics
of Thai characters and I/O methods, and addresses key issues involved
in Thai text processing. Finally, the article reports on language
processing research and provides detailed information on Thai
language resources.

2 Likes

@siriwatknp Can you provide me with the text and the search term in text so that I can try to reproduced the problem locally?

7 Likes

@siriwatknp Just saw that you submitted a PR to fix this issue :slight_smile: The PR looks good to me and has been merged.

https://github.com/discourse/discourse/pull/10488

5 Likes