My first suggestion is to search the net for similar documented issues and try to find “the way” other search engines manage this layer of complexity; and then look at the Discourse code and see what changes might be required to improve this search algo.
OBTW: Have you tried other Thai char sets in your browser keyboard settings?
Note:
A quick search, I see some experts have proposed the “Two-Pass Search Algorithm” approach:
6 Conclusion
We have presented a discriminative learning approach for Thai morphological
analysis. We consider Thai morphological analysis as a search problem. We propose the two-pass search algorithm that finds the most likely path in the expanded search space. The objective of our algorithm is to increase the coverage
of word hypotheses based on probability estimation in the lattice. The experimental results on ORCHID corpus show that the two-pass search algorithm can
improve the performance over the standard search approach
See also: Computers and the Thai Language
This article explains the history of Thai language development for
computers, examining such factors as the language, script, and
writing system, among others. The article also analyzes characteristics
of Thai characters and I/O methods, and addresses key issues involved
in Thai text processing. Finally, the article reports on language
processing research and provides detailed information on Thai
language resources.