Chinese search doesn't work to some words

Hi all,
I find that some Chinese words cannot be searched. I don’t know why. Any idea for that?

The screenshot below comes from this site. You can view it from https://meta.discourse.org/t/discourse/6780?source_topic_id=13287

You have to enable search tokenize chinese japanese korean in site settings and rebuild the search index by editing the post or something.

This should be enabled by default for you in site settings if you have Chinese locale selected.

3 Likes

Thanks Sam. I’m not sure how to rebuild the search index, I will do some research on it. Thanks

If memory serves:

cd /var/discourse
./launcher enter app
rake search:reindex
3 Likes

Thanks Stephen. Do you know how to do it under development environment?

You can run rake from ./bin/docker

I can’t be more specific without details of your dev environment install.

It still doesn’t work on my dirsource.
I find an example from this site:
https://meta.discourse.org/t/discourse/6780?source_topic_id=13287

You can find some Chinese words from the topic above, but if you search some of these words, it shows no results.

Going by the text as it appears in the screen captures, it looks like missed

@fantasticfears any tips here? Is the segmenter broken here?

1 Like

This screen capture can be clearer.

Do you enable the search tokenize chinese japanese korean? Or do you set default locale as zh_CN? Meta probably didn’t enable those settings. Mind trying something here? https://master1.discoursecn.org/

@sam Segmenter works for me. It should be something else.

2 Likes

I tried some words on your site. It still doesn’t work

It can be seen from the screenshot below that the provided results do not include the current topic that I was viewing, but it should include. It cannot cover all correct results.

I’m not really sure which is broken since both text come back with expected segmentation.

Can you update cppjieba_rb to 0.3.3? @sam

1 Like

I paste some missed words here. If you copy them and search them in discourse, you can find that the expected searching results are missed.

For this site: (the results should include this page: https://meta.discourse.org/t/discourse/6780?source_topic_id=13287)
指南发布

For Erick Guan’s site (https://master1.discoursecn.org/)
重启服务器 (searching results should include: 关于重启服务器需要重建容器以及vps搬家的疑问 - 支持 - Discourse中文论坛)

1 Like