Verbeter woord_aantal berekening voor CJK berichten, of gebruik teken_aantal

Hmm, if we are smart about our pipeline we could use cppjieba.

It would require that update_index! would take care of this:


char count is probably the simplest thing though, given that reading the word bla is far faster than reading supercalifragilisticexpialidocious

I wonder if you can make some PR that changes so we lean on char count, then we can divide char count by 4 say for English and 2 for Chinese? (via some setting)

@lindsey this is an interesting topic for you.

1 like