Is Discourse search immune to typos and how does it work with several words?

For instance, will DC search engine find “John Max Dolittle” with the following queries:

  • hohn <== typo
  • john dolittle <== missing string
  • john mx doelitle <== fuzzy

I was alerted by this post: Discourse search is awfully unsmart!
But it is quite old (2017). Have things improved since?

It seems that the Algolia plug in is no longer official?

1 Like

AI based search is immune to typos, but it is not fast:

However the technique used means that it is a bit slower cause we need to expand the term using an LLM prior to looking for similarity.

General search stems using snowball: Snowball Stemmer - NLP - GeeksforGeeks

It catches some typos as a side effect, but really this is a side effect thing, we are not using metaphone or other sophisticated typo erasure techniques, there is nothing simple built into postgres for that.

Algolia plugin remains supported and official: Discourse Algolia Search

5 Likes

FWIW the Algolia search plugin is still official :+1:

(If you’re hosted by us, it’s available on the Enterprise plans)

2 Likes