Is full page semantic search only in English?

Full page semantic search… can it only english? And is there need for some rails magic to make its life easier?

I could work provided you switch the embeddings model to the multilingual one. I have not tested it, but theoretically it should work.

I just started wonder because in most of cases it can’t offer anything and if there is search results those are unrelevant big time.

What model are you using for embeddings?
Have you generated embeddings for all topics?
What model are you using for HyDE Search?

  • text-embedding-ada-002
  • as far as I know yes
  • gpt-3.5-turbo

I’ve done a little bit testing — sorry, not very consistently, but using style like a hare between head lights of a car.

It can defenetly finnish too. I think there is more fundamentally issues of AI and minor language. And users.

First at all OpenAI doesn’t have enough material to handle finnish, but I’m sure that situation includes every languages where isn’t enough material that AI can steal use to learning. That means semantic is a way more difficult than other questions, and those are really difficult to Chat GPT when used other language than english or other major ones.

It looks like GPT-4 is more accurate than GPT-3.5-turbo. But when hits by 3.5 were just noise perhaps 8 times out of 10 and even Discourse could offer those 2 right ones just using purely tags, GPT-4 had something like 50% success ratio. And yes, those are stetson statistics.

Creating a search where semantic approach is… helpful, is actually quite difficult. For me anyway because I had expectations what I should get. So it is not only matter of real semantic searches, but more or less searching using not-accurate search sentence over list of search terms created from that sentence. Yes, I know — such one is a semantic search too.

My very weak point is semantic component works as it should, but issues are coming from limitations of AI itself and user’s too high expectations. And language other than english is not an issue per se.

But…

Semantic full page search is awful slow. Am I right if I’ll blame technical weakness of my VPS — not enough RAM, magical creatures etc? Because here it is fast.

Secondly… can we at some point offer AI-hits as default, over those generated by Discourse?

Just to keep things and topics together: I was very wrong. That has nothing to do with 3.5 and 4. The reason was acting of semantic search on mobiles. It starts searching after three characters and then the result is very wrong. When advanced filter is opened, or search button is clicked if I’m remembering right, AI will do new search and updating results — and then the ”hit ratio” is closer to right.