Getting a lot of no results for semantic search

I’m having a hard time getting the semantic AI search to give me any results so I wanted to check with everyone and see if you all have any ideas on what could be wrong.

If I take a simple example and search for “shopify”, you can see I get plenty of normal results, but no AI results. When I compare this to searching for “shopify” in Discourse Meta community, you can see I get many normal results and many AI results. I’ve tried more complex and specific questions, but I still end up with no results.

As for our setup, we are using gpt-4o-mini and that is correctly setup under LLMs. “AI embeddings semantic search enabled” is enabled. We are using “text-embedding-ada-002” for the embeddings.


2 Likes

thanks for reporting, the team will have a look!

Thanks Sam! I wanted to make sure I wasn’t missing something obvious here since it seems to be working better on your own site.

The only thing that is jumping to mind is that possibly we are not done backfilling embeddings on your site, we will have a look

Hey @tyler.lamparter,

On a first glance, one issue I’ve found with your site current config is that while you are using text-embedding-ada-002 you’ve filled the embeddings configs for prompts, which are not supported for that model. I removed the instructions you had set there and regenerated the embeddings on your site.

I’m also updating the tooltip on those settings to try to avoid this confusion going forward

Other than that, I tried doing a search for “shopify integration” and I got the following hypothetical search document

which is aligned to what we expect.

Can you try searching now and sharing your experience?

@Falco this seems to be working much better, thank you! I had added the prompts under the embedding configuration in an attempt to improve it, but of course it had no effect.

2 Likes

@Falco maybe I spoke too soon. Whenever I search now, AI always has exactly 40 results found regardless of what I search for. Many of the results are also not that relevant (suggesting the about this category topic for example).



1 Like

Let me try with a different embeddings model. Will report back in ~1h.

1 Like

What would be the ideal topic result for this query ?

I would expect 0 search results and 0 AI results in that particular case. We support ES6/ECMAScript2015 JS (yes very old), but it’s not in any community post as of yet.

Ohhhhh I see. This won’t work in this case.

The way our current AI search works it:

  • Takes user input
  • Creates a new post about it taking into account the forum description
  • Returns the most semantically similar topics to it

There is no threshold of distance where we cut off the search, as finding out a general threshold that covers all thousands of Discourse instances is non-trivial. This is discussed at Setting a similarity threshold for semantic search.

We are looking into releasing a new approach that will do a more standard LLM + RAG search and return a conversational response, where you can tweak the prompt to say “no results found”. This is coming in the next weeks, will ping you here when you can test it.

1 Like

That would be great. That’s what I was attempting to do with adding the prompts into embeddings section. Since you can’t set a threshold now, is that why we’re always seeing about 40 results?

Yes, exactly.

Today, the AI search works like a failover system when the standard search returns no results, acting like an overpowered synonyms dictionary. However, in your case, the search query doesn’t have any possible results in the entire forum, which isn’t the use case it was designed to address.

Stay tuned for the next version of the search; it will be closer to what you want.

2 Likes

Am I good to set our embedding model back to text-embedding-ada-002?

1 Like