Differences in search latency between AI semantic and keyword search

Is there any data on latency for Semantic search and Semantic Related Topics v. key-word search and Suggested Topics?

Thanks in advance.

Can you expand on what you mean by latency here?

For Related Topics, since every embeddings is pre-calculated there is not extra runtime cost. Much to the contrary, finding related topics SQL query is faster than our old suggested topics query, and we cache related topics for even faster performance.

As for AI Search, our current HyDE[1] approach to it incurs in serious latency, which is why it happens async and the user is presented first with the standard search and the option to augment it with AI results when those are ready. Here on Meta the AI search results are ready 4 seconds after the normal search results, on average.


  1. GPT-4: HyDE stands for Hypothetical Document Embeddings, a technique used in semantic search to find documents based on similarities in their content. This approach enables more precise and contextually relevant search results by assessing the conceptual similarities between documents, rather than relying solely on keyword matching. It represents a zero-shot learning technique that combines GPT-3’s language understanding capabilities with contrastive text encoders, enhancing AI’s ability to comprehend and process natural language data in a more nuanced and effective manner. ↩︎

3 Likes

Exactly what I was looking for. Thanks Falco.

Has there been any investigation re: ways to reduce that latency for Semantic search?

The first version of AI Search had much better latency, but also much worse results.

As for the next version, we have several plans to reduce latency:

  • Use post level embeddings instead of topic level embeddings

  • Use a re-ranker model to sort search results

  • Make HyDE optional

We believe this will get us better search results, and also make it faster in the process. And paired with the new hardware we are offering with no additional cost to all our hosted customers, capable of doing embeddings inference in just 2ms, we are just getting start with what’s possible here.

2 Likes

Cool. Thanks for the insight Falco.

A couple more questions as we’re looking at turning this on for our communities.

  1. It appears, when you toggle the switch to show Semantic search results, what is displayed to the user is a mix of results from the Semantic search API and the Key-word search API. Is that correct? If so, how are those two sets of results ranked against each other?
  2. Related, can you comment on how the Sort By: words with Semantic results. I notice, for example, an article that has the star icon next to it in one sort then not in another.



1 Like

Yes, exactly.

Using a technique called “reciprocal rank fusion”. We may switch to a re-ranker in the future.

Semantic search is incompatible with sort options, as we don’t have any distance cutoff calculation. It’s supposed to disable / block any time sort order is not relevance.

1 Like

Cool, thanks Falco. From what we we’re seeing, the Semantic search API is providing Semantic search results only to the client. So, presumably the Reciproal Rank Fusion is happening on the client. Is that correct? Also, would we have tha option of changing out that re-ranking algorithm ourselves if we wanted to experiment with different options ourselves?

1 Like

Yes, exactly,

Technically, since it’s all client-based, you could overwrite this.

That said, in the long run I see us relying more and more in re-ranker models, which will all be server-side for obvious reasons.

Got it. Thank You!

1 Like