Add search synonyms

I have been reading a lot here lately and see that ‘post’ and ‘reply’ seem to be used somewhat interchangeably.

If this were done, it would result in fewer bothersome questions when someone uses the wrong one of the two for their searching before they ask a question with a new topic (LOL it just happened to me, ‘delete post after’ did not produce the same results as ‘delete reply after’…)

Hence my topic question…

1 Like

Reply and post are not 100% interchangeable. In most usage we see here on Meta they are, but not always.

I’d suggest reviewing Discourse New User Guide, which describes what a post is. A reply is any post that is not the OP.

5 Likes

But I would rather find what I am searching for even if I do not know the correct terminology.

For those more ‘in the know’, would they not still have the option of doing explicit searches with quotes around their explicit term of interest, for example “reply” :question:

Thanks, I will read that but do many other people read that before they make new topics here?

So, I read the ‘Discourse New User Guide’ and I am unable to find any explicit definition of ‘reply’.

But as I have quoted you above, a ‘reply’ is necessarily a ‘post’, so when someone searches for ‘post’ all ‘reply’ matches should also be presented…

Whether, a search for ‘reply’ should bring up all ‘post’ entries is also unknown after reading that guide.

So, I would still like to have the request of this topics’ title, acted upon. (but again, that is only my opinion)

A reply is necessarily a post but some posts are not replies so searching on post should not automatically add the reply search term.

If your preference is satisfied then it will annoy other users like myself who are only searching for post and not reply.

3 Likes

But you are obviously ‘in the know’ and would likely just use an explicit search term without bothering people here with a new topic about why so many search results for ‘post’ are showing up in your ‘reply’ searches.

Regardless of the semantics of post/reply — adding synonyms to search isn’t something that can be configured in Discourse at the moment.

9 Likes

Ok, that shuts me up :wink: but perhaps there should be a way to add them, I predict it could lessen the burden on the good people who respond to newbies on this great forum :slight_smile:

Actually, I do general searches and then follow relevant links that have some overlap with what I’m searching for.

Search engines have an idea of which links are followed. Discourse has something similar. “Suggested messages” at the end of the topic are a fruitful source of relevant topics not directly related to the specific search terms.

1 Like

I am recategorizing it as #feature the feature request is pretty clear to me. It is asking for a place in the UX to define custom synonyms.

Postgres technically supports synonyms per:

https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-SYNONYM-DICTIONARY

So if you wanted to get your gloves off and be mega technical you could wire something today, but I agree that some time in the future adding a UI to allow mods to define this may be interesting.

Not putting a #pr-welcome on this cause it is complicated and would take quite a while to get right with possible limited benefit.

Timeframe wise I would say this is something I expect not to get to in the next year and probably to get to within the next 5 years.

9 Likes

Congratulations Dale :partying_face:

image

1 Like

We’ve made an update to our terminology (User is now “Member”) and we’ve updated our documentation accordingly, but I’d like to be able to have anyone who searches for User to automatically see the results mentioning “Member”. Any thoughts on an easy method to accomplish this?

CC: @michellefs

It is a reasonably hard one, we could potentially build a plugin that injects synonyms to the indexed data - but we would be talking anything between 1 day to 5 days of work

I guess the big question here is how important is this to you? It is doable but would require some custom consulting on our part.

1 Like

I don’t know anything but isn’t that just matter of changing texts from customize side? Or do I now, as usual, understanding totally wrong?

I think the hope is to have the ability to impact the search algorithm indirectly through a tool like tag synonyms. But for any keywords within a post (or the original post, at least).

An example of a use case would be for community members/site visitors who search for their colloquial phrases rather than similar brand jargon. The search algorithm prioritized very different topics. An example on our site would be searching for “desktop app” versus “native client” topics.

Curious whether viewpoints on typos have changed over the years:

In Discourse-AI we started experimenting with semantic search. This is still early days and we are still exploring these systems.

Using LLMs to improve the search prompt is also a possible (albeit slow today) approach:

This technique is mentioned here: GitHub - texttron/hyde: HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels


Besides the 100% automated approaches

Our general strategy here is to iterate. We already have “watched words” in the product, I would not mind a feature that adds “Search Synonyms” where you specify common typos and common phrases you wish to “stuff”. It is not scheduled work but certainly something you could look at sponsoring.

There is precedence for this exact feature in Postgres per: https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-SYNONYM-DICTIONARY

The other area I am open to exploring (I am only lukewarm though on this) is allowing for a hidden “metadata” place on posts, where admins can stuff search terms. It is very very invisible and generally I recommend just “properly” stuffing the works so stuff is not hidden eg:

SEO

semantic, related, improving

2 Likes

Shocked Cosmo Kramer GIF

That is a pure genius idea, it solves the main problem of the embeddings based search: bad user input.

And it requires minimal changes from our existing setup, as you only need to add a small step “enriching” the search query :exploding_head:


On this topic, something that we can also do is doing a hybrid search:

  • Search using existing PG full text search
  • Search using embeddings
  • Gather both best 50 results
  • Pass to a search re-ranker service
  • Show the re-ranked results

We already ship a super capable re-ranked in our existing embeddings API under a separate endpoint, this has all the necessary pieces ready to happen.

Example here:

5 Likes