Add search synonyms

I have been reading a lot here lately and see that ‘post’ and ‘reply’ seem to be used somewhat interchangeably.

If this were done, it would result in fewer bothersome questions when someone uses the wrong one of the two for their searching before they ask a question with a new topic (LOL it just happened to me, ‘delete post after’ did not produce the same results as ‘delete reply after’…)

Hence my topic question…

1 个赞

Reply and post are not 100% interchangeable. In most usage we see here on Meta they are, but not always.

I’d suggest reviewing Discourse New User Guide, which describes what a post is. A reply is any post that is not the OP.

5 个赞

But I would rather find what I am searching for even if I do not know the correct terminology.

For those more ‘in the know’, would they not still have the option of doing explicit searches with quotes around their explicit term of interest, for example “reply” :question:

Thanks, I will read that but do many other people read that before they make new topics here?

So, I read the ‘Discourse New User Guide’ and I am unable to find any explicit definition of ‘reply’.

But as I have quoted you above, a ‘reply’ is necessarily a ‘post’, so when someone searches for ‘post’ all ‘reply’ matches should also be presented…

Whether, a search for ‘reply’ should bring up all ‘post’ entries is also unknown after reading that guide.

So, I would still like to have the request of this topics’ title, acted upon. (but again, that is only my opinion)

A reply is necessarily a post but some posts are not replies so searching on post should not automatically add the reply search term.

If your preference is satisfied then it will annoy other users like myself who are only searching for post and not reply.

3 个赞

But you are obviously ‘in the know’ and would likely just use an explicit search term without bothering people here with a new topic about why so many search results for ‘post’ are showing up in your ‘reply’ searches.

Regardless of the semantics of post/reply — adding synonyms to search isn’t something that can be configured in Discourse at the moment.

9 个赞

Ok, that shuts me up :wink: but perhaps there should be a way to add them, I predict it could lessen the burden on the good people who respond to newbies on this great forum :slight_smile:

Actually, I do general searches and then follow relevant links that have some overlap with what I’m searching for.

Search engines have an idea of which links are followed. Discourse has something similar. “Suggested messages” at the end of the topic are a fruitful source of relevant topics not directly related to the specific search terms.

1 个赞

I am recategorizing it as feature the feature request is pretty clear to me. It is asking for a place in the UX to define custom synonyms.

Postgres technically supports synonyms per:

So if you wanted to get your gloves off and be mega technical you could wire something today, but I agree that some time in the future adding a UI to allow mods to define this may be interesting.

Not putting a pr-welcome on this cause it is complicated and would take quite a while to get right with possible limited benefit.

Timeframe wise I would say this is something I expect not to get to in the next year and probably to get to within the next 5 years.

9 个赞

Congratulations Dale :partying_face:

image

1 个赞

我们更新了术语(“用户”现为“会员”),并相应地更新了文档,但我希望任何搜索“用户”的人都能自动看到提及“会员”的结果。有什么简单的方法可以实现这一点吗?

CC:@michellefs

这是一个相当棘手的问题,我们可以创建一个插件来将同义词注入索引数据——但这需要 1 到 5 天的工作时间。

我想这里最大的问题是这对您有多重要?这是可以实现的,但需要我们提供一些定制咨询。

1 个赞

我什么都不知道,但那不只是从自定义方面更改文本的问题吗?或者我又像往常一样完全理解错了?

我認為希望能夠透過類似 標籤同義詞 的工具間接影響搜尋演算法。但這僅適用於貼文中的任何關鍵字(或至少是原始貼文)。

一個使用案例是,社群成員/網站訪客會搜尋他們的慣用語,而不是類似的品牌術語。搜尋演算法會優先處理截然不同的主題。我們網站上的例子是搜尋「desktop app」與「native client」主題。

好奇多年來對錯字的想法是否有改變:

在 Discourse-AI 中,我们开始试验语义搜索。这仍处于早期阶段,我们仍在探索这些系统。

使用 LLM 改进搜索提示也是一种可能的方法(尽管今天很慢):

这项技术在此处有提及:GitHub - texttron/hyde: HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels


除了 100% 自动化方法之外

我们在这里的总体策略是迭代。产品中已经有“监视词”,我乐于看到一个添加“搜索同义词”的功能,您可以在其中指定常见的拼写错误和您希望“填充”的常用短语。这并非计划中的工作,但绝对是您可以考虑赞助的内容。

根据:PostgreSQL: Documentation: 18: 12.6. Dictionaries 中已有此确切功能的先例

我愿意探索的另一个领域(但我对此只是不冷不热)是允许在帖子中设置隐藏的“元数据”区域,管理员可以在其中填充搜索词。这非常非常不显眼,通常我建议“正确地”填充内容,以免内容被隐藏,例如:

SEO

semantic, related, improving

2 个赞

Shocked Cosmo Kramer GIF

这真是一个绝妙的主意,它解决了基于嵌入的搜索的主要问题:糟糕的用户输入。

而且它只需要对我们现有的设置进行最小的更改,因为你只需要添加一个“丰富”搜索查询的小步骤 :exploding_head:


关于这个话题,我们还可以做一些事情,那就是进行混合搜索:

  • 使用现有的 PG 全文搜索进行搜索
  • 使用嵌入进行搜索
  • 收集两者中最好的 50 个结果
  • 传递给搜索重新排序服务
  • 显示重新排序后的结果

我们已经在现有的嵌入 API 中提供了一个功能强大的重新排序器,它位于一个单独的端点下,具备了发生这一切所需的所有必要组件。

示例在此:

https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search.py#L67-L70

6 个赞