2.3 版本中的搜索改进

对于即将发布的 2.3 版本,我们添加了一项新功能,并对帖子搜索索引方式进行了多项修复,这将使搜索结果更加优化。

1. 分类的搜索优先级


每个分类都可以配置搜索优先级,您可以在创建或编辑分类时的 Settings(设置)选项卡中找到该选项。新增了五个优先级等级:ignore(忽略)、very low(非常低)、low(低)、high(高)和 very high(非常高)。这些等级通过将预配置的权重乘到每个结果的搜索排名中来生效,并可通过控制台中的隐藏站点设置进行配置。

例如,将分类的搜索优先级配置为 very_high 会使其搜索排名提升 40%,而配置为 very_low 则会使其搜索排名降低 40%。将分类的搜索优先级设置为 ignore 会将其从搜索结果中移除。不过,您仍然可以通过高级搜索将搜索范围限定在该分类内,从而搜索其中的帖子。请注意,搜索优先级不会继承,这意味着即使父分类被配置为忽略,其子分类仍然可被搜索。

2. 搜索结果的改进与修复

  1. 搜索相关性已更新,在排名时会考虑文档长度。此前,搜索排名仅根据与给定搜索词匹配的次数最多来决定。这对搜索结果来说存在问题,因为我们发现较长的帖子更有可能因为匹配次数较多而排名靠前。因此,我们已改为在排名搜索结果时考虑文档长度。
    FIX: Relevance search will now consider document length in ranking. · discourse/discourse@e87ca59 · GitHub

  2. 提高了用于生成搜索索引的原始数据质量。PERF: Improve quality of `PostSearchData#raw_data`. (#7275) · discourse/discourse@cfd5078 · GitHub

    • 帖子中的 URL 有时会被错误地分词并重复索引,导致链接内的词汇排名偏高。
    • 帖子中的灯箱内容会将其图像元数据污染到搜索结果中。
  3. 空帖子(如版主操作或小型帖子操作,例如关闭、分配)不再包含在搜索索引中。这一变化使索引更小,并减少了搜索结果中的噪音。FIX: Don't index posts with empty `Post#raw` for search. (#7263) · discourse/discourse@daeda80 · GitHub

  4. 通过从索引中删除已丢弃主题的帖子,缩小了搜索索引。PERF: Delete search data of posts from trashed topics periodically. (… · discourse/discourse@d151425 · GitHub

  5. 当帖子被移动到其他主题时,其搜索数据未得到更新,导致帖子错误地出现在匹配旧主题分类的搜索中。FIX: Reindex post for search when post is moved to a different topic. · discourse/discourse@d808f36 · GitHub

  6. 通过排除帖子的已渲染版本,减小了搜索结果的负载。PERF: Reduce number of queries and size of payload when searching. · discourse/discourse@03c6b22 · GitHub

  7. 在搜索精确短语时,搜索结果中帖子的摘要功能曾出现损坏,导致客户端缺少搜索词高亮。FIX: Post blurb incorrect when search contains a phrase match. · discourse/discourse@dae0bb4 · GitHub

请在这些更改之后告诉我们您的搜索结果是否有所改善。我们也希望了解您是否认为搜索结果变差了,以便我们继续优化。谢谢大家!

43 个赞

Is the age of a post part of the weight in ranking search results? Information gets stale fairly quickly in our forum, so it would be nice to have a way to reduce the relevance of older postings without actually eliminating them.

11 个赞

Not yet, but it is an interesting idea, even in the weaker form of simply factoring in the date the topic was last touched

7 个赞

I would love to see a way to “pin” topics in search results so they appear first. That way common searches things can be lead directly to our tutorials.

You do this by changing search priority for the category those topics are in.

My “issue” is that those topics are spread through all forum categories on contextual basis. Hence I can’t use the category-based approach.

Then you are stuck because there is no random per topic way to do this, and there never will be.

Maybe search priorities for tags could be a thing eventually? Although not as straight forward since topics can have multiple tags…

Being able to weight categories is a great feature anyway! I already have several spammy special-use categories on my forum that will be nice to push down a bit in search.

6 个赞

Yes, I see this happening at some point.

10 个赞

That would be awesome! That way blog posts can be tagged as such and prioritized for search…

1 个赞

I would love to see a way for certain tags to affect the search priority also.

4 个赞

有计划将其添加到 2.4.X 或 2.5 版本吗?

这是现在的趋势吗?

我没有足够的知识来解读这个资源:Search Controller Need help with understanding how discourse search works - #3 by neounix

2 个赞

还没有。目前只有类别具有可配置的搜索优先级。

3 个赞