Enable Related Topics

This topic covers the configuration of Related Topics feature from the Embeddings module of the Discourse AI plugin.

Overview

Related Topics help you find the most relevant topics to read next after finishing reading a topic. These topics are recommended using semantic textual similarity between the current topic you are reading and all other topics in your Discourse instance. This results in the discovery of more relevant topics and continued engagement in communities.

The following is an example, note: the current topic is about “Related Topics”.

related topics example

Features

  • Semantic textual similarity: going beyond just a keyword match and using semantic analysis to find textual similarity
  • Toggle between “Suggested” and “Related” topics
  • Applicable to both anonymous and logged-in users

Availability

:discourse: Hosted by us? Currently, this module is available for all hosted by Discourse customers on any plan. :tada:

It comes pre-installed on most plans, and if you’re an Enterprise customer you can contact us to have it added to your site on request. :discourse:

:information_source: Self-hosted users can install the plugin anytime by following Install Plugins in Discourse - sysadmin - Discourse Meta

Enabling Related Topics

Prerequisites

If you are hosted by Discourse, we will provide Embeddings for you through an open-source model.

For all self-hosted, you must bring in your own through a 3rd party API key from one of the following options below:

Configuration

  1. Go to Admin settings-> Plugins → search or find discourse-ai and make sure its enabled
  2. Enable ai_embeddings_enabled for the Embeddings module needed for Related Topics feature
  3. Enable ai_embeddings_semantic_related_topics_enabled to activate the Related Topics feature

If on Discourse Hosting / Self-hosting the model

  • If your site discussions aren’t in English, set ai embeddings model to multilingual-e5-large .

If using Cloudflare Workers AI

  • Set ai embeddings model to bge-large-en.

If using OpenAI / Azure OpenAI

  • Set ai embeddings model to text-embedding-ada-002.

Technical FAQ

Architecture Diagram

The overview is, that when a topic is created / updated this happens:

sequenceDiagram
    User->>Discourse: Creates topic
    Discourse-->>Embedding Microservice: Generates embeddings
    Embedding Microservice-->>Discourse: 
    Discourse-->>PostgreSQL:Store Embeddings 

And during topic visit:

sequenceDiagram
    User->>Discourse: Visits topic
    Discourse-->>PostgreSQL: Query closest topics
    PostgreSQL-->>Discourse: 
    Discourse->>User: Presents related topics 
  • How is topic/post data processed?
    • Hosted by Discourse: The Embeddings microservice is run alongside the other servers that host your existing forums. There is no third party involved, and information never leaves your internal network in our virtual private datacenter.
    • Self-hosted: Embeddings data is processed by 3rd party providers. Please refer to your specific provider for more details
  • Where does the data go?
    • Hosted by Discourse/Self-hosted: Embeddings data is stored in the same database where we store your topics, posts and users. It’s another data table in there.
  • What does the “semantic model” look like? How was it “trained”, and is there a way to test that it can accurately apply to the topics on our “specialized” communities?
    • Hosted by Discourse: By default we use and recommend this model. We have this deployed to many customers, and found that it performs well for both niche and general communities. If the performance isn’t good enough for your use case, we have more complex models ready to go, but in our experience, the default option is a solid choice
    • Self-hosted: Different models will be trained differently which might affect the outcome. Please refer to the specific model provider for more information
9 Likes

Something worth keeping an eye on.

In reviewing many post in Related Topics for an English site (OpenAI) starting to notice that topics in Spanish tend to be grouped together and suspect that if they were first translated to English each post would have a different vector and thus be clustered with other post. :slightly_smiling_face:



A side benefit of this feature for moderators is to check that the categories of the topics listed in Related Topics are correct.

As I review each new post I also check the Related Topics. This is becoming an effective way to identify topics created with the wrong category.

FYI - A related idea was noted in this feature request.



Find this topic when often needing following link which is not so easy to find so noting here.

1 Like

That behavior is governed by the model, and it appears to be a know problem:

I think the OSS model we recommend for multilingual sites does a better job at this, but we still need to rollout it to more customers to validate this.

2 Likes

It won’t let me enable this option:

Am I missing something here or is Gemini alone not enough?

UPDATE: The instructions and error description may want to be updated to add that the ai embeddings model should also be updated to match the provider otherwise ai_embeddings_enabled can’t be enabled. The parameter description is missing Gemini as an option.

1 Like

7 posts were split to a new topic: “Net::HTTPBadResponse” errors on Gemini Embeddings

What do I fill here pls:

I want to fill the above, because I want to enable the first option among the 4 shown below:

If you use OpenAI, nothing.

1 Like

Then this 1st option (Embeddings Module) troubles me, doesn’t let me enable it:

Most of those are empty. But ai embeddings discourse service api key is your OpenAI API and ai embeddings discourse service api endpoint is https://api.openai.com/v1/embeddings. Model should be text-embedding-3-large (sure, it can be small too but it has some issues).