This topic covers the configuration of Related Topics feature from the Embeddings module of the Discourse AI plugin.
Overview
Related Topics help you find the most relevant topics to read next after finishing reading a topic. These topics are recommended using semantic textual similarity between the current topic you are reading and all other topics in your Discourse instance. This results in the discovery of more relevant topics and continued engagement in communities.
The following is an example, note: the current topic is about “Related Topics”.
Features
- Semantic textual similarity: going beyond just a keyword match and using semantic analysis to find textual similarity
- Toggle between “Suggested” and “Related” topics
- Applicable to both anonymous and logged-in users
Availability
Hosted by us? Currently, this module is available for all hosted by Discourse customers on any plan.
It comes pre-installed on most plans, and if you’re an Enterprise customer you can contact us to have it added to your site on request.
Self-hosted users can install the plugin anytime by following Install Plugins in Discourse - sysadmin - Discourse Meta
Enabling Related Topics
Prerequisites
If you are hosted by Discourse, we will provide Embeddings for you through an open-source model.
For all self-hosted, you must bring in your own through a 3rd party API key from one of the following options below:
Configuration
- Go to
Admin
settings->Plugins
→ search or finddiscourse-ai
and make sure its enabled - Enable
ai_embeddings_enabled
for the Embeddings module needed for Related Topics feature - Enable
ai_embeddings_semantic_related_topics_enabled
to activate the Related Topics feature
If on Discourse Hosting / Self-hosting the model
- If your site discussions aren’t in English, set
ai embeddings model
tomultilingual-e5-large
.
If using Cloudflare Workers AI
- Set
ai embeddings model
tobge-large-en
.
If using OpenAI / Azure OpenAI
- Set
ai embeddings model
totext-embedding-ada-002
.
Technical FAQ
Architecture Diagram
The overview is, that when a topic is created / updated this happens:
sequenceDiagram
User->>Discourse: Creates topic
Discourse-->>Embedding Microservice: Generates embeddings
Embedding Microservice-->>Discourse:
Discourse-->>PostgreSQL:Store Embeddings
And during topic visit:
sequenceDiagram
User->>Discourse: Visits topic
Discourse-->>PostgreSQL: Query closest topics
PostgreSQL-->>Discourse:
Discourse->>User: Presents related topics
- How is topic/post data processed?
- Hosted by Discourse: The Embeddings microservice is run alongside the other servers that host your existing forums. There is no third party involved, and information never leaves your internal network in our virtual private datacenter.
- Self-hosted: Embeddings data is processed by 3rd party providers. Please refer to your specific provider for more details
- Where does the data go?
- Hosted by Discourse/Self-hosted: Embeddings data is stored in the same database where we store your topics, posts and users. It’s another data table in there.
- What does the “semantic model” look like? How was it “trained”, and is there a way to test that it can accurately apply to the topics on our “specialized” communities?
- Hosted by Discourse: By default we use and recommend this model. We have this deployed to many customers, and found that it performs well for both niche and general communities. If the performance isn’t good enough for your use case, we have more complex models ready to go, but in our experience, the default option is a solid choice
- Self-hosted: Different models will be trained differently which might affect the outcome. Please refer to the specific model provider for more information