Discourse AI - Embeddings

:bookmark: This topic covers the configuration of the Embeddings module of the Discourse AI plugin. It explains what embeddings are, how they’re used, and how to set them up.

:person_raising_hand: Required user level: Administrator

Embeddings are a crucial component of the Discourse AI plugin, enabling features like Related topics and AI search. This guide will walk you through the setup and use of embeddings in your Discourse instance.

What are Embeddings?

Embeddings are numerical representations of text that capture semantic meaning. In Discourse, they’re used to:

  1. Generate related topics at the bottom of topic pages
  2. Enable semantic search functionality

Setting up Embeddings

For hosted customers

If you’re a hosted customer, embeddings are pre-configured. You can simply enable the AI features that depend on them.

For self-hosted instances

If you’re self-hosting, refer to the Discourse AI self-hosted guide for detailed setup instructions.

Configuring Embedding Definitions

Embedding models are now configured as Embedding Definitions in the admin UI. Navigate to AdminAI plugin → Embeddings tab. When adding a new embedding definition, you can choose from pre-configured presets or configure one manually.

Available presets include:

  • text-embedding-3-large (OpenAI)
  • text-embedding-3-small (OpenAI)
  • text-embedding-ada-002 (OpenAI)
  • gemini-embedding-001 (Google)
  • bge-large-en (Hugging Face)
  • bge-m3 (Hugging Face)
  • multilingual-e5-large (Hugging Face)

Each embedding definition includes: display name, provider, URL, API key (or AI Secret), tokenizer, dimensions, distance function, max sequence length, and optional embed/search prompts.

Configuring embeddings

Navigate to AdminPluginsDiscourse AI, ensure the following settings are enabled.

  1. ai embeddings enabled: Turn the embeddings module on or off
  2. ai embeddings selected model: Select which embedding definition to use for generating embeddings

Optional settings that can be tweaked…

  • AI embeddings generate for pms: Decide whether to generate embeddings for personal messages
  • AI embeddings semantic related topics enabled: Enable or disable the “Related topics” feature
  • AI embeddings semantic related topics: The maximum number of related topics to be shown
  • AI embeddings semantic related include closed topics: Include closed topics in related topic results
  • AI embeddings semantic related age penalty: Apply an exponential age penalty to topics in related results (0.0 disables, higher values penalize older topics more)
  • AI embeddings semantic related age time scale: Time scale in days for age penalty calculation (default: 365)
  • AI embeddings semantic search enabled: Enable full-page AI search
  • AI embeddings semantic quick search enabled: Enable semantic search option in the search menu popup
  • AI embeddings semantic search use hyde: Enable HyDE (Hypothetical Document Embedding) for semantic search
  • AI embeddings semantic search hyde agent: The AI agent used to expand search terms when HyDE is enabled

Providers

Discourse AI supports multiple embedding providers:

  • OpenAI
  • Google
  • Hugging Face (for open source/open weights models)
  • Cloudflare Workers AI

For hosted customers, Discourse provides pre-configured (seeded) embedding definitions that work out of the box.

Features

Related Topics

When enabled, a “Related Topics” section appears at the bottom of topic pages, linking to semantically similar discussions.

AI Search

Embeddings power the semantic search option on the full-page search interface.

Semantic search can optionally use HyDE (Hypothetical Document Embedding). When enabled via ai embeddings semantic search use hyde, the search term is expanded using the AI agent configured in ai embeddings semantic search hyde agent. The expanded search is then converted to a vector and used to find similar topics. This technique adds some latency to search but can improve results.

When selecting an agent for HyDE, choose a fast model like Gemini Flash, Claude Haiku, GPT-4o Mini, or the latest available models.

Generating embeddings

Embeddings are generated automatically for new posts. To generate embeddings for existing content:

  1. Discourse will automatically backfill embeddings for older topics via a scheduled job that runs every 5 minutes
  2. The backfill processes topics in order of recent activity first

FAQs

Q: How are related topics determined?
A: Related topics are based solely on embeddings, which include the title, category, tags, and posts content

Q: Can I exclude certain topics from related topics?
A: Yes, there’s a site setting to remove closed topics from the results

Q: Do embeddings work for historical posts?
A: Yes, the system will automatically backfill embeddings for all your content

Additional resources

Last edited by @tobiaseigen 2025-09-25T15:06:15Z

Last checked by @hugh 2024-08-06T04:16:01Z

Check documentPerform check on document:
17 Likes

Great work, thanks first of all, but I can’t see similar topics under the topics, somehow, my settings are like this, I added an openai key. Semantic search works, but how can I show similar articles under topics?

If you want to use OpenAI for embeddings you must set ai embeddings model to text-embedding-ada-002.

3 Likes

How are the jobs to generate embeddings scheduled? From the code it seems like embeddings are only generated when the page is viewed and embeddings are missing. Is there a way to generate embeddings for the whole site when turning the feature on?

2 Likes

You can also run rake ai:embeddings:backfill to generate embeddings for all topics eagerly.

8 Likes

Suggestion

Sometimes reading a topic one knows most of the noted background but there are also some mentions that are not known. While there is summarization for summarizing an entire topic up to that point what would also be of help would be an AI option that inserts a glossary for the topic as a post near the top and updates it if a user selects a word or phrase that it wants the AI to include in the glossary.


Today in reading this topic there was one reference I did not recognize so looked it up and added a reply with a reference for it. While I know the remaining references I am sure there are others, especially those new to LLMs and such, that would have no idea of many of the noted references and if the AI could help them they would visit the site much more often.

While I know what RAG means in this starting post, how many really know that?

What is RAG (Click triangle to expand)

How do domain-specific chatbots work? An Overview of Retrieval Augmented Generation (RAG)


Note: Did not know with which topic to post this but since it needed embeddings to work posted it here. Please move this if it makes more sense elsewhere or as the Discourse AI plugin changes.

1 Like

Are embeddings the only variable when determining “Related Topics”? Or are there any other factors that are considered (e.g. author, topic score, topic age, category, etc)?

3 Likes

Only the embeddings, but those contain the title, category, tags and posts. There is a site setting to remove closed topics from the results too.

5 Likes

7 posts were split to a new topic: Is full page semantic search only in English?

2 posts were split to a new topic: Differences in search latency between AI semantic and keyword search

I wish I found this a few months ago. I already created embeddings using bge-small-en-v1.5 and hosted them in an external database.

I will see if it can be shoehorned into this ‘standard’ set-up!

I find a little bug in the recent version leading to rake ai:embeddings:backfill failed:

root@nbg-webxj:/var/www/discourse# rake ai:embeddings:backfill
rake aborted!
NameError: uninitialized constant Parallel (NameError)

  Parallel.each(topics.all, in_processes: args[:concurrency].to_i, progress: "Topics") do |t|
  ^^^^^^^^
/var/www/discourse/plugins/discourse-ai/lib/tasks/modules/embeddings/database.rake:27:in `block in <main>'
/usr/local/bin/bundle:25:in `load'
/usr/local/bin/bundle:25:in `<main>'
Tasks: TOP => ai:embeddings:backfill
(See full trace by running task with --trace)

I suspect the culprit is that the parallel gem is neither installed in this plugin, nor in Discourse core(only find one in the if ENV["IMPORT"] == "1" block: gem "parallel", require: false).

I find the ruby-progressbar gem also required to perform rake ai:embeddings:backfill.

I make a simple PR on Github:

2 Likes

Note to others that this rake method seems to have been demoted/semi-deprecated since per Falco on GitHub:

Thanks for the PR @fokx, but I’ve left those out unintentionally as the rake task fell out out favor and should only be used in rare occasions by experienced operators who can easily install those out of band.

Is the semantic search option no longer shown in that dropdown and instead comprehended or enabled through the AI toggle?

1 Like

Can you confirm for me if the embeddings will only work on posts after installing or will it also allow us to semantic-search all historical posts? I’m hoping the latter! Thanks.

1 Like

It’s the later, as it will automatically backfill embeddings for all your content.

4 Likes

I’m trying to set up AI Embeddings using Gemini Flash but I can’t get it to work. I can’t find good descriptions/examples of all the settings fields though, so I might have missed one or two that are important. I don’t know if the ‘ai_embeddings_model’ setting is required, but if I set it to ‘gemini’ I get the following error…

I’ve not been able to find the ai_gemini_api_key setting. I do have Gemini Flash set up as an LLM with an API key and that’s working elsewhere, e.g. summarization, but I’m assuming this is wanting the API key entered somewhere else?

I suppose this would work with OpenAI too, wouldn’t it?

It would be great if it could support their Batch API (50% discount)

Yes, but nowadays we backfill automatically in the background, so this isn’t mandatory.

For price conscious peeps, we support great open weights model that you can run on your own hardware.

1 Like

Thanks. Do I understand it correctly that backfill is when the vectorization happens? When switching between models, do the vectors need to be recalculated (Are they “proprietary”)? I assume yes.

It’d be useful to know how the costs of using the OpenAI API stack up against investing in a GPU-powered server with opensource solution. Is there a formula or any way to estimate the number of tokens used? We’re only using the API to vectorize posts, not for calculating vector distances, right? So, the number of tokens used depends on how much content we have, correct?

I assume that for both related topics and AI-powered search, all posts need to be vectorized only once, so I can calculate the total number of words in posts table and derive the number of tokens needed. The same process would apply to the daily addition of posts. I’m neglecting the search phrases for now.