Self-Hosting Embeddings for DiscourseAI

The Discourse AI plugin has many features that require embeddings to work, such as Related Topics, AI Search, AI Helper Category and Tag suggestion, etc. While you can use a third-party API, like Configure API Keys for OpenAI, Configure API Keys for Cloudflare Workers AI or Configure API Keys for Google Gemini, we built Discourse AI from the first day to not be locked into those.

Running with HuggingFace TEI

HuggingFace provides an awesome container image that can get you running quickly.

For example:

mkdir -p /opt/tei-cache
docker run --rm --gpus all --shm-size 1g -p 8081:80 \
  -v /opt/tei-cache:/data \
  ghcr.io/huggingface/text-embeddings-inference:latest \
  --model-id BAAI/bge-large-en-v1.5

This should get you up and running with a local instance of BAAI/bge-large-en-v1.5, a very good performing open-source model.

You can check if it’s working with

curl http://localhost:8081/ \
    -X POST \
    -H 'Content-Type: application/json' \
    "{ \"inputs\": \"Testing string for embeddings\" }"

Which should return an array of floats under normal operation.

Making it available for your Discourse instance

Most of the time, you will be running this on a dedicated server because of the GPU speed-up. When doing so, I recommend running a reverse proxy, doing TLS termination, and securing the endpoint so it can only be connected by your Discourse instance.

Configuring DiscourseAI

Discourse AI includes site settings to configure the inference server for open-source models. You should point it to your server using the ai_hugging_face_tei_endpoint setting.

After that, change the embeddings model setting to point to the model you are using at ai_embeddings_model.

10 Likes

The model bge-m3 should work for multilingual (or not english) sites?

Yes, I played with it the week it got silently shared on GitHub and it works well. Still waiting to see how it lands on the MTEB leaderboars, as it wasn’t there last I looked.

That said we have large hosted Discourse instances using the multilingual the plugin ships, e5, and it performs very well.

1 Like

Thanks, did you have plans to enable open-source custom endpoints for embeds? I’m trying to use this models on Huggingface.

Sorry I don’t understand what you are trying to convey here. This topic is a guide on how to run open-source models for Discourse AI embeddings.

Oh, sorry about that. I’m trying to use an open-source model from HuggingFace custom endpooint and I wonder if that’s possible or it’s on the plans to enable at near future :slight_smile:

To check if it’s working, the following command works for me (with BAAI/bge-m3 model):

curl -X 'POST' \
  'http://localhost:8081/embed'\
  -H 'Content-Type: application/json' \
  -d '{ "inputs": "Testing string for embeddings"}'

BTW, you can also use the Swagger web interface at http://localhost:8081/docs/.

2 Likes

This is also a nice embeddings server:

1 Like