Estimating cost of enabling Discourse AI for related content and search

SubStrider · October 28, 2025, 4:34am

Is there some cost benchmark or yardstick or guesstimate formula that will help me understand the the one time (mass embedding) and ongoing (embedding and search) cost of enabling Discourse AI using a cloud based LLM?

For self hosted LLM, what is a typical server config/cost that would be required?

NateDhaliwal · October 28, 2025, 4:37am

I believe ~~you would need a GPU~~ it is better with a GPU if you want to self-host. Check out things like Ollama.

Also see:

Falco · October 28, 2025, 1:11pm

Related Topics and AI search don’t use an LLM.

It’s one request per topic for mass embeddings, so most sites should be able to do it using something like the Gemini Free tier.

Search is one request per search, and which most likely can fit in the free tier.

Since this is just an embeddings model, you should be able to self host Qwen/Qwen3-Embedding-0.6B · Hugging Face using GitHub - huggingface/text-embeddings-inference: A blazing fast inference solution for text embeddings models · GitHub in a basic 2 vCPU / 4GB RAM easily.

It is faster on a server with GPU, of course, but runs just fine in one without it.

Topic		Replies	Views
Estimating costs of using LLMs for Discourse AI Site Management how-to , price-sensitive , ai	2	1137	November 14, 2024
Adding Semantic Search feature for our self-hosted discourse site Support ai , ai-search	9	332	March 19, 2025
How to configure Discourse to use a locally installed LLM? Support ai	7	362	June 3, 2025
Discourse AI - Self-Hosted Guide Self-Hosting ai	59	14260	May 20, 2024
Unlock All Discourse AI Features with Our Hosted LLM Announcements ai	9	1009	March 13, 2026

Estimating cost of enabling Discourse AI for related content and search

Related topics