Adding Semantic Search feature for our self-hosted discourse site

Ajay26 · March 4, 2025, 11:35am

I’m new to discourse AI. I’m using “sentence-transformers/all-mpnet-base-v2” as my embedding model. is this enough to do semantic search?
Or Should i have to add a Hyde model for it?

Please guide me on this.

Falco · March 4, 2025, 1:42pm

You also need an LLM for semantic search. If you want to self-host see Self-Hosting an OpenSource LLM for DiscourseAI.

Ajay26 · March 5, 2025, 5:18am

Thank you so much.

Can you gimme an idea on the requirements to host model like “mistralai/Mistral-7B-Instruct-v0.2” on-prem and in cloud as well for an enterprise level website, please.

And also i cant able to find any tokenizers for this model in the admin panel.

Falco · March 5, 2025, 4:45pm

There is nothing Discourse specific here, so standard rules apply. A 7B model, if ran using fp16, will take ~14GB VRAM plus the space for the context. You can use fp8 quantization to halve that, but that old model isn’t the best for it.

As it isn’t feasible to ship every possible tokenizer, you should pick the closest one from the available tokenizers.

Abinav_22 · March 6, 2025, 9:30am

Can you suggest us some LLM models for this scenario? We might be hosting our model on prem, so would like to know the compatibility factor of the models with Discourse.

Thank you.

Falco · March 6, 2025, 4:22pm

Depends on you budget, language support target, and what features of Discourse AI you want.

Today Qwen 2.5 Instruct in 32B or 72B are a strong contender.

Abinav_22 · March 10, 2025, 5:19am

Is there any way that we can use a smaller model for the summarization feature? Because LLM’s take a bigger budget, and we might have to settle for something smaller for now…

Falco · March 10, 2025, 2:22pm

Yes, you can use any model you want.

Ajay26 · March 18, 2025, 3:40pm

Can we alter the payload to the model for summarization or any other function?

Especially, I want to change the system’s content.

{‘role’: ‘system’, ‘content’: ‘You are an advanced summarization bot that generates concise, coherent summaries of provided text.\n\n- Only include the summary, without any additional commentary.\n- You understand and generate Discourse forum Markdown; including links, italics, bold.\n- Maintain the original language of the text being summarized.\n- Aim for summaries to be 400 words or less.\n- Each post is formatted as “<POST_NUMBER>) ”\n- Cite specific noteworthy posts using the format DESCRIPTION\n - Example: links to the 3rd and 6th posts by sam: sam (#3, #6)\n - Example: link to the 6th post by jane: agreed with\n - Example: link to the 13th post by joe: joe\n- When formatting usernames either use @USERNMAE OR USERNAME’}

If possible, how to do it?

Falco · March 19, 2025, 7:12pm

That is not possible at the moment, but we already started work on making it possible. Should land in the coming weeks.

Topic		Replies	Views
Discourse AI - Self-Hosted Guide Self-Hosting ai	61	11400	April 30, 2025
What LLM to use for Discourse AI? Site Management how-to , ai	0	409	January 23, 2025
Discourse AI - Large Language Model (LLM) settings page Site Management how-to , ai	17	1591	July 25, 2025
Discourse AI Plugin official , included-in-core , ai	70	34647	July 24, 2025
Summarising topics with an LLM (GPT, BERT, ...)? Feature	11	3904	November 22, 2023

Adding Semantic Search feature for our self-hosted discourse site

Related topics