How to add a new Chat Bot connected to a self-hosted LLM?

Nadeem · March 6, 2024, 11:18pm

I want to add a new “Chat Bot” and link it to a self hosted LLM.
I have tried to use the “ai hugging face model display name” field and that doesn’t seem to appear anywhere, perhaps I have to reference that in the prompts associated with a persona?
I have also tried to “create” a new bot via the “ai bot enable chat bots” drop down, and anything I create appear shows in the chatbot drop down as " [en.discourse_ai.ai_bot.bot_names.XXXX] where XXXX is the name I provided.
Any tips to any documentation or guide who to do this would be appreacaited.

Nadeem · April 24, 2024, 10:25pm

Anyone who can offer any suggestions or is this a known limitation?

sam · April 24, 2024, 10:35pm

@Roman is working on refactoring this section, expect more news in the coming weeks

Isambard · June 2, 2024, 11:02am

I’m not sure if I interpret this correctly that currently it is not possible to use a self-hosted LLM, but this will change soon?

sam · June 3, 2024, 2:11am

It is not possible atm, but hopefully in a week or 2 we will have this working.

Isambard · June 3, 2024, 8:23am

Thanks. I was surprised it didn’t work since OpenAI is supported. I think many people run their own LLMs with an OpenAI compatible endpoint. I will look forward to the update in 2 weeks

merefield · June 3, 2024, 8:37am

Out of interest @Isambard what’s your estimate for how much a sufficiently powerful local LLM will cost you to host on a monthly basis (dollar equivalent)?

Isambard · June 3, 2024, 5:06pm

About a minimum of $5 in additional electricity costs per month for the GPU at idle - although in reality, the incremental cost for discourse is zero since I already run the LLM for other purposes.

But for sure, it would be more economical for small forums and low usage to use an LLM as a service. Though for the scale of Discourse’s hosted offering, I suspect it might make sense to host internally (and also develop knowledge of this area that is likely to be important).

sam · June 4, 2024, 12:12am

And 15k for the A100 ?

What model particularly are you running locally?

Isambard · June 4, 2024, 7:42pm

I’m running several different things. For Discourse stuff, I will run a 7B model based off Mistral and fine-tuned for the tasks. I’m looking at various BERT-like models for classification tasks and still undecided on the embeddings yet. This runs on a 2nd hand 3090 Ti which I bought for $700.

I would love to have an A100, but instead, I built a separate 4 GPU system ‘on the cheap’ for only $1,000 that runs Llama 3 70Bq4 at over 20 tok/s.

For sure in many/most cases it would make sense to just go with a provider, however, it might make sense to DIY if:

You want to learn
You want to have control certainty over your models (so you don’t lose access to them, or are beholden to a company to use their non-public embeddings)
You have a lot of bulk processing to do which would be cheaper to do in-house
You want reserved and reliable capacity (there are limits on both requests and tokens available from providers) for bulk processing

Isambard · June 5, 2024, 11:49am

I benchmarked the 3090 and was getting max sustained throughput of around 2600 tokens per second running Llama 3 - 8B FP16. I live in an expensive electricity region but running continuously at a 285W power limit, it would cost around $0.007 per million output tokens. Or roughly $0.01 per million tokens if you fully depreciate the equipment cost over 3 years.

This compares quite favourably to Claude Haiku providing you have a reasonable utilization rate.

Isambard · August 12, 2024, 10:19pm

I made an interesting discovery: the web server that I’m hosting my forum on has sufficient grunt to run a small LLM at modest speeds (6 tok/s without batching) even without a GPU. This will be useful for offline/background tasks.

Topic		Replies	Views
How to configure Discourse to use a locally installed LLM? Support ai	7	121	June 3, 2025
Self-Hosting an OpenSource LLM for DiscourseAI Self-Hosting ai	5	2943	February 21, 2025
Adding a new Chat Bot when using AI plugin Support ai	0	391	March 12, 2024
How to use the hugging face llama2 chat bot Dev ai , ai-bot	2	529	March 9, 2024
What Discourse AI features are FREE to use? Support ai	14	293	September 29, 2024

How to add a new Chat Bot connected to a self-hosted LLM?

Related topics