Discourse AI - Self-Hosted Guide

Discourse · April 24, 2023, 7:40pm

This is a guide aimed at running your own instances of the services that power Discourse AI modules.

Introduction

If you want to use Discourse AI on your self-hosted instance, you may need to also run the companion services for the modules that you want to enable.

Each module has one or more needed companion services, and those services use more CPU / GPU / disk space than Discourse itself, so keep in mind that this is not recommended for people unfamiliar with Linux server administration and Docker.

Summarization / AI Helper / AI Bot

Embeddings

Sentiment

Running in production

You may want to put this service behind a reverse proxy to enable features like load balancing, TLS, health checks, rate limits, etc when running in a live site.

After the service is up and running, configure the module to connect to the domain where the service is running using the appropriate site setting and then enable the module.

Last edited by @Falco 2024-11-04T17:57:29Z

Check document
Perform check on document:

satonotdead · April 26, 2023, 12:10pm

The composer helper is not possible by now self-hosting?

Kudos to the team for this development and implementation

Falco · April 26, 2023, 1:06pm

Composer Helper only works with OpenAI or Anthropic APIs for now, so it will work just fine in self-hosted situations provided you have one of those APIs.

JoeD · April 28, 2023, 5:44pm

I have Composer Helper up and running, thanks!

Does Summarization require a local classification service? Or will it run with just an OpenAI API key if using ChatGPT3.5 model? I turned it on but aren’t seeing it on topics.

Falco · April 28, 2023, 5:48pm

Per Discourse AI - Summarization you can use it with OpenAI by configuring the OpenAI key (which you already did), selecting one of the GPT models as the summarization model and enabling the summarization module.

The summary button is only showing for topics with >50 replies at the moment, but we will enable it for all topics soon.

nodomain · April 29, 2023, 7:45pm

Can you please share some sample requests? I am currently trying to set this up in an AWS ASG on an EC2 instance and I can’t get it to work; I only see 400 bad request in the Discourse logs.

Furthermore, a healthcheck URL would be great, / issues a 404 error.

Falco · April 29, 2023, 7:58pm

/srv/ok and /health are the health check endpoints.

On the top of my head something along as:

jo -p model=bart-large-cnn-samsum content="Long sentence to summarize goes here" | \
  curl --json @- -XPOST http://service/api/v1/classify

For the summarization service should work.

satonotdead · April 30, 2023, 10:04pm

Can you suggest to use summarization service on localhost with healthcheck from Nginx module if we are OK with limits and load?

I just want to try open-source models, we get it working with OpenAI API keys by now.

There are plans to enable multilingual on summarize using models like ChatGPT3.5 that are made compatible?

Falco · April 30, 2023, 10:11pm

If that’s what you want it should work, yes.

Summarization already work with OpenAI and Anthropic APIs, so that will give you multilingual capabilities. You may need to hack a bit to translate the prompt for it to keep it more grounded on the topic language tho.

nodomain · May 3, 2023, 7:59pm

Good news by AWS: Amazon RDS for PostgreSQL now supports pgvector for simplified ML model integration

Drew-ART · June 15, 2023, 10:19pm

@Falco Would you be kind enough to give and example of a server configuration that has ‘plenty of CPU / GPU / Disk’ and can run the self hosted AI alongside an average Discourse forum?

DonH · June 16, 2023, 4:12pm

I’d like to see that as well, please. Also, given the resource requirement would it be better (possible, more cost effective ?) to offload the companion AI services to a separate VPS?

example of a server configuration

Falco · June 16, 2023, 4:19pm

Depends on the exact models and modules of Discourse AI you will want to run. For example the toxicity module uses 5GB and the NSFW uses 1GB of RAM. Disk space is similar, and CPU/GPU is used for inference, so your needs depend on the numbers of requests per second your expect to have.

Yes, that is probably the best way.

Drew-ART · June 17, 2023, 4:04am

Alright, i’ve taken a crack at this:

Napkin estimates:

ChatGPTv4 API

$0.0008 per 100 words
1 user averages about 100 words (or tokens) per day on each AI module
Running all 6 AI modules
$0.0008 * 6 = $0.0048

Total monthly cost per user: $0.0048 * 30 = $0.144

The minimum server requirements for self hosting are around:

16GB of free RAM, 32 preferred
3.5 GHz or higher CPU and 8 cores or more
100GB SSD

The lowest cost server which meets those requirements on Digital Ocean is:

16 GB Ram
8 Premium Intel vCPUs (over 3.5 GHz)
Bandwidth: 6,000 GiB
SSD: 2x 200 GiB
Monthly cost: $244.00

So self-hosting ChatGPT4 will be more cost effective than using its API service when Discourse has around 2,000 active users per month.

With some pretty wobbly and generous rounding involved. Does that sound about right @Falco

sam · June 17, 2023, 5:39am

GPT-4 or 3.5 can not be self hosted.

Some LLMs are open source such as Falcon or various LLaMA based models (which come with licensing questions) can be self hosted but to date they all underperform GPT 4 or even 3.5

Your back of the napkin calculation there is wildly off, if you are going to be self hosting an LLM you probably want an A100 or H100, maybe a few of them … try googling for prices…

Drew-ART · June 17, 2023, 7:39am

I guess that’s what you get when using ChatGPT to help you work out self-hosting ChatGPT costs.

Drew-ART · June 17, 2023, 8:12am

Well anyway, i’ll try to contribute something and come back to update it when i have some user data to compare.

Here’s the calculations i ran for using ChatGPT3.5’s API with the modules above, based on the very vague assumption that an average active user in one month is on average going to generate 100 words in one execution:

ChatGPT3.5 API Costs

$0.0003 per 100 words in one execution
1 active user averages about 100 words per day on each AI module

Average monthly cost per AI plugin/component: 0.009

6 = $0.054

Gives a total monthly cost per user for all 6 plugins of $0.054 if they run on ChatGPT3.5

Falco · June 29, 2023, 4:14pm

We just started running the AI services here for Meta in a g4dn.xlarge so I can now recommend that as a baseline.

DonH · June 29, 2023, 5:48pm

Thanks. Current pricing is given here for anyone wondering what a g4dn.xlarge is. Hopefully you will be able to post utilization data at some point so we can get a handle on real world costs.

Falco · June 29, 2023, 5:55pm

The machine is basically idle with just Meta traffic. It could handle a few Metas worth of traffic just fine.

Topic		Replies	Views
關於Discourse AI Support ai	6	761	October 1, 2024
Discourse AI plugin with self hosted discourse site Support ai	2	193	July 9, 2024
Discourse AI Plugin included-in-core , ai , official	89	37008	October 14, 2025
Introducing Discourse AI Blog	26	3610	May 4, 2023
I want to install Discourse AI on Discourse Installation ai	13	471	June 18, 2024