This is a guide aimed at running your own instances of the services that power Discourse AI modules.
Introduction
If you want to use Discourse AI on your self-hosted instance, you may need to also run the companion services for the modules that you want to enable.
Each module has one or more needed companion services, and those services use more CPU / GPU / disk space than Discourse itself, so keep in mind that this is not recommended for people unfamiliar with Linux server administration and Docker.
Summarization / AI Helper / AI Bot
Embeddings
Sentiment
Running in production
You may want to put this service behind a reverse proxy to enable features like load balancing, TLS, health checks, rate limits, etc when running in a live site.
After the service is up and running, configure the module to connect to the domain where the service is running using the appropriate site setting and then enable the module.
Composer Helper only works with OpenAI or Anthropic APIs for now, so it will work just fine in self-hosted situations provided you have one of those APIs.
Does Summarization require a local classification service? Or will it run with just an OpenAI API key if using ChatGPT3.5 model? I turned it on but aren’t seeing it on topics.
Per Discourse AI - Summarization you can use it with OpenAI by configuring the OpenAI key (which you already did), selecting one of the GPT models as the summarization model and enabling the summarization module.
The summary button is only showing for topics with >50 replies at the moment, but we will enable it for all topics soon.
Can you please share some sample requests? I am currently trying to set this up in an AWS ASG on an EC2 instance and I can’t get it to work; I only see 400 bad request in the Discourse logs.
Furthermore, a healthcheck URL would be great, / issues a 404 error.
Summarization already work with OpenAI and Anthropic APIs, so that will give you multilingual capabilities. You may need to hack a bit to translate the prompt for it to keep it more grounded on the topic language tho.
@Falco Would you be kind enough to give and example of a server configuration that has ‘plenty of CPU / GPU / Disk’ and can run the self hosted AI alongside an average Discourse forum?
I’d like to see that as well, please. Also, given the resource requirement would it be better (possible, more cost effective ?) to offload the companion AI services to a separate VPS?
Depends on the exact models and modules of Discourse AI you will want to run. For example the toxicity module uses 5GB and the NSFW uses 1GB of RAM. Disk space is similar, and CPU/GPU is used for inference, so your needs depend on the numbers of requests per second your expect to have.
Some LLMs are open source such as Falcon or various LLaMA based models (which come with licensing questions) can be self hosted but to date they all underperform GPT 4 or even 3.5
Your back of the napkin calculation there is wildly off, if you are going to be self hosting an LLM you probably want an A100 or H100, maybe a few of them … try googling for prices…
Well anyway, i’ll try to contribute something and come back to update it when i have some user data to compare.
Here’s the calculations i ran for using ChatGPT3.5’s API with the modules above, based on the very vague assumption that an average active user in one month is on average going to generate 100 words in one execution:
ChatGPT3.5 API Costs
$0.0003 per 100 words in one execution
1 active user averages about 100 words per day on each AI module
Average monthly cost per AI plugin/component: 0.009
6 = $0.054
Gives a total monthly cost per user for all 6 plugins of $0.054 if they run on ChatGPT3.5
Thanks. Current pricing is given here for anyone wondering what a g4dn.xlarge is. Hopefully you will be able to post utilization data at some point so we can get a handle on real world costs.