In order to use certain Discourse AI features, users are required to use a Large Language Model (LLM) provider. Please see each AI feature to determine which LLMs are compatible.
If cost is a significant worry, Discourse AI has several built-in tools to help manage spending:
- AI Usage dashboard — track token consumption per feature, model, and user with estimated costs
- Usage quotas — set per-model, per-group limits on tokens or request counts within configurable time windows (hourly, daily, weekly)
- Credit allocations — set overall credit budgets per model with soft and hard limits
- Vendor-side budgets — set usage limits right from the vendor as an additional safeguard
- Group restrictions — only let select users and groups access the AI features
There are several variable factors to consider when calculating the costs of using LLMs
A simplified view would be…
Important to understand what are tokens and how to count them
- LLM model and pricing → Identifying the specific LLM model you plan to use and finding its latest pricing details for input and output tokens
- Input tokens → The average length of your input prompts in tokens
- Output token → This is the model’s responses in tokens
Now let’s go through the example of AI Bot usage right here on Meta
There were a lot of simplifications that were made during this calculation such as token usage, users using AI Bot, and average number of requests. These numbers should only be taken as general guidelines. Especially since we do a ton of experimentation with AI Bot
-
Use the built-in AI Usage dashboard at
/admin/plugins/discourse-ai/ai-usageto review your actual request/response token usage, broken down by feature, model, and user -
On average response tokens were 3x to 5x bigger than request tokens [1]
-
Assume an average user request token to be 85, equivalent to <1 paragraph [2]
-
Assume an average response token to be 85 x 4 = 340 tokens, 3 paragraphs worth
-
Using GPT-5.4 mini from OpenAI, the cost for input tokens would be $0.75 / 1M tokens = $0.00000075 / token x 85 tokens = $0.000064 for input
-
For output tokens it would be $4.50 / 1M tokens = $0.0000045 / token x 340 tokens = $0.00153 for output
-
Total cost per request is $0.000064 + $0.00153 = $0.0016
-
During February 2024, around 600 users were using the AI Bot, making an average of 10 requests for that month. Now assume these numbers are consistent with your community
-
This would mean for February the cost for AI Bot would be $0.0016 x 600 users x 10 requests = $9.56
-
Fast forwarding this to a year’s cost of running AI Bot, this would be $9.56 x 12 = $115 for the year for running GPT-5.4 mini as your LLM of choice
For even lower costs, consider budget models like GPT-5.4 nano ($0.20/$1.25 per 1M tokens), Gemini 2.5 Flash ($0.075/$0.30 per 1M tokens), or Claude Haiku 4.5 — which can reduce costs by another 75–95% compared to the example above. Always check the latest pricing from your provider as costs continue to drop.
-
An estimation looking at the OpenAI community and our own response to request token ratio ↩︎
-
While looking at the average user request token usage I found numbers as low as 20 to >100. I wanted to encapsulate that there were more requests closer to 100 and the assumption there is that those requests might be closer to fully formed sentences and refer to well thought out prompts with lots of questions asked to the bot ↩︎
Last edited by @Saif 2024-11-04T21:45:13Z
Check document
Perform check on document: