This guide explains how to configure and manage usage quotas for Large Language Models (LLMs) in Discourse AI.
Required user level: Administrator
Summary
LLM Usage Quotas allow administrators to control and monitor AI resource consumption by setting limits on token usage and interactions for different user groups. This helps maintain cost efficiency while ensuring fair access to AI features across your community.
Configuration
Accessing quota settings
- Navigate to your site’s admin panel
- Go to Admin > Plugins > Discourse AI > LLM Models
- Select the LLM model you want to configure
Setting up quotas
For each user group, you can configure:
- Maximum token usage
- And/Or Maximum number of AI interactions
- Reset period duration
Duration options
Choose from preset reset periods:
- 1 hour
- 6 hours
- 24 hours
- 7 days
- Custom duration (specified in hours)
Usage monitoring
Viewing statistics
Administrators can monitor token consumption and usage consumption at: https://SITENAME/admin/plugins/discourse-ai/ai-usage
- Navigate to Admin > Plugins > Discourse AI
- Select “Usage” tab
- Filter by date range, user group, or specific metrics
User experience
Quota notifications
Users receive clear feedback when approaching or reaching quota limits:
- Current usage status
- Time until next quota reset
Error messages
When a quota is exceeded, users see:
- A clear notification that the quota limit has been reached
- The time remaining until their next quota reset
Best practices
- Start conservative: Begin with lower quotas and adjust based on actual usage patterns
- Group-based allocation: Assign different quotas based on user group needs and roles
- Regular monitoring: Review usage patterns to optimize quota settings
- Clear communication: Inform users about quota limits and reset periods
Common issues and solutions
Issue: Users frequently hitting limits
Solution: Consider:
- Increasing quota limits for specific groups
- Reducing the reset period
- Creating specialized groups for high-usage users
Issue: Unused quotas
Solution:
- Adjust limits downward to optimize resource allocation
- Review group assignments to ensure quotas match user needs
FAQs
Q: Can quotas be temporarily suspended?
A: Yes, administrators can temporarily disable quota enforcement for specific groups or the entire site.
Q: Do unused quotas roll over?
A: No, quotas reset completely at the end of each period.
Q: Can different LLM models have different quotas?
A: Yes, quotas can be configured independently for each LLM.
Q: What happens if multiple quotas are set for a single LLM?
A: Quotas are group based and applied per user. For a user to exceed quota the user must exceed quota in all groups. This means that if you give admins a very relaxed quota and trust level 1 a more restrictive one, the admin quota will apply to admins.
Q: What if no quota is applied to an LLM?
A: Nothing special will happen all LLM usage will be unmetered
Q: What if I want different quotas for different features
A: Discourse AI allows you to define multiple LLMs which all contact the same endpoint and even can reuse keys, if you wish to give one quota to AI helper and a different to AI Bot, define 2 LLMs.


