Configuring LLM Usage Quotas in Discourse AI

:bookmark: This guide explains how to configure and manage usage quotas for Large Language Models (LLMs) in Discourse AI.

:person_raising_hand: Required user level: Administrator

Summary

LLM Usage Quotas allow administrators to control and monitor AI resource consumption by setting limits on token usage and interactions for different user groups. This helps maintain cost efficiency while ensuring fair access to AI features across your community.

Configuration

Accessing quota settings

  1. Navigate to your site’s admin panel
  2. Go to Admin > Plugins > Discourse AI > LLM Models
  3. Select the LLM model you want to configure

Setting up quotas

For each user group, you can configure:

  • Maximum token usage
  • And/Or Maximum number of AI interactions
  • Reset period duration

Duration options

Choose from preset reset periods:

  • 1 hour
  • 6 hours
  • 24 hours
  • 7 days
  • Custom duration (specified in hours)

Usage monitoring

Viewing statistics

Administrators can monitor token consumption and usage consumption at: https://SITENAME/admin/plugins/discourse-ai/ai-usage

  1. Navigate to Admin > Plugins > Discourse AI
  2. Select “Usage” tab
  3. Filter by date range, user group, or specific metrics

User experience

Quota notifications

Users receive clear feedback when approaching or reaching quota limits:

  • Current usage status
  • Time until next quota reset

Error messages

When a quota is exceeded, users see:

  • A clear notification that the quota limit has been reached
  • The time remaining until their next quota reset

Best practices

  1. Start conservative: Begin with lower quotas and adjust based on actual usage patterns
  2. Group-based allocation: Assign different quotas based on user group needs and roles
  3. Regular monitoring: Review usage patterns to optimize quota settings
  4. Clear communication: Inform users about quota limits and reset periods

Common issues and solutions

Issue: Users frequently hitting limits

Solution: Consider:

  • Increasing quota limits for specific groups
  • Reducing the reset period
  • Creating specialized groups for high-usage users

Issue: Unused quotas

Solution:

  • Adjust limits downward to optimize resource allocation
  • Review group assignments to ensure quotas match user needs

FAQs

Q: Can quotas be temporarily suspended?
A: Yes, administrators can temporarily disable quota enforcement for specific groups or the entire site.

Q: Do unused quotas roll over?
A: No, quotas reset completely at the end of each period.

Q: Can different LLM models have different quotas?
A: Yes, quotas can be configured independently for each LLM.

Q: What happens if multiple quotas are set for a single LLM?
A: Quotas are group based and applied per user. For a user to exceed quota the user must exceed quota in all groups. This means that if you give admins a very relaxed quota and trust level 1 a more restrictive one, the admin quota will apply to admins.

Q: What if no quota is applied to an LLM?
A: Nothing special will happen all LLM usage will be unmetered

Q: What if I want different quotas for different features
A: Discourse AI allows you to define multiple LLMs which all contact the same endpoint and even can reuse keys, if you wish to give one quota to AI helper and a different to AI Bot, define 2 LLMs.

Additional resources

2 Likes

It seems we can’t completely prohibit a group from using a specific model by setting the quota to 0.

Could you add support for this setting?

Sorry can you expand here. Each feature also is group gated, so you can enable helper only for a subset of users anyway.

I want some premium models to be restricted to specific groups only. It would be great if we could set a model’s quota to 0 to disable access for certain groups.

Yeah it is an interesting problem, will have a think about it.

You may want helper to use GPT4o for “special group 1” and GPT4o mini for the rest of the people.

At the moment we only allow you to select 1 model for AI helper, so we would need a reasonably big change to support this.

@Falco / @Saif / @awesomerobot something to thing about.

2 Likes