AI exceeds LLM token thresholds randomly and unpredictably

Just FYI the problem first started with the translation service getting stuck and running out of tokens:

DiscourseAi::Completions::Endpoints::OpenAi: status: 429 - body: {“error”:{“message”:“Rate limit reached for model openai/gpt-oss-120b in organization org_01kccx1be8fffaz5sbe17 service tier on_demand on tokens per day (TPD): Limit 200000, Used 193487, Requested 7464. Please try again in 6m50.832s. Need more tokens? Upgrade to Dev Tier today at https://console.groq.com/settings/billing",“type”:“tokens”,“code”:"rate_limit_exceeded”}}

Then I paused the service for 24 hours for the daily rate limits to reset. After restarting it then I noticed this error:

DiscourseAi::Completions::Endpoints::OpenAi: status: 413 - body: {“error”:{“message”:“Request too large for model openai/gpt-oss-120b in organization org_01kccx1be8fffaz5sbe17 service tier on_demand on tokens per minute (TPM): Limit 8000, Requested 8102, please reduce your message size and try again. Need more tokens? Upgrade to Dev Tier today at https://console.groq.com/settings/billing",“type”:“tokens”,“code”:"rate_limit_exceeded”}}

Then I reduced max output tokens from 7000 to 6800 in the LLM configuration and it started working again.

What am I missing here? Are you suggesting it’s related to context window and nothing to do with max output tokens? Just trying to figure out how to match configuration numbers from groq / model limits to discourse LLM configurations.