Just FYI the problem first started with the translation service getting stuck and running out of tokens:
DiscourseAi::Completions::Endpoints::OpenAi: status: 429 - body: {“error”:{“message”:“Rate limit reached for model
openai/gpt-oss-120bin organizationorg_01kccx1be8fffaz5sbe17service tieron_demandon tokens per day (TPD): Limit 200000, Used 193487, Requested 7464. Please try again in 6m50.832s. Need more tokens? Upgrade to Dev Tier today at https://console.groq.com/settings/billing",“type”:“tokens”,“code”:"rate_limit_exceeded”}}
Then I paused the service for 24 hours for the daily rate limits to reset. After restarting it then I noticed this error:
DiscourseAi::Completions::Endpoints::OpenAi: status: 413 - body: {“error”:{“message”:“Request too large for model
openai/gpt-oss-120bin organizationorg_01kccx1be8fffaz5sbe17service tieron_demandon tokens per minute (TPM): Limit 8000, Requested 8102, please reduce your message size and try again. Need more tokens? Upgrade to Dev Tier today at https://console.groq.com/settings/billing",“type”:“tokens”,“code”:"rate_limit_exceeded”}}
Then I reduced max output tokens from 7000 to 6800 in the LLM configuration and it started working again.
What am I missing here? Are you suggesting it’s related to context window and nothing to do with max output tokens? Just trying to figure out how to match configuration numbers from groq / model limits to discourse LLM configurations.