AI exceeds LLM token thresholds randomly and unpredictably

The context windows is set to 130k

But that brings me back to the same issue. The model limit on groq is 131,072; I’ve already made it 130,000. I shouldn’t have to experiment with the limits to figure out how much discourse is sending. Discourse should be able to operate within the limits provided by the LLM configuration

What I don’t understand is why reducing the max output tokens seems to fix the issue. I haven’t made any changes to the context window, just reduced the max output token further and it’s started working and picking up from where it left off.