Add in the API key (depending on the model, you might have more fields to input manually) and save
(Optional) Test your connection to make sure it’s working
Supported LLMs
You can always add a custom option if you don’t see your model listed. Supported models are continually added. Pre-configured models are templates — you can always achieve the same result using “Manual configuration”.
Anthropic
Claude Opus 4.6
Claude Sonnet 4.6
Claude Haiku 4.5
Google
Gemini 3 Pro
Gemini 3 Flash
OpenAI
GPT-5.4
GPT-5 Mini
GPT-5 Nano
Open Router
DeepSeek V3.2
Moonshot Kimi K2.5
xAI Grok 4 Fast
MiniMax M2.5
Z-AI GLM-5
… and many many more
Additionally, hosted customers can use the CDCK Hosted Small LLM pre-configured in the settings page. This is an open-weights LLM hosted by Discourse, ready for use to power AI features.
Configurations fields
You will only see the fields relevant to your selected LLM provider. Please double-check any of the pre-populated fields with the appropriate provider, such as Model name
Core fields:
Display name — the friendly name shown in dropdowns
Model name — the model identifier sent to the API (e.g. claude-sonnet-4-6, gpt-5.2)
Provider — the service hosting the model (e.g. Anthropic, OpenAI, Google, AWS Bedrock, Azure, Open Router, etc.)
URL — the API endpoint URL (not shown for AWS Bedrock)
API Key — configured via the AI Secrets system
Tokenizer
Max prompt tokens — controls prompt trimming to prevent oversized requests
Max output tokens
Input cost / Output cost — cost per million tokens, used for usage tracking
Cached input cost / Cache write cost — for providers that support prompt caching
OpenAI: Organization ID, Reasoning effort, Service tier
Google: Enable thinking, Thinking level
Open Router: Provider order, Provider quantizations
Quotas (available after initial save):
Per-group usage quotas can be configured with max tokens, max usages, and duration
Technical FAQ
What is tokenizer?
The tokenizer translates strings into tokens, which is what a model uses to understand the input.
What number should I use forMax prompt tokens?
A good rule of thumb is 50% of the model context window, which is the sum of how many tokens you send and how many tokens they generate. If the prompt gets too big, the request will fail. That number is used to trim the prompt and prevent that from happening
Caveats
Sometimes you may not see the model you wanted to use listed. While you can add them manually, we will support popular models as they come out.
A lot to unwrap here, which llm are you trying to choose for what?
The CDCK LLMs are only available for very specific features, to see which you need to head to /admin/whats-new on your instance and click “only show experimental features”, you will need to enable them to unlock the CDCK LLM on specific features.
Any LLM you define outside of CDCK LLMs is available to all features.
Is there also a topic that provides a general rundown of the best cost/quality balance? Or even which LLM can be used for free for a small community and basic functionality? I can dive into the details and play around. But I’m a bit short in terms of time.
For example, I only care about spam detection and a profanity filter. I had this for free, but those plugins are deprecated or soon to be. It would be nice if I can retain this functionality without having to pay for an LLM.
Done! It was indeed pretty easy. But maybe for a non techie it may still be a bit hard to setup. For example, the model name was automatically set in the settings, but wasn’t the correct one. Luckily I recognized the model name in a curl example for Claude on the API page and then it worked
Estimated costs are maybe 30 euro cents per month for spam control (I don’t have a huge forum). So that’s manageable! I’ve set a limit of 5 euros in the API console, just in case.
Good to note, yeah sometimes there might be a disconnect. The auto-populated info should act as guidance, which tends to work most of the time, but does fall short in certain cases such as yours (given all the different models and provider configs)