What LLM to use for Discourse AI?

Saif · January 23, 2025, 9:22pm

It’s important to understand the needs of you as the community admin and your members when choosing a Large Language Model (LLM) to power Discourse AI features.

Several factors may influence your decisions:

Performance for use-case: Are you looking for the best-performing model? Performance can vary depending on the task (e.g., summarization, search, complex reasoning, spam detection). Assessment is based on the model’s ability to generate correct, relevant, and coherent responses.
Context length: The context window is the amount of text a model can “see” and consider at one time. Larger context windows allow for processing more information (e.g., longer topics for summarization) and maintaining coherence over longer interactions.
Compatibility: Is the model supported out of the box by the Discourse AI plugin? Will it require specific API endpoints or configuration? Check the plugin documentation for supported providers and models.
Language support: While many top LLMs handle multiple languages well, performance can vary. If your community primarily uses a language other than English, testing specific models for that language is recommended.
Multimodal capabilities: Some features, like image captioning in AI Helper, require models that can process images (vision). Ensure the chosen model supports the required modalities.
Speed & Modes: Larger, more powerful models can be slower. For real-time features like AI Helper or Search, faster models might provide a better user experience. Some models offer different modes (e.g., extended thinking or reasoning effort levels), allowing a trade-off between speed and deeper reasoning.
Cost: Budget is often a key factor. Model costs vary significantly based on the provider and the model tier. Costs are typically measured per token (input and output). Faster/smaller models are generally cheaper than large/high-performance models. Open source models can often be run more cost-effectively depending on hosting.
Privacy concerns: Different LLM providers have varying data usage and privacy policies. Review the terms of service, especially regarding whether your data might be used for training purposes. Some providers offer zero data retention options.
Open vs. Closed Source: Open-source models offer transparency and the potential for self-hosting or fine-tuning, though they may require more technical effort. Closed-source models are typically easier to use via APIs but offer less control and transparency.

Choosing an LLM for Discourse AI Features

The LLM landscape evolves rapidly. The table below provides a general overview of currently popular and capable models suitable for various Discourse AI features, categorized by their typical strengths and cost profiles. Models within each category are listed alphabetically.

These are general guidelines. Always check the official Discourse AI plugin documentation for the most up-to-date list of supported models and required configurations. Performance and cost change frequently; consult the LLM provider’s documentation for the latest details. Open Source model availability and performance can depend on the specific provider or hosting setup.

An alternative option for hosted customers is using the pre-configured LLMs available through the Discourse AI plugin’s admin interface. These can be set up via Admin → Plugins → AI → LLMs, which provides one-click presets for popular models from Anthropic, Google, OpenAI, and OpenRouter.

Category	Model	Provider	Key Strengths / Use Cases	Notes
Top Performance/Reasoning	Claude Opus 4.6	Anthropic	Maximum reasoning capability, complex tasks, analysis, generation	Highest cost Anthropic model, 200K context, excellent vision
	Gemini 3 Pro	Google	High performance, very large context window, strong multimodal	1M token context, excellent all-rounder
	GPT-5.2	OpenAI	State-of-the-art reasoning, complex tasks, generation, vision	400K context, strong all-rounder from OpenAI
	xAI Grok 4 Fast	xAI (via OpenRouter)	Strong reasoning, competitive performance	Available via OpenRouter, vision capable
Balanced (Multi-Purpose)	Claude Sonnet 4.6	Anthropic	High performance, good reasoning, large context, vision, fast	Excellent default choice, balances speed and capability, 200K context
	DeepSeek V3.2	DeepSeek (via OpenRouter)	Strong general performance, good value	Open Source option, cost-effective for broad use, 163K context
	Moonshot Kimi K2.5	Moonshot (via OpenRouter)	Strong performance, very large context, vision	262K context window, good value
Cost-Effective/Speed	Claude Haiku 4.5	Anthropic	Fast and low cost, suitable for simpler tasks, vision capable	Best for high-volume, low-latency needs like search, basic summaries
	Gemini 3 Flash	Google	Very fast and cost-effective, good general capabilities, vision	1M context, good for summarization, search, helper tasks
	GPT-5 Mini	OpenAI	Fast, affordable, good for many tasks	400K context, good balance of cost/performance for simpler features
	GPT-5 Nano	OpenAI	Extremely fast and cheapest OpenAI option	Best for highest-volume, lowest-cost needs
	Arcee Trinity Large (Free)	Arcee (via OpenRouter)	Free tier option, 128K context	Good for testing or very budget-conscious deployments
Vision Capable	Claude Opus 4.6 / Sonnet 4.6 / Haiku 4.5	Anthropic	All current Anthropic models support vision	Useful for image captioning in AI Helper
	Gemini 3 Pro / 3 Flash	Google	Strong vision capabilities	Useful for image captioning in AI Helper
	GPT-5.2	OpenAI	Integrated text and vision	Useful for image captioning in AI Helper
	Moonshot Kimi K2.5	Moonshot (via OpenRouter)	Vision capable	Available via OpenRouter
	xAI Grok 4 Fast	xAI (via OpenRouter)	Vision capable	Available via OpenRouter

General Recommendations Mapping (Simplified):

AI Bot (Complex Q&A, Agents): Top Performance/Reasoning models (Claude Opus 4.6, Gemini 3 Pro, GPT-5.2) or strong Balanced models (Claude Sonnet 4.6, DeepSeek V3.2).
AI Search / Discover: Cost-Effective/Speed models (Haiku 4.5, Gemini 3 Flash, GPT-5 Mini/Nano) or Balanced models for slightly better understanding (Sonnet 4.6, DeepSeek V3.2).
AI Helper (Title Suggestions, Proofreading, Translation): Cost-Effective/Speed models or Balanced models. Speed is often preferred. Claude Sonnet 4.6 or GPT-5 Mini are good candidates.
Summarize: Balanced models (Claude Sonnet 4.6, GPT-5.2, DeepSeek V3.2) or Cost-Effective models (Gemini 3 Flash, GPT-5 Mini). Longer context windows (Gemini 3 Pro/Flash at 1M tokens) are beneficial for long topics.
Spam Detection: Cost-Effective/Speed models are usually sufficient and cost-efficient (Haiku 4.5, Gemini 3 Flash, GPT-5 Mini/Nano).
Translation: Cost-Effective/Speed models work well for locale detection and translation tasks (Haiku 4.5, Gemini 3 Flash, GPT-5 Mini).
Automation (Triage, Reports): Depends on complexity. Simple triage rules work well with Cost-Effective models. Complex agent-based triage benefits from Balanced or Top Performance models.

Remember to configure the selected LLM(s) in your Discourse Admin under Plugins → AI → Features for each feature, and set up the LLM connections under Plugins → AI → LLMs.

Topic		Replies	Views
Discourse AI - Large Language Model (LLM) settings page Site Management ai , how-to	21	3354	May 9, 2026
How to configure Discourse to use a locally installed LLM? Support ai	8	288	September 17, 2025
Setup Inquiry: AI Summarization in Discourse & LLM Integration Support ai	11	230	October 13, 2025
Adding Semantic Search feature for our self-hosted discourse site Support ai , ai-search	9	256	March 19, 2025
Discourse AI plugin: missing model discovery & sensible defaults (any plans or community plugins?) Feature ai	4	89	February 3, 2026

What LLM to use for Discourse AI?

Related topics