What LLM to use for Discourse AI?

Saif · January 23, 2025, 9:22pm

It’s important to understand the needs of you as the community admin and your members when choosing a Large Language Model (LLM) to power Discourse AI features.

Several factors may influence your decisions:

Performance for use-case: Are you looking for the best-performing model? Performance can vary depending on the task (e.g., summarization, search, complex reasoning, spam detection). Assessment is based on the model’s ability to generate correct, relevant, and coherent responses.
Context length: The context window is the amount of text a model can “see” and consider at one time. Larger context windows allow for processing more information (e.g., longer topics for summarization) and maintaining coherence over longer interactions.
Compatibility: Is the model supported out of the box by the Discourse AI plugin? Will it require specific API endpoints or configuration? Check the plugin documentation for supported providers and models.
Language support: While many top LLMs handle multiple languages well, performance can vary. If your community primarily uses a language other than English, testing specific models for that language is recommended.
Multimodal capabilities: Some features, like AI Triage (NSFW detection), require models that can process images (vision). Ensure the chosen model supports the required modalities.
Speed & Modes: Larger, more powerful models can be slower. For real-time features like AI Helper or Search, faster models might provide a better user experience. Some models (like Claude 3.7 Sonnet) offer different modes, allowing a trade-off between speed and deeper reasoning.
Cost: Budget is often a key factor. Model costs vary significantly based on the provider and the model tier. Costs are typically measured per token (input and output). Faster/smaller models are generally cheaper than large/high-performance models. Open source models can often be run more cost-effectively depending on hosting.
Privacy concerns: Different LLM providers have varying data usage and privacy policies. Review the terms of service, especially regarding whether your data might be used for training purposes. Some providers offer zero data retention options.
Open vs. Closed Source: Open-source models offer transparency and the potential for self-hosting or fine-tuning, though they may require more technical effort. Closed-source models are typically easier to use via APIs but offer less control and transparency.

Choosing an LLM for Discourse AI Features

The LLM landscape evolves rapidly. The table below provides a general overview of currently popular and capable models suitable for various Discourse AI features, categorized by their typical strengths and cost profiles. Models within each category are listed alphabetically.

These are general guidelines. Always check the official Discourse AI plugin documentation for the most up-to-date list of supported models and required configurations. Performance and cost change frequently; consult the LLM provider’s documentation for the latest details. Open Source model availability and performance can depend on the specific provider or hosting setup.

An alternative option for hosted customers is using the pre-configured open-weight LLMs hosted by Discourse. These can often be enabled via Admin → Settings → AI → ai_llm_enabled_models or specific feature settings.

Category	Model	Provider	Key Strengths / Use Cases	Notes
Top Performance/Reasoning	Claude 3.7 Sonnet (Thinking)	Anthropic	Maximum reasoning capability, complex tasks, analysis, generation	Uses more resources/time than regular mode, excellent vision
	DeepSeek-R1	DeepSeek	Strong reasoning, competitive with top tiers, coding, math	Open Source option, potentially lower cost than proprietary equivalents
	Gemini 2.5 Pro	Google	High performance, very large context window, strong multimodal	Excellent all-rounder, integrates well with Google ecosystem
	OpenAI o1 / o1-pro	OpenAI	State-of-the-art reasoning, complex tasks, generation	Highest cost, `o1-pro` likely needed for max capability via API
Balanced (Multi-Purpose)	Claude 3.7 Sonnet (Regular)	Anthropic	High performance, good reasoning, large context, vision, faster mode	Excellent default choice, balances speed and capability
	DeepSeek-V3	DeepSeek	Strong general performance, good value	Open Source option, cost-effective for broad use
	GPT-4o	OpenAI	Very strong all-rounder, good multimodal, widely compatible	Great balance of performance, speed, and cost
	OpenAI o3-mini	OpenAI	Good performance and reasoning for cost	A flexible, intelligent reasoning model suitable for many tasks
Cost-Effective/Speed	Claude 3.5 Haiku	Anthropic	Extremely fast and low cost, suitable for simpler tasks	Best for high-volume, low-latency needs like search, basic summaries
	Gemini 2.0 Flash	Google	Very fast and cost-effective, good general capabilities	Good for summarization, search, helper tasks
	GPT-4o mini	OpenAI	Fast, affordable version of GPT-4o, good for many tasks	Good balance of cost/performance for simpler features
	Llama 3.3 (e.g., 70B)	Meta	Strong open source model, often cost-effective multi-purpose option	Performance varies by provider/hosting, requires checking compatibility
Vision Capable	Claude 3.7 Sonnet	Anthropic	Strong vision capabilities (both modes)	Required for AI Triage/NSFW Detection
	Gemini 2.5 Pro / 2.0 Flash	Google	Strong vision capabilities	Required for AI Triage/NSFW Detection
	GPT-4o / GPT-4o mini	OpenAI	Integrated text and vision	Required for AI Triage/NSFW Detection
	Llama 3.2	Meta	Open source vision capabilities	Requires checking compatibility/hosting/provider support
	Discourse Hosted LLM	Discourse	Pre-configured vision model for hosted sites	Check specific feature settings (e.g., NSFW Detection)
	Qwen-VL / others	Various	Check Discourse AI plugin for specific supported vision models	Configuration may vary

General Recommendations Mapping (Simplified):

AI Bot (Complex Q&A, Persona): Top Performance/Reasoning models (Claude 3.7 Sonnet - Thinking, R1, Gemini 2.5 Pro, o1-pro) or strong Balanced models (GPT-4o, Claude 3.7 Sonnet - Regular, o3-mini).
AI Search: Cost-Effective/Speed models (Haiku 3.5, Gemini 2.0 Flash, GPT-4o mini, Llama 3.3) or Balanced models for slightly better understanding (GPT-4o, DeepSeek-V3).
AI Helper (Title Suggestions, Proofreading): Cost-Effective/Speed models or Balanced models. Speed is often preferred. Claude 3.7 Sonnet (Regular) or GPT-4o mini are good candidates. Llama 3.3 can also work well here.
Summarize: Balanced models (Claude 3.7 Sonnet - Regular, GPT-4o, o3-mini, DeepSeek-V3) or Cost-Effective models (Gemini 2.0 Flash, Llama 3.3). Longer context windows (Gemini 2.5 Pro, Sonnet 3.7) are beneficial for long topics if budget allows.
Spam Detection / AI Triage (Text): Cost-Effective/Speed models are usually sufficient and cost-efficient (Haiku 3.5, Gemini 2.0 Flash, GPT-4o mini, Llama 3.3).
AI Triage (NSFW Image Detection): Requires a Vision Capable model (GPT-4o/mini, Sonnet 3.7, Gemini 2.5 Pro/2.0 Flash, Llama 3.2, specific Discourse hosted/supported models).

Remember to configure the selected LLM(s) in your Discourse Admin settings under the relevant AI features.

Last edited by @sam 2025-03-31T02:00:15Z

Check document
Perform check on document:

Topic		Replies	Views
Discourse AI - Large Language Model (LLM) settings page Site Management how-to , ai	17	1731	July 25, 2025
How to configure Discourse to use a locally installed LLM? Support ai	7	118	June 3, 2025
Adding Semantic Search feature for our self-hosted discourse site Support ai , ai-search	9	158	March 19, 2025
Simplified Large Language Model (LLM) configurations for Discourse AI Announcements ai	1	250	August 9, 2024
Help with Discourse AI Support ai	5	95	February 19, 2025

What LLM to use for Discourse AI?

Related topics