It’s important to understand the needs of you as the community admin and your members when choosing a Large Language Model (LLM) to power Discourse AI features.
Several factors may influence your decisions:
- Performance for use-case: Are you looking for the best-performing model? Performance can vary depending on the task (e.g., summarization, search, complex reasoning, spam detection). Assessment is based on the model’s ability to generate correct, relevant, and coherent responses.
- Context length: The context window is the amount of text a model can “see” and consider at one time. Larger context windows allow for processing more information (e.g., longer topics for summarization) and maintaining coherence over longer interactions.
- Compatibility: Is the model supported out of the box by the Discourse AI plugin? Will it require specific API endpoints or configuration? Check the plugin documentation for supported providers and models.
- Language support: While many top LLMs handle multiple languages well, performance can vary. If your community primarily uses a language other than English, testing specific models for that language is recommended.
- Multimodal capabilities: Some features, like AI Triage (NSFW detection), require models that can process images (vision). Ensure the chosen model supports the required modalities.
- Speed & Modes: Larger, more powerful models can be slower. For real-time features like AI Helper or Search, faster models might provide a better user experience. Some models (like Claude 3.7 Sonnet) offer different modes, allowing a trade-off between speed and deeper reasoning.
- Cost: Budget is often a key factor. Model costs vary significantly based on the provider and the model tier. Costs are typically measured per token (input and output). Faster/smaller models are generally cheaper than large/high-performance models. Open source models can often be run more cost-effectively depending on hosting.
- Privacy concerns: Different LLM providers have varying data usage and privacy policies. Review the terms of service, especially regarding whether your data might be used for training purposes. Some providers offer zero data retention options.
- Open vs. Closed Source: Open-source models offer transparency and the potential for self-hosting or fine-tuning, though they may require more technical effort. Closed-source models are typically easier to use via APIs but offer less control and transparency.
Choosing an LLM for Discourse AI Features
The LLM landscape evolves rapidly. The table below provides a general overview of currently popular and capable models suitable for various Discourse AI features, categorized by their typical strengths and cost profiles. Models within each category are listed alphabetically.
These are general guidelines. Always check the official Discourse AI plugin documentation for the most up-to-date list of supported models and required configurations. Performance and cost change frequently; consult the LLM provider’s documentation for the latest details. Open Source model availability and performance can depend on the specific provider or hosting setup.
An alternative option for hosted customers is using the pre-configured open-weight LLMs hosted by Discourse. These can often be enabled via Admin → Settings → AI →
ai_llm_enabled_models
or specific feature settings.
Category | Model | Provider | Key Strengths / Use Cases | Notes |
---|---|---|---|---|
Top Performance/Reasoning | Claude 3.7 Sonnet (Thinking) | Anthropic | Maximum reasoning capability, complex tasks, analysis, generation | Uses more resources/time than regular mode, excellent vision |
DeepSeek-R1 | DeepSeek | Strong reasoning, competitive with top tiers, coding, math | Open Source option, potentially lower cost than proprietary equivalents | |
Gemini 2.5 Pro | High performance, very large context window, strong multimodal | Excellent all-rounder, integrates well with Google ecosystem | ||
OpenAI o1 / o1-pro | OpenAI | State-of-the-art reasoning, complex tasks, generation | Highest cost, o1-pro likely needed for max capability via API |
|
Balanced (Multi-Purpose) | Claude 3.7 Sonnet (Regular) | Anthropic | High performance, good reasoning, large context, vision, faster mode | Excellent default choice, balances speed and capability |
DeepSeek-V3 | DeepSeek | Strong general performance, good value | Open Source option, cost-effective for broad use | |
GPT-4o | OpenAI | Very strong all-rounder, good multimodal, widely compatible | Great balance of performance, speed, and cost | |
OpenAI o3-mini | OpenAI | Good performance and reasoning for cost | A flexible, intelligent reasoning model suitable for many tasks | |
Cost-Effective/Speed | Claude 3.5 Haiku | Anthropic | Extremely fast and low cost, suitable for simpler tasks | Best for high-volume, low-latency needs like search, basic summaries |
Gemini 2.0 Flash | Very fast and cost-effective, good general capabilities | Good for summarization, search, helper tasks | ||
GPT-4o mini | OpenAI | Fast, affordable version of GPT-4o, good for many tasks | Good balance of cost/performance for simpler features | |
Llama 3.3 (e.g., 70B) | Meta | Strong open source model, often cost-effective multi-purpose option | Performance varies by provider/hosting, requires checking compatibility | |
Vision Capable | Claude 3.7 Sonnet | Anthropic | Strong vision capabilities (both modes) | Required for AI Triage/NSFW Detection |
Gemini 2.5 Pro / 2.0 Flash | Strong vision capabilities | Required for AI Triage/NSFW Detection | ||
GPT-4o / GPT-4o mini | OpenAI | Integrated text and vision | Required for AI Triage/NSFW Detection | |
Llama 3.2 | Meta | Open source vision capabilities | Requires checking compatibility/hosting/provider support | |
Discourse Hosted LLM | Discourse | Pre-configured vision model for hosted sites | Check specific feature settings (e.g., NSFW Detection) | |
Qwen-VL / others | Various | Check Discourse AI plugin for specific supported vision models | Configuration may vary |
General Recommendations Mapping (Simplified):
- AI Bot (Complex Q&A, Persona): Top Performance/Reasoning models (Claude 3.7 Sonnet - Thinking, R1, Gemini 2.5 Pro, o1-pro) or strong Balanced models (GPT-4o, Claude 3.7 Sonnet - Regular, o3-mini).
- AI Search: Cost-Effective/Speed models (Haiku 3.5, Gemini 2.0 Flash, GPT-4o mini, Llama 3.3) or Balanced models for slightly better understanding (GPT-4o, DeepSeek-V3).
- AI Helper (Title Suggestions, Proofreading): Cost-Effective/Speed models or Balanced models. Speed is often preferred. Claude 3.7 Sonnet (Regular) or GPT-4o mini are good candidates. Llama 3.3 can also work well here.
- Summarize: Balanced models (Claude 3.7 Sonnet - Regular, GPT-4o, o3-mini, DeepSeek-V3) or Cost-Effective models (Gemini 2.0 Flash, Llama 3.3). Longer context windows (Gemini 2.5 Pro, Sonnet 3.7) are beneficial for long topics if budget allows.
- Spam Detection / AI Triage (Text): Cost-Effective/Speed models are usually sufficient and cost-efficient (Haiku 3.5, Gemini 2.0 Flash, GPT-4o mini, Llama 3.3).
- AI Triage (NSFW Image Detection): Requires a Vision Capable model (GPT-4o/mini, Sonnet 3.7, Gemini 2.5 Pro/2.0 Flash, Llama 3.2, specific Discourse hosted/supported models).
Remember to configure the selected LLM(s) in your Discourse Admin settings under the relevant AI features.
Last edited by @sam 2025-03-31T02:00:15Z
Check document
Perform check on document: