This guide explains how to configure and use Discourse AI’s spam detection feature, including the setup process, scanning criteria, classification logic, customizations, and contrasts with AI triage.
Required user level: Administrator
Discourse AI provides an efficient spam detection feature that identifies and flags spam posts with minimal configuration. While designed for simplicity, it complements the more versatile AI triage system, which supports broader workflows and larger use cases.
Summary
In this guide, you will learn:
- How AI spam detection works and what content is scanned
- The classification logic and context used by the AI
- Steps to configure spam detection through
/admin/plugins/discourse-ai/ai-spam
- Guidelines for Large Language Model (LLM) selection
- Key differences between spam detection and AI triage
- How to manage flagged and missed posts
How AI spam detection works
What content gets scanned?
AI spam detection evaluates posts based on these criteria:
-
User trust level:
- Scans posts from users with trust level 1 or lower.
- Excludes posts from higher trust levels.
-
Post type:
- Public posts (excluding private messages).
- Both reply posts and first topic posts are included, based on additional thresholds.
-
Post edits:
- Scans posts with significant edits (e.g., changes exceeding 10 characters).
- Enforces a 10-minute delay between scans of the same post.
-
Post frequency:
- Prioritizes posts from new users with fewer than 4 total posts in public topics.
- Excludes posts from users exceeding this threshold.
The classification process
Posts that meet the criteria are sent to an AI model (LLM) for analysis. The model evaluates whether the post is “SPAM” or “NOT SPAM” based on:
- Context: Includes post content, topic title, user account data (e.g., account age and trust level), and site guidelines.
- Custom instructions: Admin-defined rules for reinforced or adapted scanning criteria.
-
Automated detection:
- Flags irrelevant or promotional content (e.g., ads or commercial materials).
- Identifies automated or bot-like behaviors.
- Assesses content relevance to the discussion.
Default prompt and context
The AI uses a default system prompt to guide spam detection. This prompt outlines spam classification rules. For example:
You are a spam detection system. Analyze the following content and context.
Notes:
- Replies must remain relevant to the discussion thread.
- Mark as SPAM if the content is irrelevant, promotional, or automated.
- Consider new user posts with links as potential SPAM unless explicitly relevant to the topic.
Respond only with "SPAM" or "NOT SPAM".
The scanner also compiles a context package, including:
- Metadata from topics and categories.
- Relevance of replies to the thread.
- Author data (e.g., account creation date, total posts, trust level).
- Post text truncated to 5000 characters for processing.
How to configure AI spam detection
Configuration guide
-
Access settings:
Navigate to/admin/plugins/discourse-ai/ai-spam
. -
Select an LLM:
- Choose a language model suited to your forum’s needs. See the Large Language Model (LLM) settings page for configuring LLMs.
- Access
/admin/plugins/discourse-ai/ai-llms
for LLM configurations.
-
Activate spam detection:
Enable spam detection by toggling the feature on.
Note: A connected LLM is mandatory.
-
Add customized instructions:
- Define rules specific to your forum (e.g., stricter monitoring of external links).
- Save any changes to apply them.
Tip: Disable Akismet when using Discourse AI spam detection to avoid redundancy.
Differences from AI triage
While spam detection is designed specifically for identifying spam, AI triage supports broader post management tasks.
Feature | AI Spam Detection | AI Triage |
---|---|---|
Complexity | Streamlined, opinionated setup | Highly customizable and flexible |
Primary use case | Detecting spam with minimal overhead | Advanced workflows for categorization, tagging, replies, spam detection, nsfw detection |
Actions | Flags spam, silences users | Tags, categorizes, hides posts, adds replies, flags posts, silences users |
Recommendation | Use instead of Akismet | Use for rich highly customizable workflows |
For more details, see Discourse AI - AI triage.
LLM selection recommendations
The performance of spam detection depends on the chosen LLM.
Most low-cost LLMs work effectively, such as:
- GPT-4o-mini
- Claude 3.5 Haiku
- Gemini 2.0 Flash
Experiment with different models to find the best fit. Configure your models via /admin/plugins/discourse-ai/ai-llms
.
Testing spam scanner behavior
You can test spam detection rules directly from the configuration page.
- Paste a post URL or ID into the test field.
- Review the classification result (e.g., “SPAM” or “NOT SPAM”) and analyze logs to understand reasoning.
- Unsaved changes are applied during testing, enabling experimentation without risk.
Managing flagged and missed posts
Handling flagged posts
Flagged posts appear in the moderation queue. Admins can:
- Approve legitimate posts wrongly classified as spam.
- Reject spam topics to keep the system accurate.
Important: Reject spam flags for incorrectly classified posts. Users remain silenced until the flag is resolved.
Handling missed spam
Missed spam refers to posts bypassing detection but flagged by the community. Moderators can manage these as necessary.
Best practices
- Monitor flagged and missed spam regularly to refine system accuracy. Clickable metrics simplify this process.
- Use test cases to evaluate custom instructions against edge cases.
- Review and adjust LLM settings when needed.
Additional resources
Configuring AI spam detection effectively reduces manual moderation efforts, ensuring a clean, spam-free community.
Last edited by @MarkDoerr 2024-12-21T02:05:11Z
Check document
Perform check on document: