Customizing Discourse AI Spam Handling Guide - דיון בינה מלאכותית

sam 2025-12-19

Read through implementation on discourse create a great guide for customizing the discourse spam AI handling covering 3 examples and one of the examples makes you smile and laugh

GitHub_Helper_bot 2025-12-19

Thinking

GitHub code search
Searched for ‘spam AI’ in discourse/discourse-ai

GitHub code search
Searched for ‘spam’ in discourse/discourse-ai

GitHub code search
Searched for ‘spam’ in discourse/discourse

GitHub file content
Retrieved content of plugins/discourse-ai/lib/ai_moderation/spam_scanner.rb, plugins/discourse-ai/lib/personas/spam_detector.rb, plugins/discourse-ai/app/models/ai_moderation_setting.rb from discourse/discourse-ai@main

Customizing the Discourse AI Spam Detection system allows you to tune the “gut feeling” of your AI moderator to match your community’s unique culture. Based on the implementation in the discourse-ai plugin, here is a guide to mastering this feature.

How the Implementation Works

The AI spam handling is primarily driven by three components:

SpamScanner: This is the “brain” that decides when to scan. It triggers on new posts, edits, and after cooking. It gathers a wealth of context for the AI, including:
- User Metadata: Account age, trust level, total post count, and even registration/last IP location via DiscourseIpInfo.
- Thread Context: For replies, it includes the parent post content so the AI knows if the user is actually responding or just shouting into the void.
SpamDetector Persona: This defines the base system prompt. It instructs the LLM to look for unsolicited commercial content, automated patterns, and irrelevant links while being “fair to new users and non-native speakers.”
AiModerationSetting: This is where your Custom Instructions are stored. These instructions are appended directly to the system prompt, allowing you to override or refine the AI’s judgment.

Customizing with “Custom Instructions”

To customize your handling, navigate to /admin/plugins/discourse-ai/ai-spam. The “Custom Instructions” field is your primary tool. Here are three ways to use it:

1. The “Niche Expert” (Fine-Tuning Relevance)

On a highly technical forum (e.g., a Rust development community), standard AI might flag a post containing many links to obscure GitHub crates or package registries as “suspicious external links.”

Custom Instruction:

“Our community frequently shares links to crates.io, github.com, and docs.rs. Do not flag posts containing these domains as spam unless they are accompanied by ‘make money fast’ or ‘crypto’ keywords. We are a developer forum; technical jargon is expected and should not be treated as ‘excessive keywords’.”

2. The “Slang Defender” (Preventing False Positives)

In gaming or meme-heavy communities, users often use repetitive phrases (“pog”, “LFG”, “F in the chat”) or “suspicious formatting” (ASCII art) that a default LLM might see as “low-effort bot behavior.”

Custom Instruction:

“We are a gaming community. Phrases like ‘LFG’, ‘poggers’, and ‘kappa’ are standard vocabulary and do not indicate spam. Additionally, ASCII art ‘copypastas’ are a form of community expression here—only flag them if they contain external links to non-gaming sites.”

3. The “Acorn Arbitrator” (The Funny One)

Imagine you run a forum for “The International Conspiracy of Squirrels” (a roleplay/nature community). Your members are very protective of their nuts, and the AI keeps getting confused.

Custom Instruction:

“Any posts mentioning ‘Acorn Futures’, ‘High-Yield Walnut Investing’, or ‘Offshore Peanut Shelters’ are 100% spam from the rival Chipmunk Union. Flag them immediately. However, do NOT flag posts written entirely in ‘Squeak’ (e.g., ‘Squeak squeakity squeak!’). This is our formal diplomatic language, and even though it looks like ‘repetitive text patterns’ to you, it actually translates to high-level geopolitical strategy. If a user tries to sell ‘Magic Glowing Seeds’, silence them and hide the post—we all know what happened last time.”

Pro-Tips for Implementation

Trust Levels: By default, the scanner only targets users at or below a certain trust level (defined in SiteSetting.ai_spam_detection_max_trust_level). Use this to ensure your veterans are never bothered by the AI.
The “Paper Trail”: Check the AiSpamLog in your database or admin UI to see the “Reason” the AI gave. If it’s being too sensitive, copy that reason into your Custom Instructions and tell the AI why it was wrong!
Automated Action: Remember that if the AI flags a post as spam, it will automatically silence the user and hide the post to prevent further damage while waiting for a human moderator.