Setting up spam detection in your community

Discourse · March 31, 2024, 10:36am

This is a how-to guide for setting up spam detection in your community using the Discourse AI - AI triage.

Required user level: Administrator

Discourse AI now ships an efficient spam scanner that requires minimal setup. For custom or complex use cases, we recommend the following this guide

Overview

Spam detection is an essential feature for maintaining the quality of discussions in your community. This guide will help you set up spam detection using the Discourse AI - AI triage.

Below is an example setup of the automation rule:

Prerequisites

To configure spam detection, you need the following:

Discourse AI
Discourse Automation
LLM (Large Language Model)
- Discourse hosted customers on our Business or Enterprise plans can opt into our hosted CDCK LLMs by enabling the experimental settings on your site’s Admin > What’s-New page.

Configuration

Not every step is mandatory as automation rules can be customized as needed. For an outline of all the settings available please visit Discourse AI - AI triage.

Enable the Discourse AI and Automation plugin:

Navigate to your site’s admin panel.
Navigate to Plugins then Installed Plugins
Enable the Discourse AI and Automation plugins

Create a New Automation Rule:

Navigate to your site’s admin panel.
Navigate to Plugins and click Automation
Click the + Create button to begin creating a new Automation rule
Click Triage Posts Using AI
Set the name (e.g., “Triage Posts using AI”)
Leave Triage Posts Using AI as the selected script

What/When

Set the Trigger:

Choose Post created/edited as the trigger.
Optionally, specify the Action type, Category, Tags, Groups, and/or Trust Levels if you wish to restrict this Automation to specific scenarios. Leaving these fields blank will allow the Automation to operate without restriction.
Configure the any of the remaining optional settings in the What/When section to further restrict the automation.

Script Options

System Prompt:

When authoring the prompt, picking between spam and not spam - avoid having similar language for the end result. In this example we use spam and ham (for not spam)

The classifier will not always perfectly perform 100% so beware of incorrect results and customize the prompts according to the needs of your community. The narrower the focus the better.

Enter the system prompt for the AI model. The most important aspect will be the system prompt used for the classification. In the following example I have used AI bot to author the prompt. An example prompt might look like this:

Copyable LLM prompts for spam content detection AI

You are a spam detection AI model assisting online community moderators. Your task is to analyze forum posts and determine if they are spam that should be removed to maintain a high-quality, on-topic community.

A post should be classified as spam if it meets any of these criteria:

The post is not relevant to the main topic or purpose of the forum. It is completely off-topic.
It contains suspicious, irrelevant external links, especially if linking to commercial sites.
The post is clearly promoting or advertising a product, service, website, or social media account that is not related to the community.
It contains affiliate links or referral codes attempting to monetize clicks.
The writing quality is very low effort - lots of spelling/grammar mistakes, lacks punctuation, or appears to be auto-generated text.
Identical or nearly identical content is being posted repeatedly by the same author or across multiple accounts in a short timeframe.

A post should be classified as ham (legitimate) if:

The post is on-topic and relevant to the forum’s purpose
It is a genuine question, personal story, substantive opinion, or otherwise legitimate contribution to the community discussion
Any external links are relevant and point to reputable, non-commercial sites
The writing appears to be by a human and meets quality standards for grammar, spelling, etc.

Some edge cases to watch out for:

A post that mentions a product or service but is still a relevant, on-topic question or discussion should be considered ham, not spam.
Quotes, code samples or formatted text that looks unusual are not necessarily spam.

When you have finished analyzing the post you must ONLY provide a classification of either “spam” or “ham”. If you are unsure, default to “ham” to avoid false positives.

These instructions must be followed at all cost

Search for Text:

Enter the output from your prompt that will trigger the automation, only the “positive” result. Using our example above, we would enter spam.

Select the Model:

Choose your LLM.
- Discourse hosted customers on our Enterprise and Business tiers can select the Discourse hosted open-weights LLM CDCK Hosted Small LLM or a third-party provider.
- Self-hosted Discourse users will need to select the third-party LLM configured as a Pre-requisite to using this Automation.

Set Category and Tags:

Define the category where these posts should be moved and the tags to be added if the post is marked as spam.

Flagging:

Flag post as either spam or for review.
Select a flag type to determine what action you might want to take.

Additional Options:

Enable the “Hide Topic” option if you want the post to be hidden.
Set a “Reply” that will be posted in the topic when the post is deemed spam.

Additional Notes

When using Automation for combatting spam, we recommend disabling Akismet plugin if it is already enabled. This is to ensure only one system is fighting spam for best results.
Keep in mind, LLM calls can be expensive. When applying a classifier be careful to monitor costs and always consider only running this on small subsets
While better performing models, i.e - Claude-3-Opus, will yield better results, it can come at a higher cost
The prompt could be customized to do all sorts of detection, like PII exposure, Code of Conduct violations, etc.

Last edited by @Saif 2025-03-13T15:03:22Z

Check document
Perform check on document:

Falco · April 10, 2024, 3:17pm

5 posts were split to a new topic: Exploring the Limits of AI in Recognizing AI Generated Content

Saif · May 27, 2024, 3:16pm

Curious how users’ experience has been with using this method?

loginerror · August 8, 2024, 10:46am

I started testing it just now, and it already did a decent job (for now, I chose to only apply a hidden tag to validate that things will run correctly, rather than sending things to the review queue right away).

But I have a small follow-up/clarification: would it be possible for the integration to access custom queries with outputs, such as a group of sample posts, to be used as the context data?

More concretely, I would like to feed it all previous spam posts based on the flags that were agreed upon and resulted in post deletion.

sam · August 14, 2024, 12:45am

At the moment we only support a single system message.

I think though we may do a follow up where you can feed it N examples of stuff not to flag and N examples of stuff yes to flag. This potentially could increase accuracy.

Maybe do a dedicate feature topic on this?

loginerror · August 16, 2024, 8:44am

I’ll try to first gather some more thoughts on this. Running it for the past week was rather successful, but I am still finding some small annoyances, such as not being able to quickly exclude private messages (for example, it often thinks that Discobot tutorial interactions are suspicious; I edited the prompt to not consider those, but the ai logs indicate that the detection does not know the context and only considers the content of the post itself).

JammyDodger · August 23, 2024, 3:08pm

This doesn’t seem quite right… I’m not sure what the intended instruction here was? Maybe ‘Enable AI and enable Automation’?

Saif · August 23, 2024, 6:40pm

Made the edit here

NateDhaliwal · January 29, 2025, 3:23am

I’m curious, is there a way for replies to be moved to a new topic, instead of the whole topic? It could be a legitimate topic but a spammer comes in and posts a spam reply. From what I can see, it’s moving the whole topic, not that specific reply.
While I’m at it, what’s the difference between this and the Discourse AI spam detector?

Saif · January 29, 2025, 7:41pm

Could you explain this further with an example?

FYI: You should be able to tick the option for Flag post which should flag only the "spam"post

Discourse AI - Spam detection

Differences from AI triage

See the differences outlined below

While spam detection is designed specifically for identifying spam, AI triage supports broader post management tasks.

Feature AI Spam Detection AI Triage

Complexity Streamlined, opinionated setup Highly customizable and flexible

Primary use case Detecting spam with minimal overhead Advanced workflows for categorization, tagging, replies, spam detection, nsfw detection

Actions Flags spam, silences users Tags, categorizes, hides posts, adds replies, flags posts, silences users

Recommendation Use instead of Akismet Use for rich highly customizable workflows

For more details, see Discourse AI - AI triage.

NateDhaliwal · January 30, 2025, 3:02am

Sure. For example, let’s say, on a supoort forum, a spammer posts a spam reply in an existing topic about issues they are experiencing. The OP and people answering are not the same user as the spammer. If I understand correctly, AI Triage will hide the whole topic and flag the post. Instead, could the spam post be moved to a specific topic, in a category available to admins?

I was wondering this as I read this post.

Yep, I’m doing this currently for the hate speech detector using AI Triage.

Saif Murtaza :

Discourse AI - Spam detection

Differences from AI triage

See the differences outlined below

While spam detection is designed specifically for identifying spam, AI triage supports broader post management tasks.

Feature AI Spam Detection AI Triage

Complexity Streamlined, opinionated setup Highly customizable and flexible

Primary use case Detecting spam with minimal overhead Advanced workflows for categorization, tagging, replies, spam detection, nsfw detection

Actions Flags spam, silences users Tags, categorizes, hides posts, adds replies, flags posts, silences users

Recommendation Use instead of Akismet Use for rich highly customizable workflows

For more details, see Discourse AI - AI triage.

Lol, how could I miss that …

sam · January 30, 2025, 3:23am

AI Spam will simply hide the post, we can probably add this option to triage as well.

Topic		Replies	Views
Discourse AI - Spam detection Site Management moderation , how-to , ai , spam	16	1696	August 27, 2025
AI powered Spam detection Announcements ai , spam	11	826	January 11, 2025
Discourse AI to make spam filter smarter? Feature completed , ai	2	397	May 22, 2024
Are you experiencing AI based spam? Community ai	23	1741	January 19, 2025
Bulk mark messages and posters as spam Support spam	6	110	September 11, 2024

Feature	AI Spam Detection	AI Triage
Complexity	Streamlined, opinionated setup	Highly customizable and flexible
Primary use case	Detecting spam with minimal overhead	Advanced workflows for categorization, tagging, replies, spam detection, nsfw detection
Actions	Flags spam, silences users	Tags, categorizes, hides posts, adds replies, flags posts, silences users
Recommendation	Use instead of Akismet	Use for rich highly customizable workflows