Discourse AI - Spam detection

pfaffman · August 28, 2025, 3:35pm

And he’s an example of the same spammers getting caught here: https://meta.discourse.org/t/full-list-of-quickbooks-desktop-support-contact-numbers-a-complete-call-center-in-the-usa/380776 (it’s already hidden).

These guys are definitely working hard.

haydenjames · September 3, 2025, 1:12pm

Great work on this feature. This is exactly how I like to see AI used.

Quick question: when a new TL0 user submits a reply or topic, is there a time delay while the content is scanned?

I see a short delay in the built-in tester (screenshot below), but when I post from a test account, there’s no similar pause. Is the live scan asynchronous after publishing, with the post hidden only if it trips a rule? (context: I’m using the OpenAI ChatGPT 5 API.)

For what it’s worth, AI > Spam & Stats increment as expected with the test account, so the post IS being scanned; it just isn’t introducing the same delay as the Test button does.

Thanks.

stance455 · September 25, 2025, 10:52am

Ok so this works pretty well but what happens when it flags dozens of topics/users. I’m not seeing a way to bulk ban/delete these users/posts.

sps · February 3, 2026, 7:49pm

Thanks for the detailed thread. We have Discourse AI spam detection enabled on our instance, and one of the things that we’re seeing is the auto-silencing default when the first post made by an account is flagged.

I understand this is for silencing one-shot spammers; however, this is causing issues where approving/accepting a flag means that the user will remain silenced in cases where we want to approve the flag but not silence the user. It would be good to have:

an “agree and keep silenced” button, and
a separate “agree and lift silencing.” button.

sam · February 3, 2026, 11:57pm

This is a tricky one, we don’t want to paralyze people with choice here, but I totally get that at scale this can be a problem.

Let me check with the enterprise xp team maybe there is a small customization we can make for your forum.

singi2016cn · February 28, 2026, 3:52am

I published a test spam topic in my local development environment, but it did not automatically enter the review queue.

The AI detection result is indeed a spam post.

And it also meets the other conditions for entering the review queue.

User Trust Level:
- Scan posts from users with a trust level of 1 or lower.
- Exclude posts from users with a higher trust level.
Post Type:
- Public posts (excluding private messages).
- Include reply posts and first topic posts based on other thresholds.
Post Edits:
- Scan posts with significant edits (e.g., changes exceeding 10 characters).
- Enforce a 10-minute delay between scans of the same post.
Post Frequency:
- Prioritize cases where new users have posted a total of fewer than 4 posts in public topics.
- Exclude posts from users exceeding this threshold.

However, the final result is that it did not enter the review queue.

Where should I check to find the problem?

per1234 · February 28, 2026, 4:11am

Hi @singi2016cn.

Do you meant that you verified this with the testing tool?

You can access that tool by following these instructions:

Log into an account on your forum that has administrator privileges.
Navigate to this page on your forum: /admin/plugins/discourse-ai/ai-spam
Click the “Test…” button on that page.
The “Test spam detection” dialog will open.
Enter the URL or post ID of your test spam post into the “Post URL or ID” field in the dialog.
Click the “Run test” button.

singi2016cn · February 28, 2026, 6:24am

Yes, the testing tool clearly returned Spam, but when I posted the exact same content, it didn’t enter the moderation queue.

Moin · February 28, 2026, 10:59am

Who posted this? Did you use a new user you created for testing or did you for example use an account with moderator permissions?

singi2016cn · March 2, 2026, 1:20am

Regular user, trust_level_1 trust level, not an administrator or a moderator.

LotusJeff · March 10, 2026, 2:34am

Here is the custom instruction set I am using for spam detection. It is more detailed than the stock version, so it will use more tokens. What are others using for custom instruction sets for spam detection?

## Concise Spam Detection Instruction Set

You are a spam detection system reviewing forum posts.

Your task is to determine whether a post is primarily intended to promote, deceive, manipulate search rankings, distribute malicious links, or disrupt discussion — rather than genuinely participate in the community.

Evaluate:

* Post content
* Post type (REPLY or NEW TOPIC)
* Thread context (for replies)
* Site information

---

### Classify as Spam if the post:

* Promotes products, services, or external sites without meaningful engagement
* Contains suspicious, unrelated, or multiple promotional links
* Uses SEO-style keyword stuffing or repetitive patterns
* Appears automated, templated, or bot-generated
* Is irrelevant to the forum topic
* For REPLY posts: ignores the thread and injects unrelated content

Strong spam indicators include:

* Affiliate/referral links
* “Buy now,” discounts, or sales language
* Contact info unrelated to discussion
* Generic praise + link
* Copy-paste structure
* Nonsensical or AI-spun text

---

### Do NOT classify as spam solely because:

* The user is new
* English is imperfect
* The post is short
* The tone is enthusiastic
* A relevant product or supplier is mentioned in context

Legitimate signals include:

* Specific references to the thread
* Topic-relevant technical discussion
* Genuine questions
* Personal experience related to the forum subject

---

### Decision Rule

If the primary intent appears promotional, malicious, or disruptive → spam = true.
If the post meaningfully participates in discussion → spam = false.

When uncertain but multiple red flags are present, prioritize community safety.

---

### Output Format

Return valid JSON only:

{"spam": true or false, "reason": "Brief explanation (1–2 sentences)."}

Do not include additional commentary.

LotusJeff · March 10, 2026, 4:04pm

There should be a report on the Admin->Plugin->AI->SPAM page that shows the details of the summary box. The summary box shows the number of posts scanned, spam detected, and false positives and negatives.

Does the detail report exist somewhere that I have not found?
Is there a Data Explorer query that provides the lower-level detail?

Thanks in advance.

Falco · March 10, 2026, 4:52pm

This one gives you all details

SELECT * FROM ai_spam_logs ORDER BY 1 DESC LIMIT 50

Topic		Replies	Views
Setting up spam detection in your community Site Management moderation , automation , how-to , ai	10	2026	January 30, 2025
AI powered Spam detection Announcements ai , spam	11	1115	January 11, 2025
AI spam bot says it is not spam but scan log says it is spam Bug ai	7	228	August 22, 2025
Are you experiencing AI based spam? Community Building ai	22	2146	January 19, 2025
Discourse AI to make spam filter smarter? Feature completed , ai	1	484	May 18, 2024

Discourse AI - Spam detection

Related topics