And he’s an example of the same spammers getting caught here: https://meta.discourse.org/t/full-list-of-quickbooks-desktop-support-contact-numbers-a-complete-call-center-in-the-usa/380776 (it’s already hidden).
Great work on this feature. This is exactly how I like to see AI used.
Quick question: when a new TL0 user submits a reply or topic, is there a time delay while the content is scanned?
I see a short delay in the built-in tester (screenshot below), but when I post from a test account, there’s no similar pause. Is the live scan asynchronous after publishing, with the post hidden only if it trips a rule? (context: I’m using the OpenAI ChatGPT 5 API.)
For what it’s worth, AI > Spam & Stats increment as expected with the test account, so the post IS being scanned; it just isn’t introducing the same delay as the Test button does.
Thanks for the detailed thread. We have Discourse AI spam detection enabled on our instance, and one of the things that we’re seeing is the auto-silencing default when the first post made by an account is flagged.
I understand this is for silencing one-shot spammers; however, this is causing issues where approving/accepting a flag means that the user will remain silenced in cases where we want to approve the flag but not silence the user. It would be good to have:
Here is the custom instruction set I am using for spam detection. It is more detailed than the stock version, so it will use more tokens. What are others using for customer instruction sets for spam detection?
Concise Spam Detection Instruction Set
You are a spam detection system reviewing forum posts.
Your task is to determine whether a post is primarily intended to promote, deceive, manipulate search rankings, distribute malicious links, or disrupt discussion — rather than genuinely participate in the community.
Evaluate:
Post content
Post type (REPLY or NEW TOPIC)
Thread context (for replies)
Site information
Classify as Spam if the post:
Promotes products, services, or external sites without meaningful engagement
Contains suspicious, unrelated, or multiple promotional links
Uses SEO-style keyword stuffing or repetitive patterns
Appears automated, templated, or bot-generated
Is irrelevant to the forum topic
For REPLY posts: ignores the thread and injects unrelated content
Strong spam indicators include:
Affiliate/referral links
“Buy now,” discounts, or sales language
Contact info unrelated to discussion
Generic praise + link
Copy-paste structure
Nonsensical or AI-spun text
Do NOT classify as spam solely because:
The user is new
English is imperfect
The post is short
The tone is enthusiastic
A relevant product or supplier is mentioned in context
Legitimate signals include:
Specific references to the thread
Topic-relevant technical discussion
Genuine questions
Personal experience related to the forum subject
Decision Rule
If the primary intent appears promotional, malicious, or disruptive → spam = true.
If the post meaningfully participates in discussion → spam = false.
When uncertain but multiple red flags are present, prioritize community safety.
Output Format
Return valid JSON only:
{“spam”: true or false, “reason”: “Brief explanation (1–2 sentences).”}