防止垃圾邮件的技巧

:bookmark: This documentation provides a comprehensive guide on preventing spam in Discourse forums, and includes information about various settings and tools designed to help maintain a spam-free community environment.

:person_raising_hand: Required user level: Administrator

On most forums spam is rare. However, if you’re having problems with spam on your site, Discourse comes with numerous tools to help you automatically prevent spam.

The following guide offers some recommendations on how you can help prevent spam, while still maintaining a positive and welcoming environment for your community.

Spam Detection with Discourse AI

AI Spam Detection is one of the best Discourse features for automated spam detection. Unlike other tools, it can automatically block users and posts based on preconfigured rules. AI Spam Detection is available to all users on Discourse hosting, and on self-hosted sites with an LLM configured.

Benefits of AI Spam Detection include:

  • Automation: No manual intervention is needed to block obvious spam.
  • Customizability: You can tailor it to your community’s unique requirements.
  • Scalability: Works well even when communities are under heavy spam attacks.
  • Broad compatibility: Free (on Discourse hosting) and budget-friendly LLMs like GPT-4, Claude 3.5, and Gemini Flash can handle spam detection effectively.

Setting up AI spam detection

:megaphone: This is now default turned on for Starter and Standard customers

Simply turn it on in Admin settings → plugins → AI → Spam Handling (details here).

By default it uses a prompt that Discourse has tailored for our sites, but you may add custom instructions specific to your site.

Example tailored prompt

:information_source: With Discourse AI you can also use the creative AI bot to generate tailored prompts that are specific to your site’s needs.

Default Trust Levels

The default trust level for new users on your site can be adjusted on the .../admin/site_settings/category/trust page, however, we recommend keeping the default trust level set to 0.

If you’ve modified the value of this setting, we strongly recommend changing it back to 0: new user, as changing this setting can put your site at serious risk for spam, due to the way that trust levels interact with Discourse’s spam related settings.

Spam Related Site Settings

:warning: Unless you are specifically having trouble with spam, we recommend keeping the following settings at their default values.

Discourse has several spam related site settings that you can access on your site’s .../admin/site_settings/category/spam page.

These settings can be adjusted to increase or decrease the sensitivity of spam detection, and the strictness of the consequences associated with posting spam.

The following are some of the more commonly adjusted spam related settings that have a notable impact on how spam is handled on a site.

The default values for all settings are shown below.

Hiding Posts

The hide post sensitivity and cooldown minutes after hiding posts settings control the likelihood that a flagged post will be automatically hidden by Discourse, and how long a user must wait before they can edit a flagged and hidden post.

Silencing New Users

Discourse has a num users to silence site setting, which will automatically silence a new user if they receive a certain number of spam flags.

By default this is set to 3, so you may want to consider lowering this if you’re consistently having problems with spam coming from the same user(s).

Limiting Links

Discourse limits the number of posts a new user can make that contain links to an outside domain with the newuser spam host threshold setting. If new users on your site are frequently spamming links to the same domain, you may want to consider lowering the value of this setting.

Limiting IP Addresses

Discourse limits the number of new accounts a user can make from any given IP address. If you’re finding that problematic users on your site are repeatedly creating accounts to spam your site, you could consider lowering this from the default value.

There’s also a flag sockpuppets checkbox that you can enable to prevent users from creating multiple accounts and then commenting on the same topic:

Additionally, you can manually look up the IP addresses of problematic users on their admin page under the Last IP Address and Registration IP Address fields, and delete other accounts associated with the same IP address.

Or consider blocking IP addresses that spammers are using on the “Logs → Screened IPs” page (.../admin/logs/screened_ip_addresses):

Adjusting Flag Requirements

By default, a topic needs to be flagged by 5 unique users before Discourse will automatically suspending posting to that topic.

You can adjust the num flaggers to close topic site setting to raise or lower the number of flaggers required to suspend posting on a topic, and adjust the auto close topic sensitivity setting to change the likelihood that the topic in question will get automatically closed instead.

Watched Words

Watched Words are another great feature for helping block or limit posts that contain words, phrases, or URL links that spammers might be repeatedly using.

Considering adding some “Blocked” or “Silence” Words to your site if you’re finding that spammers are frequently using the same types of text in their posts.

For a more advanced use of Watched Words, you could also consider Using Regex with Watched Words.

Increase Trust Level Requirements

If you’re finding that spam is coming mainly from TL0 users, you may also want to adjust some of the trust level settings to make it harder to get to TL1:

hCaptcha Plugin

The Discourse hCaptcha plugin aims to enhance security and bot protection by integrating hCaptcha into the local sign-up form.

:sparkles: On all Discourse hosted sites, this plugin is automatically included.

Additional Steps

It’s important to understand why users are spamming your site. Are they’re bored, malicious, or looking to promote themselves?

Suggestions for dealing with The Difficult User, along with a variety of other moderation topics can be found in our Discourse Moderation Guide, so you may want to read through this guide for some additional ideas regarding moderating your site.

Outside of the above, ramping up your moderation team for the short term, so that you have full coverage is another good approach to combatting spam. The key is to wear the problem users down so they get bored and move on.

If you’re continually having problems with spam after going through this guide, you could also consider placing all or some posts from new users into the review queue with the approve post count, approve unless trust level, or approve new topics unless trust level settings:

However, it’s important to make sure you have enough moderators at hand to handle this, as this can have the potential make it difficult for new users to start interacting with the site if posts go unapproved.

Last edited by @Saif 2025-03-13T15:11:05Z

Check documentPerform check on document:
17 个赞

我不能代表所有论坛发言,但我以前在一个拥有TL3级别的论坛上,当我一天第一次登录我关注的类别时,至少还有一篇垃圾邮件帖子。在我目前担任版主的论坛上,我们平均每天会收到大约2篇垃圾邮件帖子。所以,根据我的经验,我认为垃圾邮件在很多论坛上都相当普遍。

5 个赞

一个非常有用的正则表达式是 \\d{3}-\\d{4}|[\\w+\\-.]+@[a-z\\d\\-]+(\\.[a-z\\d\\-]+)*\\.[a-z]+,它可以阻止电子邮件地址和电话号码。别忘了启用设置 - 发帖 - “监视词正则表达式”。

7 个赞

:wave:

我一直在我的论坛上很好地利用这些技巧,所以……谢谢你!:heart:

有没有一个可以启用的设置,只将来自 gmail.com 域名的注册新用户发送到审核队列?

目前,我将所有新用户发送到队列进行审核,但我发现大多数垃圾邮件用户是使用 gmail 电子邮件创建的。至少对我来说,只将这些人发送到审核队列将减少负载和审核时间 :sweat_smile:

1 个赞

@SaraDev 你知道这是否可能吗?我也很想知道,因为阻止 IP 和特定域名将非常有帮助!

1 个赞

没有核心的 Discourse 功能可以仅将来自特定域(例如 gmail.com)用户的帖子发送到审核队列。

最接近的相关功能是自动批准电子邮件域站点设置,该设置允许某些电子邮件域绕过手动用户批准流程,自动批准来自这些域的用户。

还有 阻止的电子邮件域允许的电子邮件域 的设置,它们提供了一种根据电子邮件域限制或控制谁可以在您的站点上注册的方式:

但是,所有这些设置都需要启用 必须批准用户 设置,并且仅影响用户最初在站点上注册,而不影响帖子创建与审核队列之间的交互。

作为一种变通方法,您可以使用“用户组”来实现类似的功能。例如,您可以创建一个自定义用户组,并将注册时使用特定电子邮件地址的用户自动添加到该用户组,然后将该用户组添加到 除非允许的用户组外,均批准除非允许的用户组外,均批准新主题 设置中。

通过这种设置,您可以有效地绕过特定域用户的审核队列,同时仍然根据需要将其他帖子发送到审核队列。

2 个赞

您好,我想知道是否可以强制在主题和/或帖子创建时使用验证码?

我不知道,但如果一个机器人可以在登录时绕过验证码,那么它在发布时也可以做同样的事情,有什么帮助吗?

确实如此,但似乎有用于注册的验证码支持,所以我想知道是否也存在用于主题/帖子创建的验证码支持。

最近我们看到许多客户遭受了大型垃圾邮件攻击,它们都有一个共同点,那就是它们将一个或多个类别开放给了“所有人”——“创建”,绕过了所有信任级别限制。

对于经验丰富的 Discourse 管理员来说,这显然是个坏主意,但对于经验不足的人来说并非如此。因此,将(对我们来说)显而易见的事情说出来,并将其添加到主题的起始帖中,可能是一个好主意。

7 个赞

最近,我们一直在处理一些垃圾邮件发送者,他们使用自动注册,然后尝试创建新主题,发布看起来像真实咨询请求的人工智能生成内容,但其中包含亚马逊联盟链接。他们通常会用各种 URL 缩短引擎来隐藏这些链接。他们甚至能够回复帖子,甚至以一种有趣的方式在私信中聊天。有人遇到过这种情况吗?我想知道,既然这些尝试似乎是全自动化的,那么是否有许多其他目标是 Discourse 论坛。你有什么关于如何摆脱它们的策略建议吗?

1 个赞

您好 @Overgrow

这里有一些您可以尝试的防止此情况的建议:

  • 使用 Discourse AI - AI triage 在您的社区中设置垃圾信息检测 以检测此类内容
  • 将 URL 缩短服务和亚马逊联盟链接模式添加到您的受监控 已关注词语 列表中
  • 降低 newuser spam host threshold 并提高 TL1 的要求
  • 减少 max new accounts per registration IP 并启用 flag sockpuppets
  • 使用 Discourse hCaptcha 插件来帮助防止您的网站上出现自动垃圾信息/AI 注册。
  • 考虑将所有新用户发布的内容放入审核队列,直到攻击平息,方法是调整:
    • approve post count
    • approve unless trust level
    • approve new topics unless trust level

这里的方法将与防止一般垃圾信息类似,但会更侧重于缩短的 URL 和人工智能生成的内容。

对于您这里的情况,您可以尝试使用一个专门用于检测人工智能内容的 AI 提示,如下所示:

你是一个垃圾信息检测系统。分析以下内容和上下文。

注意:
- 回复必须与讨论串保持相关。
- 如果内容不相关、宣传性或自动化,则标记为垃圾信息。
- 考虑新用户发布的带有链接的内容可能为垃圾信息,除非与主题明确相关。

注意那些看起来真实但具有不自然模式的内容。
寻找措辞奇特、混合了过度正式和随意语言,或不完全符合上下文的通用建议。
标记包含隐藏联盟链接的内容,特别是当帖子似乎旨在自然地引导到产品推荐时。

特别注意这些危险信号:
1. 伪装成真实咨询请求但包含宣传元素的内容
2. 介绍问题然后推荐特定产品作为解决方案的帖子
3. 存在 URL 缩短服务(bit.ly、tinyurl、t.co、goo.gl 等),它们可能隐藏联盟链接
4. 亚马逊产品链接或引用,特别是带有联盟参数(tag=、ref=、affiliate=)的
5. 看起来在寻求推荐但巧妙地引导到特定产品的内
6. 人工优质文本 - 过度正式的语言与随意的表达或尴尬的结构混合
7. 发布包含上述任何模式内容的新账户

仅回复“SPAM”或“NOT SPAM”。
3 个赞

最近在处理机器人账户方面遇到了很多麻烦。我不得不第二次禁用新用户注册。昨天不得不删除大约 50 个机器人账户,其中包含大约 30 条垃圾帖子。我已经启用了 hcaptcha,并设置了一个困难的谜题,但它们并没有停止。我之前使用的是 3.5.0 版本,但在攻击发生后刚刚更新到 3.6.0 版本。我们已经不允许在信任级别 0 时发布链接,并且要求 30 篇帖子后才允许发布链接,但这些帖子只是关于旅行社和其他随机废话的文本块。还有一些 AI 账户和帖子,它们引用了论坛的实际内容,但不太有意义。这些帖子对我们的用户来说有点意思,但总之,我不想在论坛上启用 AI,但我感觉我已经尝试了所有其他选择。但是,我收到了这个消息:

但我没有看到任何地方可以添加所述配置?

最重要的是,虽然 AI 可能会帮助处理垃圾邮件,但我认为启用 AI 无法阻止机器人账户的创建,或者我错了?

1 个赞

如果将 approve post count 设置为 1,那么修改这些是否仍然需要?

我真的不知道这个问题的答案。