Blocking recent wave of spam

SailReal · July 3, 2024, 10:43am

We are hit by a massive spam wave for days now. Others like https://ask.learncbse.in/ gave up for the moment as it seems

I’m searching here just for one variant:

The content changes often, the email addresses and IPs as well so blocking does reduce the amount but we didn’t find a real fix yet. For privacy reasons we do not want to send everything to Akisment.

If we would block

AS55836: Reliance Jio Infocomm Limited
AS9498: Bharti Airtel Ltd.
AS45609: Bharti Airtel Ltd.
AS24560: Bharti Airtel Ltd.

we would be fine, but this could be a good (or small) part of the Indian population.

j127 · July 3, 2024, 8:10pm

Have you tried adding certain words to Admin → Customize → Watched Words → Require Approval?

From your screenshot, I’d try adding these words:

cash
credit
money
loan
toll-free
customer care
care number
0779*
helpline
:point_left:

It can be slightly inconvenient for users, but I have Discourse send a webhook to a Firebase cloud function (free) that pings my phone in a Slack chat room, so I can often approve posts in moderation within 60 seconds from my phone, if I’m awake.

SailReal · July 3, 2024, 8:41pm

Thanks for the hint but please check out https://ask.learncbse.in/ (it’s not my instance but the posts are more or less the same I’m fighting against) and scroll through the last days, they are using a ton of combinations and variations of each keyword. I’m in the process of creating a lot of regex for each keyword because they are adding everywhere a “.”, a “,” a “|”, replacing a “0” with an “O”, an “e” with a “3”, adding in the middle of the word a (up to now) random character etc etc … it is really difficult to fight against this type of spam.

Or if you do not want to click on a random link, here is a screenshot of the last few hours, but these are just the last few hours, they vary a lot over time:

JammyDodger · July 3, 2024, 8:49pm

Just to check, but do you use the min first post typing time admin setting? I find that quite useful for catching a lot of ours.

SailReal · July 3, 2024, 8:51pm

Yes, thanks for the hint, this is set but I mean it is not that hard for the bot to just wait a few minutes

JammyDodger · July 3, 2024, 8:59pm

This spam seems like a different type to the AI based answers/content the other topic is focused on so I’ve split it out.

We do have a new AI-based tool for spam detecting which has proven to be quite effective:

SailReal · July 3, 2024, 9:10pm

Thanks for the tip, but setting up an LLM just to fight another spammer LLM for our discourse is way too expensive for our usecase.

As a spammer you can easily increase the cost for the org by just creating more users/posts, so depending on what you want to archive this could be also a motivation to create even more posts

anon82911141 · July 3, 2024, 9:24pm

Hi,

Have you tried using Akismet? Seems like their solution would work for you.

(free for personal use, not for commercial use - don’t know how you’d categorise yourself)

Firepup650 · July 3, 2024, 9:25pm

Perhaps requiring every user’s first post to be approved would help a bit here? That way at least they’d never make it onto the forum publicly, and as long as you don’t have a lot of real users signing up daily, I think it would help at least some.

SailReal · July 3, 2024, 9:43pm

Thanks for all the tips.

We do thought about it but we have a privacy and security product which means we do need to protect our users as much as possible. The content is public for sure but not the IP Address/Agent/Referrer/Email if I understood Discourse Akismet correctly, it is transmitted to Akismet (sure would also read the privacy policy but the overview is already enough information for the decision).

That would be an idea. With ~2 signups per day it shouldn’t be too much trouble, but it’s not the best experience to wait for an approval, but if we explain it properly it might be the best option we have for now.

anon82911141 · July 3, 2024, 9:46pm

Yes, you are unfortunately correct - they do transmit some additional data to Akismet which may not align with your privacy policy. In that case, @Firepup650’s suggestion is the best one out there.

RGJ · July 3, 2024, 10:33pm

FYI my Geo Blocking plugin can deny access to Discourse based on the source AS network. Indeed a lot of this kind of spam seems to originate from those networks, especially AS45609.

If you don’t want to block half of India then it might be worth investigating how hard it would be to reuse some of the functionality in that plugin to add network or IP based rules to the approval logic (“require approval for new posts from networks”)

j127 · July 3, 2024, 11:36pm

I scrolled through many pages on that example site and think it might be possible to block nearly all of those with the watched words feature, if Discourse regex can work on Unicode ranges.

Regular users probably don’t use things like this:

2+ slashes in a row
unusual punctuation like ^ (unless it’s a math site)
uncommon Unicode ranges:
- ✓ (Miscellaneous Symbols)
- ∆ (Greek and Coptic)
- ❽, ➁, ❸, 3, ❷ (Dingbats)
- 𝘾, 𝙪, 𝙨, 𝙩 (Mathematical Alphanumeric Symbols)

ChatGPT could probably write a regex for those, if Discourse supports it.

One more idea is to try Cloudflare with the Bot Fight Mode feature (free) and challenge all bots.

SailReal · July 4, 2024, 8:18am

Ouh, that would be the perfect solution, will have a look into the code, thanks!

The problem here is that this bot somehow knows how Discourse works: In the following scenario I’m watching for ❽ in the “Require for Approval” section. The problem is now that those bots often create first an random text and then edit it to the actual content. Editing a post is not checked against the “Require for Approval” list, see e.g.

VS

(here I added the ❽ directly during post creation)

which means our only option is to add it to the block section, but blocking too many words and characters can easily lead to problems where normal users get a confusing message when creating valid posts. I think this is where most of our problems come from. In my opinion, this is a bug, and also when editing a post, the “Require Approval” list should be checked against the edited content when the change is published.

j127 · July 4, 2024, 8:54am

I guess watched words won’t help then. I haven’t had a spam attack from that yet but I’m worried about it because users started to figure it out.

j127 · July 13, 2024, 5:11pm

It looks like one of my forums just got hit by that same kind of spam attack. I don’t know if they used the editing trick, since I didn’t have the spam words on the watched words list yet.

juanjosegzl · July 13, 2025, 6:09pm

Hello everyone

I have a proof of concept of this, if you wanna take a look

RGJ · July 13, 2025, 9:29pm

Nice work @juanjosegzl , I would gladly accept this as a PR!

juanjosegzl · July 19, 2025, 2:33pm

thank you @RGJ I just opened a PR

Shelim · July 29, 2025, 5:24pm

Hello @juanjosegzl ,

Your last PR broke the plugin - it now ask everyone to confirm their post via moderation (even the moderation themselves) regardless of their geo location. Any workarounds or ETA on fix?

Topic		Replies	Views
Our forum is getting "bamwar" spam Support	35	11250	April 1, 2016
Diagnosing spam attack of 100 topics Feature	34	2896	May 29, 2017
Spam Posts Support	3	457	April 10, 2024
A few questions related to moderation Support moderation	23	288	March 19, 2025
How to block certain types of user registrations Support	3	62	January 21, 2025

Blocking recent wave of spam

Related topics