Spam bots tricking Discourse filter by editing

Some new spam bots appeared, which are intelligent enough to optimise for Discourse’s built-in spam filters. They first make a comment without any links, and later on they’ll edit and add the link. Discourse doesn’t catch them this way. For example the following revision:

I’ve experienced this too, the most insidious are burying links in punctuation with their edits. Instead of generating clicks from the victim site they seem mostly concerned with creating inlinks and are oblivious to the nofollow being applied to said links.

The other more worrying trend is wiki edits, unlike posts and post edits these don’t appear in the user activity, I can only tell that it has happened because they’ve received a wiki editor badge, without ever posting a wiki post.

Is this spam bot TL1 or TL0?

2 Likes

I don’t see a link in that post. I just see text. Can you show raw?

1 Like

I deleted the user, and I don’t remember the TL.

The links were like the following:

 <a href="https://shareit.onl/">shareit</a>  <a href="https://mxplayer.pro/">MX player</a>

or

<a href="https://messenger.red/">https://messenger.red/</a>  <a href="https://kodi.software/download/">https://kodi.software/download/</a>

or

 <a href="https://viamichelin.onl/">viamichelin</a> <a href="https://putlocker.ooo/">putlocker</a>

(The end of three different posts from the same user)

1 Like

TL is critical for diagnosis here, cause you can just disallow edits to TL0 which is fine, if the spam bot is smart enough to get to TL1 … well we have a diff problem.

3 Likes

It was able to comment 3 posts + add 6 links without triggering the spam system, I think it must have been TL1, but I might check it in a backup.

Honestly, these bots are really smart. They post a “thanks for your post” reply first. There is absolutely nothing suspicious about it, even their email address is similar to their user name. Only googling the email gives results on spam list, nothing else really.

They wait for you to approve the post. Only then they later activate their spam posting bits.

These are not bots, they are humans. There has been a vast increase in human spammers in the last 8 years.

3 Likes

That’s been my impression too. It’s borne out in a variant of the technique described in the OP that we’ve seen. In this case the spammer “replies” to a comment and uses the Discourse quote feature to copy some of the other person’s text into their message. Then they insert their link into the copied block, thus making it look like the other user did it. Not sure if this is supposed to spoof the system into thinking the link is from someone of a higher T level or what. Kinda stupid, really, but definitely seems like something that had to be done manually, not by a bot. In particular, they don’t just drop the URL into the quoted text either; they highlight some text and use the link tool, adding a further layer of disguise. We’ve seen a few of these over the last couple of months.

3 Likes

I’ve just noticed that spammers have been posting legitimate looking posts and then going back a few weeks later to insert links to things like [free netflix] and [tech news].

Is there any way to prevent the insertion of links on all edits by something like below TL3? Even blocking the insertion of URLs in edits for anyone less than TL4 would be fine.

Or has anyone found another way to stop it?

Is it possible to cause all edits by non-admins/non-moderators to bump the post? I think it would be good to see every edit. The human spammers are getting more sophisticated.

Edit: I’m looking at one spammer’s post, and it looks completely legitimate and on-topic. There is no clue that it’s a spammer, except for the injected links.

1 Like

The first step is to tighten up your allowed editing interval in post edit time limit from the default to something like one day. Unless your users regularly need to edit posts from weeks ago, you can close that in your site settings in about 15 seconds.

4 Likes

I’m changing the setting tonight, but I’m hoping that there is another way, since that would probably annoy some users. People tend to be more cautious about speaking freely if they know they they can’t go back and edit things later. (I don’t post as often in forums that lock the editing and am generally less comfortable.)

Ideally, I like unlimited editing windows, and every edit bumps the topic.

1 Like

This is tricky, cause post #12 could be edited in a 40 post topic, if we bump the topic for that it would be incredibly surprising to see it. You would have to scroll through every post.

I think one alternative here is moderation tools that list all edits that happened beyond a certain threshold. But this would introduce a lot of extra overhead here.

Another alternative might be to give TL2 and up longer edit time limits.

2 Likes

Why? Coming back “weeks later” to edit something is highly anomalous. And you can make things wiki if you want to signal that they are especially editable. There’s a nice middle ground of “a few days” you can test first.

I’m going to dial down the default on this setting a bit now actually, from 60 days to 30 days, since the use case for coming back so much later to edit is increasingly absurd to me.

5 Likes

That might be useful.

For the moment, I’ve changed the trust levels required to add links and edit posts and made it a little harder to reach TL1.

The last spam post I saw wasn’t the usual obvious spammer — it was someone fully blending into the site, posting a thoughtful question like a regular user.

I’ll try to find the old spam by querying all edits done by TL0 users.

If one post were marked unread, wouldn’t it just add the blue dot next to the post and auto-scroll to it when a user visits that topic?

Sometimes people feel like they might have said something that they didn’t want to say, and they want to remove it. We’re living in a world where everything a person ever says can follow them around for the rest of their life, and it can lead to problems. People aren’t the same people for their whole lives, and they might not want their old self (or just an angry moment) to remain online forever. I tend to not speak as freely online in places where editing is limited.

1 Like

I just remembered that there is a post webhook for “when there is a new reply, edit, deleted or recovered.” I didn’t check yet, but if I can get the action (“edited”) out of the header, then I can write a script to post those into an external dashboard for manual review. That would solve it on my site.

2 Likes

If it’s over the 30 day (or 1 day, whatever you have it set to) limit, they could flag it for removal.

2 Likes

You may find the sibling topic Human-driven copy-paste spam informative if you’re not following it yet.

This form of spam only works because it’s invisible to moderators and the active community. That’s the only reason it’s happening. Perhaps all edits could bump the thread in the latest activity view — if the topic has already been read, then it’d be a direct link to that edited post. That would completely solve both issues (the spam and the worthless initial copy-paste content) in one fell swoop.

Even simpler (although not quite as effective), I know my fellow moderators and I would happily keep tabs on a special view that simply displayed posts that have been edited, sorted by their edit time (and perhaps optionally restricted by trust level).

5 Likes

I think you are right @sam we need two site settings here, one for TL0 and TL1 and one for higher trust levels. Can you assign that next week? Should be easy.

I recommend allowed edit window settings of

  • TL0 and TL1 — 1 day
  • everyone else at TL2 and higher — 30 days (current default)
5 Likes