Spam account scanner script

https://github.com/TannerFilip/discourse-spam-check

I’ll start off by saying, I’m not a great programmer. This is the first “real” tool I’ve written that’s (potentially) useful to people other than me. I’d love any feedback/criticism you have.

I’ve written a Python script that scans through the list of suspect and/or silenced users and lets you delete them if necessary. I ran it over on Mozilla’s Discourse and deleted a few dozen accounts - this was only after I deleted close to a hundred by hand.

There are a few things that seem pretty hacky, especially lines 174 to 191. As I said, I’d appreciate any feedback you might have, and would be happy to answer any questions!

11 Me gusta

Very cool! One thing you’ll want to do is be sure Akismet is enabled, as we recently (within the last 2-3 months) added a feature where the Akismet plugin will scan new user accounts for spammy stuff and flag them for you thanks to @Roman :clap:

Yes, completely human spam account signups – accounts that never post once, just set up an account with profile info and walk away forever – is indeed still a problem. The below is even after Akismet checking:

But bear in mind user profiles aren’t indexed at all, and new user profiles have seriously suppressed info… and our Akismet change helps tremendously.

Having a cleanup tool is still needed though!

7 Me gusta

I didn’t know that! I’ll have to talk to @LeoMcA to see if we want to enable that.

4 Me gusta

Suspect users are now being sent to the Review Queue, which removed the suspect users list this script was using. As they’re being pushed to manual review, is this needed now?

3 Me gusta

Has there been any progress on this?

Our community is experiencing several spam/bot account signups per day that have 0 posts read, 0 topics viewed, less than 1min read time. It would be good to have an auto-remove function for all accounts with certain selected parameters.

Also, is there an option for a Captcha or similar plugin to help filter bots?

If those accounts have no activity, they’re harmless. They are invisible to other users (including a public user list). And user profiles, regardless of their trust level, are disallowed in robots.txt and not visible in search engines.

Plus, inactive accounts are periodically cleaned up, see Clean up inactive users after days setting (“Number of days before an inactive user (trust level 0 without any posts) is removed. To disable clean up set to 0.”).

It’s triggered by the CleanUpInactiveUsers SideKiq job.

1 me gusta

That disallows nothing. robots.txt is only a pollite suggestion, that at same time points to right direction.

It may be innocuous but in the past spammers have used these accounts to “age” thier profile before activating it knowing that we are keeping an eye on new accounts. Then suddenly an account from 3 months ago starts trying to link to whatever spam or DM users phishing attempts.

Personally I would like better tools to head those off before they become a problem rather than waiting. I would also help if we had stronger tools to prevent bots from signing up in the first place.

Sure, it can still be an issue sometimes. I experience a lot of spam but so far I have seen no spam accounts suddently posting after a long time.

If they posted spam, they would quickly be flagged by other users anyway.

And you can still drastically lower the duration after which an inactive account is deleted.