Spam account scanner script

https://github.com/TannerFilip/discourse-spam-check

I’ll start off by saying, I’m not a great programmer. This is the first “real” tool I’ve written that’s (potentially) useful to people other than me. I’d love any feedback/criticism you have.

I’ve written a Python script that scans through the list of suspect and/or silenced users and lets you delete them if necessary. I ran it over on Mozilla’s Discourse and deleted a few dozen accounts - this was only after I deleted close to a hundred by hand.

There are a few things that seem pretty hacky, especially lines 174 to 191. As I said, I’d appreciate any feedback you might have, and would be happy to answer any questions!

11 个赞

Very cool! One thing you’ll want to do is be sure Akismet is enabled, as we recently (within the last 2-3 months) added a feature where the Akismet plugin will scan new user accounts for spammy stuff and flag them for you thanks to @Roman :clap:

Yes, completely human spam account signups – accounts that never post once, just set up an account with profile info and walk away forever – is indeed still a problem. The below is even after Akismet checking:

But bear in mind user profiles aren’t indexed at all, and new user profiles have seriously suppressed info… and our Akismet change helps tremendously.

Having a cleanup tool is still needed though!

7 个赞

I didn’t know that! I’ll have to talk to @LeoMcA to see if we want to enable that.

4 个赞

Suspect users are now being sent to the Review Queue, which removed the suspect users list this script was using. As they’re being pushed to manual review, is this needed now?

3 个赞

这方面有进展吗?

我们的社区每天都会遇到几次垃圾邮件/机器人账户注册,这些账户的帖子阅读量为 0,主题浏览量为 0,阅读时间不到 1 分钟。最好能有一个自动移除功能,移除所有符合特定参数的账户。

另外,是否有验证码或类似插件的选项来帮助过滤机器人?

如果这些账户没有任何活动,它们就没有危害。它们对其他用户(包括公开用户列表)是不可见的。而且,用户个人资料,无论其信任级别如何,在 robots.txt 中是被禁止的,并且在搜索引擎中不可见。

此外,不活跃账户会定期清理,请参阅**清理不活跃用户(天数)**设置(“在移除不活跃用户(信任级别为0且没有任何帖子)之前的天数。设置为0可禁用清理。”)。

它由 CleanUpInactiveUsers SideKiq 作业触发。

1 个赞

这什么都没禁止。robots.txt 只是一个礼貌的建议,同时也是一个正确的方向指示。

它可能无害,但在过去,垃圾邮件发送者曾利用这些账户来“养”他们的个人资料,然后再激活它,因为他们知道我们正在关注新账户。然后,突然一个 3 个月前的账户开始尝试链接到任何垃圾邮件或用户网络钓鱼尝试。

就个人而言,我希望有更好的工具来在它们成为问题之前解决它们,而不是等待。如果我们有更强大的工具来阻止机器人注册,那也会有帮助。

当然,有时仍然可能是一个问题。我遇到很多垃圾邮件,但到目前为止,我还没有看到任何长期不活跃的垃圾邮件账号突然发布内容。

如果他们发布垃圾邮件,无论如何都会很快被其他用户标记。

而且你仍然可以大大缩短非活跃账号被删除之前的时间。