Google just released “No CAPTCHA reCAPTCHA” as new spambot measure

(Erlend Sogge Heggen) #1

Seems like a suitable companion to go along with the Akismet Plugin to fight spam.

(Khoa Nguyen) #2

Akismet Plugin send data to Akismet’s server. It’ll analystic it and response for client if this is spam or not.

(Jeff Atwood) #3

Almost all the spam we see is fairly definitively human entered so I don’t know how much this would help.

(cpradio) #4

That is definitely our experience. I’m not sure we’ve come across ANY that were scripted.

(Erlend Sogge Heggen) #5

Fair point. I can see it being very useful for traditional e-mail signup though. If you check off “I’m not a bot” during registration, the conformation mail could be bypassed (still sent as a backup measure, but account is already activated) and the user doesn’t even have to switch tabs before engaging with the forum. Derp, this is not safe.

I’m always happily surprised when I’m registering for a new website and they let me start doing things immediately after pressing the “Sign up” button.

(Jeff Atwood) #6

No. Email has to be verified regardless or I could say I control your email address in my sign up.

(Scott Trager) #7

Captcha doesn’t just help against automated spam bots.

It does 5 major things:

  1. It reduces automated registrations and comments. Go make yourself a random Drupal site/forum and add it to google and see how quickly you start getting these - next, add a Captcha and watch the difference… Right now Discourse is still relatively small and unknown - when it becomes bigger and in use on more major sites it will become more of a target for these scripts.
  2. It prevents registration bombs.
  3. It makes it more painful for spammers to create multiple accounts. While it doesn’t stop them by any means it does slow them down - you can create far less accounts in an hour if there is a captcha than if there isn’t one.
  4. It creates the IMPRESSION to your community that security is
    important to you. Sorta like the lock icon in your browser
  5. It slows down attempts to poke at your systems login system. It’s
    especially effective against some forms of brute force attacks.

(Mittineague) #8

It also places a hurdle in front of legitimate users that may have problems getting past it, to the point of abandoning their attempt to become a member.

As @cpradio mentioned above, though we have had a good number of SPAM mill accounts, there has been no hard evidence of any bot account activity save a very few incidences of Flood posting that might have been.

As far as the impression of security, IMHO nothing works better than the absence (or at least the quick removal) of SPAM posts.

The days of xRumer bombing are a thing of the past.

(Scott Trager) #9

Make them optional sure, but don’t exclude them! I’d go so far as to say there should be multiple Captcha options - include ReCaptcha 2, NCCap, SolveMedia, maybe a few light-weight basic ones (while not unbreakable they do help some)… This way forum operators have a choice of what provider to use if any at all.

It’s also extremely rare these days for users to get frustrated at captchas as they have become universal almost everywhere you go.

(Dan Porter) #10

I get frustrated at Captchas every time I come across them. My frustration has not weakened by their prevalence.

I agree to the ‘operator choice’ you put forward, with the exception that No Captcha is supposed to fix the problem all the others face, and I personally see no reason to support them. Additionally Discourse does not have notable trouble with spambots. This feature would be better suggested as a plugin. Maybe, if the time comes when most forums would benefit from this, that plugin could be rolled into the main release?

(Jeff Atwood) #11

We already do so much client checking in JavaScript (and remember, unlike ancient circa 1999 Drupal, we are a giant ball of JavaScript app) that automated registration has never been a practical issue. However, on busier / popular sites we do see a fair bit of 100% human account creation where the account is a profile spammer, e.g. will never post but adds random spam stuff to their profile pic, about me, etc.

None of the rationales you are listing really make sense in that context. “It slows them down” only in the sense that it slows everyone down. And it doesn’t seem very effective, if you are dealing with a human who has to spend an extra 10 seconds decoding a captcha versus zero seconds… how does that matter?

The cold, hard reality is that you need a whole different set of approaches when dealing with human spammers, as we almost exclusively are these days. There seem to be a LOT more humans out there spamming 24/7. The days of eliminating spambots and eliminating almost all spam are well and truly over. The bots lost. Now the humans are taking over.

One thing I was thinking might help is a custom question at signup that requires a simple answer only someone familiar with the site would know. For example if you are signing up for an account at @howtogeek it might ask, “name one writer at how to geek” where the answer (even if you don’t know) is easily obtainable by looking at the home page.

(Lowell Heddings) #12

I think if you asked a question like that you would pretty much eliminate a ton of casual contributors from your forum. I don’t know that it is a bad thing but definitely something to think about.

SEO spam is enormous and it has been getting worse for years. It is just that now there is so much money involved and people are so cheap that it is more effective to pay humans. These firms will use every trick imaginable to get their links listed somewhere. And as Google gets better it isn’t just about page rank, but rather hijacking pagerank for the clicks.

If you can get the right link in the right place you can literally make $1+ per click with some affiliate agreements. There is one article on my site with an affiliate link that consistently makes $150 for every 100 clicks. Imagine if you are a comment spammer hijacking a popular article and manage to get your link in there.

This is why I want stronger tools. The edit window is a big problem that keeps getting ignored. There is no way to even strip Amazon affiliate codes from posts, much less other ones. And that whole nofollow thing is just a giant target.

People that come and create one post with one link and don’t come back probably shouldnt have that link indexed in Google, so links for TL1 and below shouldn’t be clickable for anons at all.

(Jeff Atwood) #13

Yeah, but once someone posts they are easier to filter out. The users who never post and use their profile for weird spammy-ish stuff is much harder to stop.

Even though we already exclude user profiles from robots.txt as a global rule, don’t show user page links to anons for TL0 users, don’t make links clickable at all on TL0 profile pages, and all that stuff. They just… keep… coming… like crazy. It’s really disturbing how many new accounts these profile spammer guys (again, all 100% human) create on a regular basis.

More specifically to the situations you described:

  • You can set your max edit window to a day if you’re worried about someone coming in 60 days later and editing a post from “Hello dear friends” to “link to spam (url)”.

  • You can turn off following links for TL3 if you like in site settings

So that’s all handled…

That’s a bit of off topic for this topic, though, which is about in a general sense:

Should we be really worried about spambots?

To which the answer is

No, you should be worried a hell of a lot more about 100% real live human spammers today.

And in your specific case

Should I be more worried about betrayal by my TL3 users or new users?

The answer is definitely the new users. It is a massive investment in time and effort to get to TL3.

It’s also a lot easier to deal with on Discourse because you can’t just join, wait 60 days and bypass the new user sandbox. Nope, on Discourse you are a new user until you actually read a bunch of stuff as measured by JavaScript. One of the easiest ways to identify a new human spam account? New signup, zero or one topics entered, zero or one posts read. And if that same user has filled out their profile? 99% chance they are a profile spammer.

(The above heuristic is what the new “suspicious” tab on the users page does. It is extremely accurate.)

(Scott Trager) #16

While I do agree that human spammers are now the norm, I can also say that having a Captcha has made the world of difference for us. When we added the Captcha our spam levels dropped by a significant number while our regular traffic was unaffected. Mind you this is with a VERY basic Captcha that has a MAJOR flaw (Color-blind people can’t use it - we have an alternate way for these users to log in using a recaptcha) and could be broken in about 5 minutes by anyone with any kind of technical savvy.

More importantly attempts at breaking the login system all but vanished. The same thing happens with many of our forms on our main pages - without a Captcha people (or bots) were hitting them all day long looking for vulnerabilities- with a Captcha the amount of this traffic was slowed tremendously - all without any (noticeable) loss in users.

Our company is rather large [and is a large target as a result] and we use Captchas EVERYWHERE for just these very reasons… despite this we receive almost no complaints about them. Once in a blue moon an email will come in from a user who is having trouble with one, but almost never do people write in anymore saying “I hate your Captchas please don’t use them” or anything even remotely similar.

(Kane York) #17

Here’s the thing: Automated Discourse registrations can’t be done with just submitting HTML forms, you need to do 3+ web requests per signup, minimum, more if your registration bot accounts for non-standard site settings. (TOS acceptance required, SSO, local logins disabled…) Plus another two to confirm & activate the account. And don’t forget the requests per second/minute ratelimits.

All those web requests take time, and if you leave any out, the registration will fail. So Discourse just plain isn’t getting those large quantities of automated registrations.

(Mittineague) #18

Agree that any kind of CAPTCHA is unnecessary at this time and would likely be ineffective and more harmful than beneficial.

If anything, I’d much prefer blocking anonymous proxies (the ones that can’t be traced back).
In my experience that was the top SPAM stopper, not a CAPTCHA

(Scott Trager) #19

You can’t do that because of China and other restrictive regions. Too many legitimate users use them to avoid their governments snooping :hushed:

(Yaron Oliker) #20

Hi Jeff

How can you say with such confidence that the spam is indeed human entered and not a bot using, for example, a browser automation software like selenium? To the best of my knowledge there is no way to tell the difference between the two. Also, being a ‘giant ball of javascript’ is not really an issue for a bot using a full stack browser. I don’t know if spammers actually use this kind of technology, but I think we should be very careful in classifying interactions as human when these tools are out there…

not a bot

(Jeff Atwood) #21

We can tell based on the timestamps. Bots are inhumanly fast, humans take time to copy paste or type. Also, only a human could really change profile images and other in depth profile customizations that are highly specific to Discourse…

(Mittineague) #22

FWIW I have been a moderator for some time (at least 3 years) and have seen literally tens of thousands of problem accounts.

After a while certain patterns emerge, and with experience bot accounts stand out like a sore thumb.

The closest thing to seeing any bot activity since the move to Discourse is extremely rare toe-to-heel Registrations.

And that has now been taken care of.

Believe me, the days of xRumer “blasting” are a thing of the past.

Not to say that there aren’t plenty of human SPAMmers to contend with. :frowning: