CAPTCHA – eeeewwwww

Dave Shuck posted about making CAPTCHA easier on your users this morning.  While I can see the point, CAPTCHA sucks.  I can't even tell you how many times I go to post something somewhere, see a CAPTCHA image and hit the Back button.  It's not that it's terribly difficult to deal with usually, but the the whole premise is flawed.  Though I can say I've had a couple occasions where it's taken several tries to get right, and I've got good eyes.

Dave makes a good point about capitalization, but my question is why the hell would you make a CAPTCHA case sensitive to begin with?  And it should NEVER contain 'l', 'L', '1', 'o', 'O', '0', or any other even remotely ambiguous characters.  And the characters should be friggin' HUGE so it's stupid easy to read for anything that has visual processing.  In other words, basic visual processing capabilities should be the ONLY requirement.

While I'll admit to having had some spam problems on my blog, I've solved them without CAPTCHA.  It's not terribly difficult.  A combination of form obfuscation and content filters has basically erased all comment spam with zero user impact.  And note also that I use MovableType under the hood, which is a very common blogging platform and therefore has a large number of spammers specifically targetting it because it's so widely used.

8 responses to “CAPTCHA – eeeewwwww”

  1. Peter J. Farrell

    Barney,

    While you may not like Captcha – the technology is here to stay until there is an international identification / authentication system in place. According to the research, Captchas are usually only solvable about 80% of the time for humans. Sometimes you have to guess and computer just don't guess well.

    This is because OCR technology has gotten pretty good. For example, the Captcha on the PayPal signup page is 100% defeatable with consumer grade OCR software. Many people have already defeated it.

    I think spammers are just not targetting you fully – yet. Matt Woodward did similar stuff like you have done to combat spam. After a while, the spammer were adapting to changes after just a few minutes.

    Lastly, I'll take a Captcha over having to create an account on somebody's blog to make comment any day! I'll fill in a Captcha, but I won't create an account.

  2. Sami Hoda

    What do you guys think of Math Captcha's? Like the one on Kurt Wiersma's blog?

  3. charlie griefer

    I just signed up for a forum today, and at the bottom there were two radio buttons. the one that was checked by default said something along the lines of "i am a bot and won't know enough to click the other radio button", and the 2nd (unchecked) button had text that read, "I'm a human. Let me in".

    I'll admit, I thought that was a better alternative than trying to figure out some CAPTCHAs that I've seen. Altho I don't profess to know how successful something like that is vs. a CAPTCHA system's success rate.

  4. Barney

    Peter, I know all about spambots hosing my blog. Initially, I had few problems, but there was a point when I was getting several thousand per day on a regular basis. Some of the simple obfuscation tricks (like renaming mt-comments.cgi) produced results like Matt's: it stopped the spam for a few hours. However, doing JS-based rewrite seems to have almost completely cured it. Check the source for how it works. Wouldn't be hard to defeat, but like everything else in the world of security, it only has to be more difficult than it's worth.

    The handful that don't get foiled are mostly handled by content filters (which, I might add have yet to kill a legit comment), and the occasional one that slips through both is immediately manually deleted and used to improve the content filter.

    I'll certainly agree that registering for a site (particularly if it requires email-based positive confirmation) is far worse than a CAPTCHA check. I can see the utility for "important" stuff (like PayPal), but for blog comments on most blogs, the benefits of computer posting aren't sufficient to endeavor to overcome simple, user-transparent security measures, which makes those measures equally effective.

  5. Barney

    Charlie,

    Another alternative that I considered (but didn't end up having to implement), is a session-based check to ensure the submission matches a form that's been rendered by the app. You could even do it without a session and just use form/URL variables, if you wanted.

    The trick is to make the computer have to jump through hoops to do what it wants. As soon as navigating the hoops isn't worth the benefits, you've won. A link on my blog comments that I'm going to delete anyway isn't worth much, so people don't try very hard.

  6. Ben Nadel

    Sami Hoda, I would definately be interested in a math captcha. I am currently using a math de-spamming method on my site and it is working quite nicely. I do, however, have to get a bit complicated when obfuscating the display of the math. Check it out at http://bennadel.com/index.cfm?dax=blog:197.view

  7. Tuggle

    I am certainly up for correction on this, but I thought CAPTCHA was primarily developed for stopping automated signups on high traffic sites or hacking/DoS style attack. When did "the average Joe" starting thinking they should apply it to their blog comments? Comments spam bots are looking to hit the web in mass droves. So even having a field that says "Type the word 'happy' in the box" would stop a bot because they're not going to write a custom submission just for your blog. Any question would work "What color is the sky?", etc.

    So aside from the cool factor which wears off in about 7 minutes, I think CAPTCHA is like killing an ant with a nuke.

  8. Barney

    CAPTCHA is designed to prevent computers from doing what you only want humans to do. Whether it's logging into a bank, registering a domain, or posting a blog comment, the goal is the same.

    But you make exactly my point. Security needs to be more difficult to crack than the benefits of cracking it, but at the same time, it needs to be as transparent as possible. Hence I use biometrics and two separate computer controlled keys to get in my datacenter, but I use stupid JS obfuscation to keep blog spammers out. Both are secure enough, and both are minimally intrusive.