Thursday, July 31, 2008

Another Captcha Idea (Identify/Captcha)

With the recent failures with the captcha for gmail and other major mail service providers it has became apparent that captchas need to be improved. So I pondered on this for a while and a came up with the identify/captcha idea where you have the standard captcha but you also include a background image that has items the user has to identify. Here is a mockup of what I am talking about:





The background itself being something other then whitespace would trip up the bot makers for a while but the real kicker is the “What is in the picture?” bit. Having 6 check boxes gives 64 possible answers that’s a 1/64 chance of a bot guessing it right. If the bots get the text right 1/3 of the time then that is a 1/192 chance of the bot answering correctly. Sounds like a good odds to me.

Now let’s think about how to break this system (what the bot makers would think) well they could easily create a database of pictures and correct answers. So to foil that we will need a very large collection of pictures to make the DB unmanageable and we need to distort the pictures to make any hashing attempt fail. The rest is left as an exercise for the reader.

Now all is needed is for me to convince Google to start using this.

EDIT:

My brother pointed out that having to have a database of images is pretty much unacceptable because image recognition can eventually defeat any distortions. So to make this idea better the image should be a 3d rendering of the objects with random textures, positions and angles and lighting making it so you can have a virtually unlimited number of images with objects in it. This would force the bot makers to actually do object recognition. This idea is more processor intensive but you will no longer have to store the Database or create it.


1 comment:

DanStory said...

I've never been a big fan of CAPTCHA, mostly due to the annoyance it can cause for a guest/visitor. Solutions for bot form submit detection should aim at not interrupting the guest/visitor at the highest cost possible. One thing that bots do (or I should say, don't do) is not render/process the page the form is on (HTML/CSS/JS). With most every browser supporting JavaScript, CSS, and sessions (cookies); these technologies could be utilized to help reduce (or completely stop) bots from form submission. CodeViewer.org (my website) first prevented bot posts from submission by having a two step process of submitting a form (preview->finish). After some time, of course they got around it. Just three days ago, I implemented more bot detection by doing a math problem client side (using JavaScript) then check the answer with the stored answer for that form session. If the answers don't match then either the browser doesn't support JavaScript (or is disabled) or was a bot that can't (doesn't) execute the JavaScript.

The thing is that with someone devoted enough, resourceful enough, what have you; will find some way to bypass/fake the form input bot detection.