Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 25



Definition Background Types Applications Constructing CAPTCHAs Breaking CAPTCHAs Issues with CAPTCHAs Conclusion

CAPTCHA= Completely Automated Public Turing test to tell Computers and Humans Apart

Invented at CMU by Luis von Ahn, Manuel Blum, et. al

A program that is a challenge response test to separate humans from computer programs

Generic CAPTCHAs distort letters and numbers Distorted characters are presented to user
User has to recognize the distorted letters

If the guessed letters are correct, the user is inferred to be a human and allowed access
Else, user is a bot and denied access

Humans can read the distorted and noisy text Current OCRs cannot read them

Why CAPTCHA was needed?
Sabotage of online polls Spam emails Abusing free online accounts Tampering with rankings on recommendation systems (like EBay, Amazon)

Altavista first used a crude CAPTCHA in their sites Resulted in 95% spam reduction
Yahoo partnered CMU to counter these threats in Messenger chat service. Luis von Ahn and Manuel Blum of CMU trademarked CAPTCHA in 2000

What is a Turing test?

Proposed by Alan Turing To test a machines level of intelligence Human judge asks questions to two participants, one is a machine, he doesnt know which is which o If judge cant tell which is the machine, the machine passes the test o CAPTCHA employs a reverse Turing test, judge = CAPTCHA program, participant = user if user passes CAPTCHA, he is human if user fails, it is a machine
o o o

Types of CAPTCHAs
Text based:
Simple, normal language questions:
What is sum of three and thirty-five? If today is Saturday, what is day after tomorrow? Which of mango, table, water is a fruit?
o o

Very effective, needs a large question bank Cognitively challenged users find it hard

Designed by Yahoo and CMU Picks up 10 random words from dictionary and distorts, fills with noise o User has to recognize at least 3 words o If user is correct, he is admitted
o o

A modified version of Gimpy Yahoo used this version in Messenger Has only 1 random string of characters Not a dictionary word, so not prone to dictionary attack o Not a good implementation, already broken by OCRs
o o o o

MSNs Passport service CAPTCHAs:

o o o o o

Provided for Microsofts MSN services Use 8 characters Warping is used to distort Very strong implementation, hasnt been broken It is segmentation-resistant

Graphic based CAPTCHAs:

After M.M.Bongard, pattern recognition expert User has to solve a pattern recognition problem Has to tell the distinct characteristic between two sets of figures o Then tell to which set a given figure belongs to
o o o

Uses a large database of labelled images It shows a set of images, user has to recognize the common feature among those o E.g., Pick the common characteristic among the following four pictures-----Aeroplane
o o

Consist of downloadable audio clip User listens and enters the spoken word Helps visually disabled users Below is the Googles audio enabled CAPTCHA o Not popular
o o o o

Protect online polls Prevent Web registration abuse, protect passwords from brute-force attack Prevent comment spam and spam emails E-Ticketing, prevent scalping

Verify digitized books: reCAPTCHA

Used in Google Books Project Two words are shown, the program knows first word o If user enters first word correctly, it assumes that the second unknown word will also be entered correctly o Second word becomes known
o o

Help advance AI knowledge

CAPTCHAs are called Hard-AI problems A win-win scenario:

If CAPTCHAs are broken by a bot, a Hard-AI problem is solved If its not yet broken, then current implementation is able to withstand attacks

Thus AI knowledge is advanced if CAPTCHAs are broken

Constructing CAPTCHAs
Things to keep in mind:

Dont store CAPTCHA solution in Web pages metadata

A CAPTCHA is no good if it doesn't distort Need a large database of different CAPTCHA questions Avoid repetition of questions

o o

Generate the question

Persist the correct answer

Present the question to user

Evaluate answer, if incorrect, start again-Generate a different CAPTCHA

If correct, allow access to user

Embeddable CAPTCHAs:
Available freely, just embed code into Web pages HTML, from e.g., o No maintenance

Custom CAPTCHAs:
o o

Fits to the theme of the page Better protected from spammers Can be written in any language Perl, .NET, ASP, JavaScript

o o o o o

Accessibility Image security Script security Security after widespread adoption Custom implementation or a general CAPTCHA?

Issues with CAPTCHAs

Usability issues:
W3C mandates Web to be accessible to all people o Some CAPTCHAs are inaccessible to visually impaired, cognitively challenged people

Compatibility issues:
JavaScript may need to be activated in browsers o Some may need Adobe Flash plugin installed

CAPTCHAs are an effective way to counter bots and reduce spam They serve dual purpose help advance AI knowledge Applications are varied from stopping bots to character recognition & pattern matching Some issues with current implementations represent challenges for future improvements


You might also like