Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

ARTIFICIAL

INTELLIGENCE
-ABOUT CAPTCHA

Presented By

G.SindhuPallavi E.JayaLakshmi

Roll no: 07AT1A0519 Roll no:07AT1A0511

IV B. Tech C. S. E IV B. Tech C. S. E

Email Id: gundasindhu19@gmail.com EmailId:jaya4lakshmi@gmail.com


Contents

TOPIC PAGE NO

1. Introduction: 4

2. History 4

3. Characteristics 5

4. Applications 6

5. Guidelines 8

6. How to break a CAPTCHA 10

7. Protection by enhances CAPTCHAs 12

8. Conclusion 14

9. References 14
INTRODUCTION:

A CAPTCHA is a type of challenge-response test used in computing to ensure


that the response is not generated by a computer. The process usually involves one computer (a
server) asking a user to complete a simple test which the computer is able to generate and grade.
Because other computers are unable to solve the CAPTCHA, any user entering a correct
solution is presumed to be human. Thus, it is sometimes described as a reverse turing test,
because it is administered by a machine and targeted to a human, in contrast to the standard
turing test that is typically administered by a human and targeted to a machine. A common type
of CAPTCHA requires that the user type letters or digits from a distorted image that appears on
the screen. The term CAPTCHA means Completely Automated Public Turing Test To Tell
Computers and Humans Apart.

CAPTCHAs must satisfy three basic properties. The tests must be


– Easy for humans to pass.
– Easy for a tester machine to generate and grade.
– Hard for a software bot to pass.

HISTORY:

Moni Naor was the first person to theorize a list of ways to verify that a request
comes from a human and not a bot. Primitive CAPTCHAs seem to have been developed in
1997 by Andrei Broder, Martin Abadi, Krishna Bharat, and Mark Lillibridge to prevent bots
from adding URLs to their search engine. In order to make the images resistant to OCR (Optical
Character Recognition), the team simulated situations that scanner manuals claimed resulted in
bad OCR. In 2000, Luis von Ahn and Manuel Blum coined the term 'CAPTCHA', improved and
publicized the notion, which included any program that can distinguish humans from
computers. They invented multiple examples of CAPTCHAs, including the first CAPTCHAs to
be widely used, which were those adopted by Yahoo!.
CHARACTERISTICS:

A CAPTCHA system is a means of automatically generating new challenges which:

 Current software is unable to solve accurately.


 Most humans can solve.
 Does not rely on the type of CAPTCHA being new to the attacker.

Although a checkbox "check here if you are not a bot" might serve to distinguish
between humans and computers, it is not a CAPTCHA because it relies on the fact that an
attacker has not spent effort to break that specific form.

Withholding of the algorithm can increase the integrity of a limited set of systems,
as in the practice of security through obscurity. The most important factor in deciding whether
an algorithm should be made open or restricted is the size of the system. Although an algorithm
which survives scrutiny by security experts may be assumed to be more conceptually secure
than an unevaluated algorithm, an unevaluated algorithm specific to a very limited set of
systems is always of less interest to those engaging in automated abuse. Breaking a CAPTCHA
generally requires some effort specific to that particular CAPTCHA implementation, and an
abuser may decide that the benefit granted by automated bypass is negated by the effort required
to engage in abuse of that system in the first place.

An example of Text-Based CAPTCHA.


APPLICATIONS of CAPTCHAs:

CAPTCHAs have several applications for practical security, including

Preventing Comment Spam in Blogs:

Most bloggers are familiar with programs that submit bogus comments, usually
for the purpose of raising search engine ranks of some website (e.g., "buy penny stocks
here"). This is called comment spam. By using a CAPTCHA, only humans can enter
comments on a blog. There is no need to make users sign up before they enter a
comment, and no legitimate comments are ever lost!

Protecting Website Registration:

Several companies (Yahoo!, Microsoft, etc.) offer free email services. Up until a
few years ago, most of these services suffered from a specific type of attack: "bots" that
would sign up for thousands of email accounts every minute. The solution to this problem
was to use CAPTCHAs to ensure that only humans obtain free accounts. In general, free
services should be protected with a CAPTCHA in order to prevent abuse by automated
programs.

Search Engine Bots:

It is sometimes desirable to keep web pages unindexed to prevent others from


finding them easily. There is an html tag to prevent search engine bots from reading web
pages. The tag, however, doesn't guarantee that bots won't read a web page; it only serves
to say "no bots, please." Search engine bots, since they usually belong to large
companies, respect web pages that don't want to allow them in. However, in order to truly
guarantee that bots won't enter a web site, CAPTCHAs are needed.
Online Polls:

In November 1999, http://www.slashdot.org released an online poll asking which


was the best graduate school in computer science (a dangerous question to ask over the
web!). As is the case with most online polls, IP addresses of voters were recorded in
order to prevent single users from voting more than once. However, students at Carnegie
Mellon found a way to stuff the ballots using programs that voted for CMU thousands of
times. CMU's score started growing rapidly. The next day, students at MIT wrote their
own program and the poll became a contest between voting "bots." MIT finished with
21,156 votes, Carnegie Mellon with 21,032 and every other school with less than 1,000.
Can the result of any online poll be trusted? Not unless the poll ensures that only humans
can vote. This is the reason why presence of CAPTCHAs is treated as very important as
they prevent bots to vote during online polls.

Preventing Dictionary Attacks:

CAPTCHAs can also be used to prevent dictionary attacks in password systems.


The idea is simple: prevent a computer from being able to iterate through the entire space
of passwords by requiring it to solve a CAPTCHA after a certain number of unsuccessful
logins.

Worms and Spam:

CAPTCHAs also offer a plausible solution against email worms and spam: "I will
only accept an email if I know there is a human behind the other computer." A few
companies are already marketing this idea.

These are the applications of CAPTCHAs used in websites. In general there are
some guidelines for websites which implements CAPTCHAs.
GUIDELINES:

If a website needs protection from abuse, it is recommended that it uses a


CAPTCHA. There are many CAPTCHA implementations, some better than others. The
following guidelines are strongly recommended for any CAPTCHA:

Accessibility:

CAPTCHAs must be accessible. CAPTCHAs based solely on reading text — or


other visual-perception tasks — prevent visually impaired users from accessing the protected
resource. Such CAPTCHAs may make a site incompatible with Section 508 in the United
States. Any implementation of a CAPTCHA should allow blind users to get around the
barrier, for example, by permitting users to opt for an audio CAPTCHA.

Image Security:

Images of text should be distorted randomly before being presented to the user.
Many implementations of CAPTCHAs use undistorted text, or text with only minor
distortions. These implementations are vulnerable to simple automated attacks. For example,
the CAPTCHAs shown below can all be broken using image processing techniques, mainly
because they use a consistent font.

Script Security:

Building a secure
CAPTCHA is not easy. In addition to
making the images unreadable by computers, the system should ensure that there are no easy
ways around it at the script level. Common examples of insecurities in this respect include:

(1) Systems that pass the answer to the CAPTCHA in plain text as part of the
web form.

(2) Systems where a solution to the same CAPTCHA can be used multiple times
(this makes the CAPTCHA vulnerable to so-called "replay attacks").

Security Even After Wide-Spread Adoption:

There are various "CAPTCHAs" that would be insecure if a significant number of


sites start using them. An example of such a puzzle is asking text-based questions, such as a
mathematical question ("what is 1+1"). Since a parser could easily be written that would
allow bots to bypass this test, such "CAPTCHAs" rely on the fact that few sites use them,
and thus that a bot author has no incentive to program their bot to solve that challenge. True
CAPTCHAs should be secure even after a significant number of websites adopt them.

AUDIO CAPTCHA:

Because CAPTCHAs rely on visual perception, users unable to view a


CAPTCHA (for example, due to a disability or because it is difficult to read) will be unable
to perform the task protected by a CAPTCHA. Therefore, sites implementing CAPTCHAs
may provide an audio version of the CAPTCHA in addition to the visual method. The official
CAPTCHA site recommends providing an audio CAPTCHA for accessibility reasons.

HOW TO BREAK A CAPTCHA:


Here are some approaches to defeating CAPTCHAs:

Human solvers:

CAPTCHA is vulnerable to a relay attack that uses humans to solve the puzzles.
One approach involves relaying the puzzles to a group of human operators who can solve
CAPTCHAs. In this scheme, a computer fills out a form and when it reaches a CAPTCHA, it
gives the CAPTCHA to the human operator to solve.

Another variation of this technique involves copying the CAPTCHA images and
using them as CAPTCHAs for a high-traffic site owned by the attacker. With enough traffic, the
attacker can get a solution to the CAPTCHA puzzle in time to relay it back to the target site. In
October 2007, a piece of malware appeared in the wild which enticed users to solve CAPTCHAs
in order to see progressively further into a series of striptease images.

These methods have been used by spammers to set up thousands of accounts on


free email services such as Gmail and Yahoo! Since Gmail and Yahoo! are unlikely to be
blacklisted by anti-spam systems, spam sent through these compromised accounts is less likely to
be blocked.

Computer character recognition:

A number of research projects have attempted (often with success) to beat visual
CAPTCHAs by creating programs that contain the following functionality:
1. Pre-processing: Removal of background clutter and noise.
2. Segmentation: Splitting the image into regions which each contain a single character.
3. Classification: Identifying the character in each region.

Steps 1 and 3 are easy tasks for computers. The only step where humans still
outperform computers is segmentation. If the background clutter consists of shapes similar to
letter shapes, and the letters are connected by this clutter, the segmentation becomes nearly
impossible with current software. Hence, an effective CAPTCHA should focus on the
segmentation.

Shape Context based approach:

Shape Contexts based approach is to break Gimpy, the CAPTCHA test used at
Yahoo! to screen out bots. This approach make use of general purpose algorithms that have been
designed for generic object recognition. The same basic ideas have been applied to finding
people in images, matching handwritten digits, and recognizing 3D objects.

Below are a few examples of images analyzed using this method, and the word
that was found. Correct words are shown in green, incorrect words in red. For EZ-Gimpy
experiments are done using 191 images. It was able to correctly identify the word in 176 of these
images: a success rate of 92%.This algorithm takes only a few seconds to process one image.

SCREW SPACE

PROTECTION BY ENHANCED CAPTCHAs:

Extreme Distortion:
One of the way to improve security is by using more distorted text in images, so
that bots will not be able to break.

If we use extremely distorted images like this, it is almost impossible for any of
the programs to break them.

reCAPTCHAs:

reCAPTCHA is a system developed at Carnegie Mellon University that uses


CAPTCHA to help digitize the text of books while protecting websites from bots attempting to
access restricted areas. reCAPTCHA supplies subscribing websites with images of words that
optical character recognition (OCR) software has been unable to read. The subscribing websites
present these images for humans to decipher as CAPTCHA words, as part of their normal
validation procedures.

Image-Recognition Captchas:
Some of the researchers promote image-recognition CAPTCHAs as a possible
alternative for text-based CAPTCHAs. Image recognition CAPTCHA system presents a question
requiring the user to select a stated type of animal from an array of thumbnail images of assorted
animals. There is an another approach which needs to answer yes/no for each picture. With
sixteen images, a bot has a 1 in 65536 (2 16) chance of getting the CAPTCHA right purely by
chance.

An Example of Image Recognition Captcha

Here is a challenge in image form for users, user has to click the 3 pictures of the
kittens. This is difficult for any bot to overcome. The possibility of getting right by chance is
very less.

CONCLUSION:
Thus CAPTCHAs have undergone many forms from text based to image
recognition. Still many new forms of CAPTCHAs are yet to come. There are some proposals to
even introduce some questions as CAPTCHAs based on common sense in future. We hope for
many sophisticated CAPTCHAs in future so that they will lock out any bot.

REFERENCES:

http://www.captcha.net/

http://en.wikipedia.org/

http://www.captcha.ru/en/breakings/

http://www.cs.sfu.ca/~mori/research/gimpy/

You might also like