Professional Documents
Culture Documents
Seminar Report1
Seminar Report1
Seminar on
PATTERN RECOGNITION USING OPENCV
2010-11
Submitted To:- Submitted By:-
1
CHAPTER 1
Introduction
Pattern recognition has become more and more popular and important to us since 1960’s and it
induces attractive attention coming from a wider areas. There are various definitions for Pattern
Recognition:
• 1973(Duda and Hart) defined the pattern recognition is a field concerned with machine
recognition of meaning regularities in noisy of complex environments.
• 1977(Pavlidis) defined pattern recognition in his book: “the word pattern is derived from the
same root as the word patron and, in his original use, means something which is set up as a
perfect example to be imitated. Thus pattern recognition means the identification of the ideal
which a given object was made after.”
• 1992(Schalkoff) defined PR as“ The science that concerns the description or classification
(recognition) of measurements” .
• 1996(Ripley) outlined pattern recognition in his book: “Given some examples of complex
signals and the correct decisions for them, make decisions automatically for a stream of
future examples” .
• 2002 (Robert P.W. Duin) described the nature of pattern recognition is engineering; the final
aim of Pattern recognition is to design machines to solve the gap between application and
theory.
2
1.2 Pattern Recognition : Methods
Pattern recognition undergoes an important developing for many years. Pattern recognition
include a lot of methods which impelling the development of numerous applications in different
filed. The practicability of these methods is intelligent emulation.
Statistical decision and estimation theories have been commonly used in PR for a long time.
It is a classical method of PR which was found out during a long developing process, it based on the
feature vector distributing which getting from probability and statistical model. The statistical
model is defined by a family of class-conditional probability density functions Pr(x|ci) (Probability
of feature vector x given class ci) In detail, in SPR, we put the features in some optional order, and
then we can regard the set of features as a feature vector. Also statistical pattern recognition deals
with features only without consider the relations between features.
Its aim is to find out a few similar clusters in a mass of data which not need any information
of the known clusters. It is an unsupervised method. In general, the method of data clustering can be
partitioned two classes, one is hierarchical clustering, and the other is partition clustering.
The thinking process of human being is often fuzzy and uncertain, and the languages of
human are often fuzzy also. And in reality, we can’t always give complete answers or classification,
so theory of fuzzy sets come into being. Fuzzy sets can describe the extension and intension of a
concept effectively. The application of fuzzy sets in pattern recognition started in 1966, where the
two basic operations –abstraction and generalization were quite much aimed at by Bellanetal. Two
principles proposed by Marr (1982) and (Keller, 1995) which can be think as the general role of
fuzzy sets in PR. The PR system based on fuzzy sets theory can imitate thinking process of human
being widely and deeply.
3
1.2.4 Neural networks
Neural networks is developing very fast since the first neural networks model MP was
proposed since 1943, especially the Hopfield neural networks and famous BP arithmetic came into
being after. It is a data clustering method based on distance measurement; also this method is
model-irrespective. The neural approach applies biological concepts to machines to recognize
patterns.
The outcome of this effort is the invention of artificial neural networks which is set up by
the elicitation of the physiology knowledge of human brain. Neural networks is composed of a
series of different , associate unit. In addition, genetic algorithms applied in neural networks is a
statistical optimized algorithms proposed by Holland (1975) NeurPR is a very attractive since it
requires minimum a priori knowledge, and with enough layers and neurons, an ANN can create any
complex decision region.
The concept of structural pattern recognition was put for the fourth time (Pavilidis, 1977).
And structural pattern recognition is not based on a firm theory which relies on segmentation and
features extraction. Structural pattern recognition emphases on the description of the structure,
namely explain how some simple sub-patterns compose one pattern. There are two main methods in
structural pattern recognition, syntax analysis and structure matching. The basis of syntax analysis
is the theory of formal language, the basis of structure matching is some special technique of
mathematics based on sub-patterns. When consider the relation among each part of the object, the
structural pattern recognition is best. Different from other methods, structural pattern recognition
handle with symbol information, and this method can be used in applications with higher level, such
as image interpretation. Structural pattern recognition always associates with statistic classification
or neural networks through which we can deal with more complex problem of pattern recognition,
such as recognition of multidimensional objects.
This method major emphasizes on the rules of composition. And the attractive aspect of
syntactic methods is its suitability for dealing with recursion. When finish customizing a series of
4
rules which can describe the relation among the parts of the object, syntactic pattern recognition
which is a special kind of structural pattern recognition can be used.( in the middle of
1960’s,1978) .
This method which uses two concepts: fuzzy applications and compositional rule of
inference can cope with the problem for rule based pattern recognition. (Kumar S.Ray ,J.Ghoshal ,
1996).
This method is presented, and works mainly in Spanish and Russian, which works with the
descriptions of the objects. This approach can apply for both supervised pattern recognition and
unsupervised pattern recognition.
SVM is a relative new thing with simple structure; it has been researched widely since it was
proposed in the 1990’s. SVM base on the statistical theory ,and the method of SVM is an effective
tool that can solve the problems of pattern recognition and function estimation, especially can solve
classification and regression problem, has been applied to a wide range for pattern recognition such
as face detection, verification and recognition, object detection and recognition ,speech recognition
etc.
A pattern recognition system based on any PR method mainly includes three mutual-
associate and differentiated processes. One is data building; the other two are pattern analysis and
pattern classification .Data building convert original information into vector which can be dealt with
by computer. Pattern analysis’ task is to process the data (vector), such as feature selection, feature
extraction, data-dimension compress and so on. The aim of pattern classification is to utilize the
information acquired from pattern analysis to discipline the computer in order to accomplish the
classification.
5
A very common description of the pattern recognition system that includes five steps to
accomplish. The step of classification/regression / description showed in fig1 is the kernel of the
system. Classification is a PR problem of assigning an object to a class, The output of the PR
system is an integer label, such as classifying a product as “1” or “0” in a quality control test.
Regression is a generalization of a classification task, and the output of the PR system is a real-
valued number, such as predicting the share value of a firm based on past performance and stock
market indicators. Description is the problem of representing an object in terms of a series of
primitives, and the PR system produces a structural or linguistic description. A general composition
of a PR system is given below.
1. 4 Applications
It is true that application was one of the most important elements for PR theory. Pattern
Recognition has been developed for many years ,and the technology of PR has been applied in
many fields such as artificial intelligence ,computer engineering ,nerve biology,medicine image
analysis, archaeology,geologic reconnoitering,space navigation , armament technology and so on.
• Computer vision :-
The first vision system presented was supposing the objects with geometric shapes and
optimized edges extracted from images.
• Computer aided diagnosis :-
• Character recognition:-
Automated mail sorting, processing bank checks; Scanner captures an image of the text;
Image is converted into constituent characters .
• Speech recognition
6
CHAPTER 2
Facial Recognition
2.1 Introduction
Image analysis of faces has been an active research topic in computer vision and image
processing. This strong interest is driven by some promising applications such as surveillance and
security monitoring, advanced human-machine interface, video conferencing and virtual reality.
Generally speaking, major research areas include face detection, tracking and recognition, face
animation, expression analysis, lip reading, etc. As the basis for all other related image analysis of
human faces, face detection and tracking are of great importance.
The information age is quickly revolutionizing the way transactions are completed.
Everyday actions are increasingly being handled electronically, instead of with pencil and paper or
face to face. This growth in electronic transactions has resulted in a greater demand for fast and
accurate user identification and authentication. Access codes for buildings, banks accounts and
computer systems often use PIN's for identification and security clearences.
Using the proper PIN gains access, but the user of the PIN is not verified. When credit and ATM
cards are lost or stolen, an unauthorized user can often come up with the correct personal codes.
Despite warning, many people continue to choose easily guessed PIN’s and passwords: birthdays,
phone numbers and social security numbers. Recent cases of identity theft have highten the need for
methods to prove that someone is truly who he/she claims to be. Face recognition technology may
solve this problem since a face is undeniably connected to its owner expect in the case of identical
twins. Its nontransferable. The system can then compare scans to records stored in a central or local
database or even on a smart card.
The face is an important part of who you are and how people identify you. Except in the
case of identical twins, the face is arguably a person's most unique physical characteristics. While
humans have the innate ability to recognize and distinguish different faces for millions of years,
7
computers are just now catching up. For face recognition there are two types of comparisons .the
first is verification. This is where the system compares the given individual with who that individual
says they are and gives a yes or no decision. The second is identification. This is where the system
compares the given individual to all the Other individuals in the database and gives a ranked list of
matches. All identification or authentication technologies operate using the following four stages:
a) Capture: A physical or behavioural sample is captured by the system during Enrollment
and also in identification or verification process
b) Extraction: unique data is extracted from the sample and a template is created.
c) Comparison: the template is then compared with a new sample.
d) Match/non match: the system decides if the features extracted from the new Samples are a
match or a non match
Face recognition technology analyze the unique shape, pattern and positioning of the facial
features. Face recognition is very complex technology and is largely software based. This Biometric
Methodology establishes the analysis framework with tailored algorithms for each type of biometric
device. Face recognition starts with a picture, attempting to find a person in the image. This can be
accomplished using several methods including movement, skin tones, or blurred human shapes.
The face recognition system locates the head and finally the eyes of the individual. A matrix
is then developed based on the characteristics of the Individual’s face. The method of defining the
matrix varies according to the algorithm (the mathematical process used by the computer to perform
the comparison). This matrix is then compared to matrices that are in a database and a similarity
score is generated for each comparison.
Artificial intelligence is used to simulate human interpretation of faces. In order to increase
the accuracy and adaptability, some kind of machine learning has to be implemented. There are
essentially two methods of capture. One is video imaging and the other is thermal imaging. Video
imaging is more common as standard video cameras can be used. The precise position and the angle
of the head and the surrounding lighting conditions may affect the system performance. The
complete facial image is usually captured and a number of points on the face can then be mapped,
position of the eyes, mouth and the nostrils as a example. More advanced technologies make 3-D
map of the face which multiplies the possible measurements that can be made. Thermal imaging has
better accuracy as it uses facial temperature variations caused by vein structure as the distinguishing
traits. As the heat pattern is emitted from the face itself without source of external radiation these
systems can capture images despite the lighting condition, even in the dark. The drawback is high
cost. They are more expensive than standard video cameras.
8
Fig 2: Block Diagram for face- Recognition
9
the next line. The other coil is used to deflect the beam from top to bottom.
10
the optical information of the picture .
The implementation of face recognition technology includes the following four stages:
• Data acquisition
• Input processing
• Face image classification and decision making
The input can be recorded video of the speaker or a still image. A sample of 1 sec duration
consists of a 25 frame video sequence. More than one camera can be used to produce a 3D
representation of the face and to protect against the usage of photographs to gain unauthorized
access.
A pre-processing module locates the eye position and takes care of the surrounding lighting
condition and colour variance. First the presence of faces or face in a scene must be detected. Once
the face is detected, it must be localized and Normalization process may be required to bring the
dimensions of the live facial sample in alignment with the one on the template.
Some facial recognition approaches use the whole face while others concentrate on facial
components and/ or regions (such as lips, eyes etc). The appearance of the face can change
considerably during speech and due to facial expressions. In particular the mouth is subjected to
fundamental changes but is also very important source for discriminating faces. So an approach to
person’s recognition is developed based on patio- temporal modeling of features extracted from
talking face.
Models are trained specific to a person’s speech articulate and the way that the person
speaks. Person identification is performed by tracking mouth movements of the talking face and by
estimating the likelyhood of each model of having generated the observed sequence of features. The
model with the highest likelyhood is chosen as the recognized person.
11
Fig 5: Face image classification and decision making
12
Synergetic computer are used to classify optical and audio features, respectively. A
synergetic computer is a set of algorithm that simulate synergetic phenomena. In training phase the
BIOID creates a prototype called faceprint for each person. A newly recorded pattern is
preprocessed and compared with each faceprint stored in the database. As comparisons are made,
the system assigns a value to the comparison using a scale of one to ten. If a score is above a
predetermined threshold, a match is declared.
From the image of the face, a particular trait is extracted. It may measure various nodal
points of the face like the distance between the eyes ,width of nose etc. it is fed to a synergetic
computer which consists of algorithm to capture, process, compare the sample with the one stored
in the database. We can also track the lip movement which is also fed to the synergetic computer.
Observing the likelyhood each of the samples with the one stored in the database we can accept or
reject the sample.
13
CHAPTER 3
3.1 Introduction
14
3.2 Haar-Based Detection
One of the difficult tasks of face recognition is to differentiate the face from the background. Each
frame from a real time image is a collection of color and/or light intensity values. Analyzing each of
the pixels to determine where the human face is located can be very difficult due to the wide
variation in pigmentation and shape of the human face. However, Haar- based detection simplifies
the detection of objects within a digital image.
Instead of using intensity values, Haar detection uses changes in contrast values between adjacent
rectangular groups of pixels. The contrast variances between the pixel groups are used to find
relative light areas and dark areas. Areas with contrast variances form features, as shown in figure
1, which are used to detect the desired objects within the image5. These features can be easily
scaled by increasing or decreasing the area of the pixels being examined. This allows features to be
used to detect objects of various sizes.
The integral image array can be calculated with one pass through the original image. Using
15
the integral image, only six to eight array references are required to compute a feature. Thus
calculating a feature is extremely fast and efficient. It also means calculating a scaled feature
requires the same effort as a non-scaled feature. Thus detection of various sizes of the same object
requires the same amount of effort and time as objects of similar sizes.
Although calculating features is efficient, computing the over 180,000 different rectangle
features associated within a 24 × 24 sub-window is not feasible. However, only a small number of
features are actually needed to determine if the sub-window contains the desired object3, 8. The
goal is to select the one feature that distinguishes the desired object, like a face, from another object.
If a sub-window does not have that feature, then it can be eliminated. All non- eliminated sub-
windows are then analyzed for a more features, creating a slightly more complex classifier. This
process continues until the desired detection rate is achieved. This method of identifying an object
is fast and efficient. Viola and Jones8 were able to achieve a 95% detection rate of a human face
with only 200 features. It took less than one second to identify the faces within a 384 × 248 pixel
image. It is possible to achieve a higher degree of accuracy with more features with only a minute
increase in detection speed.
The open computer vision library (OpenCV) provides a library to utilize Haar classifiers.
In addition to the ability to use the classifiers it also provides several classifiers for face detection as
shown in Figure 2. This detection classifier is capable of detecting a face in real-time on a personal
computer running a 1 gigahertz or better processor using a fifteen frames per second video stream
with a resolution of 320 × 240 pixels. OpenCV also has a program that has the ability to create new
classifiers with a desired accuracy by providing images that contain the desired object and images
that lack the object. This allows one to easily expand or modify a detection routine to be more
effective or more precise. One could build classifiers to identify facial features as well as a human
face.
Intel’s Open Computer Vision Library is a free open source set of C/C++ libraries for computer
vision applications. OpenCV is designed to be used in conjunction with applications pertaining to
the fields of human-computer interaction, robotics, biometrics, and other areas where visualization
is important. The main focus for this paper is the possible applications of OpenCV in the field of
16
facial detection and recognition. In addition to the previously mentioned support for Haar
classifiers, there are many libraries that are useful for face detection and recognition. OpenCV
provides a simple application programmer interface (API) for interacting with various types of
digital cameras, like a web camera. Many facets of this API are implemented on both Windows and
Linux platforms. This allows for any programs to be easily ported from one operating system to
another. OpenCV has the ability to utilize two cameras simultaneously. In addition it can
synchronize the two camera images to create a stereo image. This functionality is useful if one
wishes to explore the three dimensional techniques of facial recognition.
One of the most common methods for facial recognition is principal component analysis (PCA) .
This method involves projecting a facial image into a subspace, called “eigen space”. This subspace
is formed from a set of basis vectors called eigen vectors or “eigen faces”. The basis vectors are
computed by calculating a covariance matrix of a dataset and then deriving the eigen values and
eigen vectors. In this case the dataset would be a large set of facial images. By combining all of the
eigen vectors one gets an image of a generalized human face. To perform facial recognition, one
projects the desired image into face space and compares the resulting eigen vectors to another
image’s vectors. If a vast majority of the vectors are the same, then the two faces are the same4.
One of the major advantages of OpenCV is the library to perform principal component analysis
using images. This library provides the functions to calculate the covariance matrix, basis vectors,
and perform the projections into eigen space using the image data structure used by all capture
functions in OpenCV. This allows a researcher considerable time savings since all of the algorithms
have already been designed and written .
Tarun order to be able to create the covariance matrix, a large dataset of facial images is
needed. The best approach to acquiring such a dataset is to request the Facial Recognition
Technology (FERET) database from the National Institute of Standards and Technology (NIST).
This database contains thousands of color and grayscale images that are used to test the accuracy of
different facial recognition systems7. This database is free to researchers studying this field. One
advantage of the FERET Database is that it provides many different images of the same individual.
These images include different angles, and lighting conditions. In addition, some of the images are
17
grayscale, while others are color images. Since the accuracy of face recognition systems can be
dependent on the changes of camera angles and lighting conditions of the captured image versus the
comparison images4, having various poses can test the versatility of a system. Also, the various
poses allow the training set to be composed of images with the same angles and lighting conditions,
producing the most accurate eigen faces to be used in comparisons. The most important aspect to
researching facial recogition is the results of the research. Prior to the FERET database many facial
recognition systems boasted fantastic results.
However, many of the databases that these systems used contained very few images. This caused
the results to be inaccurate. All researchers have access to a large quantity of images to perform the
recognition process. This allows researchers to get more accurate statistics on the false positive and
false negative rates for their algorithms and their recognition systems.
18
CHAPTER 4
4.1.1 Advantages :
a. There are many benefits to face recognition systems such as its convinence and Social
acceptability.all you need is your picturetaken for it to work.
b. Face recognition is easy to use and in many cases it can be performed without a
person even knowing.
c. Face recognition is also one of the most inexpensive biometric in the market and
Its price should continue to go down.
4.1.2 Disadvantage :
a. Face recognition systems can’t tell the difference between identical twins.
4.2: APPLICATIONS :-
The natural use of face recognition technology is the replacement of PIN, physical tokens or
both needed in automatic authorization or identification schemes. Additional uses are utomation of
human identification or role authentication in such cases where assistance of another human needed
in verifying the ID cards and its beholder.
There are numerous applications for face recognition technology :
Where visualization is important. The main focus for this paper is the possible applications
of OpenCV in the field of facial detection and recognition. In addition to the previously mentioned
support for Haar classifiers, there are many libraries that are useful for face detection and
recognition 5.
OpenCV provides a simple application programmer interface (API) for interacting with
various types of digital cameras, like a web camera. Many facets of this API are implemented on
both Windows and Linux platforms. This allows for any programs to be easily ported from one
operating system to another. OpenCV has the ability to utilize two cameras simultaneously.
19
CHAPTER
SOURCE – CODE
#define CV_NO_BACKWARD_COMPATIBILITY
#include "cv.h"
#include "highgui.h"
#include <iostream>
#include <cstdio>
#ifdef _EiC
#define WIN32
#endif
String cascadeName =
"../../data/haarcascades/haarcascade_frontalface_alt.xml";
String nestedCascadeName =
"../../data/haarcascades/haarcascade_eye_tree_eyeglasses.xml";
20
const String scaleOpt = "--scale=";
size_t scaleOptLen = scaleOpt.length();
const String cascadeOpt = "--cascade=";
size_t cascadeOptLen = cascadeOpt.length();
const String nestedCascadeOpt = "--nested-cascade";
size_t nestedCascadeOptLen = nestedCascadeOpt.length();
String inputName;
21
}
cvNamedWindow( "result", 1 );
if( capture )
{
for(;;)
{
IplImage* iplImg = cvQueryFrame( capture );
frame = iplImg;
if( frame.empty() )
break;
if( iplImg->origin == IPL_ORIGIN_TL )
22
frame.copyTo( frameCopy );
else
flip( frame, frameCopy, 0 );
waitKey(0);
_cleanup_:
cvReleaseCapture( &capture );
}
else
{
if( !image.empty() )
{
detectAndDraw( image, cascade, nestedCascade, scale );
waitKey(0);
}
else if( !inputName.empty() )
{
/* assume it is a text file containing the
list of the image filenames to be processed - one per line */
FILE* f = fopen( inputName.c_str(), "rt" );
if( f )
{
char buf[1000+1];
while( fgets( buf, 1000, f ) )
{
int len = (int)strlen(buf), c;
while( len > 0 && isspace(buf[len-1]) )
len--;
23
buf[len] = '\0';
cout << "file " << buf << endl;
image = imread( buf, 1 );
if( !image.empty() )
{
detectAndDraw( image, cascade, nestedCascade, scale );
c = waitKey(0);
if( c == 27 || c == 'q' || c == 'Q' )
break;
}
}
fclose(f);
}
}
}
cvDestroyWindow("result");
return 0;
}
24
CV_RGB(255,0,0),
CV_RGB(255,0,255)} ;
Mat gray, smallImg( cvRound (img.rows/scale), cvRound(img.cols/scale), CV_8UC1 );
t = (double)cvGetTickCount();
cascade.detectMultiScale( smallImg, faces,
1.1, 2, 0
//|CV_HAAR_FIND_BIGGEST_OBJECT
//|CV_HAAR_DO_ROUGH_SEARCH
|CV_HAAR_SCALE_IMAGE
,
Size(30, 30) );
t = (double)cvGetTickCount() - t;
printf( "detection time = %g ms\n", t/((double)cvGetTickFrequency()*1000.) );
for( vector<Rect>::const_iterator r = faces.begin(); r != faces.end(); r++, i++ )
{
Mat smallImgROI;
vector<Rect> nestedObjects;
Point center;
Scalar color = colors[i%8];
int radius;
center.x = cvRound((r->x + r->width*0.5)*scale);
center.y = cvRound((r->y + r->height*0.5)*scale);
radius = cvRound((r->width + r->height)*0.25*scale);
circle( img, center, radius, color, 3, 8, 0 );
if( nestedCascade.empty() )
continue;
smallImgROI = smallImg(*r);
nestedCascade.detectMultiScale( smallImgROI, nestedObjects,
1.1, 2, 0
25
//|CV_HAAR_FIND_BIGGEST_OBJECT
//|CV_HAAR_DO_ROUGH_SEARCH
//|CV_HAAR_DO_CANNY_PRUNING
|CV_HAAR_SCALE_IMAGE
,
Size(30, 30) );
for( vector<Rect>::const_iterator nr = nestedObjects.begin(); nr != nestedObjects.end(); nr++ )
{
center.x = cvRound((r->x + nr->x + nr->width*0.5)*scale);
center.y = cvRound((r->y + nr->y + nr->height*0.5)*scale);
radius = cvRound((nr->width + nr->height)*0.25*scale);
circle( img, center, radius, color, 3, 8, 0 );
}
}
cv::imshow( "result", img );
}
26
CHAPTER 7
BIBLOGRAPHY
1) M. Bishop. Computer Security, Art and Science. Massachusetts: Pearson Education Inc.
2003.
2) B. Draper, B. Kyungim, M.S. Bartlett, and J.R. Beveridge. “Recognizing faces with PCA
and ICA,” Computer Vision and Image Understanding, vol. 91. pp 115-137, July 2003.
3) P. Menezes, J.C. Barreto, and J. Dias. “Face Tracking Based on Haar-like Features and
Eigenfaces,” 5th IFAC Syposium on Intelligent Autonomous Vehicles, Lisbon, Portugal, July 5-7,
2004.
4) N. Muller, L. Magaia, B.M. Herbst. “Singular Value Decomposition, Eigenfaces, and 3D
Reconstructions,” SIAM Review, Vol. 46 Issue 3, pp. 518 – 545. Dec. 2004.
5) Open Computer Vision Library Reference Manual. Intel Corporation, USA, 2001.
6) P. J. Phillips, P. Grother, R.J. Micheals, D.M. Blackburn, E. Tabassi, and J.M. Bone.
“FRVT 2002: Overview and Summary,” Technical Report, National Institute of Standards
and Technology, March 2003.
7) P. J. Phillips, A. Martin, C.L. Wilson, and M. Przybocki. “An Introduction to Evaluating
Biometric Systems,” IEEE Computer, pp. 56-63, February, 2000.
8) B. Weyrauch, J. Huang, B. Heisele and V. Blanz “Component-Based Face recognition
with 3D Morphable Models,” IEEE Workshop on Face processing in Video, FPIV04,
Washington, D.C., 2004.
9) P. Viola and M. Jones. “Rapid Object Detection Using Boosted Cascade of Simple
Features,” Proceedings IEEE Conference on Computer Vision and Pattern Recognition,
2001.
27