Professional Documents
Culture Documents
Face Detection
Face Detection
Face Detection
INTRODUCTION
Face detection involves separating image windows into two classes; one
containing faces (training the background (clutter). It is difficult because
although commonalities exist between faces, they can vary considerably in
terms of age, skin color and facial expression. The problem is further
complicated by differing lighting conditions, image qualities and geometries, as
well as the possibility of partial occlusion and disguise. An ideal face detector
would therefore be able to detect the presence of any face under any set of
lighting conditions, upon any background.
DIFFERENT APPROACHES OF FACE DETECTION:
There are two predominant approaches to the face detection problem:
Geometric (feature based) and photometric (view based). As researcher interest
in face recognition continued, many different algorithms were developed, three
of which have been well studied in face detection literature.
Recognition algorithms can be divided into two main approaches:
1.1.1. Geometric: Is based on geometrical relationship between facial
landmarks, or in other words the spatial configuration of facial features. That
means that the main geometrical features of the face such as the eyes, nose
and mouth are first located and then faces are classified on the basis of various
geometrical distances and angles between features.(figure 2)
1|Page
3. Elastic Bunch Graph Matching using the Fisherface algorithm.
2|Page
1.2 FACE DETECTION:
Face detection involves separating image windows into two classes; one
containing faces (training the background (clutter). It is difficult because
although commonalities exist between faces, they can vary considerably in
terms of age, skin color and facial expression. The problem is further
complicated by differing lighting conditions, image qualities and geometries, as
well as the possibility of partial occlusion and disguise. An ideal face detector
would therefore be able to detect the presence of any face under any set of
lighting conditions, upon any background. The face detection task can be
broken down into two steps. The first step is a classification task that takes
some arbitrary image as input and outputs a binary value of yes or no,
indicating whether there are any faces present in the image. The second step
is the face localization task that aims to take an image as input and output the
location of any face or faces within that image as some bounding box with (x,
y, width, height).
The face detection system can be divided into the following steps:-
3|Page
3.Localization: The trained neural network is then used to search for faces in
an image and if present localize them in a bounding box. Various Feature of
Face on which the work has done on:-Position Scale Orientation Illumination
4|Page
1.3 The History of Face Detection
Face detection began as early as 1977 with the first automated system being
introduced By Kanade using a feature vector of human faces. In 1983, Sirovich
and Kirby introduced the principal component analysis(PCA) for feature
extraction. Using PCA, Turk and Pentland Eigenface was developed in 1991 and
is considered a major milestone in technology. Local binary pattern analysis for
texture recognition was introduced in 1994 and is improved upon for facial
recognition later by incorporating Histograms(LBPH) [4], [5]. In 1996
Fisherface was developed using Linear discriminant analysis (LDA) for
dimensional reduction and can identify faces in different illumination conditions,
which was an issue in Eigenface method [6]. Viola and Jones introduced a face
detection technique using HAAR cascades and ADABoost [7]. In 2007, A face
recognition technique was developed by Naruniec and Skarbek using Gabor Jets
that are similar to mammalian eyes. In This project, HAAR cascades are used
for face detection and Eigenface, Fisherface and LBPH are used for face
detection.
5|Page
CHAPTER-2
LITERATURE SURVEY
6|Page
Active Shape Model Active shape models focus on complex non-rigid
features like actual physical and higher level appearance of features Means that
Active Shape Models (ASMs) are aimed at automatically locating landmark
points that define the shape of any statistically modelled object in an image.
When of facial features such as the eyes, lips, nose, mouth and eyebrows. The
training stage of an ASM involves the building of a statistical
7|Page
needs highest computations. Huang and Chen and Lam and Yan both
employ fast iteration methods by greedy algorithms. Snakes have some
demerits like contour often becomes trapped onto false image features
and another one is that snakes are not suitable in extracting non convex
features.
8|Page
Independently of computerized image analysis, and before ASMs were
developed, researchers developed statistical models of shape . The idea is that
once you represent shapes as vectors, you can apply standard statistical
methods to them just like any other multivariate object. These models learn
allowable constellations of shape points from training examples and use
principal components to build what is called a Point Distribution Model. These
have been used in diverse ways, for example for categorizing Iron Age
broaches. Ideal Point Distribution Models can only deform in ways that are
characteristic of the object. Cootes and his colleagues were seeking models
which do exactly that so if a beard, say, covers the chin, the shape model
can \override the image" to approximate the position of the chin under the
beard. It was therefore natural (but perhaps only in retrospect) to adopt Point
Distribution Models. This synthesis of ideas from image processing and
statistical shape modelling led to the Active Shape Model. The first parametric
statistical shape model for image analysis based on principal components of
inter-landmark distances was presented by Cootes and Taylor in. On this
approach, Cootes, Taylor, and their colleagues, then released a series of papers
that cumulated in what we call the classical Active Shape Model.
Based on low level visual features like color, intensity, edges, motion etc.
Skin Color Base Color is avital feature of human faces. Using skin-color as a
feature for tracking a face has several advantages. Color processing is much
faster than processing other facial features. Under certain lighting conditions,
color is orientation invariant. This property makes motion estimation much
easier because only a translation model is needed for motion estimation.
Tracking human faces using color as a feature has several problems like the
color representation of a face obtained by a camera is influenced by many
factors (ambient light, object movement, etc
9|Page
Fig 2.2. face detection
10 | P a g e
The perceived human color varies as a function of the relative direction to the
illumination.
The pixels for skin region can be detected using a normalized color histogram,
and can be normalized for changes in intensity on dividing by luminance.
Converted an [R, G, B] vector is converted into an [r, g] vector of normalized
color which provides a fast means of skin detection. This algorithm fails when
there are some more skin region like legs, arms, etc. Cahi and Ngan suggested
skin color classification algorithm with YCbCr color space. Research found that
pixels belonging to skin region having similar Cb and Cr values. So that the
thresholds be chosen as [Cr1, Cr2] and [Cb1, Cb2], a pixel is classified to have
skin tone if the values [Cr, Cb] fall within the thresholds. The skin color
distribution gives the face portion in the color image. This algorithm is also
having the constraint that the image should be having only face as the skin
region. Kjeldson and Kender defined a color predicatein HSV color space to
separate skin regions from background. Skin color classification in HSI color
space is the same as YCbCr color space but here the responsible values are hue
(H) and saturation (S). Similar to above the threshold be chosen as [H1, S1]
and [H2, S2], and a pixel is classified to have skin tone if the values [H,S] fall
within the threshold and this distribution gives the localized face image. Similar
to above two algorithm this algorithm is also having the same constraint.
11 | P a g e
Gray information within a face can also be treat as important features.
Facial features such as eyebrows, pupils, and lips appear generally darker than
their surrounding facial regions. Various recent feature extraction algorithms
search for local gray minima within segmented facial regions. In these
algorithms, the input images are first enhanced by contrast-stretching and
gray-scale morphological routines to improve the quality of local dark patches
and thereby make detection easier. The extraction of dark patches is achieved
by low-level gray-scale thresholding. Based method and consist three levels.
Yang and huang presented new approach i.e. faces gray scale behavior in
pyramid (mosaic) images. This system utilizes hierarchical Face location consist
three levels. Higher two level based on mosaic images at different resolution. In
the lower level, edge detection method is proposed. Moreover this algorithms
gives fine response in complex background where size of the face is unknown
Face detection based on edges was introduced by Sakai et al. This work
was based on analyzing line drawings of the faces from photographs, aiming to
locate facial features. Than later Craw et al. proposed a hierarchical framework
based on Sakai et al.‘s work to trace a human head outline. Then after
remarkable works were carried out by many researchers in this specific area.
Method suggested by Anila and Devarajan was very simple and fast. They
proposed frame work which consist three steps i.e. initially the images are
enhanced by applying median filter for noise removal and histogram
equalization for contrast adjustment. In the second step the edge image is
constructed from the enhanced image by applying sobel operator. Then a novel
edge tracking algorithm is applied to extract the sub windows from the
enhanced image based on edges. Further they used Back propagation Neural
Network (BPN) algorithm to classify the sub-window as either face or non-face.
12 | P a g e
2.4 FEATURE ANALYSIS
These algorithms aim to find structural features that exist even when the
pose, viewpoint, or lighting conditions vary, and then use these to locate faces.
These methods are designed mainly for face localization.
Paul Viola and Michael Jones presented an approach for object detection
which minimizes computation time while achieving high detection accuracy.
Paul Viola and Michael Jones [39] proposed a fast and robust method for face
detection which is 15 times quicker than any technique at the time of release
with 95% accuracy at around 17 fps. The technique relies on the use of simple
Haar-like features that are evaluated quickly through the use of a new image
representation. Based on the concept of an ―Integral Image‖ it generates a
large set of features and uses the boosting algorithm AdaBoost to reduce the
overcomplete set and the introduction of a degenerative tree of the boosted
classifiers provides for robust and fast interferences. The detector is applied in
a scanning fashion and used on gray-scale images, the scanned window that is
applied can also be scaled, as well as the features evaluated.
13 | P a g e
match occurs, it means that the faces in the image are detected. Equation of
Gabor filter [40] is shown below`
All methods discussed so far are able to track faces but still some issue
like locating faces of various poses in complex background is truly difficult. To
reduce this difficulty investigator form a group of facial features in face-like
constellations using more robust modelling approaches such as statistical
analysis. Various types of face constellations have been proposed by Burl et
al. . They establish use of statistical shape theory on the features detected from
a multiscale Gaussian derivative filter. Huang et al. also apply a Gaussian filter
for pre-processing in a framework based on image feature analysis.
14 | P a g e
extensively tuned (number of layers, number of nodes, learning rates, etc.) to
get exceptional performance. In early days most hierarchical neural network
was proposed by Agui et al. The first stage having two parallel subnetworks in
which the inputs are filtered intensity values from an original image. The inputs
to the second stage network consist of the outputs from the sub networks and
extracted feature values. An output at the second stage shows the presence of
a face in the input region. Propp and Samal developed one of the earliest neural
networks for face detection. Their network consists of four layers with 1,024
input units, 256 units in the first hidden layer, eight units in the second hidden
layer, and two output units. Feraud and Bernier presented a detection method
using auto associative neural networks. The idea is based on which shows an
auto associative network with five layers is able to perform a nonlinear principal
component analysis. One auto associative network is used to detect frontal-
view faces and another one is used to detect faces turned up to 60 degrees to
the left and right of the frontal view. After that Lin et al. presented a face
detection system using probabilistic decision-based neural network (PDBNN)
[49]. The architecture of PDBNN is similar to a radial basis function (RBF)
network with modified learning rules and probabilistic interpretation.
15 | P a g e
projection of the training images onto this subspace and the original images is
minimized. They call the set of optimal basis vectors Eigen pictures since these
are simply the eigen vectors of the covariance matrix computed from the
vectorized face images in the training set. Experiments with a set of 100
images show that a face image of 91 X 50 pixels can be effectively encoded
using only50 Eigen pictures.
SVMs were first introduced Osuna et al. for face detection. SVMs work as
a new paradigm to train polynomial function, neural networks, or radial basis
16 | P a g e
function (RBF) classifiers. SVMs works on induction principle, called structural
risk minimization, which targets to minimize an upper bound on the expected
generalization error. An SVM classifier is a linear classifier where the separating
hyper plane is chosen to minimize the expected classification error of the
unseen test patterns .In Osunaet al. developed an efficient method to train an
SVM for large scale problems, and applied it to face detection. Based on two
test sets of 10,000,000 test patterns of 19 X 19 pixels, their system has slightly
lower error rates and runs approximately30 times faster than the system by
Sung and Poggio . SVMs have also been used to detect faces and pedestrians in
the wavelet domain.
CHAPTER-3
FEASIBILITY REPORT
Preliminary investigation examine project feasibility, the likelihood the
system will be useful to the organization. The main objective of the feasibility
17 | P a g e
study is to test the Technical, Operational and Economical feasibility for adding
new modules and debugging old running system. All system is feasible if they
are unlimited resources and infinite time. There are aspects in the feasibility
study portion of the preliminary investigation:
1 Technical Feasibility
2 Operation Feasibility
3 Economical Feasibility
3.1 Technical Feasibility
The technical issue usually raised during the feasibility stage of the
investigation includes the following:
18 | P a g e
Proposed projects are beneficial only if they can be turned out into information
system. That will meet the organization’s operating requirements. Operational
feasibility aspects of the project are to be taken as an important part of the
project implementation. Some of the important issues raised are to test the
operational feasibility of a project includes the following: -
The well-planned design would ensure the optimal utilization of the computer
resources and would help in the improvement of performance status.
A system can be developed technically and that will be used if installed must
still be a good investment for the organization. In the economical feasibility, the
development cost in creating the system is evaluated against the ultimate
benefit derived from the new systems. Financial benefits must equal or exceed
the costs. The system is economically feasible. It does not require any addition
hardware or software. Since the interface for this system is developed using the
existing resources and technologies. There is nominal expenditure and
economical feasibility for certain.
CHAPTER-4
19 | P a g e
4.1 INTRODUCTION
4.1.1 Purpose: The main purpose for preparing this document is to give a
general insight into the analysis and requirements of the existing system or
situation and for determining the operating characteristics of the system.
4.1.2 Scope: This Document plays a vital role in the development life cycle
(SDLC) and it describes the complete requirement of the system. It is meant for
use by the developers and will be the basic during testing phase. Any changes
made to the requirements in the future will have to go through formal change
approval process.
Developing the system, which meets the SRS and solving all the
requirements of the system?
Demonstrating the system and installing the system at client's location after
the acceptance testing is successful.
Submitting the required user manual describing the system interfaces to
work on it and also the documents of the system.
Conducting any user training that might be needed for using the system.
Maintaining the system for a period of one year after installation.
Usability
20 | P a g e
The system is designed with completely automated process hence there is no or
less user intervention.
Reliability
The system is more reliable because of the qualities that are inherited from the
chosen platform php. The code built by using php is more reliable.
Performance
This system is developing in the high level languages and using the advanced
front-end and back-end technologies it will give response to the end user on
client system with in very less time.
Supportability
Implementation
The system is implemented in web environment using core php. The apache is
used as the web server and windows xp professional is used as the platform.
21 | P a g e
Hard Disk :- 50 GB
Monitor :- Color monitor
Keyboard :- 104 keys
Mouse :- Any pointing device
CHAPTER-5
SELECTED SOFTWARE
22 | P a g e
5.1 SELECTED SOFTWARE
WINDOWS OS 10 HOME
Battery Saver, for those unfamiliar, is a feature that makes your system
more power efficient. It does so by limiting the background activity on
the device. A TPM is a microchip that offers additional security-related
functions. Many motherboard manufacturers install TPM chip on their
23 | P a g e
device. Microsoft assures that if your motherboard has that chip,
Windows 10 Home will provide support for it.
Home users will also be able to utilise the all-new Virtual Desktops
option and Snap assist feature with up to 4 apps on one screen.
Furthermore, they can also give a whirl to Continuum, a flagship feature
of Windows 10 that lets you quickly switch from desktop mode to tablet
mode. You are also bestowed with Microsoft Edge, the brand new
browser in town.
24 | P a g e
Assigned Access 8.1, for instance, allows you to lock user accounts and
prevent them from accessing specific apps. BitLocker, on the other
hand, is one of the most powerful disk-encryption tools on Windows. It
lets you encrypt your external USB-drives. You also get tools that
facilitate seamless connectivity while joining Azure Active Directory, and
a Business Store for Windows 10. So should you get the Pro edition
instead?
It all comes down to this: do you need features such as Client Hyper-V,
which is a built-in virtualisation solution in Windows. Does your work
require you to connect to a Windows domain? If yes, you should
purchase the Pro edition. Else, the Home edition is what you need.
While Windows 10 Home and Pro are direct paths for retail users, there
are other variants of Windows 10 as well like Windows 10 Enterprise
and Windows 10 Student. The Enterprise edition, as you may expect, is
meant to meet the demands of medium and large sized organisations.
It comes with even more sophisticated features such as Device Guard,
which gives a company the ability to lock down devices. Unlike the
other two Windows 10 Editions, however, the Enterprise variant won't
be available for sale in retail stores. Instead, it will be sold through
volume licensing.
25 | P a g e
stores, though, and will be seeded out through academic volume
licensing.
If you are upgrading Also worth noting is that if you have already have
a legit copy of Windows 7 Starter, Windows 7 Home Basic, Windows 7
Home Premium, or Windows 8.1, you will get a free upgrade to
Windows 10 Home. Existing legit users of Windows 7 Pro, Windows 7
Ultimate, or Windows 8.1 Pro will get a free upgrade to Windows 10
Pro.
What is Python?
It is used for:
26 | P a g e
software development,
mathematics,
system scripting.
Python can connect to database systems. It can also read and modify
files.
Why Python?
Python has syntax that allows developers to write programs with fewer
lines than some other programming languages.
27 | P a g e
Good to know
Python was designed for readability, and has some similarities to the
English language with influence from mathematics.
28 | P a g e
functional programming. Python is often described as a "batteries
included" language due to its comprehensive standard library.
The Python interpreter and the extensive standard library are freely
available in source or binary form for all major platforms from the
Python Web site, https://www.python.org/, and may be freely
distributed. The same site also contains distributions of and pointers to
many free third party Python modules, programs and tools, and
additional documentation.
29 | P a g e
The Python interpreter is easily extended with new functions and data
types implemented in C or C++ (or other languages callable from C).
Python is also suitable as an extension language for customizable
applications.
This tutorial introduces the reader informally to the basic concepts and
features of the Python language and system. It helps to have a Python
interpreter handy for hands-on experience, but all examples are self-
contained, so the tutorial can be read off-line as well.
History
30 | P a g e
Main article: History of Python
Python 2.0 was released on 16 October 2000 with many major new
features, including a cycle-detecting garbage collector and support for
Unicode.
Python 2.7's end-of-life date was initially set at 2015 then postponed to
2020 out of concern that a large body of existing code could not easily
be forward-ported to Python 3. In January 2017, Google announced
work on a Python 2.7 to Go transcompiler to improve performance
under concurrent workloads.
31 | P a g e
many of its features support functional programming and aspect-
oriented programming (including by metaprogramming[45] and
metaobjects (magic methods)). Many other paradigms are supported
via extensions, including design by contract and logic programming.
Readability counts
Rather than having all of its functionality built into its core, Python was
designed to be highly extensible. This compact modularity has made it
particularly popular as a means of adding programmable interfaces to
existing applications. Van Rossum's vision of a small core language with
a large standard library and easily extensible interpreter stemmed from
his frustrations with ABC, which espoused the opposite approach.
32 | P a g e
Python strives for a simpler, less-cluttered syntax and grammar while
giving developers a choice in their coding methodology. In contrast to
Perl's "there is more than one way to do it" motto, Python embraces a
"there should be one—and preferably only one—obvious way to do it"
design philosophy. Alex Martelli, a Fellow at the Python Software
Foundation and Python book author, writes that "To describe something
as 'clever' is not considered a compliment in the Python culture."
33 | P a g e
Users and admirers of Python, especially those considered
knowledgeable or experienced, are often referred to as Pythonists,
Pythonistas, and Pythoneers.
PYCHARM FEATURES
34 | P a g e
PyCharm’s smart code editor provides first-class support for Python,
JavaScript, CoffeeScript, TypeScript, CSS, popular template languages
and more. Take advantage of language-aware code completion, error
detection, and on-the-fly code fixes!
Use smart search to jump to any class, file or symbol, or even any IDE
action or tool window. It only takes one click to switch to the
declaration, super method, test, usages, implementation, and more.
Refactor your code the intelligent way, with safe Rename and Delete,
Extract Method, Introduce Variable, Inline Variable or Method, and
other refactorings. Language and framework-specific refactorings help
you perform project-wide changes.
35 | P a g e
Save time with a unified UI for working with Git, SVN, Mercurial or
other version control systems. Run and debug your application on
remote machines. Easily configure automatic deployment to a remote
host or VM and manage your infrastructure with Vagrant and Docker.
Database tools
Web Development
Live Edit
36 | P a g e
Live Editing Preview lets you open a page in the editor and the browser
and see the changes being made in code instantly in the browser.
PyCharm auto-saves your changes, and the browser smartly updates
the page on the fly, showing your edits.
Scientific Tools
You can run a REPL Python console in PyCharm which offers many
advantages over the standard one: on-the-fly syntax check with
inspections, braces and quotes matching, and of course code
completion.
Conda Integration
Visual Debugging
37 | P a g e
Some coders still debug using print statements, because the concept is
hard and pdb is intimidating. PyCharm’s python debugging GUI makes it
easy to use a debugger by putting a visual face on the process. Getting
started is simple and moving on to the major debugging features is
easy.
Debug Everywhere
Of course PyCharm can debug code that you’re running on your local
computer, whether it’s your system Python, a virtualenv, Anaconda, or
a Conda env. In PyCharm Professional Edition you can also debug code
you’re running inside a Docker container, within a VM, or on a remote
host through SSH.
38 | P a g e
Test-driven development, or TDD, involves exploration while writing
tests. Use the debugger to help explore by setting breakpoints in the
context you are investigating:
This investigation can be in your test code or in the code being tested,
which is very helpful for Django integration tests (Django support is
available only in PyCharm Professional Edition). Use a breakpoint to find
out what is coming from a query in a test case:
PDB is a great tool, but requires you to modify your code, which can
lead to accidentally checking in `pdb.set_trace()` calls into your git
repo.
Breakpoints
All debuggers have breakpoints, but only some debuggers have highly
versatile breakpoints. Have you ever clicked ‘continue’ many times until
you finally get to the loop iteration where your bug occurs? No need for
that with PyCharm’s conditional breakpoints.
Exceptions can ruin your day, that’s why PyCharm’s debugger is able to
break on exceptions, even if you’re not entirely sure where they’re
coming from.
39 | P a g e
See Variable Values at a Glance
As soon as PyCharm hits a breakpoint, you’ll see all your variable values
inline in your code. To make it easy to see what values have changed
since the last time you hit the breakpoint, changed values are
highlighted.
Watches
If you want to know where your code goes, you don’t need to put
breakpoints everywhere. You can step through your code, and keep
track of exactly what happens.
Speed
40 | P a g e
5.4 HAAR CASCADE GUI TRAINER
1. Introduction
Cascade Trainer GUI is a program that can be used to train, test and
improve cascade classifier models. It uses a graphical interface to set the
parameters and make it easy to use OpenCV tools for training and testing
classifiers.
If you are new to the concept of object detection and classifiers please
consider visiting http://opencv.org for more information.
2. Installation
2.1. Prerequisites
3. How to Use
When Cascade Trainer GUI is first started you will be presented with the
following screen. This is the starting screen and it can be used for training
classifiers. To train classifiers usually you need to provide the utility with
41 | P a g e
thousands of positive and negative image samples, but there are cases
when you can achieve the same with less samples.
To start the training, you need to create a folder for your classifier. Create
two folders inside it. One should be “p” (for positive images) and the other
should be “n” (for negative images).
For example, you should have a folder named “Car” which has the
mentioned folders inside it.
Positive image samples are the images of the object you want to train your
classifier and detect.
For example, if you want to train and detect cars then you need to have
many car images.
And you should also have many negative images. Negative images are
anything but cars.
Important Note 2: In theory, negative images can be any image that is not
the positive image but in practice, negative images should be relevant to
the positive images. For example, using sky images as negative images is
a poor choice for training a good car classifier, even though it doesn’t have
a bad effect on the overall accuracy:
Start by pressing the Browse button in Train tab. Select the folder you
have created for the classifier.
42 | P a g e
Common, Cascade and Boost tabs can be used for setting numerous
parameters for customizing the classifier training. Cascade Trainer GUI
sets the most optimized and recommended settings for these parameters
by default, still some parameters need to be modified for each training.
Note that detailed description of all these parameters are beyond the
scope of this help page and require deep knowledge about cascade
classification techniques.
You can set pre-calculation buffer size to help with the speed of training
process. You can assign as much memory as you can for these but be
careful to not assign too much or too low. For example if you have 8 GB of
RAM on your computer then you can safely set both of the buffer sizes
below to 2048.
Next you need to set the sample width and height. Make sure not to set it
to a very big size because it will make your detection very slow. Actually it
is quite safe to always set a small value for this. Recommended settings
for sample width and height is that you keep one aspect on 24 and set the
other accordingly. For example, if you have sample images that are
320×240 then first calculate the aspect ratio which in this case is 1.33:1
then multiply the bigger number with 24. You’d get 32×24 for sample
width and height.
You can also set the feature type to HAAR or LBP. Make sure to use HOG
only if you have OpenCV 3.1 or later. HAAR classifiers are very accurate
but require a lot more time to train so it is much wiser to use LBP if you
can provide your classifiers with many sample images. LBP classifiers on
the other hand are less accurate but train much quicker and detect almost
3 times faster.
43 | P a g e
After all the parameters are set, press Start button at the bottom to start
training your cascade classifier. You’ll see the following log screen while
training is going on.
Now if you exit Cascade Trainer GUI and go to the classifier folder you will
notice that there are new files and folders are created in this folder.
“n” and “p” are familiar but the rest are new. “classifier” folder contains
XML files that are created during different stages of training. If you check
inside “classifier” folder, you’ll notice something similar to the following.
“params.xml” contains the parameters you have used for the training.
(Just a reminder file)
To test your classifiers, go to Test tab from the tab bar at the top and set
the options as described below and finally press Start.
Select your cascade classifier using the Browse button at the top. You can
also manually enter the path to cascade XML file in the Cascade Classifier
XML field. This cascade classifier will be used for detection in images
and/or videos.
You can select one of the following for Input Settings and you need to set
the path according to this option:
44 | P a g e
• Single Image: A single image will be used as the scene in which
detection will be done. In this case Path should point to a single image file.
(Only supported images can be selected.)
• Video: A video file will be used for testing the classifier. In this case Path
should point to a video file that will be used as input.
After setting the input settings it is time for output settings. Output type
and path should be set according to what was chosen in input settings. See
below:
• Video: A video file with the detected objects highlighted with a red
rectangle will be created. In this case Path should point to a video file that
will be created after the test completes. Note that this option will only work
if Video is selected in the input settings.
After clicking on Start button test will start. You’ll see a progress bar and
the following preview screen with results.
You can also use Cropper tool to crop images from the object you want to
detect. It provides very simple tools to achieve this. Try to use relevant
negative images and always separately train objects that are too much
visually different. You can crop images from frames in a video file or from
a folder full of images.
45 | P a g e
CHAPTER-6
DIGITAL IMAGE PROCESSING
6.1 DIGITAL IMAGE PROCESSING
Interest in digital image processing methods stems from two principal
application areas:
1. Improvement of pictorial information for human interpretation
2. Processing of scene data for autonomous machine perception
46 | P a g e
In this second application area, interest focuses on procedures for extracting
image information in a form suitable for computer processing.
Examples includes automatic character recognition, industrial machine vision
for product assembly and inspection, military recognizance, automatic
processing of fingerprints etc.
Image:
Am image refers a 2D light intensity function f(x, y), where(x, y) denotes
spatial coordinates and the value of f at any point (x, y) is proportional to
the brightness or gray levels of the image at that point. A digital image is an
image f(x, y) that has been discretized both in spatial coordinates and
brightness. The elements of such a digital array are called image elements
or pixels.
A simple image model:
To be suitable for computer processing, an image f(x, y) must be
digitalized both spatially and in amplitude. Digitization of the spatial
coordinates (x, y) is called image sampling. Amplitude digitization is called
gray-level quantization.
The storage and processing requirements increase rapidly with the spatial
resolution and the number of gray levels.
Example: A 256 gray-level image of size 256x256 occupies 64k bytes of
memory.
47 | P a g e
level processing is raw image. Medium level processing means extracting
regions of interest from output of low level processed image. Medium level
processing deals with identification of boundaries i.e edges .This process is
called segmentation. High level processing deals with adding of artificial
intelligence to medium level processed signal.
48 | P a g e
6.2 FUNDAMENTAL STEPS IN IMAGE PROCESSING
49 | P a g e
Representat
Segmentat
ion
ion and
description
Pre-
processing
Recognition
Image
Knowledge And
acquisition base
50 | P a g e
A digital image processing system contains the following blocks as shown in
the figure
Storage
• Optical discs
• Tape
• Video tape
• Magnetic discs
Imag
e Display unit
acquisition Processing Unit TV
monitors
equipments
Computer Printers
Work
Video station Projectors
Scanner
Communication channel
1. Acquisition
2. Storage
51 | P a g e
3. Processing
4. Communication
5. Display
Descripti
Format Full name on Recognized
name Extensions
52 | P a g e
Tagge Image A
TIFF d File flexible file format .tif, .tiff
Forma supportin
t g a variety
imag
e compression
standard
s including
JPEG
Joint standa
JPEG Photographic A rd for .jpg, .jpeg
compression of
Experts Group images
of photographic
quality
Format used
BMP Windows Bitmap mainly for .bmp
simpl uncompresse
e d
imag
es
Portab Compres
PNG le Network ses full color .png
Graphi imag
cs es with
trasparency(
up to
53 | P a g e
48bits/p
CHAPTER-7
FACE DETECTION
The problem of face recognition is all about face detection. This is a fact
that seems quite bizarre to new researchers in this area. However, before
face recognition is possible, one must be able to reliably find a face and its
landmarks. This is essentially a segmentation problem and in practical
systems, most of the effort goes into solving this task. In fact the actual
recognition based on features extracted from these facial landmarks is only a
minor last step.
54 | P a g e
There are two types of face detection problems:
55 | P a g e
Most face detection systems use an example based learning approach
to decide whether or not a face is present in the window at that given instant
(Sung and Poggio,1994 and Sung,1995). A neural network or some other
classifier is trained using supervised learning with 'face' and 'non-face'
examples, thereby enabling it to classify an image (window in face detection
system) as a 'face' or 'non-face'.. Unfortunately, while it is relatively easy to
find face examples, how would one find a representative sample of images
which represent non-faces (Rowley et al., 1996)? Therefore, face detection
systems using example based learning need thousands of 'face' and 'non-face'
images for effective training. Rowley, Baluja, and Kanade (Rowley et
al.,1996) used 1025 face images and 8000 non-face images (generated from
146,212,178 sub-images) for their training set!
56 | P a g e
For example, almost always an individuals eye region is darker than his
forehead or nose. Therefore an image will match the template if it satisfies
the 'darker than' and 'brighter than' relationships (Sung and Poggio, 1994).
57 | P a g e
7.2 REAL-TIME FACE DETECTION
58 | P a g e
Since in real-time face detection, the system is presented with a series
of frames in which to detect a face, by using spatio-temperal filtering (finding
the difference between subsequent frames), the area of the frame that has
changed can be identified and the individual detected (Wang and Adelson,
1994 and Adelson and Bergen 1986).Further more as seen in Figure exact
face locations can be easily identified by using a few simple rules, such as,
1)the head is the small blob above a larger blob -the body.
2)head motion must be reasonably slow and contiguous -heads won't jump
around erratically (Turk and Pentland 1991a, 1991b).
59 | P a g e
It is process of identifying different parts of human faces like eyes,
nose, mouth, etc… this process can be achieved by using MATLAB code In
this project the author will attempt to detect faces in still images by using
image invariants. To do this it would be useful to study the grey-scale
intensity distribution of an average human face. The following 'average
human face' was constructed from a sample of 30 frontal view human faces,
of which 12 were from females and 18 from males. A suitably scaled
colormap has been used to highlight grey-scale intensity differences.
The grey-scale differences, which are invariant across all the sample
faces are strikingly apparent. The eye-eyebrow area seem to always contain
dark intensity (low) gray-levels while nose forehead and cheeks contain
bright intensity (high) grey-levels. After a great deal of experimentation, the
researcher found that the following areas of the human face were suitable for
a face detection system based on image invariants and a deformable
template.
60 | P a g e
scaled colormap scaled colormap (negative)
Figure 5.3.2 Area chosen for face detection (indicated on average human
face in gray scale)
The above facial area performs well as a basis for a face template,
probably because of the clear divisions of the bright intensity invariant area
by the dark intensity invariant regions. Once this pixel area is located by the
face detection system, any particular area required can be segmented based
on the proportions of the average human face After studying the above
images it was subjectively decided by the author to use the following as a
basis for dark intensity sensitive and bright intensity sensitive templates.
Once these are located in a subject's face, a pixel area 33.3% (of the width of
the square window) below this.
61 | P a g e
Note the slight differences which were made to the bright intensity invariant
sensitive template (compare Figures 3.4 and 3.2) which were needed because
of the pre-processing done by the system to overcome irregular lighting
(chapter six). Now that a suitable dark and bright intensity invariant
templates have been decided on, it is necessary to find a way of using these
to make 2 A-units for a perceptron, i.e. a computational model is needed to
assign neurons to the distributions displayed .
62 | P a g e
7.4 FACE DETECTION ALGORITHM
63 | P a g e
Fig 5.4.3 Eye detection
64 | P a g e
7.5 Face Detection using Haar-Cascades
65 | P a g e
1
66 | P a g e
Figure 2: Several Haar-like-features matched to the features of authors
face.
67 | P a g e
7.6 PRINCIPAL COMPONENT ANALYSIS (PCA)
68 | P a g e
typical 100x100 image used in this thesis will have to be transformed into
a 10000 dimension vector!
69 | P a g e
Fig 6.6.1 Eigenfaces
The transformation of a face from image space (I) to face space (f)
involves just a simple matrix multiplication. If the average face image is A
and U contains the (previously calculated) eigenfaces,
f = U * (I - A)
This is done to all the face images in the face database (database
with known faces) and to the image (face of the subject) which must be
recognized. The possible results when projecting a face into face space
are given in the following figure.
70 | P a g e
2.Projected image is a face and is not transformed near a face in
the face database
I' = UT *U * (I - A
71 | P a g e
sufficient for a very good description of human faces since the
reconstructed image have only about 2% RMS. pixel-by-pixel errors.
7.7.1 Eigenface
72 | P a g e
Figure 4: Pixels of the image are reordered to perform calculations for
Eigenface
73 | P a g e
The next step is computing the covariance matrix from the result. To
obtain the Eigen vectors from the data, Eigen analysis is performed using
principal component analysis. From the result, where covariance matrix is
diagonal, where it has the highest variance is considered the 1 st Eigen
vector. 2nd Eigen vector is the direction of the next highest variance, and it
is in 90 degrees to the 1 st vector. 3rd will be the next highest variation,
and so on. Each column is considered an image and visualized, resembles
a face and called Eigenfaces. When a face is required to be recognized,
the image is imported, resized to match the same dimensions of the test
data as mentioned above. By projecting extracted features on to each of
the Eigenfaces, weights can be calculated. These weights correspond to
the similarity of the features extracted from the different image sets in the
dataset to the features extracted from the input image. The input image
can be identified as a face by comparing with the whole dataset. By
comparing with each subset, the image can be identified as to which
person it belongs to. By applying a threshold detection and identification
can be controlled to eliminate false detection and recognition. PCA is
sensitive to large numbers and assumes that the subspace is linear. If the
same face is analyzed under different lighting conditions, it will mix the
values when distribution is calculated and cannot be effectively classified.
This makes to different lighting conditions poses a problem in matching
the features as they can change dramatically.
74 | P a g e
7.7.2 Fisherface
Figure 5: The first component of PCA and LDA. Classes in PCA looks more
mixed than of LDA
75 | P a g e
7.7.3 Local Binary Pattern Histogram
76 | P a g e
is changed as in figure 7, the result is the same as before. Histograms are
used in larger cells to find the frequency of occurrences of values making
process faster. By analyzing the results in the cell, edges can be detected
as the values change. By computing the values of all cells and
concatenating the histograms, feature vectors can be obtained. Images
can be classified by processing with an ID attached. Input images are
classified using the same process and compared with the dataset and
distance is obtained. By setting up a threshold, it can be identified if it is a
known or unknown face. Eigenface and Fisherface compute the dominant
features of the whole training set while LBPH analyze them individually.
77 | P a g e
CHAPTER 8
METHODOLOGY
Below are the methodology and descriptions of the applications used for
data gathering, face detection, training and face recognition. The project
was coded in Python using a mixture of IDLE and PYCharm IDEs.
78 | P a g e
classifier objects are created using classifier class in OpenCV through the
cv2.CascadeClassifier() and loading the respective XML les. A camera
object is created using the cv2.VideoCapture() to capture images. By
using the CascadeClassifier.detectMultiScale() object of various sizes are
matched and location is returned. Using the location data, the face is
cropped for further verification. Eye cascade is used to verify there are
two eyes in the cropped face. If satisfied a marker is placed around the
face to illustrate a face is detected in the location.
79 | P a g e
Figure 9: The Flowchart for the image collection
80 | P a g e
Application starts with a request for a name to be entered to be stored
with the ID in a text le. The face detection system starts the first half.
However, before the capturing begins, the application check for the
brightness levels and will capture only if the face is well illuminated.
Furthermore, after the face is detected, the position of the eyes are
analyzed. If the head is tilted, the application automatically corrects the
orientation. These two additions were made considering the requirements
for Eigenface algorithm. The Image is then cropped and saved using the
ID as a filename to be identified later. A loop runs this program until 50
viable images are collected from the person. This application made data
collection efficient.
OpenCV enables the creation of XML les to store features extracted from
datasets using the FaceRe-cognizer class. The stored images are
imported, converted to grayscale and saved with IDs in two lists with
same indexes. Face Recognizer objects are created using face recognizer
class. Each recognizer can take in parameters that are described below:
cv2.face.createEigenFaceRecognizer()
cv2.face.createFisherfaceRecognizer()
1.The first argument is the number of components for the LDA for the
creation of Fisherfaces. OpenCV mentions it to be kept 0 if uncertain.
2.Similar to Eigenface threshold. -1 if the threshold is passed.
81 | P a g e
cv2.face.createLBPHFaceRecognizer()
1.The radius from the center pixel to build the local binary pattern.
2.The Number of sample points to build the pattern. Having a
considerable number will slow down the computer.
3.The Number of Cells to be created in X axis.
4.The number of cells to be created in Y axis.
5.A threshold value similar to Eigenface and Fisherface. if the threshold
is passed the object will return -1
82 | P a g e
Recognizer objects are created and images are imported, resized,
converted into numpy arrays and stored in a vector. The ID of the image
is gathered from splitting the le name, and stored in another vector. By
using FaceRecognizer.train(NumpyImage, ID) all three of the objects are
trained. It must be noted that resizing the images were required only for
Eigenface and Fisherface, not for LBPH. Next, the configuration model is
saved as a XML le using FaceRecognizer.save(FileName). In this project,
all three are trained and saved through one application for convenience.
The ow chart for the trainer is shown in figure 10.
83 | P a g e
8.2.3 The Face Recognition using Detection
84 | P a g e
8.3 Discussion
85 | P a g e
CHAPTER 9
OUTPUT SCREENS
86 | P a g e
CHAPTER 10
87 | P a g e
CONCLUSION
This paper describes the mini-project for visual perception and
autonomy module. Next, it explains the technologies used in the
project and the methodology used. Finally, it shows the results,
discuss the challenges and how they were resolved followed by a
discussion. Using Haar-cascades for face detection worked extremely
well even when subjects wore spectacles. Real time video speed was
satisfactory as well devoid of noticeable frame lag. Considering all
factors, LBPH combined with Haar-cascades can be implemented as a
cost effective face recognition platform. An example is a system to
identify known troublemakers in a mall or a supermarket to provide
the owner a warning to keep him alert or for automatic attendance
taking in a class.
88 | P a g e
CHAPTER 11
FUTURE ENHANCEMENT
Today, one of the fields that uses facial detection the most is security.
This Facial Detection system can be enhanced by Facial Recognition
which is a very effective tool that can help law enforcers recognize
criminals and software companies are leveraging the technology to help
users access their technology. This technology can be further developed
to be used in other avenues such as ATMs, accessing confidential files, or
other sensitive materials. This can make other security measures such as
passwords and keys obsolete.
Another way that innovators are looking to implement facial detection &
recognition is within subways and other transportation outlets. They are
looking to leverage this technology to use faces as credit cards to pay for
your transportation fee. Instead of having to go to a booth to buy a ticket
for a fare, the face recognition would take your face, run it through a
system, and charge the account that you’ve previously created. This could
potentially streamline the process and optimize the flow of traffic
drastically. The future is here.
89 | P a g e
CHAPTER 12
BIBLIOGRAPHY
[1] Takeo Kanade. Computer recognition of human faces, volume
47. Birkhauser Basel, 1977.
[2] Lawrence Sirovich and Michael Kirby. Low-dimensional
procedure for the characterization of human faces. Josa a,
4(3):519{524, 1987.
[3] M. Turk and A. Pentland. Eigenfaces for recognition. Journal
of Cognitive Neuroscience, 3(1):71{86, Jan 1991.
[4] Dong chen He and Li Wang. Texture unit, texture spectrum,
and texture analysis. IEEE Transactions on Geoscience and Remote
Sensing, 28(4):509{512, Jul 1990.
[5] X. Wang, T. X. Han, and S. Yan. An hog-lbp human detector
with partial occlusion handling. In 2009 IEEE 12th International
Conference on Computer Vision, pages 32{39, Sept 2009.
[6] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman.
Eigenfaces vs. sherfaces: recognition using class speci c linear
projection. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 19(7):711{720, Jul 1997.
[7] P. Viola and M. Jones. Rapid object detection using a boosted
cascade of simple features. In Proceed-ings of the 2001 IEEE
Computer Society Conference on Computer Vision and Pattern
Recognition. CVPR 2001, volume 1, pages I{511{I{518 vol.1,
2001.
[8] John G Daugman. Uncertainty relation for resolution in space,
spatial frequency, and orientation optimized by two-dimensional
visual cortical lters. JOSA A, 2(7):1160{1169, 1985.
[9] S Mar^celja. Mathematical description of the responses of
simple cortical cells. JOSA, 70(11):1297{ 1300, 1980.
[10] Rainer Lienhart and Jochen Maydt. An extended set of haar-
like features for rapid object detection. In Image Processing. 2002.
90 | P a g e
Proceedings. 2002 International Conference on, volume 1, pages
I{I. IEEE, 2002.
[11] John P Lewis. Fast template matching. In Vision interface,
volume 95, pages 15{19, 1995.
[12] Meng Xiao He and Helen Yang. Microarray dimension reduction,
2009.
91 | P a g e