Professional Documents
Culture Documents
Final Report For Btech Sem 8th Engineering
Final Report For Btech Sem 8th Engineering
CHAPTER 1
INTRODUCTION
CHAPTER 1
INTRODUCTION
This project aims at creating a system using which , it is easily possible to estimate
automatically whether each student is present or absent and to mark his/her attendance
automatically. It is also possible to know whether students are awake or sleeping and
whether students are interested or bored in lecture using Concentration Analysis.
For simplicity, we are testing the system for a single person per trial using a simple
webcam. MATLAB is used for testing the system’s behavior under various external
conditions like noise, illumination, etc. The entire process of marking the attendance of the
student plus concentration analysis is actually divided into separate modules namely:
Registration using Facial Detection (using Viola-Jones algorithm), Facial Recognition
(using Principal Component Analysis) and Concentration Analysis (using Thresholding).
This project is able to register images of the student from video feed. These images form
the necessary training database required for Facial Recognition. The images undergo
subsequent intensity normalization and noise removal techniques for Image Enhancement.
After registration, the registering user needs to add name respective to his/her images.
Then, for marking attendance, the project accepts live video feed of a single student as
input .The facial recognition algorithm is implemented in the background and the name of
the student is displayed on the screen. The presence or absence of the student is marked
corresponding to his her name in the database which can be displayed. For concentration
analysis, Firstly, an eye pair is detected for the student whose concentration is to be
measured. Secondly, the project counts number of blinks per frame set. These form the
basis of measuring concentration percentage of the person.
The available systems like RFID’s, tokens, fingerprint biometrics, etc. are too
costly and also require additional hardware.
Given a real time video of an ongoing class, the system should be able to detect
and recognize students to record their attendance automatically.
It should utilize minimum resources in terms of hardware and cost.
It is also expected to know whether students are awake or sleeping and whether
students are interested or bored in lecture using Concentration Analysis.
Chapter 5 Implementation
Chapter 6 Conclusion
CHAPTER 2
LITERATURE SURVEY
CHAPTER 2
LITERATURE SURVEY
This paper describes a face detection framework that is capable of processing images
extremely rapidly while achieving high detection rates. There are three key contributions.
The first is the introduction of a new image representation called the “Integral Image”
which allows the features used by our detector to be computed very quickly. The second is
a simple and efficient classifier which is built using the AdaBoost learning algorithm
(Freund and Schapire, 1995) to select a small number of critical visual features from a very
large set of potential features. The third contribution is a method for combining classifiers
in a “cascade” which allows background regions of the image to be quickly discarded
while spending more computation on promising face-like regions. A set of experiments in
the domain of face detection is presented. The system yields face detection performance
comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998;
Schneiderman and Kanade, 2000; Roth et al., 2000). The system was implemented on a
conventional desktop, face detection proceeds at 15 frames per second.
This paper brings together new algorithms and insights to construct a framework for robust
and extremely rapid visual detection .In other face detection systems, auxiliary
information, such as image differences in video sequences, or pixel color in color images,
have been used to achieve high frame rates. This system achieves high frame rates working
only with the information present in a single grey scale image. These alternative sources of
information can also be integrated with our system to achieve even higher frame rates.
There are three main contributions of our face detection framework.
The first contribution of this paper is a new image representation called an integral image
that allows for very fast feature evaluation. Motivated in part by the work of Papageorgiou
et al. (1998) our detection system does not work directly with image intensities. Like these
authors we use a set of features which are reminiscent of Haar Basis functions (though we
will also use related filters which are more complex than Haar filters). In order to compute
these features very rapidly at many scales we introduce the integral image representation
for images (the integral image is very similar to the summed area table used in computer
graphics (Crow, 1984) for texture mapping). The integral image can be computed from an
image using a few operations per pixel. Once computed, any one of these Haar like
features can be computed at any scale or location in constant time.
The second contribution of this paper is a simple and efficient classifier that is built by
selecting a small number of important features from a huge library of potential features
using AdaBoost (Freund and Schapire, 1995). Within any image sub-window the total
number of Haar-like features is very large, far larger than the number of pixels. In order to
ensure fast classification, the learning process must exclude a large majority of the
available features, and focus on a small set of critical features. Motivated by the work of
Tieu and Viola (2000) feature selection is achieved using the AdaBoost learning algorithm
by constraining each weak classifier to depend on only a single feature. As a result each
stage of the boosting process, which selects a new weak classifier, can be viewed as a
feature selection process. AdaBoost provides an effective learning algorithm and strong
bounds on generalization performance (Schapire et al., 1998).
The third major contribution of this paper is a method for combining successively more
complex classifiers in a cascade structure which dramatically increases the speed of the
detector by focusing attention on promising regions of the image. The notion behind focus
of attention approaches is that it is often possible to rapidly determine where in an image a
face might occur. More complex processing is reserved only for these promising regions.
The key measure of such an approach is the “false negative” rate of the attention process. It
must be the case that all, or almost all, face instances are selected by the attention filter.
We will describe a process for training an extremely simple and efficient classifier which
can be used as a “supervised” focus of attention operator.1 A face detection attention
operator can be learned which will filter out over 50% of the image while preserving
99%of the faces (as evaluated over a large dataset). This filter is exceedingly efficient; it
1216110074, 1216110080, 1216110091, 1216110094, 1216110109, 1216110124 Page
AUTOMATED ATTENDANCE AND CONCENTRATION ANALYSIS SYSTEM
Viola–Jones Face Detection: The Viola - Jones method for face object detection contains
three techniques:
Integral Image for feature extraction, the Haar-like features is rectangular type
that is obtained by integral image.
Figure 2.1 An Integral Image whose value will be calculated at point (x, y)
As shown in Figure 2.1, the value of the integral image at point (x, y) is the sum of all the pixels
above and to the left.
Face recognition systems have been grabbing high attention from commercial market point
of view as well as pattern recognition field. It also stands high in researcher’s community.
Face recognition have been fast growing, challenging and interesting area in real-time
applications. A large number of face recognition algorithms have been developed from
decades. The present paper refers to different face recognition approaches and primarily
focuses on principal component analysis, for the analysis and the implementation is done
in free software, Scilab. This face recognition system detects the faces in a picture taken by
web-cam or a digital camera, and these face images are then checked with training image
dataset based on descriptive features. Descriptive features are used to characterize images.
Matlab’s IMAQ toolbox is used for performing image analysis.
Face recognition systems have been grabbing high attention from commercial market point
of view as well as pattern recognition field. Face recognition has received substantial
attention from researches in biometrics, pattern recognition field and computer vision
communities. The face recognition systems can extract the features of face and compare
this with the existing database. The faces considered here for comparison are still faces.
Machine recognition of faces from still and video images is emerging as an active research
area. The present paper is formulated based on still or video images captured either by a
digital camera or by a web cam. The face recognition system detects only the faces from
the image scene, extracts the descriptive features. It later compares with the database of
faces, which is collection of faces in different poses.
This paper mainly addresses the building of face recognition system by using Principal
Component Analysis (PCA). PCA is a statistical approach used for reducing the number of
variables in face recognition. In PCA, every image in the training set is represented as a
linear combination of weighted eigenvectors called Eigen faces. These eigenvectors are
obtained from covariance matrix of a training image set. The weights are found out after
selecting a set of most relevant Eigen faces. Recognition is performed by projecting a test
image onto the subspace spanned by the Eigen faces and then classification is done by
measuring minimum Euclidean distance. A number of experiments were done to evaluate
the performance of the face recognition system.
Over the last ten years or so, face recognition has become a popular area of research in
computer vision and one of the most successful applications of image analysis and
understanding. Because of the nature of the problem, not only computer science
researchers are interested in it, but neuroscientists and psychologists also. It is the general
opinion that advances in computer vision research will provide useful insights to
neuroscientists and psychologists into how human brain works, and vice versa .The goal is
to implement the system (model) for a particular face and distinguish it from a large
number of stored faces with some real-time variations as well. It gives us efficient way to
find the lower dimensional space. Further this algorithm can be extended to recognize the
gender of a person or to interpret the facial expression of a person. Recognition could be
carried out under widely varying conditions like frontal view, a 45° view, scaled frontal
view, subjects with spectacles etc. are tried, while the training data set covers limited
views. The algorithm models the real-time varying lighting conditions as well. But this is
out of scope of the current implementation. The aim of this research paper is to study and
develop an efficient MATLAB program for face recognition using principal component
analysis and to perform test for program optimization and accuracy. This approach is
preferred due to its simplicity, speed and learning capability.
Eigen faces are a set of eigenvectors used in the computer vision problem of human face
recognition. Eigen faces assume ghastly appearance. They refer to an appearance-based
approach to face recognition that seeks to capture the variation in a collection of face
images and use this information to encode and compare images of individual faces in a
holistic manner. Specifically, the Eigen faces are the principal components of a distribution
of faces, or equivalently, the eigenvectors of the covariance matrix of the set of face
images, where an image with NxN pixels is considered a point (or vector) in N 2 -
dimensional space.
The idea of using principal components to represent human faces was developed by
Sirovich and Kirby and used by Turk and Pentland for face detection and recognition. The
1216110074, 1216110080, 1216110091, 1216110094, 1216110109, 1216110124 Page
AUTOMATED ATTENDANCE AND CONCENTRATION ANALYSIS SYSTEM
Eigen face approach is considered by many to be the first working facial recognition
technology, and it served as the basis for one of the top commercial face recognition
technology products. Since its initial development and publication, there have been many
extensions to the original method and many new developments in automatic face
recognition systems. Eigen faces is still considered as the baseline comparison method to
demonstrate the minimum expected performance of such a system. Eigen faces are mostly
used to:
Extract the relevant facial information, which may or may not be directly
related to human intuition of face features such as the eyes, nose, and lips. One
way to do so is to capture the statistical variation between face images.
Represent face images efficiently. To reduce the computation and space
complexity, each face image can be represented using a small number of
dimensions.
The Eigen faces may be considered as a set of features which characterize the global
variation among face images. Then each face image is approximated using a subset of the
Eigen faces, those associated with the largest Eigen values. These features account for the
most variance in the training set. In the language of information theory, we want to extract
the relevant information in face image, encode it as efficiently as possible, and compare
one face with a database of models encoded similarly. A simple approach to extracting the
information contained in an image is to somehow capture the variations in a collection of
face images, independently encode and compare individual face images. Mathematically, it
is simply finding the principal components of the distribution of faces, or the eigenvectors
of the covariance matrix of the set of face images, treating an image as a point or a vector
in a very high dimensional space. The eigenvectors are ordered, each one accounting for a
different amount of the variations among the face images. These eigenvectors can be
imagined as a set of features that together characterize the variation between face images.
Each image locations contribute more or less to each eigenvector, so that we can display
the eigenvector as a sort if “ghostly” face which we call an Eigen face. The face images
that are studied are shown in the Figure 2.3, and their respective Eigen faces are shown in
Figure 2.4.
Each of the individual faces can be represented exactly in terms of linear combinations of
the Eigen faces. Each face can also be approximated using only the “best” Eigen face,
which has the largest Eigen values, and the set of the face images. The best M Eigen faces
span an M dimensional space called as the “Face Space” of all the images. The basic idea
using the Eigen faces was proposed by Sirovich and Kirby as mentioned earlier, using the
principal component analysis and where successful in representing faces using the above
mentioned analysis. In their analysis, starting with an ensemble of original face image they
calculated a best coordinate system for image compression where each coordinate is
actually an image that they termed an Eigen picture. They argued that at least in principle,
any collection of face images can be approximately reconstructed by storing a small
collection of weights for each face and small set if standard picture (the Eigen picture). The
weights that describe a face can be calculated by projecting each image onto the Eigen
picture. Also according to the Turk and Pentland [1], the magnitude of face images can be
reconstructed by the weighted sums of the small collection of characteristic feature or
Eigen pictures and an efficient way to learn and recognize faces could be to build up the
characteristic features by experience over feature weights needed to (approximately)
reconstruct them with the weights associated with known individuals. Each individual
therefore would be characterized by the small set of features or Eigen picture weights
Eigen Face Approach: One of the simplest and most effective PCA approaches used in
face recognition systems is the so-called Eigen face approach. This approach transforms
faces into a small set of essential characteristics, Eigen faces, which are the main
components of the initial set of learning images (training set).
Recognition is done by projecting a new image in the Eigen face subspace, after which the
person is classified by comparing its position in Eigen face space with the position of
known individuals. The advantage of this approach over other face recognition systems is
in its simplicity, speed and insensitivity to small or gradual changes on the face.
The problem is limited to files that can be used to recognize the face. Namely, the images
must be vertical frontal views of human faces. The whole recognition process involves two
steps:
Initialization process
Recognition process
Calculate the Eigen faces from the training set, keeping only the highest Eigen
values. These M images define the face space. As new faces are experienced,
the Eigen faces can be updated or recalculated.
Calculate distribution in this M-dimensional space for each known person by
projecting his or her face images onto this face-space.
These operations can be performed from time to time whenever there is a free excess
operational capacity. This data can be cached which can be used in the further steps
eliminating the overhead of re-initializing, decreasing execution time thereby increasing
the performance of the entire system [4]. Having initialized the system, the next process
involves the steps:
Calculate a set of weights based on the input image and the M Eigen faces by
projecting the input image onto each of the Eigen faces.
Determine if the image is a face at all (known or unknown) by checking to see
if the image is sufficiently close to a ―free space.
If it is a face, then classify the weight pattern as either a known person or as
unknown.
Update the Eigen faces or weights as either a known or unknown, if the same
unknown person face is seen several times then calculate the characteristic
weight Face Recognition Using Principal Component Analysis Method.
The last step is not usually a requirement of every system and hence the steps are left
optional and can be implemented when there is a requirement.
CHAPTER 3
PROPOSED METHODOLOGY
CHAPTER 3
PROPOSED METHODOLOGY
The system design for the proposed model has been broken down into three key steps:
Registration, Recognition and Concentration Analysis.
3.1.1 Registration:
In this module we are taking video feed as input. To register the images we are using facial
detection. Noise removal, averaging and resizing of images to proper resolution is
performed here. These images form the training database.
3.1.2 Recognition:
In this module we are taking video feed as input, with one student at a time. The face is
recognized with the help of PCA facial recognition and the name of the recognized student
is displayed in the annotation on the video input.
In this module the attendance is marked automatically and results are displayed. This tells
us about the attendance of the student.
In this module, the number of blinks are calculated per frame set and concentration is
determined whether it is increasing or decreasing.
1. Face Detection.
2. Face Recognition.
3. Concentration Analysis.
A face detector has to tell whether an image of arbitrary size contains a human face and if
so, where it is. One natural framework for considering this problem is that of binary
classification, in which a classifier is constructed to minimize the misclassification risk.
Since no objective distribution can describe the actual prior probability for a given image
to have a face, the algorithm must minimize both the false negative and false positive rates
in order to achieve an acceptable performance.
This task requires an accurate numerical description of what sets human faces apart from
other objects. It turns out that these characteristics can be extracted with a remarkable
committee learning algorithm called AdaBoost, which relies on a committee of weak
classifiers to form a strong one through a voting mechanism. A classifier is weak if, in
general, it cannot meet a predefined classification target in error terms.
To study the algorithm in detail, we start with the image features for the classification task.
3.2.1.1 Features:
The Viola-Jones algorithm uses Haar-like features, that is, a scalar product between the
image and some Haar-like templates. More precisely, let I and P denote an image and a
pattern, both of the same size N × N as shown in Figure 3.6. The feature associated with
pattern P of image I is defined by,
1≤I≤N 1 ≤ j ≤ N1 ≤ I ≤ N 1≤ j ≤ N
As shown in Figure 3.6, the example rectangle features shown relative to the enclosing
detection window. The sums of the pixels which lie within the White rectangles are
subtracted from the sum of pixels in the grey rectangles. Two-rectangle features are shown
in (A) and (B). Figure (C) shows a three-rectangle feature, and (D) a four-rectangle feature.
To compensate the effect of different lighting conditions, all the images should be mean
and variance normalized beforehand. Those images with variance lower than one, having
little information of interest in the first place, are left out of consideration.
Our face detection procedure classifies images based on the value of simple features. There
are many motivations for using features rather than the pixels directly. The most common
reason is that features can act to encode ad-hoc domain knowledge that is difficult to learn
using a finite quantity of training data. For this system there is also a second critical
motivation for features: the feature-based system operates much faster than a pixel-based
system.
More specifically, we use three kinds of features. The value of a two-rectangle feature is
the difference between the sums of the pixels within two rectangular regions. The regions
have the same size and shape and are horizontally or vertically adjacent as shown in Figure
3.6. A three rectangle feature - computes the sum within two outside rectangles subtracted
from the sum in a center rectangle. Finally a four-rectangle feature computes the difference
between diagonal pairs of rectangles.
Given that the base resolution of the detector is 24 ×24, the exhaustive set of rectangle
features is quite large, 160,000. Note that unlike the Haar basis, the set of rectangle
features is over complete.
Rectangle features can be computed very rapidly using an intermediate representation for
the image which we call the integral image. The integral image at location x, y contains the
sum of the pixels above and to the left of x, y, inclusive:
x’≤x, y’≤y
Where ii (x, y) is the integral image and i (x, y) is the original image. Using the following
pair of recurrences:
Where s(x, y) is the cumulative row sum, s(x, −1) =0, and ii (−1, y) = 0) the integral image
can be computed in one pass over the original image. Using the integral image any
rectangular sum can be computed in four array references. Clearly the difference between
two rectangular sums can be computed in eight references. Since the two-rectangle features
defined above involve adjacent rectangular sums they can be computed in six array
references, eight in the case of the three-rectangle features, and nine for four-rectangle
features.
The authors point out that in the case of linear operations (e.g. f·g), any invertible linear
operation can be applied to f or g if its inverse is applied to the result. For example in the
case of convolution, if the derivative operator is applied both to the image and the kernel
the result must then be double integrated:
( f’)*∫∫(g) = f *g ………………………………(v)
Viewed in this framework computation of the rectangle sum can be expressed as a dot
product (i.r), where i is the image and r is the box car image (with value1 within the
rectangle of interest and 0 outside). This operation can be rewritten,
i· r =( ∫ ∫ i) .r ’ ’ .……….……………………..(vi)
The integral image is in fact the double integral of the image (first along rows and then
along columns). The second derivative of the rectangle (first in row and then in column)
yields four delta functions at the corners of the rectangle. Evaluation of the second dot
product is accomplished with four array accesses.
As shown in Figure 3.7, the sum of the pixels within rectangle D can be computed with
four array references. The value of the integral image at location1 is the sum of the pixels
in rectangle A. The value at location 2 is A + B, at location 3 is A + C, and at location 4 is
A + B + C + D. The sum within D can be computed as 4 + 1 − (2 + 3).
How to make sense of these features is the focus of AdaBoost. A classifier maps an
observation to a label valued in a finite set. For face detection, it assumes the form of f :
Rd→ {−1, 1}, where 1 means that there is a face and −1 the contrary and d is the number
of Haar-like features extracted from an image. Given the probabilistic weights w・∈R+
assigned to a training set made up of no observation-label pairs (xi, yi), AdaBoost aims to
iteratively drive down an upper bound of the empirical loss
∑ W i 1 y ≠ f (x )
i i
..………………………..(vii)
i=1
Under mild technical conditions. Remarkably, the decision rule constructed by AdaBoost
remains reasonably simple so that it is not prone to over fitting, which means that the
empirically learned rule often generalizes well.
This section describes an algorithm for constructing a cascade of classifiers which achieves
increased detection performance while radically reducing computation time. The key
insight is that smaller, and therefore more efficient, boosted classifiers can be constructed
which reject many of the negative sub-windows while detecting almost all positive
instances. Simpler classifiers are used to reject the majority of sub-windows before more
complex classifiers are called upon to achieve low false positive rates. Stages in the
cascade are constructed by training classifiers using AdaBoost. Starting with a two-feature
strong classifier, an effective face filter can be obtained by adjusting the strong classifier
threshold to minimize false negatives. The initial AdaBoost threshold,
T
(1/2) ∑ α t ……………………………(viii)
t =1
is designed to yield a low error rate on the t = 1 training data. A lower threshold yields
higher detection rates and higher false positive rates. The detection performance of the
two-feature classifier is far from acceptable as a face detection system. Nevertheless the
classifier can significantly reduce the number of sub-windows that need further processing
with very few operations:
The overall form of the detection process is that of a degenerate decision tree, what we call
a “cascade”. A positive result from the first classifier triggers the evaluation of a second
classifier which has also been adjusted to achieve very high detection rates. A positive
result from the second classifier triggers a third classifier, and so on. A negative outcome
at any point leads to the immediate rejection of the sub-window. The structure of the
cascade reflects the fact that within any single image an overwhelming majority of sub-
windows are negative. As such, the cascade attempts to reject as many negatives as
possible at the earliest stage possible. While a positive instance will trigger the evaluation
of every classifier in the cascade, this is an exceedingly rare event.
Much like a decision tree, subsequent classifiers are trained using those examples which
pass through all the previous stages. As a result, the second classifier faces a more difficult
task than the first. The examples which make it through the first stage are “harder” than
typical examples .At a given detection rate, deeper classifiers have correspondingly higher
false positive rates.
1. User selects values for f, the maximum acceptable false positive rate per layer
and d, the minimum acceptable detection rate per layer.
2. User selects target overall false positive rate, Ftarget.
3. P = set of positive examples
4. N = set of negative examples
5. F0 = 1.0; D0 = 1.0
6. I=0
7. while Fi >Ftarget
7.1. i←i+ 1
7.2. ni= 0; Fi = Fi−1
7.3. while Fi >f ×Fi−1
7.3.1. ni←ni+ 1
7.3.2. Use P and N to train a classifier with ni features using AdaBoost
7.3.3. Evaluate current cascaded classifier on validation set to determine Fi
and Di.
1216110074, 1216110080, 1216110091, 1216110094, 1216110109, 1216110124 Page
AUTOMATED ATTENDANCE AND CONCENTRATION ANALYSIS SYSTEM
7.3.4. Decrease threshold for the ith classifier until the current cascaded
classifier has a detection rate of at least d×Di−1 (this also affects Fi )
7.4. N ←∅
7.5. If Fi >Ftarget then evaluate the current cascaded detector on the set of non-
face images and put any false detections into the set N
Since the final detector is insensitive to small changes in translation and scale, multiple
detections will usually occur around each face in a scanned image. The same is often true
of some types of false positives. In practice it often makes sense to return one final
detection per face. Toward this end it is useful to post process the detected sub-windows in
order to combine overlapping detections into a single detection. In these experiments
detections are combined in a very simple fashion. The set of detections are first partitioned
into disjoint subsets. Two detections are in the same subset if their bounding regions
overlap. Each partition yields a single final detection. The corners of the final bounding
region are the average of the corners of all detections in the set. In some cases this post
processing decreases the number of false positives since an overlapping subset of false
positives is reduced to a single detection.
Face recognition systems have been grabbing high attention from commercial market point
of view as well as pattern recognition field. Face recognition has received substantial
attention from researches in biometrics, pattern recognition field and computer vision
communities. The face recognition systems can extract the features of face and compare
this with the existing database. The faces considered here for comparison are still faces.
Machine recognition of faces from still and video images is emerging as an active research
area.
The face recognition system detects only a face from the image scene, extracts the
descriptive features. It later compares with the database of faces, which is collection of
faces in different poses. The present system is trained with the database, where the images
are taken in different poses, with glasses, with and without beard.
Eigen faces are a set of eigenvectors used in the computer vision problem of human face
recognition. Eigen faces assume ghastly appearance. They refer to an appearance-based
approach to face recognition that seeks to capture the variation in a collection of face
images and use this information to encode and compare images of individual faces in a
holistic manner. Specifically, the Eigen faces are the principal components of a distribution
of faces, or equivalently, the eigenvectors of the covariance matrix of the set of face
images, where an image with N x N pixels is considered a point (or vector) in N2
dimensional space. The idea of using principal components to represent human faces was
developed by Sirovich and Kirby and used by Turk and Pent land for face detection and
recognition .The Eigen face approach is considered by many to be the first working facial
recognition technology, and it served as the basis for one of the top commercial face
recognition technology products. Since its initial development and publication, there have
been many extensions to the original method and many new developments in automatic
face recognition systems. Eigen faces is still considered as the baseline comparison method
to demonstrate the minimum expected performance of such a system. Eigen faces are
mostly used to:
Extract the relevant facial information, which may or may not be directly
related to human intuition of face features such as the eyes, nose, and lips. One
way to do so is to capture the statistical variation between face images.
Represent face images efficiently. To reduce the computation and space
complexity, each face image can be represented using a small number of
dimensions The Eigen faces may be considered as a set of features which
characterize the global variation among face images. Then each face image is
approximated using a subset of the Eigen faces, those associated with the
largest Eigen values. These features account for the most variance in the
training set. In the language of information theory, we want to extract the
Each of the faces can be represented exactly in terms of linear combinations of the Eigen
faces. Each face can also be approximated using only the “best” Eigen face, which has the
largest Eigen values, and the set of the face images. The best M Eigen faces span an M
dimensional space called as the “Face Space” of all the images. The basic idea using the
Eigen faces was proposed by Sirovich and Kirby as mentioned earlier, using the principal
component analysis and where successful in representing faces using the above mentioned
analysis. In their analysis, starting with an ensemble of original face image they calculated
a best coordinate system for image compression where each coordinate is actually an
image that they termed an Eigen picture. They argued that at least in principle, any
collection of face images can be approximately reconstructed by storing a small collection
of weights for each face and small set if standard picture (the Eigen picture). The weights
that describe a face can be calculated by projecting each image onto the Eigen picture.
Also according to the Turk and Pentland, the magnitude of face images can be
reconstructed by the weighted sums of the small collection of characteristic feature or
Eigen pictures and an efficient way to learn and recognize faces could be to build up the
characteristic features by experience over feature weights needed to (approximately)
reconstruct them with the weights associated with known individuals.
Each individual, therefore would be characterized by the small set of features or Eigen
picture weights needed to describe and reconstruct them, which is an extremely compact
representation of the images when compared to themselves.
1. Initialization process
2. Recognition process
These operations can be performed from time to time whenever there is a free excess
operational capacity. This data can be cached which can be used in the further steps
eliminating the overhead of re-initializing, decreasing execution time thereby increasing
the performance of the entire system.
Having initialized the system, the next process involves the steps -
1. Calculate a set of weights based on the input image and the M Eigen faces by
projecting the input image onto each of the Eigen faces.
2. Determine if the image is a face at all (known or unknown) by checking to see
if the image is sufficiently close to a “free space”.
3. If it is a face, then classify the weight pattern as either a known person or as
unknown.
4. Update the Eigen faces or weights as either a known or unknown. If the same
unknown person face is seen several times then calculate the characteristic
weight pattern and incorporate into known faces. The last step is not usually a
requirement of every system and hence the steps are left optional and can be
implemented as when the there is a requirement.
Let the training set of face images be Γ1,Γ2…………..ΓM. The average face of the set is defined
by,
Ψ =(1/ M )∑ Γ k …………………………(ix)
Φ=Γi−Ψ …...…..………………………(x)
T 2
ƛ k=(1/M )( µk Φ n) …………………………..(xi)
The vectors µk and λk scalars are Eigen vectors and Eigen values, respectively; of the
covariance matrix.
( )(∑ Φ . Φ t )= A . A
M
1 T
C= ………………………(xii)
m n=1
The matrix C, however, is N2 xN2 by N, and determining the N Eigen vectors and Eigen
values is an intractable task for typical image sizes. A Computationally feasible method is
to be funded to calculate these eigenvectors. If the number of data points in the image
space is M (M<N2), there will be only M-1 meaningful eigenvectors, rather than N2. The
eigenvectors can be determined by solving much smaller matrix of the order M2xM2 which,
reduces the computations from the order of N2 to M, pixels. Therefore we construct the
matrix L
L=A. AT ……………………………….(xiii)
where,
T
Lmn=Φ m Φ n …………………………..(xiv)
and find the M eigen vector ul of L . These vectors determine linear combination of the M
training set face images to form the Eigen faces
v 1=µlk Φ k ……………………………..(xv)
where, l = 1……M.
Once the Eigen faces are created, identification becomes a pattern recognition task. The
Eigen faces span an N2-dimensional subspace of the original A image space. The M'
significant eigenvectors of the L matrix are chosen as those with the largest associated
Eigen values. In the test cases, based on M = 6 face images, M' = 4 Eigen faces were used.
The number of Eigen faces to be used is chosen heuristically based on the Eigen values. A
new face image (I) is transformed into its Eigen face components (projected into "face
space") by a simple operation
T
Ωk =v k (Γ k −Ψ ) …………………………….(xvi)
where, k = l…….M'.
T
Ω =[Ω1 Ω2 … … … Ω M ] ……………………….(xvii)
that describes the contribution of each Eigen face in representing the input face image,
treating the Eigen faces as a basis set for face images. The vector is used to find which of a
number of predefined face classes, if any, best describes the face. The simplest method for
determining which face class provides the best description of an input face image is to find
the face class k that minimizes the Euclidean distance
ε k =¿|Ω−Ωk|∨¿ ……………………………..(xix)
A face is classified as belonging to class k when the minimum εk is below some chosen
threshold θε Otherwise the face is classified as "unknown”. The distance threshold, θε, is
half the largest distances between any two face images, mathematically can be expressed
as,
where j, k = 1 to M.
where,
For simplicity, we are trying to measure the concentration of a single student per trial. For
Concentration Analysis following steps should be followed:
Eye Tracking
o Track eyes in detected faces to identify its view point with respect to
camera/blackboard.
Concentration Quotient Calculation
o Calculate the number of eye-blinks of the student per certain number
of frames which is pre-defined.
o Comparing each new set of total blinks with the previous set of total
blinks.
o Calculate the concentration percentage using above collected data.
Based on the concentration percentage we will find whether the student’s concentration
increases or decreases. The main steps for the same are as explained below.
First we detect the user’s face by Haar Cascade Classifier. We are again using Viola Jones
algorithm to detect the ROI of the student i.e. the eye pair. It works almost same as
described previously. The only difference is we are now using it to detect the eye pair
instead of a face.
As shown in Figure 3.8, the example of a Haar Feature that looks similar to the eye region
which is darker than the upper cheeks is applied onto a face.
Our main motive is to calculate the number of eye blinks of the student per certain number
of frames which is predefined. For this purpose thresholding plays the major role. The eye
pair image is converted into the binary format with the help of the specified threshold
depending upon the illumination. Our system gives best results at threshold 55.
(a) (b)
When eyes are closed, the image shows complete black region while when the eyes are
opened, some white objects are visible as shown in the Figure. This forms the basis of
blink detection. This is used to calculate the value of s as shown in the algorithm.
In the next step, we are comparing each new set of total blinks with the previous set of
total blinks. This is used to calculate the concentration percentage using above collected
data which is given by the formula,
Based on the concentration percentage we will find whether the student’s concentration
increases or decreases.
A Data Flow Diagram (DFD) is a graphical representation of the "flow" of data through an
information system, modeling its process aspects. A DFD is often used as a preliminary
step to create an overview of the system, which can later be elaborated. DFDs can also be
used for the visualization of data processing (structured design).
3.4 Advantages
There are various advantages of our system. They are illustrated as follows:-
Reduce errors: Time and Attendance software reduces the risk of human error
and ensures and easy, impartial, and orderly approach in addressing specific
needs without any confusion. In fact, Time and Attendance software has been
shown to have an accuracy rate of more than 99% versus manual systems by
eliminating errors in data entry and calculations.
Increase productivity: Productivity increases because the process is seamless
and makes day-to-day operations more efficient and convenient.
Reduces Manual Work: As the system is automated it doesn’t require more
resources like hand written record of student’s attendance, but the record is
maintained in the database.
The system has less hardware requirements in comparison to the other
biometric system which is RFID based .It does not require additional
components like microcontroller. It works with camera and a computer.
As the system uses fewer resources therefore the cost of the system is less.
The system also reduces human effort.
The system does not only perform the attendance of the system but also checks
the concentration of a person in the class.
This system uses the facial recognition technology and can be further used in
various applications like for surveillance, checking the concentration of person
while driving.
This system is efficient and works perfectly in the ideal conditions.
The system also works in real time.
RAM : 2 GB
HDD : 5 GB
Software Requirements:
CHAPTER 4
EXPERIMENTAL RESULT
CHAPTER 4
EXPERIMENTAL RESULT
Here the face detection is done using the cascade object detector by viola jones
algorithm .Here we use bounding box to detect the faces in the image. In this image we
detect the faces of each and every person. The result of using the viola jones is efficient as
it detects all the faces in the images.
In this result we used the viola jones algorithm to detect faces in the real time ongoing
video .The faces are marked using a rectangular annotation with a label “face”.
We have applied our face recognition algorithm-PCA to recognize the training image to
the test image. Here we have used the KEC database .this database was created for testing
1216110074, 1216110080, 1216110091, 1216110094, 1216110109, 1216110124 Page
AUTOMATED ATTENDANCE AND CONCENTRATION ANALYSIS SYSTEM
the algorithm. The database has proper illumination .the algorithm gives efficient results
and recognizes most of the images correctly
We have applied our algorithm on the standard database .this database is properly
illuminated. The results that we got using this database were really good. The recognition
percentage is 100%.
3
2.5
2
1.5
1
0.5
0
10 20 30 40
No. of Persons
Figure 4.5 Comparisons between Execution Time of Recognition Module vs. Number of Faces
This graph describes that as the no of the persons increases in the database, the average
time for computation of recognition also increases.
This table provides us with the information for each database in which we get the results of
matching the training images with the test images. Here we also have the average time per
matching of the individual database.
In this result of concentration analysis we have detected the eyes by applying the Viola
Jones in real time.
CHAPTER 5
CONCLUSION
CHAPTER 5
CONCLUSION
We have designed a real time automated attendance system which reduces the time and
resources that is required while taking attendance manually. This system uses the
technology of face detection and recognition. The system also tells us whether the student
is concentrating in class or not by calculating the concentration of the person. Various
efficient algorithms are used in order to get the desired results. This system works well in
the ideal conditions and further improvement can be made when the conditions are not
ideal like proper illumination or lightning.
ADVANTAGES:
Reduced errors
Time and Attendance software reduces the risk of human error and ensures and easy,
impartial, and orderly approach in addressing specific needs without any
confusion. In fact, Time and Attendance software has been shown to have an
accuracy rate of more than 99% versus manual systems by eliminating errors in
data entry and calculations.
Increased productivity
As the system is automated it doesn’t require more resources like hand written record
of student’s attendance, but the record is maintained in the database.
The system does not only perform the attendance of the system but also checks
the concentration of a person in the class.
This system uses the facial recognition technology and can be further used in
various applications like for surveillance, checking the concentration of person
while driving.
This system is efficient and works perfectly in the ideal conditions.
SCOPE:
REFERENCES
[2]. Paul Viola and Michael J. Jones, “Robust Real-Time Face Detection”,
International Journal of Computer Vision 57(2), p.p. 137–154, 2004.
[3]. William Robson Schwartz, Huimin Guo, Jonghyun Choi, Larry S. Davis, “Face
Identification Using Large Feature Sets”, IEEE Transactions On Image
Processing, Volume. 21, No. 4, 2012.
[5]. Matthew A. Turk and Alex P. Pentland, “Face Recognition Using Eigen
Faces”, Computer Vision and Pattern Recognition, 1991. Proceedings
CVPR’91, IEEE Computer Society Conference, p.p. 586-591, 1991.
[6]. Yi-Qing Wang, “An Analysis of the Viola-Jones Face Detection Algorithm”,
Image Processing On Line, Vol. 4, p.p. 128-148, 2014.
[8]. Patrik Polatsek, “Eye Blink Detection”, Proceedings of 9th Student Research
Conference in Informatics and Information Technologies, Bratislava, Slovakia,
STU, 2013.
[9]. Deepak Ghimire, Joonwhoan Lee, ”A Robust Face Detection Method Based on
Skin Color and Edges”, Journal of Information Processing System, Vol. 9,
2013.
[11]. Richard M. Jiang, Abdul H. Sadka, Huiyu Zhou, ”An Automatic Human Face
Detection Method”, International Workshop on Content-Based Multimedia
Indexing, IEEE 2008.
APPENDIX
It uses Viola Jones Algorithm to detect people’s face, eyes, mouth and upper body.
2. step()
3. eig()
4. imfill()
5. bewareopen()
6. regionprops()
7. set()
8. insertObjectAnnotation()
9. imread()
10. imshow()
It display image.
11. imcrop()
It creates an interactive associated with the image displayed in the current figure,
called the target image.
12. imresize()
13. mean()
14. load()
15. save()
16. get()
17. cla()
It deletes from the current axes all graphic objects whose handles are not hidden.
18. peekdata()
19. strcat()
20. strcmp()