Professional Documents
Culture Documents
Facial Recognition Using Histogram of Gradients and SVM
Facial Recognition Using Histogram of Gradients and SVM
Facial Recognition Using Histogram of Gradients and SVM
net/publication/317418143
CITATIONS READS
19 2,064
2 authors:
Some of the authors of this publication are also working on these related projects:
Computer Vision Based Automated Detection of Object Tracking in Video Surveillance Applications View project
All content following this page was uploaded by Josephine Julina J. K. on 08 November 2017.
Abstract— Face recognition is widely used in computer vision stored in the database or face gallery. The details about the
and in many other biometric applications where security is a face and its different facial components can be obtained based
major concern. The most common problem in recognizing a face on features which are chosen in such a way that they are
arises due to pose variations, different illumination conditions robust to noise and different orientation. Most of the facial
and so on. The main focus of this paper is to recognize whether a recognition problems deal with feature extraction and machine
given face input corresponds to a registered person in the learning techniques. Facial recognition tasks are performed in
database. Face recognition is done using Histogram of Oriented many areas of image and vision applications where security is
Gradients (HOG) technique in AT & T database with an more focused without any second thought of compromising it.
inclusion of a real time subject to evaluate the performance of the
Many applications deal with mug shot images which are used
algorithm. The feature vectors generated by HOG descriptor are
used to train Support Vector Machines (SVM) and results are
to recognize faces of any matched criminals in the database.
verified against a given test input. The proposed method checks The remaining sections of the paper is organized as
whether a test image in different pose and lighting conditions is follows: Section II provides details about related work carried
matched correctly with trained images of the facial database. The in recognition of faces and feature extraction. Section III
results of the proposed approach show minimal false positives provides details about the architecture of the proposed method
and improved detection accuracy. for detecting whether an image corresponds to matched face in
the gallery. Section IV provides qualitative results obtained
Keywords- AT & T facial database; Face recognition; Feature
extraction; Histogram of Oriented Gradients; Support Vector
after testing with images of AT & T facial database [16]. The
Machine system is able to detect 90% of faces over 41 test images
resulting in few false positives. Section V provides conclusion
and future work followed by references.
I. INTRODUCTION
The face is an important identity of a person. It is obvious II. RELATED WORK
that humans are experts in recognizing people at a farther The well-known datasets available to detect and recognize
distance, predicting the identity of a person in more prominent faces and facial expressions [18] includes CVL [21], CMU
occlusions and in different lighting conditions. However, for pose, illumination & expression (CMU - PIE) [25], FERET
the system, it requires many iterations and tuning of threshold [8], extended M2VTS dataset [17], Cohn-Kanade (CK) [24],
parameters to recognize a person and to reject appropriately JAFFE dataset [20] and so on. This paper uses AT & T facial
when a wrong input is given. database [16] and real time images to recognize faces based on
The location of facial components namely eyes, nose, and feature extraction using HOG [10].
mouth constitute prime landmarks in marking the presence of Faces can be detected most generally using Viola-Jones
a face in an image. However, this process would be made algorithm [9]. Facial features being extracted should be
difficult when a person tends to exhibit different expressions invariant to pose variations. Such invariant features can be
and pose variations. Some of the commonly encountered obtained based on Active Shape Model (ASM) [4] and Active
challenges in facial detection and recognition include Appearance Model (AAM) [12]. Features of an image include
variations in lighting conditions, occlusion, wearing of edge details, texture, color information, corners, and interest
spectacles, having more facial hair and so on. Certain points of regions that constitute what is known as the Region
techniques like template matching methods [19], geometry of Interest (ROI). The feature space can be reduced by
based approaches [1], appearance-based models [3] are identifying dominant and unique feature using Principal
adopted to overcome such challenges. Component Analysis (PCA) [5] technique or by means of
Face detection is the first and foremost step in identifying distance metric like Euclidean [2]. Some of the well known
the presence of faces in an image. Then face recognition or facial recognition algorithms use Gabor filters [30], LBP
verification is carried out which checks whether a given test (Local Binary Pattern) [30] and Linear Discriminant Analysis
input containing face is matched with already available faces (LDA) [30]. The raw pixel information does not discriminate
unique pattern of images that are prone to pose variations. based on input data alone. Supervised learning performs
Hence, it is essential to extract information in larger blob areas classification if the output is discrete, else regression if output
based on the application of interest. Most of the feature is continuous. Unsupervised learning deals with clustering as
extraction methods are based on the type of data. There are the pattern is hidden in the data. Some of the algorithms
many features namely SURF (Speeded-up Robust Features) include Support Vector Machine (SVM) [11], Bayesian
[26], FREAK (Fast Retina Keypoint) [27], BRISK (Binary learning [14], neural networks [13], Hidden Markov Model
Robust Invariant Scalable Keypoints) [28], MSER (Maximally (HMM) [7], K-Means [14] and so on. Choosing an appropriate
Stable Extremal Regions) [29], SIFT (Scale-invariant Feature algorithm is somewhat trivial when data is not given
Transformation) [30] and HOG (Histogram of Oriented importance. It is also essential to avoid over-fitting and under
Gradients) [23]. fitting problems while bringing out optimal decision boundary
[11].
The HOG descriptor can overcome the problem of varying
illumination as it is invariant to lighting conditions. It is used
to extract magnitude of edge information and works well even III. SYSTEM ARCHITECTURE
during variations in poses and illumination conditions. HOG The system is designed to handle pose and illumination
works well under such challenging situations as it represents problems in recognition of faces. This paper focuses on
directionality of edge information thereby making it finding an exact match of the face from 40 subjects in AT & T
significant for the study of pattern and structure of the facial database in addition to user defined subject, based on
interested object. The HOG feature descriptor can be used
HOG features and matching. Fig. 2 shows the overall system
along with other feature descriptors [10]. SURF features are
architecture which covers the entire workflow that includes
mainly used in facial recognition applications. FAST/SURF
and Harris/Freak [6] combination lessens illumination setting up of the database, feature extraction using HOG,
distortion to a greater extent. The combination of HOG and facial landmark detection, building up of classifier model and
SVM [7] yields improved detection accuracy. Fig. 1. shows feature matching.
general marking of facial landmarks such as eyes, mouth, and
nose using Harris corner points.
the database are in the same aspect ratio with differences in C. Feature extraction and classification
poses and expressions with constant illumination. Fig. 3 shows Both training and testing phase requires undergoing the
sample images of AT & T facial database. A new subject is same type of feature extraction technique which in this case is
added with 10 different poses taken in varying lighting done by using HOG features.
conditions and different periods of time. The database is split
into a training set and testing set. Fig. 4 shows the images Algorithm: Feature-based facial recognition
present in the testing set.
Input: Images of persons S = {1,2,...41} each with
S i = {1,2,...,10} where i = {1,2,....41} forming a total of
410 images
Output: Matched subject in the database based on training
features F and labels generated by means of HOG feature and
Harris feature matching of different facial components
Training set: T = I 1 , I 2 ,..., I 9 from 41 subjects forming a total
of 369 images
Fig. 3. Sample images from AT & T facial database Testing set: Te = I 10 each from 41 subjects forming a total of
41 images. These images used to test the system are not
included in the training set.
1. Create database by accessing files from original AT &
T database in which original images in .pgm format
are converted to .jpg format to match with inclusion
of real-time images
2. Include 10 new user images by detecting faces using
Viola-Jones face detector
3. Partition the database into training and testing set
4. Extract facial features for all images in the training
Fig. 4. Test input
dataset T
5. Train the dataset T using HOG feature descriptor
B. Pre-processing techniques
Facial alignment and normalization are essential in 6. Choose a test image from the test dataset Te
achieving higher recognition rates as all feature vectors work 7. Extract feature vector F using HOG feature
very well in same feature space. A set of 10 images of the descriptor
newly added self-defined images requires pre-processing to
match with images of the database. All faces of the newly 8. Predict the class of test input using SVM classifier
added subject are cropped using Viola-Jones algorithm [9]. It based on matching data available in training features
uses Haar features and Adaboost [31] classifier to detect faces and labels
in an integral image [9]. All images in the new image set 9. Perform feature matching to display the
should be resized to match with the common size of corresponding matching between a test image and
112X 92 pixels. Fig. 5 shows the set of newly added images in trained images.
the training set.
The model built in the training phase is used to predict the
class of the subject by using SVM classifier. SVM can be
used most frequently as a binary classifier. However, it can
be used to classify multi-class data.
The labels and features are used to train a model which
can be used as an input to the SVM classifier for finding an
exact match of the given test data against relevant subjects
available in the database. SVM is trained to correct even
possible misclassification [7] when an unknown test input is
Fig. 5. Many appearances of same subject in different illumination and poses given for recognition.
¦ TruePositive + ¦ TrueNegative
Accuracy = (1)
¦ TotalPopulation
[7] Martin Schels and Friedhelm Schwenker, "A Multiple Classifier System
Approach for Facial Expressions in Image Sequences Utilizing GMM
Supervectors" in International Conference on Pattern Recognition
[8] P.J. Phillips, H. Moon, S.A. Rizvi, P.J. Rauss, "The FERET Evaluation
Methodology for Face-Recognition Algorithms" in IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090-
1104, October 2000
[9] Paul Viola, Michael J. Jones, "Robust Real-Time Face Detection" in Int.
J. of Computer Vision, vol. 57, no. 2, pp. 137 - 154, 2004
[10] Pierluigi Carcagni, Marco Del Coco, Marco Leo, Cosimo Distante,
"Facial expression recognition and histograms of oriented gradients: a
comprehensive survey" in Springer Open Journal, pp. 1-25, 2015
[11] R. Chellappa, C.L. Wilson, S. Sirohey, "Human and Machine
Recognition of Faces: A Survey" in Proceedings of the IEEE, vol. 83,
no. 5, pp. 705-740, May 1995
[12] Shan Du, Rabab Ward, "Face recognition under pose variations" in
Journal of Franklin Institute, pp. 596-613, 2006
[13] Sun Y, Wang X, Tang X., "Deep learning face representation from
predicting 10,000 classes" in IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 1898-1898, 2014
[14] Y. Bengio, "Learning deep architectures for AI" in Foundations and
trends in machine learning, vol. 2, no. 1, pp. 1-127, 2009
[15] Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, Face recognition: A
literature survey in ACM Computing Survey, vol. 34, no. 4, pp. 399-
485, 2003
[16] http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
[17] http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb/
[18] http://www.face-rec.org/databases/
[19] http://www.i-programmer.info/babbages-bag/1091-face-recognition.html
[20] http://www.kasrl.org/jaffe.html
[21] http://www.lrv.fri.uni-lj.si/database.html
[22] https://en.wikipedia.org/wiki/Sensitivity_and_specificity
[23] https://in.mathworks.com/help/vision/feature-detection-and-
extraction.html
[24] http://www.pitt.edu/~emotion/downloads.html
[25] http://www.ri.cmu.edu/research_project_detail.html?project_id=418&m
enu_id=261
[26] https://www.cse.unr.edu/~bebis/CS491Y/Papers/Bay06.pdf
[27] https://infoscience.epfl.ch/record/175537/files/2069.pdf
[28] https://e-collection.library.ethz.ch/eserv.php?pid=eth:7684&dsID=eth-
7684-01.pdf
[29] http://in.mathworks.com/help/vision/ref/detectmserfeatures.html
[30] http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
[31] Dwarakesh T P, S Ananda Subramaniam, T Sree Sharmila, Vacant Seat
Detection System using Adaboost and Camshift in International Journal
of Electrical and Computing Engineering, vol. 1, no. 3, pp. 41518, April
2015