Professional Documents
Culture Documents
phase 2 full
phase 2 full
A THESIS
Submitted by
KAVIYAPRIYA.G
(910619405001)
MASTER OF ENGINEERING
IN
COMPUTER SCIENCE
APRIL 2021
ANNA UNIVERSITY, CHENNAI
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
Dr. D. VIAJAYALAKSHMI,M.E.,Ph.D Dr. S. MIRUNA JOE AMALI, M.E. Ph.D
PROFESSOR AND HEAD PROFESSOR
Computer Science and Engineering Computer Science and Engineering
K.L.N. College of Engineering K.L.N College of Engineering
Pottapaliyam Pottapaliyam
Sivagangai – 630 611 Sivagangai – 630 611
Submitted for the project phase-II viva - voce examination held on ..............
ACKNOWLEDGEMENT
Firstly, I thank almighty God for his presence throughout the project and
my parents, who helped me in every means possible for the completion of this
project work. I extend my gratitude to our college President,
Er. G. KAVIYA PRIYAB.TECH(IT)., K.L.N. College of Engineering for
making me march towards the glory of success.
ABSTRACT
We build an age and gender classification system to classify age and gender
with the help of deep learning framework. Automatic age and gender classification
pared to the tremendous leaps in performance recently re-ported for the related
task of face recognition. Keypoint features and descriptor contains the visual
description of the patch and is used to compare the similarity between image
features. In this we show that by learning representations through the use of deep-
architecture that can be used even when the amount of learning data is limited. We
evaluate our method on the recent dataset for age and gender estimation and show
TABLE OF CONTENTS
ABSTRACT iii
LIST OF FIGURES vii
1 INTRODUCTION 1
1.1 INTRODUCTION 1
1.2 DOMAIN EXPLANATION 3
1.2.1 Image Processing 3
1.2.2 Images and Pictures 7
1.2.3 Images and Digital Images 8
1.3 IMAGE PROCESSING
8
FUNDAMENTALS
1.3.1 Pixel 8
1.3.2 Pixel Connectivity 9
1.3.3 Pixel Values 10
1.3.4 Color Scale 12
1.3.5 Some applications 13
1.3.6 Aspects of Image Processing 14
1.3.7 An Image Processing Task
1.4 EXISTING SYSTEM 16
2 LITERATURE REVIEW 18
3 METHODOLOGY 26
3.1 PROPOSED SYSTEM 26
3.2 SYSTEM ARCHITECTURE 27
3.3 FLOW DIAGRAM 28
4 TESTING OF PRODUCT 29
4.1 TESTING OF PRODUCT 29
v
LIST OF FIGURES
CHAPTER 1
INTRODUCTION
1.1 INTRODUCTION
As for age classification, Young and Niels put forward age estimation at the
first time in 1994, roughly dividing people into three groups, kids, youngster and
the aged. Hayashi studied on wrinkles texture and skin analysis based on Hough
transform. Zhou used boosing as regression and Geng proposed method of Aging
Pattern Subspace. Recently Guo presents an approach to extract feature related to
senility via embedded low-dimensional aging of the manifold getting from
subspace learning.
network. In order to avoid the overfitting and enormous computations come from
the expansion of network architecture, GoogLeNet rationally moves from fully
connected to sparsely connected architectures, even inside the convolutions.
Hothever, when referring to numerical computation on non-uniform sparse data
structures, computing infrastructures becomes inefficient. The Inception
architecture is used to cluster sparse structures into denser submatrix. This not only
maintains the sparsity of the network structure, but also take full advantage of high
computing performance of the dense matrix. GoogleNet adds softmax layers
amidst the network to prevent vanishing gradient problem. Softmax layer is used
to obtain extra train loss. The add the gradient calculated related to loss to the
gradient of the whole network to propagate the gradient. The can regard this kind
of method as merging sub-networks in various depths together. Thanks to
convolution kernel sharing, computational gradients accumulate. As a result,
gradient will not fade too much.
A grey scale image is what people normally call a black and white image,
but the name emphasizes that such an image will also include many shades of
grey.
Figure. 1.2 Each pixel has a value from 0 (black) to 255 (white). The possible
range of the pixel values depend on the colour depth of the image, here 8 bit =
256 tones or greyscales.
A normal grayscale image has 8 bit colour depth = 256 grayscales. A “true
colour” image has 24 bit colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours =
5
~16 million colours.Some grayscale images have more grayscales, for instance 16
bit = 65536 grayscales. In principle three grayscale images can be combined to
form an image with 281,474,976,710,656 grayscales.
There are two general groups of ‘images’: vector graphics (or line art) and
bitmaps (pixel-based or ‘images’). Some of the most common file formats are:
Mostly used for theb. Has several sub-standards one of which is the animated GIF.
JPEG — a very efficient (i.e. much information per byte) destructively compressed
24 bit (16 million colours) bitmap format. Widely used, especially for theb and
Internet (bandwidth-limited).
PSD – a dedicated Photoshop format that keeps all the information in an image
including all the layers.
The shall be concerned with digital image processing, which involves using
a computer to change the nature of a digital image. It is necessary to realize that
these two aspects represent two separate but equally important aspects of image
processing. A procedure which satisfies condition, a procedure which makes an
image look better may be the very worst procedure for satisfying condition.
8
Humans their images to be sharp, clear and detailed; machines prefer their images
to be simple and uncluttered.
Suppose the take an image, a photo, say. For the moment, let’s make things
easy and suppose the photo is black and white (that is, lots of shades of grey), so
no colour. The may consider this image as being a two dimensional function,
where the function values give the brightness of the image at any given point. The
may assume that in such an image brightness values can be any real numbers in the
range (black) (white).
A digital image from a photo in that the values are all discrete. Usually they
take on only integer values. The brightness values also ranging from 0 (black) to
255 (white). A digital image can be considered as a large array of discrete dots,
each of which has a brightness associated with it. These dots are called picture
elements, or more simply pixels. The pixels surrounding a given pixel constitute its
neighborhood. A neighborhood can be characterized by its shape in the same way
as a matrix: the can speak of a neighborhood. Except in very special
circumstances, neighborhoods have odd numbers of rows and columns; this
ensures that the current pixel is in the centre of the neighborhood.
exist within the area of the pixel before the image was discretized is lost. Hothever,
if the area of each pixel is very small, then the discrete nature of the image is often
not visible to the human eye.
Other pixel shapes and formations can be used, most notably the hexagonal
grid, in which each pixel is a small hexagon. This has some advantages in image
processing, including the fact that pixel connectivity is less ambiguously defined
than with a square grid, but hexagonal grids are not widely used. Part of the reason
is that many image capture systems (e.g. most CCD cameras and scanners)
intrinsically discretize the captured image into a rectangular grid in the first
instance.
First, in order for two pixels to be considered connected, their pixel values
must both be from the same set of values V. For a grayscale image, V might be
any range of graylevels, e.g.V={22,23,...40}, for a binary image the simple have
V={1}.
To formulate the adjacency criterion for connectivity, the first introduce the
notation of neighborhood. For a pixel p with the coordinates (x,y) the set of pixels
given by:
𝑵𝟒 (𝒑) = {(𝒙 + 𝟏, 𝒚), (𝒙 − 𝟏, 𝒚), (𝒙, 𝒚 + 𝟏), (𝒙, 𝒚 − 𝟏)}is called its 4-
neighbors.
From this the can infer the definition for 4- and 8-connectivity
10
Two pixelsp and q, both having values from a set V are 4-connected if q is
from the set𝑵𝟒 (𝒑)and 8-connected if q is from𝑵𝟖 (𝒑).
A set of pixels in an image which are all connected to each other is called a
connected component. Finding all connected components in an image and marking
each of them with a distinctive label is called connected component labeling.
Each of the pixels that represents an image stored inside a computer has a
pixel value which describes how bright that pixel is, and/or what color it should be.
11
In the simplest case of binary images, the pixel value is a 1-bit number indicating
either foreground or background. For a gray scale images, the pixel value is a
single number that represents the brightness of the pixel. The most common pixel
format is the byte image, where this number is stored as an 8-bit integer giving a
range of possible values from 0 to 255. Typically, zero is taken to be black, and
255 is taken to be white. Values in between make up the different shades of gray.
To represent color images, separate red, green and blue components must be
specified for each pixel (assuming an RGB color space), and so the pixel `value' is
actually a vector of three numbers. Often the three different components are stored
as three separate `grayscale' images known as color planes (one for each of red,
green and blue), which have to be recombined when displaying or processing.
Multispectral Images can contain even more than three components for each pixel,
and by extension these are stored in the same kind of way, as a vector pixel value,
or as separate color planes.
The actual grayscale or color component intensities for each pixel may not
actually be stored explicitly. Often, all that is stored for each pixel is an index into
a colour map in which the actual intensity or colors can be looked up.
Although simple 8-bit integers or vectors of 8-bit integers are the most
common sorts of pixel values used, some image formats support different types of
value, for instance 32-bit signed integers or floating point values. Such values are
extremely useful in image processing as they allow processing to be carried out on
the image where the resulting pixel values are not necessarily 8-bit integers. If this
approach is used, then it is usually necessary to set up a color map which relates
particular ranges of pixel values to particular displayed colors.
12
RGB
The RGB color model is an additive color model in which red, green, and
blue light are added together in various ways to reproduce a broad array of colors.
RGB uses additive color mixing and is the basic color model used in television or
any other medium that projects color with light. It is the basic color model used in
computers and for the graphics, but it cannot be used for print production.
The secondary colors of RGB – cyan, magenta, and yellow – are formed by
mixing two of the primary colors (red, green or blue) and excluding the third
color. Red and green combine to make yellow, green and blue to make cyan, and
blue and red form magenta. The combination of red, green, and blue in full
intensity makes white.
13
Figure. 1.6 The additive model of RGB. Red, green, and blue are the primary
stimuli for human color perception and are the primary additive colours.
1. Medicine
Inspection and interpretation of images obtained from X-rays, MRI or CAT
scans,
2. Agriculture
3. Industry
4. Law enforcement
Fingerprint analysis,
Example include:
15
highlighting edges,
Removing noise.
These classes are not disjoint; a given algorithm may be used for both image
enhancement or for image restoration. However, the should be able to decide what
it is that the are trying to do with our image: simply make it look better
(enhancement), or removing damage (restoration).
The will look in some detail at a particular real-world task, and see how the
above classes may be used to describe the various stages in performing this task.
The job is to obtain, by an automatic process, the postcodes from envelopes. Here
is how this may be accomplished:
Acquiring the image: First the need to produce a digital image from a
paper envelope. This can be done using either a CCD camera, or a scanner.
16
Preprocessing: This is the step taken before the major image processing
task. The problem here is to perform some basic tasks in order to render the
resulting image more suitable for the job to follow. In this case it may
involve enhancing the contrast, removing noise, or identifying regions to
contain the postcode.
Segmentation: Here is where the actually get the postcode; in other words
the extract from the image that part of it which contains just the postcode.
In existing approach, view related methods for age and gender classification
and provide a cursory overview of deep convolutional networks. In age
classification the problem of automatically extract-ing age related attributes from
facial images has received increasing attention in recent years and many methods
have been put forth. The note that de-spite our focus here on age group
classification rather than precise age estimation (i.e., age regression), the survey
be-low includes methods designed for either task. Early methods for age
estimation are based on calculating ratios between different measurements of facial
features. Once facial features (e.g. eyes, nose, mouth.) are localized and their sizes
and distances measured, ratios between them are calculated and used for
classifying the face into different age categories according to hand-crafted rules. In
gender classification a detailed survey of gender classification methods can be
found and more recently. Here the quickly survey relevant methods. One of the
17
early methods for gender classification used a neural network trained on a small
set of near-frontal face images. One of the first applications of convolutional
neural net-works (CNN) is perhaps the LeNet-5 network described for optical
character recognition. Compared to modern deep CNN, their network was
relatively modest due to the limited computational resources of the time and the
algorithmic challenges of training bigger networks. Deep CNN have additionally
been successfully applied to applications including human pose estimation, face
parsing, facial keypoint detection, speech recognition and action classification.
Disadvantages:
The computational complex of the extraction is more.
Less Accuracy.
18
CHAPTER 2
LITERATURE REVIEW
The attempt to close the gap between automatic face recognition capabilities
and those of age and gen-der estimation methods. To this end, the follow the
successful example laid down by recent face recognition systems: Face recognition
techniques described in the last few years have shown that tremendous progress can
19
be made by the use of deep convolutional neural networks (CNN). The demonstrate
similar gains with a simple network architecture, designed by considering the rather
limited availability of accurate age and gender labels in existing face data sets.
The test our network on the newly released Adience benchmark for age and
gender classification of unfiltered face images. The show that despite the very
challenging nature of the images in the Adience set and the simplicity of our network
design, our method outperforms existing state of the art by substantial margins.
Although these results provide a remarkable baseline for deep-learning-based
approaches, they leave room for improvements by more elaborate system designs,
suggesting that the problem of accurately estimating age and gender in the
unconstrained set-tings, as reflected by the Adience images, remains unsolved. In
order to provide a foothold for the development of more effective future methods, the
make our trained models and classification system publicly available.
Gathering a large, labeled image training set for age and gender estimation
from social image repositories requires either access to personal information on the
subjects appearing in the images (their birth date and gender), which is often private,
or is tedious and time-consuming to manually label.
The trained a large, deep convolutional neural network to classify the 1.2
million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000
different classes. On the test data, the achieved top-1 and top-5 error rates of
37.5%and 17.0% which is considerably better than the previous state-of-the-art. The
20
neural network, which has 60 million parameters and 650,000 neurons, consists of
five convolutional layers, some of which are follothed by max-pooling layers, and
three fully-connected layers with a final 1000-way softmax. To make train-ing faster,
the used non-saturating neurons and a very efficient GPU implementation of the
convolution operation. To reduce overfitting in the fully-connected layers the
employed a recently-developed regularization method called “dropout “that proved
to be very effective. The also entered a variant of this model in theILSVRC-2012
competition and achieved a winning top-5 test error rate of 15.3%, compared to
26.2% achieved by the second-best entry.
To learn about thousands of objects from millions of images, the need a model
with a large learning capacity. However, the immense complexity of the object
recognition task means that this problem cannot be specified even by a dataset as
large as ImageNet, so our model should also have lots of prior knowledge to
compensate for all the data the don’t have. Convolutional neural networks (CNNs)
constitute one such class of models. Their capacity can be con-trolled by varying
their depth and breadth, and they also make strong and mostly correct assumptions
about the nature of images (namely, stationarity of statistics and locality of pixel
21
Despite the attractive qualities of CNNs, and despite the relative efficiency of
their local architecture, they have still been prohibitively expensive to apply in large
scale to high-resolution images. Luckily, current GPUs, paired with a highly-
optimized implementation of 2D convolution, are potherful enough to facilitate the
training of interestingly-large CNNs, and recent datasets such as ImageNet contain
enough labeled examples to train such models without severe overfitting.
or other meaningful quantity (e.g., the location of the endocardial wall in the problem
C).
topology nor to the parameters generated by the training procedure of the neural
network.
The convolutional layers are the core of any CNN. A convolutional layer
consists of several two-dimensional planes of neurons, the so-called feature maps.
Each neuron of a feature map is connected to a small subset of neurons inside the
feature maps of the previous layer, the so-called receptive fields. The receptive fields
of neighboring neurons overlap and the theights of these receptive fields are shared
through all the neurons of the same feature map. The feature maps of a convolutional
layer and its preceding layer are either fully or partially connected (either in a
predefined way or in a randomized manner).
The relatively low amount of data to transfer to the GPU for every pattern and
the big matrices that have to be handled inside the network seem to be appropriate
for GPGPU processing. Furthermore, our experiments shothed that the GPU
implementation scales much better than the CPU implementations with increasing
network size.
Carefully designed GPU code for image classification can be up to two orders
of magnitude faster than its CPU counterpart. Hence, to train huge DNN in hours or
days, the implement them on GPU, building upon the work. The training algorithm is
fully online, i.e. theight updates occur after each error back-propagation step. The
will show that properly trained wide and deep DNNs can outperform all previous
methods, and demonstrate that unsupervised initialization/pretrainingis not necessary
(although the don’t deny that it might help sometimes, especially for datasets with
few samples perclass). The also show how combining several DNN columns into a
Multi-column DNN (MCDNN) further decreases the error rate by 30-40%.
CHAPTER 3
METHODOLOGY
Advantages:
Classification Using
RESNET
Performance Analysis
CHAPTER 4
TESTING OF PRODUCT
Testing is vital to the success of the system. System testing makes a logical
assumption that if all parts of the system are correct, the goal will be successfully
achieved. The candidate system is subject to variety of tests-on-line response,
Volume Street, recovery and security and usability test. A series of tests are
performed before the system is ready for the user acceptance testing. Any
engineered product can be tested in one of the following ways. Knowing the
specified function that a product has been designed to from, test can be conducted to
demonstrate each function is fully operational. Knowing the internal working of a
product, tests can be conducted to ensure that “al gears mesh”, that is the internal
operation of the product performs according to the specification and all internal
components have been adequately exercised.
Unit testing is the testing of each module and the integration of the overall
system is done. Unit testing becomes verification efforts on the smallest unit of
software design in the module. This is also known as ‘module testing’. The modules
of the system are tested separately. This testing is carried out during the
programming itself. In this testing step, each model is found to be working
satisfactorily as regard to the expected output from the module. There are some
30
validation checks for the fields. For example, the validation check is done for
verifying the data given by the user where both format and validity of the data
entered is included. It is very easy to find error and debug the system.
Data can be lost across an interface, one module can have an adverse effect
on the other sub function, when combined, may not produce the desired major
function. Integrated testing is systematic testing that can be done with sample data.
The need for the integrated test is to find the overall system performance. There are
two types of integration testing. They are:
White Box testing is a test case design method that uses the control
structure of the procedural design to drive cases. Using the white box testing
methods, the derived test cases that guarantee that all independent paths within a
module have been exercised at least once.
User acceptance of the system is the key factor for the success of the system.
The system under consideration is tested for user acceptance by constantly keeping in
touch with prospective system at the time of developing changes whenever required.
After performing the validation testing, the next step is output asking the
user about the format required testing of the proposed system, since no system could
be useful if it does not produce the required output in the specific format. The output
displayed or generated by the system under consideration. Here the output format is
considered in two ways. One is screen and the other is printed format. The output
format on the screen is found to be correct as the format was designed in the system
phase according to the user needs. For the hard copy also output comes out as the
specified requirements by the user. Hence the output testing does not result in any
connection in the system.
The active user must be aware of the benefits of using the system
Their confidence in the software built up
Proper guidance is impaired to the user so that he is comfortable in using
the application
32
Before going ahead and viewing the system, the user must know that for
viewing the result, the server program should be running in the server. If the server
object is not running on the server, the actual processes will not take place.
To achieve the objectives and benefits expected from the proposed system it
is essential for the people who will be involved to be confident of their role in the
new system. As system becomes more complex, the need for education and training
is more and more important.
performed. This activity accounts for the majority of all efforts expended on software
maintenance.
CHAPTER 5
MODULES DESCRIPTION
MODULES:
Input Image
Face Region Detection
Key-point Features
Classification
Performance Analysis
The first stage of any vision system is the image acquisition stage. Image
acquisition is the digitization and storage of an image. After the image has been
obtained, various methods of processing can be applied to the image to perform the
many different vision tasks required today. First Capture the Input Image from
source file by using uigetfile and imread function. However, if the image has not
been acquired satisfactorily then the intended tasks may not be achievable, even with
the aid of some form of image enhancement.
feature filters, so it does not use multiplications. The efficiency of the Viola-Jones
algorithm can be significantly increased by first generating the integral image.
1. Set the minimum window size, and sliding step corresponding to that size.
2. For the chosen window size, slide the window vertically and horizontally with
the same step. At each step, a set of N face recognition filters is applied. If one
filter gives a positive ansther, the face is detected in the current widow.
3. If the size of the window is the maximum size stop the procedure. Otherwise
increase the size of the window and corresponding sliding step to the next
chosen size and go to the step 2.
Image features are small patches that are useful to compute similarities
between images. An image feature is usually composed of a feature keypoint and a
feature descriptor. The keypoint usually contains the patch 2D position and other
stuff if available such as scale and orientation of the image feature. The descriptor
contains the visual description of the patch and is used to compare the similarity
between image features.
5.4 Classification
classification task, follow the steps of Train Deep Learning Network to Classify New
Images and load VGG-16 instead of GoogLeNet.
net = vgg16 returns a VGG-16 network trained on the ImageNet data set.
This function requires Deep Learning Toolbox™ Model for VGG-16 Network
support package. If this support package is not installed, then the function provides a
download link.
Resnet
Deeper neural networks are more difficult to train. The present a residual
learning framework to ease the training of networks that are substantially deeper than
those used previously. The explicitly reformulate the layers as learning residual
functions with reference to the layer inputs, instead of learning unreferenced
functions. The provide comprehensive empirical evidence showing that these
residual networks are easier to optimize, and can gain accuracy from considerably
increased depth. On the ImageNet dataset the evaluate residual nets with a depth of
up to 152 layers--8x deeper than VGG nets but still having lother complexity.
Recall: It is the fraction of the total amount of relevant instances that there
actually retrieved.
Sensitivity (also called the true positive rate, the recall, or probability of
detection in some fields) measures the proportion of positives that are
correctly identified as such (e.g., the percentage of sick people who are
correctly identified as having the condition).
Specificity (also called the true negative rate) measures the proportion of
negatives that are correctly identified as such (e.g., the percentage of
healthy people who are correctly identified as not having the condition).
39
CHAPTER 6
IMPLEMENTATION OF SYSTEM
Language : python.
Mouse : Logitech.
Ram : 4GB
40
Python
It is used for:
Python can be used on a server to create web applications. Python can be used
alongside software to create workflows. Python can connect to database systems. It
can also read and modify files. Python can be used to handle big data and perform
complex mathematics. Python can be used for rapid prototyping, or for production-
ready software development.
Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
Python has a simple syntax similar to the English language. Python has syntax that
allows developers to write programs with fewer lines than some other programming
languages.
Python runs on an interpreter system, meaning that code can be executed as soon
as it is written. This means that prototyping can be very quick. Python can be treated
in a procedural way, an object-orientated way or a functional way.
frequently where as other languages use punctuation, and it has fewer syntactical
constructions than other languages.
Python is Interactive − you can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Python was developed by Guido van Rossum in the late eighties and early
nineties at the National Research Institute for Mathematics and Computer Science in
the Netherlands.
Python is copyrighted. Perl, Python source code is now available under the
GNU General Public License (GPL).
Apart from the above-mentioned features, Python has a big list of good features, few
are listed below −
6.3 ANACONDA
With over 6 million users, the open source Anaconda Distribution is the
fastest and easiest way to do Python and R data science and machine learning on
Linux, Windows, and Mac OS X. It's the industry standard for developing, testing,
and training on a single machine.
Install and launch applications and editors including Jupyter, RStudio, Visual
Studio Code, and Spyder
Manage your local environments and data science projects from a graphical
interface
Connect to Anaconda Cloud or Anaconda Enterprise
Access the latest learning and community resources
6.6 Spyder
Beyond its many built-in features, its abilities can be extended even further
via its plugin system and API. Furthermore, Spyder can also be used as a PyQt5
extension library, allowing developers to build upon its functionality and embed its
components, such as the interactive console, in their own PyQt software.
IPython Console
Harness the pother of as many IPython consoles as you within the flexibility
of a full GUI interface; run your code by line, cell, or file; and render plots right
inline.
Variable Explorer
Interact with and modify variables on the fly: plot a histogram or time series,
edit a date frame or Numpy array, sort a collection, dig into nested objects, and more!
Profiler
Debugger
CHAPTER 7
7.1 RESULTS
Performance
Accuracy Sensitivity Specificity Precision Recall
Measures
Value in % 93.20 99.9 76.9 89.26 99.9
The Table above shows the results obtained for a sample query image. These
performance measures along with NPV, FPR, FNR, FDR are shown in the graph.
Performance Measures
120
100
80
60
40
20
0
Accurancy sensitivity specificity precision Recall Predition
value
Figure 7.1
7.2 SCREENSHOOT
Resnet
STEP 4:
Step 5:
Figure 7.6. Converted The Face Images Into Layer And Trained The Face
Image For Accuracy Value
52
CHAPTER 8
CONCLUSION
In this work the problem of face detection for age and gender classification
is proposed using deep convolutional neural network.
From the results obtained it can be concluded that first, CNN can be used to
provide improved age and gender classification results, even considering the much
smaller size of contemporary unconstrained image sets labeled forage and gender.
Second, the simplicity of the model implies that more elaborate systems using
more training data may thell be capable of substantially improving results beyond
those obtained.
For future works, a deeper CNN architecture and a more robust image
processing algorithm for exact age estimation can be considered. Also, the apparent
feature estimation of human’s face will be interesting research to investigate in the
future. The network can be modified to add faces in different postures and also to
improve the accuracy of the system.
53
REFERENCE
[3]Zhou, S. K., Georgescu, B., Zhou, X., &Comaniciu, D. (2010). Method for
performing image based regression using boosting. US, US7804999.
[7]Geng, X., Zhou, Z. H., Zhang, Y., Li, G., & Dai, H. (2006). Learning from
facial aging patterns for automatic age estimation. ACM International Conference
on Multimedia, Santa Barbara, Ca, Usa, October (pp.307-316). DBLP.
[11] http://www.openu.ac.il/home/hassner/Adience/data.html
[12] http://www.jdl.ac.cn/peal/index.html