06 Abstract

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

ABSTRACT

Accurate and efficient Object Detection is an important topic in the


advancement of Computer Vision systems. With the rise of autonomous vehicles, smart
video surveillance, facial detection and people counting applications, fast and accurate
Object Detection systems are raising in demand. With the advent of Deep Learning
techniques, the accuracy for Object Detection has increased drastically. These systems
involve the Recognition, Classification and Localization of each object in the image.
This makes Object Detection a significantly harder task than its traditional Computer
Vision Predecessor, Image Classification. The present work discusses the recent
advances in Object Detection and Recognition using Deep Learning methods, which
achieved greater success in the field of computer vision and image processing. This
thesis aims to design various methods, which perform Object Detection with much
higher accuracy than the existing techniques and facilitates acceptable real-time
performance.

One of the major problems in the Object Detection is image Classification,


which is defined as predicting the class of an image. A slightly more complicated
problem is Image Localization, where the image contains a collection of objects. Here,
the system should predict the location and the class of the objects in the image. The
most complicated problem of Object Detection involves the combination of both
Classification and Localization. There are a large number of Deep Learning Models
available in the literature which performs the task of Object Detection, but out of which
the present research work is confined to VGG-16, VGG-19, InceptionV3 and Xception
models. The accuracy with which these models detect Objects as per the existing
literature needs to be further improved for higher end applications. Hence, the present
research work concentrates on the aspect of improvement of Object Detection
Accuracy much better than the existing values.

The work in this thesis begins with the basic concepts of Deep Learning and
further explores into various Deep Learning architectures. The present research work
focuses on a collection of four Hyper-Parameters that impact the Object Detection
Accuracy such as ‘Learning Rate’, ‘Epoch Rate’, ‘Training Batch size’ and ‘Testing
Batch Size’. Learning Rate is the speed with which the model learns the weights of the
Neural Network. Whereas, Epoch rate is the number of times the entire training sample

v
is passed through the Deep Learning Model. Training Batch size is the number of
samples of the training dataset which are passed through a Deep Learning Model in a
particular iteration. Testing Batch size is the number of samples of a Testing dataset,
which are passed through a Deep Learning model in a specific iteration.

Thus, determination of the optimum values of these Hyper-Parameters is the


core idea of this research work and is therefore carried out by a process called Fine-
Tuning. A Fine-Tuning is a method of taking the weights of pre-trained Neural
Networks and training the top layers with a new dataset by tuning the Hyper-
Parameters. Therefore, in this thesis, Fine-Tuning is applied by changing the “Number
of Epochs”, “Learning Rate”, “Training Batch Size” and “Testing Batch Size”. The
Prediction Accuracies before and after Fine-Tuning are analyzed with respect to each
Hyper-Parameter and the optimum values of these Hyper-Parameters for achieving
Maximum Object Detection Accuracy are identified. The results are satisfactorily
verified on the Standard Datasets and the results are found to be better than the existing
literature accuracies. Apart from the above, various relationships were formulated
between the considered Hyper-Parameters and the Dataset sizes for achieving
maximum Object Detection accuracy.

The current dissertation concludes by training a Deep Learning Model on a


standard new Dataset by substituting all the identified Hyper-Parameter values. There is
a significant improvement observed in Object Prediction Accuracy after training the
new Dataset with these tuned Hyper-Parameters. Hence, these tuned Hyper-Parameter
values are generalized, so that these values can be directly applied on any kind of
Dataset for obtaining highest accuracy, which can be used in future research works with
minimum time and optimum computational requirements.

The present scope of research work can be utilized in a wide variety of


applications, including National Security Systems, Automated Vehicle Systems, Device
Inspection, Iris Recognition, Face Detection, Disease Identification, Medical
Researches and many others.

vi

You might also like