Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

3URFHHGLQJRI,QWHUQDWLRQDO&RQIHUHQFHRQ6\VWHPV&RPSXWDWLRQ$XWRPDWLRQDQG1HWZRUNLQJ

Convolutional Neural Network with an Optimized


Backpropagation Technique
A. Agnes Lydia F. Sagayaraj Francis
Department of Computer Science and Engineering Department of Computer Science and Engineering
Pondicherry Engineering College Pondicherry Engineering College
Pondicherry Pondicherry
agneslydia@pec.edu fsfrancis@pec.edu

Abstract—This paper exhibits an Object recognition technique and (ii) Numerical Techniques. The choice of optimization
using the Convolutional Neural Network in Deep Learning. technique varies by the nature of the problem and the dataset.
Backpropagation is a redundantly used method to calculate the
gradient of a curve. The gradient, in turn, is involved in the II. I MAGE R ETRIEVAL S YSTEM
weight upgradation while training the deep neural network.
Image Retrieval System is a tool to search, browse, pattern
Being a repeatedly rediscovered algorithm, Backpropagation still
stands out to give better results with its various optimization match and retrieve images [10] [11]. This tool works in two
techniques. An Object Recognition technique in deep learning ways, either based on text or content. In a text-based system,
using backpropagation, optimized with a heuristic optimization the images have text descriptions annotated manually [12].
technique is implemented and evaluated. This reduces the accuracy of the retrieved results. In a content-
Index Terms—Backpropagation, Computer Vision, Gradient
Methods, Image Processing, Image Retrieval, Neural Network, based system, the contents in the images, like color, texture,
Optimization, Supervised Learning. and shapes are indexed [13] [14]. In Content-Based Image
Retrieval (CBIR) system, even the low-level features of the
I. I NTRODUCTION
images are automatically extracted using various techniques
Object Recognition is a convergence of Robotics, Machine in computer vision [15] [16].
Learning, Neural Networks and Artificial Intelligence. Object
Recognition trains artificial intelligence implementations to A. Deep Learning
recognize various forms of objects [1]. The difficulty in Deep Learning, a sub-field in machine learning, marks a
identifying an object arises due to deformations, viewpoint permanent change in the behavior of the neural network.
variations, illumination conditions, background clutter and Unlike the existing machine learning algorithms, Deep learn-
occlusions [2]. To overcome these obstacles, the features are ing consists of a set of algorithms and network architectures
extracted from the images and processed in multiple layers which co-works at multiple levels to extract prominent features
of abstraction [3]. The existing machine learning algorithms automatically without human intervention. The deep learning
though perform well, the lack of multiple layers of processing architectures have the capability to be trained both in super-
and deep analysis of every feature in the image leads to vised and unsupervised manner depending on the problem to
Deep Learning algorithms [4]. Deep Learning is a multi-layer be solved [17].
neural network architecture that incorporates different learning
algorithms to train the network, based on the nature of the B. Convolutional Neural Network
dataset [5]. Convolutional Neural Network (CNN) is a supervised deep
Deep Learning architectures are classified into two, as learning neural architecture. CNN outperforms all the existing
Supervised and Unsupervised Learning Networks. The ar- image retrieval algorithms [18] [19]. Every CNN consists of
chitechtures that perform Supervised Learning are Artifi- four layers, (i)Convolution, (ii) Max Pooling, (iii) Flattening
cial Neural Network (ANN), Convolutional Neural Network and (iv) Full-Connection. These four layers preprocess the
(CNN), and Recurrent Neural Network (RNN). Unsupervised input images before being fed into the first layer of the neural
Learning architectures are Self Organizing Maps (SOM), Deep network [20].
Boltzmann Machines (DBM), and Autoencoders [6] [7]. For
Object recognition in images, Convolutional Neural Network
is widely implemented, as it extracts relevant information at
low computational cost without human intervention [8]. The
commonly used training algorithm in Convolutional Neural
Network is Backpropagation.
Backpropagation trains neural networks in conjunction with
several optimization techniques [9]. These optimization tech-
niques are categorized into two: (i) Heuristic Techniques Fig. 1. Layers of Convolutional Neural Network

#,(((
3URFHHGLQJRI,QWHUQDWLRQDO&RQIHUHQFHRQ6\VWHPV&RPSXWDWLRQ$XWRPDWLRQDQG1HWZRUNLQJ
1) Convolution Layer : A 3X3 feature vector matrix (filter) Step 3: Repeat steps (1) and (2) and calculate the total net
convolves over the input images and removes the least output for each output neurons, o1 and o2
important pixel values and stores the rest in a feature Step 4: Calculate the total error of the output layer.
map. This layer also removes non-linearity in the images Etotal = Σ 12 (target − output)2
by using the rectifier function. Step 5: Calculate the change in each weight that affects the
2) Max Pooling Layer : Extracts the features in the images total error.
and stores in a pooled feature map. Step 6: To decrease the error, subtract the value obtained
3) Flattening Layer : Converts the pooled feature map into in step (5) from the current weight and optionally multiply
a single dimensional vector. by some learning rate (usually μ = 0.5).
4) Full Connection Layer : An activation function is de-
These steps are repeated for each iteration and the weights
fined for every layer based on the nature of the classifi-
of the neurons are updated subsequently.
cation problem.
Convolutional Neural Networks are proven to perform
D. Gradient Descent Optimization Techniques
well while trained with Backpropagation algorithms.
Gradient Descent optimization techniques are broadly clas-
C. Backpropagation sified as Heuristic and Numerical Optimization techniques
Backpropagation trains the neural network in conjunction [22]. The Heuristic Optimization techniques were developed
with an optimization technique usually Gradient Descent [21]. based on the analysis of the performance of the standard
Backpropagation calculates the negative of a gradient from the steepest descent algorithm. In the regular steepest algorithm,
current point. The disadvantage of regular Gradient Descent the parameter ’learning rate’ is fixed and it is complex to
is, if the data are non-uniformly distributed, it fails to find the find the optimal value. While using Heuristic Optimization
global minimum. techniques, the optimal learning rate is adaptively changed
during the training process, as the algorithm moves across
the descending surface. This improves the performance of the
algorithm.
In Numerical Optimization techniques, a search is made
along the directions of the conjugate , which makes the conver-
gence faster in comparison to the steepest descent algorithms.
Therefore the numerical techniques outperform the heuristic
techniques. Although the vice versa is also true depending on
the distribution of datapoints.

Fig. 2. Stochastic Gradient Descent

To overcome this, Stochastic Gradient Descent (SGD) is


implemented. In a neural network, every neuron is assigned
with random weights which are used to calculate the gradient.
Backpropagation updates these weights to make a prediction
nearest to the expected output.

Fig. 4. Gradient Descent Optimization Techniques

Fig. 3. Sample Weighted Neural Network


E. Activation Function
Steps involved in Backpropagation:
Step 1: Calculate the total net input for each hidden neuron. The goal of every activation function is to convert the
neth1 = (w1 ∗ i1 ) + (w2 ∗ i2 ) + b1 given input into an acceptable output as defined by the neural
neth2 = (w3 ∗ i1 ) + (w4 ∗ i2 ) + b1 network. Without an activation function, a neural network
Step 2: Calculate the Activation Function and pass on to would be limited to solve only linear problems and fail to
the next layer. process complicated data. Deep Learning, therefore, involves

#,(((
3URFHHGLQJRI,QWHUQDWLRQDO&RQIHUHQFHRQ6\VWHPV&RPSXWDWLRQ$XWRPDWLRQDQG1HWZRUNLQJ
activation functions to handle, complicated, non-linear, vo- TABLE I
luminous datasets with multiple hidden layers, to efficiently I NDEX VALUES OF THE OBJECTS
extract the features. Index Table Column Head
Different activation functions can be defined for every layer, 0 Alien
based on the constraints set on every neuron. Though several 1 Brown Mug
activation functions are available, the image retrieval system 2 Cat
performs well using ReLu (Rectified Linear Units) and the 3 Cup
4 Deo
Softmax activation functions [23]. 5 Floss
The ReLu is proven to improve the convergence much better 6 Lotion
than other activation functions. It also overcomes the Van- 7 Onion
ishing Gradient Descent problem in Deep learning networks.
Softmax works well with multiclass classification problems
and with precision. TABLE II
T RAINING RESULTS OF A DAGRAD O PTIMIZER
III. DATASET
The dataset used to train the proposed Convolutional neural Iterations Training Accuracy Validation Accuracy
1 0.9698 0.9897
network is the Columbia University Image Library (COIL)- 2 1.0000 0.9897
100 [24]. COIL-100 has 7200 color images of 100 different 3 1.0000 0.9897
objects, which corresponds to 72 different orientations of each 4 1.0000 0.9897
5 1.0000 0.9897
object. This dataset is used for object recognition experiments.

TABLE III
T RAINING RESULTS OF A DADELTA O PTIMIZER

Iterations Training Accuracy Validation Accuracy


1 0.9510 0.9917
2 0.9994 0.9938
3 0.9999 0.9959
4 1.0000 0.9959
5 1.0000 0.9959

TABLE IV
T RAINING RESULTS OF A DAM O PTIMIZER

Iterations Training Accuracy Validation Accuracy


1 0.9641 0.9897
2 0.9999 0.9876
3 0.9963 0.9876
4 1.0000 0.9876
5 1.0000 0.9876

Fig. 5. COIL-100 dataset


TABLE V
T RAINING RESULTS OF A DAMAX O PTIMIZER
IV. R ESULTS
This paper experiments with various optimization tech- Iterations Training Accuracy Validation Accuracy
1 0.9658 0.9814
niques to train a Convolutional Neural Network on an image
2 0.9993 0.9855
dataset COIL-100. The dataset is divided into training and 3 0.9994 0.9855
validation dataset in the ratio 70:30. This network is intended 4 0.9998 0.9897
to predict the object in the given image. The activation function 5 1.0000 0.9855
used in the input layer is ReLu and the activation function
used in the output layer is Softmax. Softmax can make a
categorical prediction of many objects. The training is found to TABLE VI
T RAINING RESULTS OF NADAM O PTIMIZER
be done efficiently, from the similarity in the accuracy of both
the training and the validation data. The number of iterations Iterations Training Accuracy Validation Accuracy
for training the data is accordingly increased. After training 1 0.9463 0.9649
2 0.9959 0.9773
the network, a query image is given as input to the network, 3 0.9996 0.9731
and the result is given in the form of one-hot encoding. 4 0.9941 0.9814
The images are automatically indexed in the order it is fed 5 1.0000 0.9835
as input to the neural network.

#,(((
3URFHHGLQJRI,QWHUQDWLRQDO&RQIHUHQFHRQ6\VWHPV&RPSXWDWLRQ$XWRPDWLRQDQG1HWZRUNLQJ
TABLE VII TABLE IX
T RAINING RESULTS OF SGD O PTIMIZER O NE - HOT ENCODED FORM OF THE RESULT

Iterations Training Accuracy Validation Accuracy Class Matching Vector


1 0.8065 0.9442 Alien 0
2 0.9930 0.9525 Brown Mug 0
3 0.9978 0.9504 Cat 0
4 0.9986 0.9566 Cup 0
5 1.0000 0.9566 Deo 0
Floss 0
Lotion 0
TABLE VIII Onion 0
T RAINING RESULTS OF RMS PROP O PTIMIZER Pink Mug 0
Red Truck 0
Iterations Training Accuracy Validation Accuracy Soda Can 0
1 0.9581 0.9690 Squeezer 0
2 0.9978 0.9793 Tablet 0
3 0.9988 0.9793 Tomato 1
4 0.9988 0.9669 Toy Car 0
5 0.9993 0.9814 White Mug 0
Wood 0
Yellow Cat 0
Yellow Toy 0
As the difference between the values of the training dataset
accuracy and the validation dataset accuracy reduces the
performance of the overall network is increased. is trained using Backpropagation algorithm and optimized
using different Heuristic techniques. The results have proven
that the Adadelta optimizer provides better results on image
datasets.
R EFERENCES
[1] S. Akcay, M. E. Kundegorski, C. G. Willcocks, and T. P. Breckon,
“Using deep convolutional neural network architectures for object clas-
sification and detection within x-ray baggage security imagery,” IEEE
Transactions on Information Forensics and Security, vol. 13, no. 9, pp.
Fig. 6. Query Image 2203–2215, 2018.
[2] A. K. Jain and A. Vailaya, “Image retrieval using color and shape,”
Pattern recognition, vol. 29, no. 8, pp. 1233–1244, 1996.
[3] A. Gordo, J. Almazán, J. Revaud, and D. Larlus, “Deep image retrieval:
Learning global representations for image search,” in European Confer-
ence on Computer Vision. Springer, 2016, pp. 241–257.
[4] P. Druzhkov and V. Kustikova, “A survey of deep learning methods and
software tools for image classification and object detection,” Pattern
Recognition and Image Analysis, vol. 26, no. 1, pp. 9–15, 2016.
[5] D. Mo, “A survey on deep learning: one small step toward ai,” Dept.
Computer Science, Univ. of New Mexico, USA, 2012.
[6] A. Krizhevsky and G. E. Hinton, “Using very deep autoencoders for
content-based image retrieval.” in ESANN, 2011.
[7] L. Deng, “Three classes of deep learning architectures and their applica-
tions: a tutorial survey,” APSIPA transactions on signal and information
processing, 2012.
Fig. 7. Query Image [8] C.-H. Kuo, Y.-H. Chou, and P.-C. Chang, “Using deep convolutional
neural networks for image retrieval,” Electronic Imaging, vol. 2016,
no. 2, pp. 1–6, 2016.
The optimized neural network is able to recognize the object [9] S. Lahmiri, “A comparative study of backpropagation algorithms in
accurately irrespective of the background color or any such financial prediction,” International Journal of Computer Science, En-
variations. The experiment is repeated on various learning gineering and Applications (IJCSEA), vol. 1, no. 4, pp. 15–21, 2011.
[10] B. S. Manjunath and W.-Y. Ma, “Texture features for browsing and
rates and different optimizers listed in Fig.4. The accurate retrieval of image data,” IEEE Transactions on pattern analysis and
results were obtained much faster, in less iterations while using machine intelligence, vol. 18, no. 8, pp. 837–842, 1996.
Adagrad optimizer. [11] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet:
A large-scale hierarchical image database,” in Computer Vision and
V. C ONCLUSION Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. Ieee,
2009, pp. 248–255.
The voluminous growth of data and the need for techniques [12] Y. Liu, D. Zhang, and G. Lu, “Region-based image retrieval with
for efficient retrieval has led way to numerous algorithms and high-level semantics using decision tree learning,” Pattern Recognition,
optimization techniques. The state-of-art architecture being vol. 41, no. 8, pp. 2554–2570, 2008.
[13] Y. Liu, D. Zhang, G. Lu, and W.-Y. Ma, “A survey of content-based
Deep Learning, this paper is concentrated on an object recog- image retrieval with high-level semantics,” Pattern recognition, vol. 40,
nition technique in Deep Learning. This Deep neural network no. 1, pp. 262–282, 2007.

#,(((
3URFHHGLQJRI,QWHUQDWLRQDO&RQIHUHQFHRQ6\VWHPV&RPSXWDWLRQ$XWRPDWLRQDQG1HWZRUNLQJ
[14] M. S. Lew, N. Sebe, C. Djeraba, and R. Jain, “Content-based multimedia
information retrieval: State of the art and challenges,” ACM Transactions
on Multimedia Computing, Communications, and Applications (TOMM),
vol. 2, no. 1, pp. 1–19, 2006.
[15] N. Upadhyaya and M. Dixit, “A review: Relating low level features to
high level semantics in cbir,” International Journal of Signal Processing,
Image Processing and Pattern Recognition, vol. 9, no. 3, pp. 433–444,
2016.
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in neural infor-
mation processing systems, 2012, pp. 1097–1105.
[17] L. Deng, “A tutorial survey of architectures, algorithms, and applications
for deep learning,” APSIPA Transactions on Signal and Information
Processing, vol. 3, 2014.
[18] J. Schlemper, J. Caballero, J. V. Hajnal, A. N. Price, and D. Rueckert, “A
deep cascade of convolutional neural networks for dynamic mr image
reconstruction,” IEEE transactions on Medical Imaging, vol. 37, no. 2,
pp. 491–503, 2018.
[19] S.-J. Lee, T. Chen, L. Yu, and C.-H. Lai, “Image classification based
on the boost convolutional neural network,” IEEE Access, vol. 6, pp.
12 755–12 768, 2018.
[20] J. E. Sklan, A. J. Plassard, D. Fabbri, and B. A. Landman, “Toward
content-based image retrieval with deep convolutional neural networks,”
in Medical Imaging 2015: Biomedical Applications in Molecular, Struc-
tural, and Functional Imaging, vol. 9417. International Society for
Optics and Photonics, 2015, p. 94172C.
[21] S. Tiwari, R. Naresh, and R. Jha, “Comparative study of backpropagation
algorithms in neural network based identification of power system,”
International Journal of Computer Science & Information Technology,
vol. 5, no. 4, p. 93, 2013.
[22] S. Ruder, “An overview of gradient descent optimization algorithms,”
arXiv preprint arXiv:1609.04747, 2016.
[23] B. Sharma and K. Venugopalan, “Comparison of neural network training
functions for hematoma classification in brain ct images,” IOSR Journal
of Computer Engineering (IOSR-JCE), vol. 16, no. 1, pp. 31–35, 2014.
[24] E. A. Khalid, Y. CHAWKI, B. AKSASSE, and M. OUANAN, “A new
color descriptor for content-based image retrieval: Application to coil-
100,” Journal of Digital Information Management, vol. 13, no. 6, p. 473,
2015.

#,(((

You might also like