Comparative Study of Optimization Algorithm in Deep CNN-Based Model For Sign Language Recognition - SpringerLink

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

9/15/21, 12:09 PM Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition | SpringerLink

Comparative Study of Optimization Algorithm in Deep


CNN-Based Model for Sign Language Recognition
Computer Networks and Inventive Communication Technologies pp 463-471 |
Cite as

Conference paper
First Online: 14 September 2021
Part of the
Lecture Notes on Data Engineering and Communications Technologies
book series (LNDECT, volume 75)

Abstract

The fundamental part of the neural network is the learning rate, and the strategy of adopting the learning process in a neural
network is carried out using optimization algorithms or optimizers. This optimization algorithm helps us produce better
results to the model by changing the parameters like bias and weights, i.e., it helps us maximize or minimize the error
function and depends on the learnable parameters. In this paper, we examine how an End-to-End CNN model named
ASLNET recognizes the alphabets of the American sign language using various optimizers such as Stochastic Gradient
Descent (SGD), Root-Mean-Square propagation (RM-Sprop), Adaptive Gradient Algorithm (Adagrad), Adaptive Delta
(Adadelta),Adaptive Moment Estimation (Adam), Adam with Nesterov Momentum (Nadam), LookAhead and Rectified
Adam (RAdam). To avoid the overfitting issues, traditional data augmentation techniques are used to compare our model
with data augmentation and without augmentation with these optimizers. Among these, LookAhead and RAdam are the
most recently developed. The experiment is conducted on 2 NVIDIA TESLA P100 GPUs of batch size 64, and the
investigation was based on benchmark ASL Finger Spelling dataset.

Keywords
Optimization algorithms  Deep CNN  Finger Spelling dataset  Sign language recognition 

https://link.springer.com/chapter/10.1007/978-981-16-3728-5_35 1/5
9/15/21, 12:09 PM Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition | SpringerLink

This is a preview of subscription content, log in to check access.

References
1. Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., et al.: Recent
advances in deep learning for speech research at microsoft. In: ICASSP 2013 (2013)
Google Scholar (https://scholar.google.com/scholar?
q=Deng%2C%20L.%2C%20Li%2C%20J.%2C%20Huang%2C%20J.-
T.%2C%20Yao%2C%20K.%2C%20Yu%2C%20D.%2C%20Seide%2C%20F.%2C%20Seltzer%2C%20M.%2C%20Zweig
%2C%20G.%2C%20He%2C%20X.%2C%20Williams%2C%20J.%2C%20et%20al.%3A%20Recent%20advances%20in
%20deep%20learning%20for%20speech%20research%20at%20microsoft.%20In%3A%20ICASSP%202013%20%28
2013%29)
2. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786),
504–507 (2006)
MathSciNet (http://www.ams.org/mathscinet-getitem?mr=2242509)
CrossRef (https://doi.org/10.1126/science.1127647)
Google Scholar (http://scholar.google.com/scholar_lookup?
title=Reducing%20the%20dimensionality%20of%20data%20with%20neural%20networks&author=GE.%20Hinton
&author=RR.%20Salakhutdinov&journal=Science&volume=313&issue=5786&pages=504-
507&publication_year=2006)
3. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath,
T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: The shared views of four research
groups. Signal Process. Mag. IEEE 29(6), 82–97 (2012)
Google Scholar (https://scholar.google.com/scholar?
q=Hinton%2C%20G.%2C%20Deng%2C%20L.%2C%20Yu%2C%20D.%2C%20Dahl%2C%20G.E.%2C%20Mohamed
%2C%20A.-
R.%2C%20Jaitly%2C%20N.%2C%20Senior%2C%20A.%2C%20Vanhoucke%2C%20V.%2C%20Nguyen%2C%20P.%2
C%20Sainath%2C%20T.N.%2C%20et%20al.%3A%20Deep%20neural%20networks%20for%20acoustic%20modelin
g%20in%20speech%20recognition%3A%20The%20shared%20views%20of%20four%20research%20groups.%20Sig
nal%20Process.%20Mag.%20IEEE%2029%286%29%2C%2082%E2%80%9397%20%282012%29)
4. Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)

https://link.springer.com/chapter/10.1007/978-981-16-3728-5_35 2/5
9/15/21, 12:09 PM Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition | SpringerLink

Google Scholar (https://scholar.google.com/scholar?q=Graves%2C%20A.%2C%20Mohamed%2C%20A.-


R.%2C%20Hinton%2C%20G.%3A%20Speech%20recognition%20with%20deep%20recurrent%20neural%20networ
ks.%20In%3A%202013%20IEEE%20International%20Conference%20on%20Acoustics%2C%20Speech%20and%20
Signal%20Processing%20%28ICASSP%29%2C%20pp.%206645%E2%80%936649.%20IEEE%20%282013%29)
5. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J.
Mach. Learn. Res. 12, 2121–2159 (2011)
MathSciNet (http://www.ams.org/mathscinet-getitem?mr=2825422)
zbMATH (http://www.emis.de/MATH-item?1280.68164)
Google Scholar (http://scholar.google.com/scholar_lookup?
title=Adaptive%20subgradient%20methods%20for%20online%20learning%20and%20stochastic%20optimization&
author=J.%20Duchi&author=E.%20Hazan&author=Y.%20Singer&journal=J.%20Mach.%20Learn.%20Res.&volume
=12&pages=2121-2159&publication_year=2011)
6. Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large
scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012)
Google Scholar (https://scholar.google.com/scholar?
q=Dean%2C%20J.%2C%20Corrado%2C%20G.%2C%20Monga%2C%20R.%2C%20Chen%2C%20K.%2C%20Devin%
2C%20M.%2C%20Mao%2C%20M.%2C%20Senior%2C%20A.%2C%20Tucker%2C%20P.%2C%20Yang%2C%20K.%
2C%20Le%2C%20Q.V.%2C%20et%20al.%3A%20Large%20scale%20distributed%20deep%20networks.%20In%3A
%20Advances%20in%20Neural%20Information%20Processing%20Systems%2C%20pp.%201223%E2%80%931231
%20%282012%29)
7. Zeiler, M.D.: Adadelta: an adaptive learning rate method (2012). arXiv preprint arXiv:1212.5701
(http://arxiv.org/abs/1212.5701)
8. Tieleman, T., Hinton, G.: Lecture 6.5—RMSProp, COURSERA: neural networks for machine learning. Technical
report (2012)
Google Scholar (https://scholar.google.com/scholar?
q=Tieleman%2C%20T.%2C%20Hinton%2C%20G.%3A%20Lecture%206.5%E2%80%94RMSProp%2C%20COURSE
RA%3A%20neural%20networks%20for%20machine%20learning.%20Technical%20report%20%282012%29)
9. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
(http://arxiv.org/abs/1412.6980)
10. Dozat, T.: Incorporating nesterov momentum into Adam (2016)
Google Scholar (https://scholar.google.com/scholar?
q=Dozat%2C%20T.%3A%20Incorporating%20nesterov%20momentum%20into%20Adam%20%282016%29)

https://link.springer.com/chapter/10.1007/978-981-16-3728-5_35 3/5
9/15/21, 12:09 PM Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition | SpringerLink

11. Zhang, M.R., Lucas, J., Hinton, G., Ba, J.: Lookahead optimizer: k steps forward, 1 step back (2019). arXiv preprint
arXiv:1907.08610 (http://arxiv.org/abs/1907.08610)
12. Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., Han, J.: On the variance of the adaptive learning rate and beyond
(2019). arXiv preprint arXiv:1908.03265 (http://arxiv.org/abs/1908.03265)
13. Pugeault, N., Bowden, R.: Spelling it out: real-time ASL fingerspelling recognition. In: Proceedings of the 1st IEEE
Workshop on Consumer Depth Cameras for Computer Vision, jointly with ICCV'2011 (2011)
Google Scholar (https://scholar.google.com/scholar?
q=Pugeault%2C%20N.%2C%20Bowden%2C%20R.%3A%20Spelling%20it%20out%3A%20real-
time%20ASL%20fingerspelling%20recognition.%20In%3A%20Proceedings%20of%20the%201st%20IEEE%20Work
shop%20on%20Consumer%20Depth%20Cameras%20for%20Computer%20Vision%2C%20jointly%20with%20ICC
V%272011%20%282011%29)
14. Chollet, F.: Keras, 2015. Available: https://keras.io/ (https://keras.io/)
15. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016)
Google Scholar (https://scholar.google.com/scholar?
q=Abadi%2C%20M.%2C%20et%20al.%3A%20TensorFlow%3A%20large-
scale%20machine%20learning%20on%20heterogeneous%20distributed%20systems%20%282016%29)

Copyright information
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022

About this paper


Cite this paper as:
Rajan R.G., Rajendran P.S. (2022) Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition. In: Smys S.,
Bestak R., Palanisamy R., Kotuliak I. (eds) Computer Networks and Inventive Communication Technologies. Lecture Notes on Data Engineering and
Communications Technologies, vol 75. Springer, Singapore. https://doi.org/10.1007/978-981-16-3728-5_35

First Online
14 September 2021
DOI
https://doi.org/10.1007/978-981-16-3728-5_35
Publisher Name
Springer, Singapore
Print ISBN
978-981-16-3727-8

https://link.springer.com/chapter/10.1007/978-981-16-3728-5_35 4/5
9/15/21, 12:09 PM Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition | SpringerLink

Online ISBN
978-981-16-3728-5
eBook Packages
Engineering
Engineering (R0)

Buy this book on publisher's site


Reprints and Permissions

Personalised recommendations

© 2020 Springer Nature Switzerland AG. Part of Springer Nature.

Not logged in
KCG College of Technology KCG Nagar (2000596414) - INDEST-AICTE-Level III (3000168247) - AICTE Electrical & Electronics & Computer Science
Engineering (3000684219)
103.249.82.131

https://link.springer.com/chapter/10.1007/978-981-16-3728-5_35 5/5

You might also like