Professional Documents
Culture Documents
Applied Information Processing Systems 2022
Applied Information Processing Systems 2022
Brijesh Iyer
Debashis Ghosh
Valentina Emilia Balas Editors
Applied
Information
Processing
Systems
Proceedings of ICCET 2021
Advances in Intelligent Systems and Computing
Volume 1354
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de
Janeiro, Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
Indexed by DBLP, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and
Technology Agency (JST).
All books published in the series are submitted for consideration in Web of Science.
Applied Information
Processing Systems
Proceedings of ICCET 2021
Editors
Brijesh Iyer Debashis Ghosh
Department of Electronics Department of Electronics and Computer
and Telecommunications Engineering Engineering
Dr. Babasaheb Ambedkar Technological Indian Institute of Technology Roorkee
University Roorkee, Uttarakhand, India
Lonere, India
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface
v
Contents
Dr. Brijesh Iyer received his Ph.D. degree in Electronics and Telecommunication
Engineering from Indian Institute of Technology, Roorkee, India, in 2015. He is Asso-
ciate Professor in the University Department of E&TC Engineering at Dr. Babasaheb
Ambedkar Technological University, Lonere (A State Technological University). He
is a recipient of INAE research fellowship in the field of Engineering. He had 02
patents to his credit and authored over 50 research publications in peer-reviewed
reputed journals and conference proceedings. He had authored 05 five books on
curricula as well as cutting-edge technologies like sensors and healthcare technology.
He has served as Program Committee Member of various international conferences
and Reviewer for various international journals. His research interests include RF
front-end design for 5G and beyond, IoT and biomedical image/signal processing.
Dr. Debashis Ghosh is presently working as Full Professor and Head of the Depart-
ment of E&CE Engineering at IIT Roorkee. He had earned his B.E., M.Sc. (Engi-
neering) and Ph.D. from MNIT, Jaipur, IISC Bangalore in the year 1993, 1996
and 2000, respectively, in the area of E&CE Engineering as a major. He was
working as Visiting Professor to the many reputed overseas Technological Insti-
tutes/Universities. He is a recipient of “Excellence in Teaching” award of Multi-
media University, Malaysia, in the year 2007. He has a vast experience of handling
research and consultancy projects at IIT Guwahati and IIT Roorkee to his credit. He
had published several research papers in various journals and conferences of inter-
national and national repute. His area of interest includes communication systems
& signal processing, cognitive radio & sensor networks, image & video processing,
computer vision & pattern recognition.
Abstract Being considered as one of the most prominent and detrimental neurolog-
ical disorders, diagnosing what category of brain tumor disease as soon as possible is
tremendously imperative for patients, which is excessively relied on human factors
on determining brain tumor type. In order to address the said issue and enhance the
classifying performance in deep learning, the paper proposes a myriad of methods
combined with Convolutional Neural Networks, namely transfer learning, data aug-
mentation, the arrangement between Batch Normalization and Dropout. Eventual
experimental results prove that the proposed approaches outperform the state-of-
the-art papers on the benchmark brain tumor dataset. The proposed architecture for
each particular Convolutional Neural Network depicts that the outcomes are more
prospective than those original methods and default-set parameters. The highest
accuracy conducted experiments is 98.8%.
1 Introduction
It is knowledgeably asserted that a brain tumor is one of the most prominent and
detrimental neurological disorders among others like dementia, stroke, and Parkin-
son’s disease. A brain tumor, known as an intracranial tumor, is an abnormal mass
or growth of tissue in which cells grow and multiply out of control, seemingly
unchecked by the mechanisms that take over normal cells. Recently, more than 150
different brain tumors have been documented, but mainly there are two prime groups
of brain tumors which are termed primary and metastatic. An enormous number of
T. P. Ho · V. T. Hoang (B)
Faculty of Information Technology, Ho Chi Minh City Open University,
Ho Chi Minh City, Vietnam
e-mail: vinh.th@ou.edu.vn
T. P. Ho
e-mail: 1751010162toan@ou.edu.vn
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 1
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_1
2 T. P. Ho and V. T. Hoang
people from all walks of life in the United States and other regions of the world have
to deal with brain tumor symptoms yearly, which is about approximately 700,000
people. Moreover, an estimate of over 87,000 people will be unfortunately received
a primary brain tumor diagnosis in the year 2020 while the average survival rate for
all malignant brain tumor patients is only 36%, not to say that survival rates vary by
age and tumors type or grade, broadly decreasing with age. Therefore, diagnosing
what kinds of brain tumors as soon as possible after symptoms appear to carry out
plausible treatments eventually for the patient is tremendously imperative.
Nowadays, brain tumor classification has been studied widely in computer vision
works of literature, which means more and more robust brain tumor classification
models with high performance and diverse classifiers are proposing by researchers
from all over the world day by day. Jun Cheng et al. [7] proposed a method to
enhance the classification performance by applying three means (i.e., intensity his-
togram, GLCM, and raw patch-based BoW model) features to verify the effectiveness
of the recommended method namely Direct use of GLCM as features, Pipeline of
BoW-based tissue classification, and Tumor region augmentation and partition. The
ultimate outcomes were relatively objective, over 90% of assortment for each type
of brain tumor. J. Seetha and S. Selvakumar Raja [19] proposed an automatic brain
tumor detection by using based Convolutional Neural Networks (CNN) division and
the more in-depth architecture design was performed by using small kernels. The
recommended models witnessed a better improvement in the validation accuracy
compared to other algorithms such as SVM and DNN. Parnian Afshar et al. [2] pro-
posed the Capsule Networks model which had the potential to preserve the spatial
relations, due to their Routing by Agreement process. The goal of this suggestion
aimed to classify three different groups of brain tumors, which were Meningioma,
Pituitary, and Glioma. Javaria Amin et al. [3] proposed a fusion process, which is to
combine structural and texture information of four MRI sequences (T1C, T1, Flair,
and T2) for the detection of brain tumors. Also, a Discrete Wavelet Transform (DWT)
along with the Daubechies wavelet kernel was utilized for the fusion process which
provides a more informative tumor region than a single individual sequence of MRI.
Generally, the successful rate of these researches ranges from 87% (the minimum)
to 98.7% (the maximum) with diverse models and methods.
Recently, Arshia Rehman et al. [18] used CNN (Convolutional Neural Network)
models like AlexNet, Inception, and VGG16 with different kinds of improving tech-
niques to classify three types of brain tumors images (Meningioma, Glioma, and
Pituitary) in 2019. To be more specific, training parameters were adjusted, and mod-
els were fine-tuned, which achieved the accuracy of 98.6% in VGG16, 98.04% in
Inception, and 97.3% in AlexNet. Then, Deepak and Ameer [8] also implemented the
Inception model but only for the extraction stage of the two proposed models. In the
classifying step, SVM (Support Vector Machine) and KNNs (K-Nearest Neighbors)
were chosen to diagnose MRI images and accomplished significant results, with
97.8% and 98.0%, respectively. Zar Nawab Khan Swati et al. [20] proposed efficient
methods using pre-train VGG19 on ImageNet database combined with fine-tuning,
respectively, from the first block to the 6th block in 2019 for brain tumor classifica-
tion. The highest following accuracy for transfer learning VGG19 model was 96.13%
CNN Parameter Adjustment for Brain Tumor Classification 3
when fine-tuning all 6th first blocks. In this paper, we would represent many pro-
posed methods that allow based CNN architectures to accomplish a more remarkable
performance than the original form in the brain tumor classifications.
Nevertheless, five simple proposed Convolutional Neural Network Architectures
for brain tumor classification were constructed by Nyoman Abiwinanda et al. [1] in
order to assert that their results on simple models could be higher than other numerous
complicated models. Only with two 2D convolutions, two ReLU activation, and two
Maxpooling layers, they obtained 98.51% for training and 84.19% for validation.
In the year 2020, a modified CNNBCNs (Convolutional Neural Network Based on
Complex Networks) was tested and constructed by Zhiguan Huang et al. [14] with
three algorithms for randomly generating graphs such as the Erdos–Renyi (ER),
Watts–Strogatz (WS), and Barabasi–Albert (BA) that was more effective than the
original CNNBCNs in diagnosing brain tumors types via MRI images. The highest
accuracy among obtained results belonged to CNNBCN-ER, at exactly 95.49%. The
VGG16 model was chosen as a base network in proposed methods, whose name
was the Faster R-CNN architecture, presented by Yakub Bhanothu et al. [5] in 2020.
The Faster R-CNN consists of three primary blocks, namely RPN, Region of Interest
(RoI), and Region-based CNN (R-CNN) for object classification. The final results for
each brain tumor class (Glioma, Meningioma, and Pituitary) were 75.18%, 89.45%,
and 68.18%, respectively. Likewise, a developed model that consists of two primary
stages was introduced by Kazihise Ntikurako Guy-Fernand et al. [11] in the year
2020. To be more specific, input images would firstly pass through the visual attention
mechanism for training. The acquired knowledge would then be transferred to the
proposed architecture as a feature selector that mainly uses staples of CNNs such as
Convolutional layers and Batch Normalization layers. The coming out results were
quite good, at roughly 96%. In the second month of the year 2020, a new model
for brain tumor classification based on CNN’s ultimately was presented by Milica
M.Badza and Marko C.Barjaktarovic [4]. In other words, MR images at first would be
preprocessing (normalize, resize) and augmented (rotate, flip vertically) to increase
the training image database after passing through architecture with two proposed
extracting blocks which had different kinds of layers arranged. The highest obtained
accuracy in this way was around 95%. Preethi Kurian and Vijay Jeyakumar [16]
in early 2020 experimented with the CBIR task on seven separate databases with
fifteen diverse classes by implementing LeNet and AlexNet architecture and make
a comparison of validation accuracy via the number of epochs. It indicated that the
higher the figure for epoch is, the more accurate the models get.
This paper is organized as follows. Section 2 introduced the proposed approach by
enhancing different CNN frameworks. Next, Sect. 3 presents experimental results.
2 Proposed Approach
In terms of CNNs, different kinds of the version of ubiquitous CNN models such as
namely AlexNet [21], VGG [22], Inception [12], MobileNet [13], ResNet [9], and
DenseNet [17] would have experimented with proposed methods in order to draw
4 T. P. Ho and V. T. Hoang
comparisons between original results and proposed ones. Several major parameter
values can be considered as follows:
• Original set values and techniques for original CNNs
– Optimizer: Stochastic Gradient Descent (SGD).
– Activations: ReLU for hidden layers and Softmax for FC layers.
– Regularizations: ModelCheckpoint and EarlyStopping.
According to the example set values that are taken from research papers or websites
guiding how to implement the CNN model, input images firstly would be read and
resized to 224×224 for almost all working algorithms but Inception architecture
(299×299). Adam optimizer would be applied universally with the learning rate of
1e-3 instead of SGD. The setting values for both original and proposed methods
are entirely different depending on each CNN framework. Set values of original
approaches and proposed methods are illustrated in Table 1. It would be more plau-
sible for AlexNet, VGG, and Inception architectures to train with Adam’s LR of
1e-4. Data augmentation (Rotation) would be preferable to implement as well not
Proposed Methods
Softmax Layer Batch Normalization Batch Normalization
Glioma, Meningioma or
ReLU Layer ReLU Layer
Pituitary
Dropout (0.5)
FC Layer (512 filters)
Dropout (0.5)
Glioma, Meningioma or
Pituitary FC Layer (3 filters)
Softmax Layer
Glioma, Meningioma or
Pituitary
only for these said models but also for DenseNet to accomplish better results. In
terms of MobiletNet, DenseNet, and ResNet, the LR within Adam optimizer might
be set slightly higher, at precisely 1e-3, and train along with ReduceLROnPlateau
with its minimum LR of 1e-4. We eventually figure out these best-set parameters for
achieving the best results by using given deep learning algorithms. Figure 1 illustrates
enhancing architects of ResNet, DenseNet, and MobileNet, respectively.
6 T. P. Ho and V. T. Hoang
The proposed approach is evaluated on a benchmark brain tumor database [6], which
consists of 3,064 T1-weighted contrast-enhanced images presented for three kinds of
brain tumor categories (Glioma, Meningioma, and Pituitary). Table 2 summarizes the
characteristic of this database, and several example images are illustrated in Fig. 2.
3.2 Results
Table 3 presents both the classification performance on original methods and pro-
posed methods, respectively. Given is the eventual result Table 3 through which
comparisons are drawn between accuracy from original plans and those from pro-
posed methods to find out the most potent algorithms when classifying three different
types of brain tumors. Roughly speaking, almost all the CNN algorithms witness an
increase in the proportion of testing accuracy when applying proposed methods, by at
least around 1% and the maximum approximately 12%. While the proposed VGG16
model holds a dominant position in the percentage of accomplished accuracy, the
figure for recommended MobileNet architecture witnesses the least growth. In terms
of AlexNet, VGG, Inception, and MobileNet architectures, it could be obviously seen
that the proposed accuracy rises slightly by at least 1% for AlexNet and VGG but dra-
matically for Inception when modifying parameter values. To be more specific, the
figure for submitted Inception architectures grows remarkably at exactly 95.48% for
Inception v3 and 95.67% for Inception ResNet v2 as opposed to 91.86% for original
Inception v3 and 88.63% for initial Inception ResNet v2. Concerning MobileNet,
applying proposed methods that lead to a change in the architecture provides a more
significant result, at 92.76% in comparison with the actual work, at 91.34%.
About DenseNet architecture, DenseNet version 169 sees the most considerable
increase among others by implementing the proposed arrangement between Batch
Normalization and Delusion, by almost 12% in proposed accuracy. The figures for
the two versions left of DenseNet grow up gradually as well during the researched
timescale. Furthermore, all versions of ResNet’s accuracy percentages also witness
a trend-forward when applying the proposed arrangement in the FC layer. In other
words, ResNet 50 accomplishes the highest accuracy as possible, at 95.88% com-
pared to 95.65% of ResNet 50 v2, 94.86% of ResNet 101, 95.39% of ResNet 101
v2, 94.04% of ResNet 152, and 95.73% of ResNet 152 v2.
In the comparative table with other paper’s results presented in Table 4, our pro-
posed methods could achieve the accuracy of 98.8% in comparison with other refer-
ence’s precision when conducting on the same brain tumor database only by modi-
fying parameter values (Learning Rate, Batch Size, Epochs, K-Fold) rationally and
taking advance of other techniques namely ReduceLROnPlateau class, Data Aug-
mentation, Batch Normalization, and Dropout.
By implementing proposed adjusted parameters and suggested architecture for
particular Convolutional Neural Networks, the accuracy has been enhanced signifi-
cantly compared to not only pure algorithms but also other paper results. Neverthe-
less, it seems that the accomplished accuracy does not reach the efficacious peak. To
be more specific, fine-tuning these proposed models could be a more effectual way
for the accuracy to achieve.
8 T. P. Ho and V. T. Hoang
4 Conclusion
This paper proposed enhancing methods for CNN architectures to improve the brain
tumor classification performance with a low training database rate as opposed to test-
ing one. Only by modifying values in CNN model rationally comparable to database
and models, arranging positions between Batch Normalization and Dropout suitably
to avoid conflictions, and applying other useful techniques such as ReduceLROn-
Plateau class and Data Augmentation, the final results from all experimental deep
learning algorithms based on CNN could accomplish a way more significant perfor-
mance in comparison with other paper’s results.
The paper could perceive that Batch Normalization and Dropout still work well
with each other if they are arranged wisely through these experiments. Moreover,
Learning Rate in optimizer still plays an immensely vital role in determining models’
eventual accuracy when dealing with different database and deep learning architec-
tures. Nevertheless, our proposed architectures find it quite tough to achieve flawless
accuracy in all conducting algorithms, especially for Inception, MobileNet, ResNet,
and DenseNet, at around 95%. The future of this paper’s work is to strike to imple-
ment the fine-tuning technique in our proposed models and increase the training
database volume so that the models could have enough database to digest, which
could lead to more outstanding performance.
References
1. Abiwinanda, N., Hanif, M., Hesaputra, S.T., Handayani, A., Mengko, T.R.: Brain tumor clas-
sification using convolutional neural network. In: Lhotska, L., Sukupova, L., Lacković, I.,
Ibbott, G.S. (eds.) World Congress on Medical Physics and Biomedical Engineering 2018, pp.
183–189. Springer Singapore, Singapore (2019)
CNN Parameter Adjustment for Brain Tumor Classification 9
2. Afshar, P., Plataniotis, K.N., Mohammadi, A.: Capsule networks for brain tumor classification
based on mri images and coarse tumor boundaries. In: ICASSP 2019–2019 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1368–1372 (2019)
3. Amin, J., Sharif, M., Gul, N., Yasmin, M., Shad, S.A.: Brain tumor classification based on
dwt fusion of MRI sequences using convolutional neural network. Pattern Recognit. Lett. 129,
115–122 (2020)
4. Badza, M., Barjaktarovic, M.: Classification of brain tumors from MRI images using a convo-
lutional neural network. Appl. Sci. 10(03), 1999 (2020)
5. Bhanothu, Y., Kamalakannan, A., Rajamanickam, G.: Detection and classification of brain
tumor in mri images using deep convolutional network. In: 2020 6th International Conference
on Advanced Computing and Communication Systems (ICACCS), pp. 248–252 (2020)
6. Cheng, J.: Brain tumor dataset (2017). https://figshare.com/articles/dataset/brain_tumor_
dataset/1512427
7. Cheng, J., Huang, W., Cao, S., Yang, R., Yang, W., Yun, Z., Wang, Z., Feng, Q.: Enhanced
performance of brain tumor classification via tumor region augmentation and partition. PLOS
ONE 10(10), 1–13 (2015)
8. Deepak, S., Ameer, P.: Brain tumor classification using deep CNN features via transfer learning.
Comput. Biol. Med. 111, 103345 (2019)
9. Gong, T., Niu, H.: An implementation of resnet on the classification of RGB-D images. In: Gao,
W., Zhan, J., Fox, G., Lu, X., Stanzione, D. (eds.) Benchmarking, Measuring, and Optimizing,
pp. 149–155. Springer International Publishing, Cham (2020)
10. Gumaei, A., Hassan, M.M., Hassan, M.R., Alelaiwi, A., Fortino, G.: A hybrid feature extraction
method with regularized extreme learning machine for brain tumor classification. IEEE Access
7, 36266–36273 (2019)
11. Guy-Fernand, K.N., Zhao, J., Sabuni, F.M., Wang, J.: Classification of brain tumor leverag-
ing goal-driven visual attention with the support of transfer learning. In: 2020 Information
Communication Technologies Conference (ICTC), pp. 328–332 (2020)
12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceed-
ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
13. Howard, A., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M.,
Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications
(2017)
14. Huang, Z., Du, X., Chen, L., Li, Y., Liu, M., Chou, Y., Jin, L.: Convolutional neural network
based on complex networks for brain tumor image classification with a modified activation
function. IEEE Access 8, 89281–89290 (2020)
15. Kaplan, K., Kaya, Y., Kuncan, M., Ertunç, H.M.: Brain tumor classification using modified
local binary patterns (LBP) feature extraction methods. Med. Hypotheses 139, 109696 (2020)
16. Kurian, P., Jeyakumar, V.: 3—multimodality medical image retrieval using convolutional neural
network. In: Agarwal, B., Balas, V.E., Jain, L.C., Poonia, R.C., Manisha (eds.) Deep Learning
Techniques for Biomedical and Health Informatics, pp. 53–95. Academic Press (2020)
17. Liu, Q., Xiang, X., Qin, J., Tan, Y., Tan, J., Luo, Y.: Coverless steganography based on image
retrieval of densenet features and dwt sequence mapping. Knowl.-Based Syst. 192, 105375
(2020)
18. Rehman, A., Naz, S., Razzak, M., Akram, F., Imran, M.: A deep learning-based framework for
automatic brain tumors classification using transfer learning. Circuits, Syst., Signal Process.
39 (2019)
19. Seetha, J., Raja, S.S.: Brain tumor classification using convolutional neural networks. Biomed.
Pharmacol. J. 11(3), 1457–1461 (2018)
20. Swati, Z.N.K., Zhao, Q., Kabir, M., Ali, F., Ali, Z., Ahmed, S., Lu, J.: Content-based brain
tumor retrieval for MR images using transfer learning. IEEE Access 7, 17809–17822 (2019)
10 T. P. Ho and V. T. Hoang
21. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V.,
Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (CVPR) (2015)
22. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception archi-
tecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR) (2016)
Advance Fuzzy Radial Basis Function
Neural Network
1 Introduction
Data mining and knowledge discovery from database (KDD) involve classification
and clustering approaches. The different approaches of artificial intelligence are
shown in Fig. 1. Categorizing the dataset is one of the most important steps in KDD.
Data patterns grouped together on the basis of common characteristics are known as
clusters [1]. Classification methods are used for labeled data and clustering meth-
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 11
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_2
12 B. S. Shetty et al.
ods are used for unlabeled data. A lot of research has been carried out in recent
years over the use of RBFNN in designing clustering algorithms. Many clustering
techniques have been developed for pattern analysis, grouping, decision making,
document retrieval, image segmentation, and data mining; yet many significant chal-
lenges remained in the formation of the clusters in a correct way. Density, partitional,
and hierarchical are three broad categories for clustering approaches [2]. Artificial
neural network (ANN) is widely used in machine learning (ML) for solving classi-
fication and clustering problems. ANN is a 3-layer architecture (input, hidden, and
output layer). Two broad categories of ANN are feedforward and backpropagation.
Many researchers have done clustering and classification using hypersphere [3–8].
Determination of the number of hidden layers is the most researched area in ANN.
Various researchers have proposed many clustering algorithms which includes K-
means [9], enhanced K-means [10], subtractive [11], ART [12], fuzzy [13], scatter
[14], output-constricted [15], ant colony [16], artificial fish swarm [17] and particle
swarm optimization [18], genetic algorithm [19], etc. Recently, Kulkarni et al. [20]
have proposed a novel approach in which fuzzy clustering with RBFNN is used.
The major issue in any machine learning algorithm is the creation of nonlinear
boundaries for performing pattern classification and recognition. Generally, the clas-
sifiers group the patterns of the same class in the respective class cluster and perform
machine learning. Pattern classification accuracy increases with the proper construc-
tion of nonlinear boundaries. The proposed approach is an extension of the algorithm
described in [20] with improved classification accuracy. The creation of optimum
linear and nonlinear boundaries is a major objective of the proposed approach. Addi-
Advance Fuzzy Radial Basis Function Neural Network 13
tionally, it removes overlap between different classes. The designed classifier is the
combination of fuzzy clustering and RBFNN having the following advantages over
earlier RBFNNs and Fuzzy neural network classifiers.
1. The proposed approach does not make use of any tuning parameter.
2. The Gaussian neurons in the hidden layer of the RBFNN are replaced by fuzzy
neurons. The fuzzy neurons are characterized by the fuzzy membership function
due to which the clustered patterns give 100% training accuracy for any dataset.
3. The learning between the hidden layer and output layer uses optimum spread
fuzzy clustering algorithm instead of traditional least mean square algorithm.
So, the proposed AFRBFNN classifier overcomes most of the drawbacks of earlier
RBFNNs and results into a precise classifier for pattern recognition.
The paper is organized as follows. Section 2 describes basics of RBFNN in
brief. Section 3 elaborates the proposed AFRBFNN classifier architecture. Section 4
describes the learning and recall AFRBFNN algorithm. In Sect. 5, evaluation of the
AFRBFNN is done using various classifiers and datasets. Conclusions and future
work are stated in Sect. 6. The notations used in this paper are kept consistent with
FRBFNN [20], as far as possible for the reference and comparison purposes.
Radial basis function neural network (RBFNN) is a special type of feedforward net-
work which uses exactly one hidden layer. RBFNNs are widely used in classification
and regression problems. The role of RBF is the transformation of data from nonlin-
ear to linear format before performing the classification. RBF increases dimension
of feature vector and performs classification by transforming d-dimensional feature
vectors to f -dimensional feature vectors where f > d. The RBFNN is a modified
ANN which uses radial basis function in hidden layer [21]. Neurons that means clus-
ters in the hidden layer are formed during the learning phase and are characterized
by the RBF as its activation function. Quadratic, inverse, and gaussian are types
of RBFNN. The core concern over the use of RBFNN is the determination of the
centroids and the width of clusters in the hidden layer. The architecture of RBF is
composed of input, hidden, and output layers. RBFNN has only one hidden layer
and it is also referred to as a feature vector. For hidden layer formation, the radial
basis function is used. The main property of a radial function is that the membership
value of a pattern increases or decreases subsequently with an increase or decrease
in distance from a centroid. One of the features of RBF is that it gives high accu-
racy with quick convergence for dense data. Radial basis function can be used in
linear and nonlinear models. The three-layer architecture of RBF is shown in Fig. 2.
Clusters are formed in the hidden layer which is sometimes referred to as nodes
or neurons. Cluster formation demonstrated in Fig. 10 is as per [20] author Dr. A.
B. Kulkarni. Number of clusters formed are represented by H1 , H2 , …, H j . Every
14 B. S. Shetty et al.
J
yi = f ( Wi j ∗ ψ j ) (2)
j=0
where i = 1, 2, . . . , m.
Advance Fuzzy Radial Basis Function Neural Network 15
UFRBFNN consist of three layers. Layer’s names are the input layer, hidden layer,
and the output layer described as FI , FH , and FC layer, respectively, in Fig. 3.
FI = (X 1 , X 2 , X 3 , . . . , X n ). Input Layer
FH = (H11 , H12 , . . . , H3z ). Hidden Layer
FC = (C1 , C2 , C3 , . . . , Cn ). Class Layer
AFRBFNN has three layers namely the input, hidden, and output depicted as FI ,
FH , and FC , respectively, in Fig. 3.
FI = (X 1 , X 2 , X 3 , . . . , X n ). Input Layer
FH = (H11 , H12 , . . . , H3z ). Hidden Layer
FC = (C1 , C2 , C3 , . . . , Cn ). Class Layer
The role of the input layer FI is to accept n-dimensional input as a feature vec-
tor and forward it to the middle layer FH . It does not perform any operation. The
proposed algorithm is applied to input data and nonlinear data gets transformed into
a linear format in the hidden layer. Input data is mapped into unbounded hyperboxes
format in the hidden layer by learning algorithm given in Sect. 4. Hidden layer FH
output is fuzzy hyperboxes (FHBs). If the pattern lies under the hyperbox region,
the value of membership value is 1. As input moves away from the hyperbox region,
the membership value of the input pattern gradually decreases. The formula for
determining the membership function is as shown in Eq. 3.
where l is the Euclidean distance between Xh and Cpj . The weights between FI
and FH are stored in the matrix C which gives the center points of FHSs. The FC
represents class layer and k nodes in this layer represent one of the class.
Let D be the input training set with P patterns for training and hth input data is
denoted by {Xh , dh }, Xh = (x h1 , x h2 , . . . , x hn ) and dh represents any K class.
For every class index k where k = 1, 2, . . . , K , follow steps from 1 to 6.
Step 1: Let M k and O k estimate same and different class input pattern distance,
respectively, and βk shows the number of patterns input in class in Ck .
S k = Xi − Xj αk X αk i, j = 1, 2, . . . , αk where Xi , Xj ∈ Ck
D k = Xi − Xj αk X p−αk i = 1, 2, . . . , αk and j = 1, 2, . . . , p − αk
where Xi ∈ Ck , Xj ∈ / Ck
B k = min(O k )
radius k = max(P k )
Step 4: (Temporary SET creation) Now collect all the patterns following under
the cluster created in step 4. Using pattern x kj , Membership function stated in Eq. 3,
and initial radius radius k , input patterns of class k are collected and stored in set
SET for cluster formation.
Step 5: (Unbounded Hyperbox Node Creation in Fh Layer) Call Algorithm:
1 OSFC_CR(SET, n) by passing SET derived in the previous step and n as number
of features of input pattern. Here, creation of new nodes occur in Fh Layer and
unbounded hyperbox creation occurs.
Step 6: (Update βk count) As βk shows the number of patterns of Ck class,
update βk .
n j = COUNT(SET) COUNT function calculates number of patterns in SET
βk = βk − nj
where n j is the number of patterns included in the cluster.
Step 7: (Check Pattern exists in class βk ) If βk > 0, go to step 1.
Step 8: (Class Node Creation in Fc Layer) Creation of a class node in output
layer with label k.
Step 9: (Fh to Fc Link Creation) Make connection between output layer with
respective FHB belonging to class k.
To evaluate the performance of AFRBFNN classifier, two case studies along with
obtained results have been discussed in the following sub-sections. The learning
algorithm is implemented in MATLAB 2019a.
Figure 6a, b shows the final result of formed clusters by MSFC and OSFC algo-
rithm, respectively. When the above experimentation is compared with [20], it is
evident that there are reduced radii of 3.03 and 2.33 in class 1 and class 2, respec-
tively. The details of class 1 is shown in Fig. 7a and class 2 is shown in Fig. 7b. This
figure shows the redundant area that has been occupied by the MFSC method. It also
shows the comparison of MFSC method with OFSC method classwise. As and when
the proposed algorithm will be applied to a larger dataset, it will significantly reduce
the misclassification rate, increase the classification accuracy, and improve decision
making.
20 B. S. Shetty et al.
Initially, as per the steps in the algorithm, cluster formation for any one of the k
class can be initiated. In this example, let us consider class 1 starts first, the intra class
S k and inter-class D k distance for all αk patterns of class 1 are calculated using step
1. In steps 2 and 3, the pattern (12, 12) of class 1 as centroid with initial radius 2.13
forms cluster of class 1 patterns using membership function stated in Eq. 3. Here in
the same step set S is created with all class 1 patterns included in the cluster. Cluster
with centroid and optimum radius using patterns in set S is calculated using step
5 of Algorithm 1. The cluster formed before calculation of optimum centroid and
radius is like the cluster formed by MSFC algorithm shown in Fig. 4a. The cluster
formed for class 1 is as shown in Fig. 4b. Once the possible clusters for one class
along with their connections with class nodes are done, then as per step 8, the same
procedure is repeated for class 2. As per MSFC method in [20], class 2 pattern (17,
17) is selected for centroid with initial radius to be 2.83 and it is shown in Fig. 5a.
As per the proposed algorithm, the optimum radius 0.5 and centroid (16.5, 16.5)
are calculated using Algorithm 1 and respective cluster is shown in Fig. 5b. Here
architecture diagram consists of 2 input nodes for 2 features, 2 hidden layer nodes
for 2 clusters, and 2 output nodes for 2 classes.
Example 2: In this example, 3 classes and 24 patterns were considered. In order to
compare our proposed OFSC algorithm with MSFC algorithm [20], input is derived
from lateral [20]. The patterns with their features and class labels are given in Table 2.
Class 1, 2, and 3 patterns are shown by green, red, and blue colors, respectively, as
shown in Fig. 8.
Advance Fuzzy Radial Basis Function Neural Network 21
After applying Algorithm 1, four clusters are created with optimum radii for
all classes as shown in Fig. 9b. Comparison of MSFC with the proposed OSFC
algorithm is as shown in Fig. 9a, b. Centroid and radius values of all the classes
are given in Table 3. The AFRBFNN architecture is shown in Fig. 10 and it consists
of an input layer with 2 nodes representing 2 features. Four nodes in the hidden
layer represent 4 clusters of three classes. Three nodes in the output layer represent
3 classes, respectively. From both the above examples, we can conclude that the
overlapping region between inter-class clusters is decreased in OSFC algorithm due
to optimum centroid and radius calculation. Reduced overlapping region improves
prediction accuracy and decreases misclassification of patterns. To prove this, the
performance of the same algorithm is evaluated with other classifiers in the next
section.
Fig. 9 Case study 2—Example 2—cluster formation by MSFC and OSFC approach
in [20]. The analysis of results shows that the classification accuracy of AFRBFNN
is higher than all other classifiers listed. OSFC algorithm improves classification
accuracy in a more better way than all classifiers shown in comparison.
We have proposed a novel precise clustering algorithm AFRBFNN for the clas-
sification of patterns. The proposed algorithm improves the training accuracy by
5–10% approximately over other RBF-based classifiers. Because of optimum clus-
ter centroid calculation and its respective radii value, the number of clusters created
is the same as that of FRBFNN algorithm. This method assures linear separability
of nonlinear input in a more efficient way leading to better classification accuracy.
Unnecessary overlap between inter-class clusters is removed because of optimum
cluster radii value. Removal of overlap leads to an increase in pattern recognition
Advance Fuzzy Radial Basis Function Neural Network 23
rate and a decrease in misclassification rate. The improved algorithm gives the opti-
mized classification with better fidelity and accuracy. The proposed algorithm is not
sensitive to any tuning parameter. The number of clusters formed in the hidden layer
is independent of the input order of training data.
The proposed algorithm can be further enhanced by using the feature selection and
dimension reduction method. Also, pattern classification accuracy can be increased
by the creation of k more clusters using the k-means algorithm.
24 B. S. Shetty et al.
References
1. Bindra, K., Mishra, A.: “A detailed study of clustering algorithms”. In: International Confer-
ence on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future
Directions), Sep. 20–22: AIIT. Amity University Uttar Pradesh, Noida, India (2017)
2. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3),
264–323 (1999)
3. Kulkarni, U.V., Sontakke, T.R.: Fuzzy hypersphere neural network classifier. In: 10th Interna-
tional Conference on Fuzzy Systems, Melbourne, Victoria, Australia, pp. 1559–1562 (2001)
4. Kulkarni, U.V., Doye, D.D., Sontakke, T.R.: General fuzzy hypersphere neural network. In:
Proceedings of the 2002 International Joint Conference on Neural Network, Honolulu, HI,
USA, USA, vol. 3, pp. 2369–2374 (2002)
5. Doye, D.D., Kulkarni, U.V., Sontakke, T.R.: Speech recognition using modified fuzzy hyper-
sphere neural network. In: Proceedings of the International Joint Conference on Neural Net-
works (IJCNN02), Honolulu, Hawaii, vol. 1, pp. 65–68 (2002)
6. Patil, P.M., Kulkarni, U.V., Sontakke T.R.: Modular fuzzy hypersphere neural network. In:
The 12th IEEE International Conference on Fuzzy Systems, St Louis, MO, USA, vol. 1, pp.
232–236 (2003)
7. Patil, P.M., Kulkarni, S.N., Patil, A.J., Doye, D.D., Kulkarni, U.V.: Modular general fuzzy
hypersphere neural network. In: 17th IEEE International Conference on Tools with Artificial
Intelligence, Hong Kong, China, vol. 4, pp. 211–216 (2005)
8. Sonar, D.N., Kulkarni, U.V.: Pruned fuzzy hypersphere neural network (PFHSNN) for lung
cancer classification. Int. J. Comput. Appl. 157, 36–39 (2017)
9. Moody, J., Darken, C.J.: Fast learning in networks of locally-tuned processing units. Neural
Comput. 1, 281–294 (1989)
10. Chen, S.: Nonlinear time series modeling and prediction using Gaussian RBF networks with
enhanced clustering and RLS learning. Electr. Lett. 3, 117–118 (1995)
11. Sarimveis, H., Alexandridis, A., Bafas, G.: A fast training algorithm for RBF networks based
on subtractive clustering. Neurocomputing 501–505 (2003)
12. Shie-Jue, L., Chun-Liang, H.: An ART-based construction of RBF networks. IEEE Trans.
Neural Netw. 13, 1308–1321 (2002)
13. Tsekouras, G.E., Tsimikas, J.: On training RBF neural networks using input-output fuzzy
clustering and particle swarm optimization. Fuzzy Sets Syst. 221, 65–89 (2013)
14. Sohn, I., Ansari, N.: Configuring RBF neural networks. Electronic Lett. 34, 684–685 (1998)
15. Wang, D., Zeng, X.J., Keane, J.A.: A clustering algorithm for radial basis function neural
network initialization. Neurocomputing 77, 144–155 (2012)
16. Li, J., Liu, X.: Melt index prediction by RBF neural network optimized with an adaptive new
ant colony optimization algorithm. J. Appl. Polym. Sci. 119, 3093–3100 (2011)
17. Shen, W., Guo, X., Wu, C., Wu, D.: Forecasting stock indices using radial basis function neural
networks optimized by artificial fish swarm algorithm. Knowl. Based Syst. 24, 378–385 (2011)
18. Feng, H.M.: Self-generation RBFNs using evolutional PSO learning. Neurocomputing 70,
241–251 (2006)
19. Billings, S.A., Zheng, G.L.: Radial basis function network configuration using genetic algo-
rithms. Neural Netw. (8), 877–890 (1995)
20. Kulkarni, A., Bonde, S., Kulkarni, U.: A Novel fuzzy clustering algorithm for radial basis
function neural network. Int. J. Fut. Revolut. Comput. Sci. Commun. Eng. 4(4), 751–756
(2018). ISSN: 2454-4248
21. Raitoharju, J., Kiranyaz, S., Gabbouj, M.: Training radial basis function neural networks for
classification via class specific clustering. IEEE Trans. Neural Netw. Learn. Syst. 27, 2458–
2471
Unbounded Fuzzy Radial Basis Function
Neural Network Classifier
Abstract Area of pattern recognition deals with the recognition of patterns by using
different machine learning algorithms without human intervention. Many different
data mining algorithms are used in pattern recognition. The selection of an appropri-
ate and precise algorithm is very crucial. An imprecise algorithm may lead to generate
a wrong decision. Recognition can be supervised or unsupervised. This paper presents
a novel unbounded fuzzy radial basis function neural network (UFRBFNN) classifier
model to perform the supervised classification. This classifier is constructed using
fuzzy clustering and further clusters are converted into fuzzy hyperboxes. Fuzzy set
hyperboxes (FHBs) represent the neurons in the hidden layer. The creation of these
FHBs is based on the unbounded spread from inter-class information and intra-class
fuzzy membership function. The proposed approach is faster and independent of the
tuning parameters. The output is determined by the union operation of the FHBs
outputs which are connected to the class nodes in the output layer. Using K-fold
cross-validation, the UFRBFNN model is verified by applying 7 different standard
datasets from (UCI) machine learning repository and further by comparing results
with well-known radial basis function neural network (RBFNN) variants. The anal-
ysis of the result shows that the proposed model provides 5–10% improved training
accuracy with previous radial basis function classifiers.
Keywords Fuzzy set · Fuzzy neuron · Fuzzy clustering · Fuzzy hyperbox · Radial
basis function neural network
1 Introduction
Pattern classification is the area devoted to the study of techniques designed to cat-
egorize data into distinct classes. It is under extensive study for a long time as it
is a classification problem that the human brain can achieve incredibly well, but is
difficult for computers to perform.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 25
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_3
26 B. S. Shetty et al.
However, the fuzzy min–max neural network (FMMN) is one of the efficient and
powerful models [1] for pattern classification. The FHSNN [2] classifier is one of
the variants proposed by U. V. Kulkarni in this year 2001. A weighted FMN is pro-
posed by Kim and Yang [3]. In this model, a hyperbox can be expanded without
considering the hyperbox contraction process as well as the overlapping test. During
the training of patterns, the feature distribution information is utilized to avoid the
hyperbox distortion, and such distortion may be caused by eliminating the overlap-
ping area of hyperboxes in the contraction process. Ma, Liu, and Wang proposed an
FMN-based novel algorithm for pattern classification [4]. In this model, a new mem-
bership function of hyperbox is defined in which the characteristics are considered.
Additionally, it does not use a contraction process, but needs only expansion, and
no additional neurons have been used with the overlapped area. An enhanced FMN
(EFMM) is proposed by Falah Mohammed and Lim [5]. In EFMM, three heuristic
rules are proposed to eliminate the overlapping problem and to discover and resolve
possible overlapping cases.
In recent years, the RBFNNs [6] have become popular pattern classifiers which
have been applied in several engineering applications. The key issue with the RBFNN
is the determination of centroids and radii of the radial basis functions along with
the number of hidden nodes in the hidden layer [7]. A lot of research has been
carried out in recent years over the use of RBFNN in designing clustering algorithms.
Many clustering techniques have been developed for pattern analysis, grouping,
decision making, document retrieval, image segmentation, and data mining, and yet
many significant challenges remained in the formation of the clusters in the correct
way. Density, partitional, and hierarchical are three broad categories for clustering
approaches [8]. The popular approaches proposed by researchers to create hidden
layer nodes are provided in various research papers [6, 9–21]. Similarly, FMMN
model has several limitations. Over the period, many variants [3, 4, 22] are proposed
to overcome these limitations. In this paper, Unbounded Fuzzy Radial Basis Function
Neural Network Algorithm is proposed. The designed classifier is a combination of
RBF-based and fuzzy neural network classifiers.
The proposed approach has the following features:
1. The proposed model is independent of the tuning parameter.
2. The hidden layer Gaussian neurons of the RBFNN are replaced by fuzzy neurons.
These neurons are characterized by the fuzzy membership function due to which
the trained network gives 99% training accuracy for any standard dataset.
3. The learning between the hidden layer and output layer uses unbounded spread
fuzzy hyperbox algorithm (USFH) instead of traditional least mean square algo-
rithm.
4. The connection between the hidden layer and output layer, i.e., the creation of
class nodes, is done concurrently with the creation of (fuzzy set hyperboxes)
FSHBs in the hidden layer. The output of the class nodes is determined by the
union operation of the respective output of FSHs.
Hence, the proposed UFRBFNN classifier overcomes most of the drawbacks of ear-
lier RBFNNs and Fuzzy neural network classifiers. The rest of the paper is organized
Unbounded Fuzzy Radial Basis Function Neural Network Classifier 27
The simplest form of the Radial Basis Function network is a three-layer feed-forward
neural network. Training input is provided in the first layer of the network The second
layer is a hidden layer with multiple RBF nonlinear activation functions. The last
layer corresponds to the class layer and it gives the final output in the network. Radial
basis function is applied in hidden layer which maps m-dimensional n-patterns of
training dataset [m ∗ n] to m1-dimensional n-patterns [m1 ∗ n] where m1 > m, by
adding more dimensions to input. Hidden layer transforms and maps nonlinear m-
dimensional patterns to linear m1-dimensional patterns. The number of hidden layers
is always less than the number of input patterns. Hidden layers are any hyperplane
which may be represented by any shapes like circle, box, cluster, hypersphere, hyper-
box, and hyperline. RBFNN classifiers are useful to solve regression, prediction, and
classification problems. RBF is a special type of multi-layer perceptron (MLP) with
a single middle layer. The main property of a radial function is the membership value
of a pattern, which increases or decreases subsequently with an increase or decrease
in distance from the centroid. The three-layer architecture of RBF is as shown in
Fig. 1. The clusters are formed in the hidden layer of RBFNN. The number of clus-
ters formed is represented by [H1 , H2 , …, H j ]. Every cluster represents a subset of
respective class data. Representation of kth class cluster Hk is shown as Hk = [ck1 ,
ck2 , …,ckn ].
The following steps demonstrate cluster formation in RBFNN:
– Step 1: First layer receives n-dimensional input X = [x1 , x2 , ..., xn ] and forwards
it to middle layer (hidden). Number of nodes in input layer is equal to number of
features in dataset. Creation of input layer occurs here.
– Step 2: Any radial basis function is used for creation of middle layer. In this
example, the Gaussian function is described and it is shown in Eq. 2, where σ is
the cluster width. n
k=0 (X i −C i j )
2
ψ = ex p 2∗σ 2
(1)
– Step 4: Output layer assigns input to a particular cluster. In output layer of RBFNN,
output of ith node for m classes is decided by following Eq. 2.
⎛ ⎞
J
yi = f ⎝ Wi j ∗ ψ j ⎠ (2)
j=0
where i = 1, 2, ..., m.
The UFRBFNN consist of 3 layers and represented as input layer FI , hidden layer
FH , and the output layer FC as shown in Fig. 2.
FI = (X 1 , X 2 , X 3 , ..., X n )
FH = (H11 , H12 , ..., H3z )
FC = (C1 , C2 , C3 , ..., Cn )
The role of the input layer FI is to accept n-dimensional input as a feature vector
and forward it to the middle layer FH . It does not perform any operation. The pro-
posed algorithm is applied to input data and nonlinear data gets transformed into a
linear format in the hidden layer. Input data is mapped into unbounded hyperboxes
format in the hidden layer by learning algorithm given in Sect. 4. Hidden layer FH
Unbounded Fuzzy Radial Basis Function Neural Network Classifier 29
output is fuzzy hyperboxes (FHBs). If the pattern lies under the hyperbox region, the
membership value is 1. As input moves away from the hyperbox region, the mem-
bership value of the input pattern gradually decreases. The formula for determining
the membership function is as shown in Eq. 3.
where Xh = (x h1 , x h2 , ..., x hn ) is hth given input to be trained and Cpj , r j are the
center points and radius of cluster. The function f () is defined as
1 if l ≤ r j
f (l, r j ) = (4)
rj < l otherwise
where l is the Euclidean distance between Xh and Cpj . The weights between FI
and FH are stored in the matrix C which gives the center points of FHSs. The FC
represents class layer and k nodes in this layer represent one of the class.
30 B. S. Shetty et al.
Step 1: Let M k and O k estimate the distance between similar and different class
patterns, respectively, and βk shows the number of patterns in class Ck .
M k = Xi − Xj βk Xβk i, j = 1, 2, ..., βk where Xi , Xj ∈ Ck
O k = Xi − Xj βk X p−βk i = 1, 2, ..., βk and j = 1, 2, ..., p − βk
where Xi ∈ Ck and Xj ∈ / Ck
Step 2: Calculate the smallest distance of each pattern Xi ∈ Ck from the patterns
of other class Xj ∈
/ Ck using O k .
B k = min(O k )
Step 3: (Cluster creation) For cluster creation, from B k , select the pattern x kj
having maximum distance and consider it to be the centroid of cluster and radius
equal the maximum distance calculated as
radius k = max(B k )
βk = βk − nj
To evaluate the performance of UFRBFNN classifier, two case studies along with
obtained results have been discussed in the following sub-sections. The learning
algorithm is implemented in MATLAB 2019a.
5.1.1 Example 1
FHB formation. Class 1 FHB is shown in Fig. 3b. In step 6, all the patterns of class
1 are compiled. If more than one pattern are left outside the FHB, steps 1 to 6 are
required to be repeated. In our example, no class 1 pattern is left out. Hence, contin-
ued to follow step 8 and 9 for ensuring the connection between FHB and respective
class node in output layer. After all the possible FHBs for one class along with their
connections with class nodes are done in the last step, the same procedure is repeated
for other classes. In this example, cluster and FHB for class 2 are as shown in Fig. 4a
and b, respectively. Finally, two FHBs, one for class 1 and class 2 each, are created
as shown in Fig. 5b.
Figure 6a and b for class 1 and class 2, respectively, clearly indicates the removal
of excess space after conversion of the cluster into FHB. Removal of excess space
improves training accuracy and certainly reduces overlapping between different
classes. Hence, it further reduces the overall misclassification rate. This in turn cer-
tainly improves the training accuracy.
Unbounded Fuzzy Radial Basis Function Neural Network Classifier 33
5.1.2 Example 2
In this example, 3 classes and 24 patterns were considered. The pattern features and
class labels are taken from [23] as stated in Table 2. Input patterns and class labels
are kept the same in order to compare the proposed USFH algorithm with maximum
spread fuzzy clustering (MSFC) algorithm [23].
Class 1, class 2, and class 3 patterns are as shown by green, red, and blue colors,
respectively, and its scatter plot is as shown in Fig. 7.
Final results computed by both MSFC and USFH algorithms are given in Table 3.
In addition to it, final clusters formed by MSFC are as shown in Fig. 8a, and FHBs
created by USFH are as shown in Fig. 8b.
34 B. S. Shetty et al.
Table 3 Case study 2: centroid and radii calculated by MSFC and FHBs dimension calculated by
USFH
From the above comparison, we can conclude that the overlapping region between
inter-class clusters is decreased in the USFH algorithm due to lower and upper
bound calculation for FHB formation. Reduced overlapping region improves training
accuracy of USFH algorithm over MSFC algorithm. To prove this, the performance
of the USFH algorithm is evaluated with other classifiers in the following section.
Unbounded Fuzzy Radial Basis Function Neural Network Classifier 35
Fig. 8 Case study 2: cluster formation by MFSC approach and FHBs by UFSH approach
References
1. Simpson, P.K.: Fuzzy min-max neural network Part I: Classification. IEEE Trans. Neural Netw.
3, 776–786 (1992)
2. Kulkarni, U.V., Sontakke, T.R.: Fuzzy hypersphere neural network classifier. In: 10th Interna-
tional Conference on Fuzzy Systems, Melbourne, Victoria, Australia, pp. 1559–1562 (2001)
3. Kim, H.J., Ryu, T.W., Nguyen, T.T., Lim, J.S., Gupta, S.: A weighted fuzzy min-max neural
network for pattern classification and feature extraction. In: Lagana, A., Gavrilova, M.L.,
Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) Computational Science and Its Applications
ICCSA 2004 (2005)
4. Ma, D., Liu, J., Wang, Z.: The pattern classification based on fuzzy min-max neural network
with new algorithm. In: Wang, J., Yen, G.G., Polycarpou, M.M. (eds.) Advances in Neural Net-
works ISNN 2012. Lecture Notes in Computer Science, vol. 7368. Springer, Berlin, Heidelberg
(2012)
5. Mohammed, M.F., Lim, C.P.: An enhanced fuzzy min-max neural network for pattern classi-
fication. IEEE Trans. Neural Netw. Learn. Syst. 26(3), 417–429 (2015)
Unbounded Fuzzy Radial Basis Function Neural Network Classifier 37
6. Chen, S.: Nonlinear time series modeling and prediction using Gaussian RBF networks with
enhanced clustering and RLS learning. Electron. Lett. 3, 117–118 (1995)
7. Carse, B., Pipe, A.G., Fogarty, T.C., Hill, T.: Evolving radial basis function neural networks
using a genetic algorithm. In: IEEE International Conference on Evolutionary Computation,
Perth, December 1995, pp. 300–305 (1995)
8. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3),
264–323 (1999)
9. Moody, J., Darken, C.J.: Fast learning in networks of locally-tuned processing units. Neural
Comput. 1, 281–294 (1989)
10. Sarimveis, H. Alexandridis, A., Bafas, G.: A fast training algorithm for RBF networks based
on subtractive clustering. Neurocomputing 501–505 (2003)
11. Tsekouras, G.E., Tsimikas, J.: On training RBF neural networks using input-output fuzzy
clustering and particle swarm optimization. Fuzzy Sets Syst. 221, 65–89 (2013)
12. Shie-Jue, L., Chun-Liang, H.: An ART-based construction of RBF networks. IEEE Trans.
Neural Netw. 13, 1308–1321 (2002)
13. Sohn, I., Ansari, N.: Configuring RBF neural networks. Electron. Lett. 34, 684–685 (1998)
14. Wang, D., Zeng, X.J., Keane, J.A.: A clustering algorithm for radial basis function neural
network initialization. Neurocomputing 77, 144–155 (2012)
15. Niros, Antonios D., Tsekouras, George E.: A novel training algorithm for RBF neural network
using a hybrid fuzzy clustering approach. Fuzzy Sets Syst. 193, 62–84 (2012)
16. Feng, H.M.: Self-generation RBFNs using evolutional PSO learning. Neurocomputing 70,
241–251 (2006)
17. Billings, S.A., Zheng, G.L.: Radial basis function network configuration using genetic algo-
rithms. Neural Netw. 8, 877–890 (1995)
18. Li, J., Liu, X.: Melt index prediction by RBF neural network optimized with an adaptive new
ant colony optimization algorithm. J. Appl. Polym. Sci. 119, 3093–3100 (2011)
19. Shen, W., Guo, X., Wu, C., Wu, D.: Forecasting stock indices using radial basis function neural
networks optimized by artificial fish swarm algorithm. Knowl. Based Syst. 24, 378–385 (2011)
20. Rouhani, M., Javan, D.S.: Two fast and accurate heuristic RBF learning rules for data classifi-
cation. Neural Netw. 75, 150–161 (2016)
21. Liu, Y., Huang, H., Huang, T.W., Qian, X.: An improved maximum spread algorithm with
application to complex-valued RBF neural networks. Neural Netw. Neurocomput. 216, 261–
267 (2016)
22. Kumar, A., Sai Prasad, P.S.V.S.: Hybridization of fuzzy min-max neural networks with kNN for
enhanced pattern classification. In: Singh, M., Gupta, P., Tyagi, V., Flusser, J., Oren, T., Kashyap,
R. (eds.) Advances in Computing and Data Sciences. ICACDS 2019. Communications in
Computer and Information Science, vol. 1045. Springer, Singapore (2019)
23. Kulkarni, A., Bonde, S., Kulkarni, U.: A novel fuzzy clustering algorithm for radial basis
function neural network. Int. J. Future Revol. Comput. Sci. Commun. Eng. 4(4), 751–756
(2018). ISSN: 2454-4248
24. Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School
of Information and Computer Science, Irvine, CA. Neural Network (2010). http://archive.ics.
uci.edu/ml
A Study on the Adaptability of Deep
Learning-Based Polar-Coded NOMA in
Ultra-Reliable Low-Latency
Communications
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 39
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_4
40 N. Iswarya et al.
Polar codes are a subset of linear block error correction codes that can achieve Shan-
non’s channel capacity limit with decreased encoding and decoding complexity. Polar
coding is a process of recursive concatenations of a kernel matrix that transforms
the physical channel into virtual channels. As the iterations are repeated, the virtual
A Study on the Adaptability of Deep Learning … 41
channels become either extremely reliable or unreliable, and hence the data bits are
allocated to the most reliable channels. The transformation of physical channels into
virtual channels is called channel polarization. Channel polarization consists of two
phases (i) channel combining phase and (ii) channel splitting phase. After polariza-
tion, reliable channels will have a maximum data rate over a channel, and unreliable
channels will have a reduced data rate on a given channel. The number of codewords
in polar codes is represented as N, which are the powers of 2, i.e., 2 N with K number
of highly reliable channels being assigned to data bits.
Figure 1 shows the construction of polar codes of two channels, u1 and u2 , to gen-
erate the codeword transmitted on a channel. The codeword consists of information
bits (I), and frozen bits (F), and a generator
matrix (G N ) to generate I and F. G N uses
nth Kronecker product as G N = G2 n where
10
GN =
11
C N = ui G N (1)
Arican in [2] depicted polar coding and decoding. The codeword C N is generated by
multiplying the ui vector with the G N matrix from Eq. (1). The generated codeword
is then channel broadcast. At the receiver end, y N is acknowledged and decoded. In
42 N. Iswarya et al.
[2, 9], the Successive Cancellation (SC) decoding method is used. Log-Likelihood
Ratio (LLR) values are calculated, and the output vector (uˆ1 ) is generated. Initially,
LLR soft-decisions are used successively from uˆ1 to uˆN and later, hard decisions are
applied for deciding between 0 and 1. The SC decoding method performed well as
N; the codeword length tends to infinity and system complexity in the order of 0 (N
log N). However, the SC approach achieves high channel capacity; error-correction
capability detriments at short codeword lengths. In [10], to overcome the drawback of
error correction in SC decoding, the authors designed Successive Cancellation List
(SCL) decoding that triggers L number of paths. Consecutively, the system com-
plexity also increases in order 0 (LN log N). SCL performed well with limited code
lengths but failed when the correct codeword is displaced in the chosen path. From
the simulation results of [10], it is evident that BLER fades away with the increase of
list size, but the complexity remains increasing. The authors in [11] approached with
Successive Cancellation Flip (SC-Flip) decoding method to enhance SC by maximiz-
ing its error correction efficiency. However, the adoption of the selection and sorting
process in SC-Flip affects inexpensive execution complexity. Therefore, to better use
SC-Flip algorithms for low code rates, two techniques were suggested [12]: Fixed
Index Selection (FIS) scheme and Enhanced Index Selection (EIS) scheme to cir-
cumvent the execution cost and boost the error correction capability. Adaptive SCL
decoding in [13] aims at improving the throughput of the decoder, thereby reducing
the system complexity. Adaptive SCL was a combination of SC, SCL, and SCL-
CRC (Cyclic Redundancy Codes). Noticeable decoding methods have been found
in the literature to enhance channel coding of polar codes in achieving low latency
and ease of hardware complexity. Machine learning (ML), an Artificial Intelligence
application, is booming in the data science field. Utilizing these ML algorithms in
channel coding for wireless communications suits to be an attentive solution for a
channel coding problem.
The information and yielding vectors of a DNN are coherent with the channel coding,
and hence this coherent behavior found a beneficial way to acquire DNN in channel
decoding [14]. The favorable aids of channel decoding based on Deep Learning are
not reiterative and low latent but are bounded by scalability limitations concerning
block lengths, named “curse of dimensionality.” Admitting the fact that shorter code-
word lengths are efficiently trained, the polar encoding graphs are partitioned into
multiple sub-graphs; ML algorithms in [15] eventually take over polar decoding.
Analogous to the past work, the authors in [16] constructed an Neural Successive
Cancellation (NSC) decoding, comprising several Neural Network Decoders com-
bined with SC. A modified neural belief propagation method is proposed in [17] and
the estimations on decoding reliability are presented to accomplish better decoding
A Study on the Adaptability of Deep Learning … 43
outcomes. Most of the literature has considered the noise environment as Gaussian
White noise, whereas practical communication systems would have noise correla-
tion, due to filtering and oversampling. However, adopting conventional methods to
process the colored noise leads to computational complexity. Integrating an iterative
belief propagation (BP) algorithm with a convolutional neural network for an LDPC
decoder comprehended under a colored noise model. They proved in [18] CNN as
a potential solution by extracting noise correlation as a feature for CNN. A similar
combination of CNN and BP accounting BER and latency is proposed in [19]. The
DL schemes in channel coding of polar codes have got the light in communication
systems though there was a shortfall in two aspects: it can be applied solely for
shortcodes and the BP decoding scheme [20–24]. Artificial Neural Network (ANN)
for channel coding has started emerging [22]. Rather than considering error detec-
tion, the capability of error correction is considered by designing a table based error
correction decoding framework in [25, 26]. An optimization technique for the BP
algorithm’s weighting scheme to accelerate the training convergence is intended [27].
In the paper [28], the neural offset Min-Sum [MS], using the Machine Learning tech-
niques that offered enhanced performance than other BP decoders applied for BCH
codes, is suggested. Hence, in the paper [29], authors have considered MS decoders
for polar codes were exploiting both scaling and offset. Advancements in ML have
motivated a Multi-Flips SC decoding scheme in [30] based on ML in achieving low
latency than the previous flip-successive cancellation decoding methods. Free Space
Optical (FSO) communications combined with Polar codes are considered for apply-
ing DL techniques in [31, 32], resulting in a higher convergence rate and better BER
performance outperforming the standard LLR input. The authors in [33] came up
with sparse training of neural networks employed to channel polar codes for attaining
high BER. Although the authors utilize the partitionable property of polar codes, the
NND lacks the generalization capability itself and remains an unsolvable problem.
The 5th generation (5G) networks are availed for achieving essential services in
aiming drastically increased capacity, endeavoring Internet of Things (IoT) by con-
necting abundant of intelligent devices and the capability to subsidize highly reliable;
mission-critical implementations are portrayed in Fig. 2.
Advancements in Artificial Intelligence have induced authors to employ
deep learning schemes that beneficially impact NOMA techniques [34]. The high-
performance 5G systems can be realized by effectuating NOMA combined with
MIMO [35], mMIMO [36], mmWave [37, 38] technologies. A communication archi-
tecture for a Tactile Internet application based on NOMA allowing non-orthogonal
resource sharing is discussed in [39]. The role of these 5G generic services is ana-
lyzed, and the suitability of NOMA-based Tactile Internet is explored for applica-
44 N. Iswarya et al.
Table 1 A table on deep learning techniques applied in PC-NOMA for URLLC scenarios
Topic References Inference
Channel encoding and decoding of [9–13, 15, 16] Channel decoding techniques like
polar codes SC, SCL, and SCL-CRC decoding
techniques are discussed. Latency
and hardware complexity remain a
constraint for the design of polar
codes for 5G-NR
Deep learning algorithms for [17–22] Deep learning techniques are
decoding of polar codes applied in polar decoding to reduce
the decoder complexity and
enhance latency
DL-based error-correcting codes [23, 24] Error correction codes are
considered based on DL algorithms
Convolutional and deep neural [18, 19, 24, 31, 32] CNN, belief propagation
network for polar decoding (BP-CNN), and DNN schemes are
applied for polar decoding
Polar-coded non-orthogonal [5] PC-NOMA framework is designed
multiple access (PC-NOMA) to jointly optimize polar coding,
modulation, and transmission
DL based on NOMA and its [6, 34–38] Deep learning techniques for
advancements of MIMO, NOMA schemes are proposed for
mMIMO, mmWave efficient channel estimation in
imperfect CSI and uncertainties
Use cases of 5G NR with NOMA [7, 39] The main objectives of 5G NR are
to reduce the latency and to
accommodate billions of smart
devices. NOMA is employed for
Tactile Internet and URLLC
applications for 5G radio access
Performance of NOMA when [41, 45] NOMA applied for URLLC use
accompanied by URLLC cases with index modulation
schemes in a cooperative
environment are explained
Enabling Artificial Intelligence for 5G and beyond entails highly robust, well-
performing, and low complex technicalities [46]. PD-NOMA, a variant of NOMA
in two user scenarios, involving the URLLC environment, a critical use case of 5G,
is shown in Fig. 3, to reduce latency and improve reliability in the communication
system. The problem of employing PC-NOMA in URLLC applications remains a
challenging one. The chances of involving deep learning for channel estimation in
NOMA in URLLC applications seems to be an eye-opening solution for reliability
and latency complications.
46 N. Iswarya et al.
8 Conclusion
The machine learning techniques for Polar decoding aiming at channel coding in the
5G-NR network are surveyed. Deep learning techniques involving neural schemes
appear to be beneficial in channel decoding of polar codes. Applying the machine
learning algorithms in channel estimation for Non-Orthogonal Multiple Access
schemes proposed in very few reported works seems to be an encouraging solu-
tion during channel uncertainties. Involving deep learning algorithms in NOMA
also enhances the reliability and reduces the latency for critical enabling use cases of
5G. Accounting for those significant use cases of 5G, low latent Deep learning-based
NOMA schemes can be incorporated to implement the applications mentioned above.
The suitability of employing deep learning polar codes for NOMA is addressed as a
prominent solution for 5G and beyond use cases.
A Study on the Adaptability of Deep Learning … 47
References
1. Shannon, C.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423, 623–
656. Math. Rev. (MathSciNet) MR10, 133e (1948)
2. Arikan, E.: Channel polarization: a method for constructing capacity-achieving codes for sym-
metric binary-input memoryless channels. IEEE Trans. Inf. Theory 55(7), 3051–3073 (2009)
3. Bioglio, V., Condo, C., Land, I.: Design of polar codes in 5g new radio. arXiv preprint (2018).
arXiv:1804.04389
4. Babar, Z., et al.: Polar codes and their quantum-domain counterparts. In: IEEE Commun. Surv.
Tutor. 22(1), 123–155. Firstquarter (2020). https://doi.org/10.1109/COMST.2019.2937923
5. Dai, J., Niu, K., Si, Z., Dong, C., Lin, J.: Polar-coded non-orthogonal multiple access. IEEE
Trans. Signal Process. 66(5), 1374–1389 (2018). https://doi.org/10.1109/TSP.2017.2786273
6. Gui, G., Huang, H., Song, Y., Sari, H.: Deep learning for an effective nonorthogonal multiple
access scheme. IEEE Trans. Veh. Technol. 67(9), 8440–8450 (2018). https://doi.org/10.1109/
TVT.2018.2848294
7. Sutton, G.J., et al.: Enabling technologies for ultra-reliable and low latency communications:
from PHY and MAC layer perspectives. IEEE Commun. Surv. Tutor. 21(3), 2488–2524.
Thirdquarter (2019). https://doi.org/10.1109/COMST.2019.2897800
8. Zhang, M., Lou, M., Zhou, H., Zhang, Y., Liu, M., Zhong, Z.: Non-orthogonal coded access
based uplink grant-free transmission for URLLC. In: IEEE/CIC International Conference on
Communications in China (ICCC), Changchun, China, pp. 624–629 (2019). https://doi.org/
10.1109/ICCChina.2019.885590
9. Alamdar-Yazdi, A., Kschischang, F.R.: A simplified successive-cancellation decoder for polar
codes. IEEE Commun. Lett. 15(12), 1378–1380 (2011). https://doi.org/10.1109/LCOMM.
2011.101811.111480
10. Tal, I., Vardy, A.: List decoding of polar codes. IEEE Trans. Inf. Theory 61(5), 2213–2226
(2015)
11. Afisiadis, O., Balatsoukas-Stimming, A., Burg, A.: A low-complexity improved successive
cancellation decoder for polar codes. In: Asilomar Conference on Signals, Systems and Com-
puters, pp. 2116–2120 (2014)
12. Condo, C., Ercan, F., Gross, W.J.: Improved successive cancellation flip decoding of polar
codes based on error distribution. arXiv preprint (2017). arXiv:1711.11096)
13. Li, B., Shen, H., Tse, D.: An adaptive successive cancellation list decoder for polar codes with
cyclic redundancy check. IEEE Commun. Lett. 16(12), 2044–2047 (2012)
14. Xu, S., Luo, F.-L.: Machine Learning for Future Wireless Communications, 1st edn. Wiley
(2020)
15. Cammerer, S., Gruber, T., Hoydis, J., ten Brink, S.: Scaling deep learning-based decoding of
polar codes via partitioning. In: IEEE Global Communications Conference, Singapore (2017).
https://doi.org/10.1109/GLOCOM.2017.8254811
16. Doan, N., Ali Hashemi, S., Gross, W.J.: Neural successive cancellation decoding of polar
codes. In: IEEE 19th International Workshop on Signal Processing Advances in Wireless
Communications (SPAWC), Kalamata, pp. 1–5 (2018). https://doi.org/10.1109/SPAWC.2018.
8445986
17. Yuan, C., Wu, C., Cheng, D., Yang, Y.: Deep learning in encoding and decoding of polar codes.
J. Phys. Conf. Ser. 1060(1), 012021. IOP Publishing (2018)
18. Liang, F., Shen, C., Wu, F.: An iterative BP-CNN architecture for channel decoding. IEEE J.
Sel. Top. Sig. Process. 12(1), 144–159 (2018)
19. Wen, C., Xiong, J., Gui, L., et al.: A novel decoding scheme for polar code using convolutional
neural network. In: 2019 IEEE International Symposium on Broadband Multimedia Systems
and Broadcasting (BMSB). IEEE, Jeju, Korea (South), pp. 1–5 (2019)
20. Gruber, T., Cammerer, S., Hoydis, J., ten Brink, S.: On deep learning-based channel decoding.
In: Annual Conference on Information Sciences and Systems (CISS), pp. 1–6 (2017
48 N. Iswarya et al.
21. Nachmani, E., et al.: Learning to decode linear codes using deep learning. In: 54th Annual
Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL,
pp. 341–346 (2016)
22. Song, X., Zhang, Z., Wang, J., Qin, K.: A graph-neural-network decoder with MLP-based
processing cells for polar codes. In: 2019 11th International Conference on Wireless Commu-
nications and Signal Processing (WCSP). IEEE, Xi’an, China, pp. 1–6 (2019)
23. Xu, W., Wu, Z., Ueng, Y.-L., You, X., Zhang, C.: Improved polar decoder based on deep
learning. In: IEEE International Workshop on Signal Processing Systems (SiPS), pp. 1–6
(2017)
24. Lyu, W., Zhang, Z., Jiao, C., Qin, K., Zhang, H.: Performance evaluation of channel decoding
with deep neural networks. IEEE International Conference on Communication (ICC), pp. 1–6
(2018)
25. Liu, X., Wu, S., Wang, Y., et al.: Exploiting error-correction-CRC for polar SCL decoding: a
deep learning-based approach. IEEE Trans. Cogn. Commun. Netw. 6, 817–828 (2020). https://
doi.org/10.1109/TCCN.2019.2946358
26. Wang, J., Li, J., Huang, H., Wang, H.: Fine-grained recognition of error correcting codes based
on 1-D convolutional neural network. Dig. Sig. Process. 99, 102668 (2020). https://doi.org/10.
1016/j.dsp.2020.102668
27. Gao, J., Niu, K., Dong, C.: Exploiting error-correction-CRC for polar SCL decoding. IEEE
Access 8, 27210–27217 (2020)
28. Lugosch, L., Gross, W.J.: Neural offset min-sum decoding. In: 2017 IEEE International Sym-
posium on Information Theory (ISIT) (2017). https://doi.org/10.1016/j.dsp.2020.102668
29. Dai, B., Liu, R., Yan, Z.: New min-sum decoders based on deep learning for polar codes. In:
IEEE International Workshop on Signal Processing Systems (SiPS), Cape Town, pp. 252–257
(2018). https://doi.org/10.1109/SiPS.2018.8598384
30. He, B., Wu, S., Deng, Y., Yin, H., Jiao, J., Zhang, Q.: A machine learning based multi-flips
successive cancellation decoding scheme of polar codes. In: IEEE 91st Vehicular Technology
Conference (VTC2020-Spring) 2020, pp. 1–5 (2020)
31. Fang, J., Bi, M., Xiao, S., et al.: Neural network decoder of polar codes with tanh-based modified
LLR over FSO turbulence channel. Opt. Express 28, 1679 (2020). https://doi.org/10.1364/OE.
384572
32. Fang, J., et al.: Neural successive cancellation polar decoder with Tanh-based modified LLR
over FSO turbulence channel. IEEE Photon. J. 12(6), 1–10. Art no. 7906110 (2020). https://
doi.org/10.1109/JPHOT.2020.3030618
33. Xu, W., You, X., Zhang, C., Be’ery, Y.: Polar decoding on sparse graphs with deep learning.
In: 52nd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA,
pp. 599–603 (2018). https://doi.org/10.1109/ACSSC.2018.8645372
34. Narengerile, Thompson, J.: Deep learning for signal detection in non-orthogonal multiple
access wireless systems. UK/China Emerging Technologies (UCET), Glasgow, United King-
dom, pp. 1–4 (2019). https://doi.org/10.1109/UCET.2019.8881888
35. Kang, J.-M., Kim, I.-M., Chun, C.-J.: Deep learning-based MIMO-NOMA with imperfect
SIC decoding. IEEE Syst. J. 14(3), 3414–3417 (2020). https://doi.org/10.1109/JSYST.2019.
2937463
36. Boloursaz Mashhadi, M., Gündüz, D.: Deep learning for massive MIMO channel state acquisi-
tion and feedback. J. Indian Inst. Sci. 100(2), 369–382 (2020). https://doi.org/10.1007/s41745-
020-00169-2
37. Cui, J., Ding, Z., Fan, P.: The application of machine learning in mmWave-NOMA systems.
In: 2018 IEEE 87th Vehicular Technology Conference (VTC Spring). IEEE, Porto, pp. 1–6
(2018)
38. Cui, J., Ding, Z., Fan, P., Al-Dhahir, N.: Unsupervised machine learning-based user clustering
in millimeter-wave-NOMA systems. IEEE Trans. Wirel. Commun. 17(11), 7425–7440 (2018).
https://doi.org/10.1109/TWC.2018.2867180
39. Budhiraja, I., Tyagi, S., Tanwar, S., Kumar, N., Rodrigues, J.J.P.C.: Tactile internet for smart
communities in 5G: an insight for NOMA-based solutions. IEEE Trans. Ind. Inform. 15(5),
3104–3112 (2019). https://doi.org/10.1109/TII.2019.2892763
A Study on the Adaptability of Deep Learning … 49
40. Ahmad Khan Beigi, N., Soleymani, M.R.: Ultra-reliable energy-efficient cooperative scheme
in asynchronous NOMA with correlated sources. IEEE Internet Things J. 6(5), 7849–7863
(2019). https://doi.org/10.1109/JIOT.2019.2911434
41. Wang, Z., Lv, T., Lin, Z., Zeng, J., Mathiopoulos, P.T.: Outage performance of URLLC NOMA
systems with wireless power transfer. IEEE Wirel. Commun. Lett. 9(3), 380–384 (2020). https://
doi.org/10.1109/LWC.2019.2956536
42. Chen, X., Cheng, J., Zhang, Z., Wu, L., Dang, J., Wang, J.: Data-rate driven transmission
strategies for deep learning-based communication systems. IEEE Trans. Commun. 68(4), 2129–
2142 (2020). https://doi.org/10.1109/TCOMM.2020.2968314
43. Shlezinger, N., Farsad, N., Eldar, Y.C., Goldsmith, A.J.: ViterbiNet: a deep learning based
Viterbi algorithm for symbol detection. IEEE Trans. Wirel. Commun. 19(5), 3319–3331 (2020).
https://doi.org/10.1109/TWC.2020.2972352
44. Besser, K.-L., Lin, P.-H., Janda, C.R., Jorswieck, E.A.: Wiretap code design by neural network
autoencoders. IEEE Trans. Inf. Forensic Secur. 15, 3374–3386 (2020). https://doi.org/10.1109/
TIFS.2019.2945619
45. Doğan, S., Tusha, A., Arslan, H.: NOMA with index modulation for uplink URLLC through
grant-free access. IEEE J. Sel. Top. Sig. Process. 13(6), 1249–1257 (2019). https://doi.org/10.
1109/JSTSP.2019.2913981
46. Shafin, R., Liu, L., Chandrasekhar, V., Chen, H., Reed, J., Zhang, J.C.: Artificial intelligence-
enabled cellular networks: a critical path to beyond-5G and 6G. IEEE Wirel. Commun. 27(2),
212–217 (2020). https://doi.org/10.1109/MWC.001.1900323
Heart Rate Variability-Based Mental
Stress Detection Using Deep Learning
Approach
Abstract Health problems are rising with today’s stressful life, as it promotes car-
diac diseases, depression, violence, and may provoke suicide. Hence, it is essential
to develop a computer-aided diagnosis system to identify relaxed versus stressed
individuals and their correct classification. Heart rate variability (HRV) based on
RR interval is a well-proven clinical and diagnostic tool strongly associated with
the autonomic nervous system (ANS). In this study, a conventional method was
compared with a deep learning-based method. In the Conventional method, features
were extracted from various domains, and these features were fed to a classifier to
detect stressed states. However, this method uses hand-crafted features, and hence,
there is a possibility of missed high potential features that may be responsible for
maximizing the classifier’s generalization performance. This work presents a new
approach motivated by the long short-term memory network (LSTM) in sequence
learning to generate a concrete decision about the signal category. We proposed
deep learning-based Inception-LSTM network to improve performance and to reduce
computational cost. Two different stress datasets, viz., self-generated stress data and
Physionet driver stress data were used to perform the proposed method’s perfor-
mance analysis. The presented Inception-LSTM architecture outperforms existing
literature methods, achieving an accuracy of 93% for self-generated stress data and
97.19% for driver stress data.
1 Introduction
Humans face various health problems out of which stress is a primary and critical
issue in day-to-day life. Stress can be studied as a physiological and psychological
response of the body, and it occurs due to workload, social media, some personal
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 51
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_5
52 R. B. Ramteke and V. R. Thool
problems, etc. According to the Center for Disease Control (National Institute on
Occupational Safety and Health), the cause of life stress is the workplace. They also
reported that 110 million people die every year because of stress (7 persons every
2 s) [18]. Stress categorizes into acute (short term) and chronic (long term). Chronic
stress releases the cortisol stress hormone, which may give birth to bad habits such
as bad eating, smoking, drug-taking. It also increases the health risk by reducing
healing power, immunity, rise in blood pressure (BP), effects on the brain, heart
attack, stroke, violence, suicide, and even causes cancer [18]. Even though acute
stress is treatable, frequent occurrence of it may lead to chronic stress, as chronic
stress develops gradually. In general, stress is a prevalent and severe problem in the
twenty-first century, and thus, early detection and accurate classification become a
need of human beings. It motivates us to focus on the detection of stress in everyday
human life.
Motivated by the significance and computations of mental stress, numerous studies
have been reported in this field. Existing methods for recognizing stress are based
on the ECG-derived study of heart rate variability (HRV) [10, 13, 16]. Delaney et al.
[3] investigated HRV-based cardiovascular reactivity with short-term psychological
stress. The stress dataset was created using a 5 min Stroop Word Color Conflict
Test. The research was performed in the time domain and the frequency domain.
A significant reduction was found in the standard deviation of the RR intervals and
the high-frequency HRV component, whereas the low-frequency HRV component
raised. Anne et al. [12] found that HRV is a reliable index for assessing long-term
stress effects during surgery, helping surgeons to distinguish high stress-inducing
operating techniques. In [17], the authors implemented a k-nearest neighbor (KNN)
classifier for stress detection with HRV feature-based transformation algorithm. The
algorithm involves feature generation, selection, and dimension reduction for robust
feature generation. They used the Physionet driver database to conduct the analysis.
Notable literature is available for stress assessment based on manual feature gen-
eration and traditional classification approaches. HRV-based stress detection adopts
a considerable window length of the signal, usually in minutes [2]. However, acute
mental stress detection requires the decision made in a short window [5]. The per-
formance of conventional HRV-based methods can be constrained with decreased
window length. Intelligent and reliable deep learning may be a potential solution
that has been emerged swiftly and achieved impressive performance on image clas-
sification and sequence learning tasks. A few researchers have already formulated
an ECG-based stress detection problem using deep learning [5, 8, 14]. Rastgoo et
al. [11] proposed a multimodal fusion CNN-LSTM driver stress detection network
that integrates vehicle data, contextual data, and ECG signal data to enhance the
classification results with remarkable accuracy of 92.8%. In [4], the authors sug-
gested a similar method as that of [11] to automatically recognize diversified pilot
mental states. The recent deep learning studies on stress detection uses multimodal
input, along with that the authors adopting heavy network parameters of fusion-based
models that are computationally complex.
This work presents an approach to utilize the HRV parameter for the evaluation of
mental stress. The conventional method and deep learning method were employed
Heart Rate Variability-Based Mental Stress Detection Using Deep Learning Approach 53
for the performance comparison. The conventional method offers prominent features
that allow the classifier to generate optimal results. The proposed deep learning study
combines the Inception module and LSTM network with minimum learnable param-
eters. This is the first attempt to use the LSTM-based Inception network for stress
recognition to the best of our knowledge. The contributions to the proposed work are
as follows: (1) Generation of a database, 180 recordings of a stressed category (data
was collected during the viva-voce engineering examination of students), and 179
recordings of a normal category (data was collected during the regular college rou-
tine); (2) Filters were designed for the noise removal from generated ECG data; (3)
Stress was evaluated using HRV parameter in different domains and highly effective
eight features were picked; (4) SVM and ANN classifiers with our specified hyperpa-
rameter values based on the trial and error technique have been implemented in such
a way that the most favorable classification rule was developed; (5) The inception-
LSTM architecture was designed for the enhancement of classification results.
The rest of the paper is organized into three major sections. Section 2 briefs about
the proposed methodology in which database and methods are explained, Sect. 3
presents the analysis of experimental results and discussions, and Sect. 4 concludes
the proposed approach.
2 Proposed Methodology
The proposed method initiates a strategy to detect mental stress and classify whether
the subject is stressed or relaxed. In this study, the conventional method and the
deep learning-based method have been implemented as shown in Fig. 1. Firstly,
an electrocardiogram (ECG) signal database was acquired and preprocessed using
designed filters explained in the preprocessing part of Sect. 2.2. The RR interval data
was extracted from the preprocessed ECG signal. The conventional method follows
the HRV analysis in which features were extracted in various domains and then fed
as an input to the classifier. The deep learning-based method is a fusion of a single
Inception module [15] that is inspired by GoogleNet due to its ability to produce a
robust performance and bidirectional LSTM network [6], as it learns the RR interval
sequence step-by-step in a forward and backward manner to detect stressed state.
In this paper, two databases were used for stress detection using HRV. The first dataset
is self-generated ECG data at Shri Guru Gobind Singhji Institute of Engineering and
Technology, Nanded, Maharashtra, India. It contains 359 recordings sampled at a
frequency of 360 Hz. The data were acquired from 180 students during the practical
viva-voce semester examination as stressed, and during the regular college routine
as normal. Out of 359 recordings, 180 are stressed, and 179 are normal ones. The
54 R. B. Ramteke and V. R. Thool
short segment ECG of 3 min has been acquired with the help of surface electrodes
with lead-I configuration using the BIOPAC system (MP150) from the Biomedical
Instrumentation Lab. of the Instrumentation Department.
The publicly available Physionet driver stress ECG database (drivedb) and the
normal sinus rhythm ECG database (nsrdb) are the second dataset used in this analysis
[19]. Both Physionet datasets consist of 18 recordings of 1 h each, so there were only
a total of 36 samples. Hence to increase the dataset, we segmented the 1 h recording
into 3 min. There are 1920 samples, i.e., 1080 are stressed (drivedb) samples, and
840 are relaxed (nsrdb) samples after segmentation.
HRV signal is obtained from the RR interval, where each RR interval represents a
point in a graph known as a tachogram [1, 2]. R peak detection is the first step of
acquiring the HRV signal.
2.2.1 Preprocessing
The sampling frequency of both standard Physionet datasets was different. Hence,
both the datasets get resampled at a frequency rate of 360 Hz. Afterward, these
samples were filtered using the same filters as mentioned above.
The Pan-Tompkins algorithm [9] was used to find R peaks. The bandpass filter
with low cut-off frequencies of 5 and 11 Hz was used to detect the QRS complex.
The algorithm goes through differentiation, squaring followed by a sliding window
integration with a window size of 150 ms to make all signal data positive and to
amplify the high-frequency data to get valuable information. Two thresholds (high
threshold and low threshold) are adjusted to make the decision. The peak is labeled
as a signal peak if it crosses a high threshold (and low threshold in case of lost
peak); otherwise, the peak is labeled as a noise peak. This algorithm provides good
efficiency for the detection of R peak. The RR intervals were estimated using the
time duration between two adjacent R peaks. Figure 2 shows filtered ECG and HRV
signal.
Feature extraction improves the performance of the classifier. Two methods have
been used for feature extraction to recognize mental stress.
I. Conventional Method
II. Deep Learning-based Method.
In this work, statistical analysis (time domain) [2], Fourier analysis (frequency
domain) [2], and nonlinear analysis (Poincare plot) [7] were carried out to extract
features from the RR intervals. The extracted features are shown in Table 1.
Two classifiers were used, i.e., support vector machine (SVM) and artificial neural
network (ANN), to predict stressed and normal states. In this study, the classification
task has been accomplished using 8 potential features.
SVM: The performance of SVM depends on regularization parameter C and
kernel parameters. The radial basis function (RBF) kernel was used, and it is defined
as
Table 1 Extracted feature values from different domains for the relaxed and stressed state
Variable Description Relaxed values Stressed values
(mean ± SD) (mean ± SD)
Time domain features
sdrr (ms) The standard deviation of RR 129 ± 20 98 ± 13
interval
rmssd (ms) RMS value of the difference 58 ± 12 23 ± 9
between adjacent RR intervals
Frequency domain features
LF (ms2 ) Power in the low frequency band 202 ± 49 400 ± 134
[0.003–0.04 Hz]
HF (ms2 ) Power in the high frequency band 328 ± 82 200 ± 60
[0.04–0.15 Hz]
LF/HF ratio 0.6 ± 1.5 2 ± 3.2
Poincare plot features
SD1 (ms) The standard deviation of 54 ± 8 39 ± 6
short-term variability of RR
intervals
SD2 (ms) The standard deviation of 118 ± 11 90 ± 9
long-term variability of RR
intervals
SD1/SD2 ratio 0.4 ± 0.2 0.2 ± 0.2
Heart Rate Variability-Based Mental Stress Detection Using Deep Learning Approach 57
2
f = (z, z ) = ex p(−α z − z ) (1)
d(C E)
dw = m × dw pr ev + lr × m (2)
dw
The previous change in weight or bias is denoted by dw pr ev . The term C E is the
binary cross-entropy used to estimate the performance of the network. The hyper-
parameter values are chosen such that the network produces the least error. For the
proposed work, the learning rate (lr ) is 0.001 with a 0.89 momentum value (m).
In this work, an LSTM network is used to incorporate with the Inception module,
as shown in Fig. 1. The LSTM networks have received remarkable results in the
prediction of time-series signals such as RR interval signals. Single Inception-LSTM
module was used, and the detailed structure of the proposed network is shown in
Fig. 1 (see zoom portion of Inception-LSTM module or elliptical circle). The LSTM
act as a feature extractor with a many-to-many structure. The proposed Inception-
LSTM-based approach optimizes the binary cross-entropy loss by Adam optimizer
for classification. If t is the actual output and q is the predicted output, then the loss
function is defined as
The hidden units of the bidirectional LSTM were set to 5. For preventing overfitting,
a dropout layer with a drop rate of 0.4 was used. There are only 39,942 learnable
parameters, and hence, this network is computationally cheap. The training process
gets terminated after 25 epochs with a fixed global learning rate of 0.01 and a batch
size of 150. The fully connected layer’s weight and bias learning rate were kept 5
times the global learning rate. All the experiments were implemented on the system
configured with 2GB NVIDIA GeForce MX230 GPU using software MATLAB
R2020a.
58 R. B. Ramteke and V. R. Thool
The ultimate objective of the research work is to detect severe stress. Each dataset
was split in a 65:15:20 ratio for training, validation, and testing of the proposed
network to achieve the objective.
The overall results comprised the classification of stress and relaxation conditions
of humans. Initially, HRV-based conventional methods have been implemented for
classification. The time-domain features, frequency-domain features, and Poincare
plot features were computed for HRV analysis (see Table 1). Afterward, SVM and
ANN classifiers were trained on extracted features for each dataset to detect mental
stress. The ANN classifier has better accuracy than the SVM classifier shown in
Tables 2 and 3, because the momentum parameter used in ANN helps the classi-
fier reach faster toward the minimum, and the adaptive learning rate converges the
optimization process.
The deep learning-based method was utilized to improve the classification accu-
racy further. To evaluate the performance of the deep learning-based model, two
approaches are proposed. In the first approach, only the Inception module was trained,
while in the second approach, the Inception-LSTM network was used, which consid-
erably upgraded the performance. The result analysis shows the substantial enhance-
ment of classification using the deep learning-based method over the conventional
method. The overall classification results were intensified by replacing the Inception
module with the Inception-LSTM network for learning the sequential features in
both forward and backward directions, which signifies the importance of learning
long-term dependencies in time-series data.
4 Conclusions
References
1. Acharya, U.R., Joseph, et al.: Heart rate variability: a review. Med. Biol. Eng. Comput, 44(12),
1031–1051 (2006)
2. Camm, A.J., Malik, M., et al.: Heart rate variability: standards of measurement, physiological
interpretation and clinical use. In: Task Force of the European Society of Cardiology and the
North American Society of Pacing and Electrophysiology, pp. 1043–1065 (1996)
3. Delaney, J.P.A., et al.: Effects of short-term psychological stress on the time and frequency
domains of heart-rate variability. Percept. Motor Skills 91(2), 515–524 (2000)
4. Han, S.-Y., Kwak, N.-S., et al.: Classification of pilots’ mental states using a multimodal deep
learning network. Biocybern. Biomed. Eng. 40(1), 324–336 (2020)
5. He, J., Li, K., Liao, X., Zhang, P., Jiang, N.: Real-time detection of acute cognitive stress using a
convolutional neural network from electrocardiographic signal. IEEE Access 7, 42710–42717
(2019)
6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780
(1997)
7. Hoshi, R.A., Pastre, C.M., et al.: Poincaréplot indexes of heart rate variability: relationships
with other nonlinear variables. Auton. Neurosci. 177(2), 271–274 (2013)
Heart Rate Variability-Based Mental Stress Detection Using Deep Learning Approach 61
8. Oskooei, A., Chau, S.M., et al.: DeStress: Deep Learning for Unsupervised Identification
of Mental Stress in Firefighters from Heart-rate Variability (HRV) Data. arXiv preprint
arXiv:1911.13213 (2019)
9. Pan, J., Tompkins, W.J.: A real-time QRS detection algorithm. IEEE Trans. Biomed. Eng.
32(3), 230–236 (1985)
10. Ramteke, R., Thool, V.R..: Stress detection of students at academic level from heart rate vari-
ability. In: 2017 International Conference on Energy, Communication, Data Analytics and Soft
Computing (ICECDS), pp. 2154–2157. IEEE (2017)
11. Rastgoo, M.N., et al.: Automatic driver stress level classification using multimodal deep learn-
ing. Expert Syst. Appl. 138, 112793 (2019)
12. Reijmerink, I., et al.: Heart rate variability as a measure of mental stress in surgery: a systematic
review. Int. Arch. Occup. Environ. Health 1–17 (2020)
13. Rigas, G., et al.: Real-time driver’s stress event detection. IEEE Trans. Intell. Transp. Syst.
13(1), 221–234 (2011)
14. Seo, W., Kim, N., Kim, S., et al.: Deep ECG-respiration network (DeepER net) for recognizing
mental stress. Sensors 19(13), 3021 (2019)
15. Szegedy, C., Vanhoucke, et al.: Rethinking the inception architecture for computer vision.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.
2818–2826 (2016)
16. Tanev, G., et al.: Classification of acute stress using linear and non-linear heart rate variability
analysis derived from sternal ECG. In: 2014 36th Annual International Conference of the IEEE
Engineering in Medicine and Biology Society, pp. 3386–3389. IEEE (2014)
17. Wang, J.-S., et al.: A k-nearest-neighbor classifier with heart rate variability feature-based
transformation algorithm for driving stress recognition. Neurocomputing 116, 136–143 (2013)
18. The Science of Stress. https://www.slma.cc/the-science-of-stress/. Last accessed 20 Aug 2020
19. PhysioBank Databases. https://archive.physionet.org/physiobank/database/. Last accessed 15
Nov 2020
Product-Based Market Analysis Using
Deep Learning
Abstract Product Market Analysis understands how the market reacts to a product
manufactured by a company. In this paper, a deep-learning-based model is created.
The model can understand how a customer feels about a particular product. The
dataset used is “fer2013” (Ref. Kaggle Dataset) and is famous for creating “Senti-
ment Analysis.” The model developed is a self-made model giving a training accu-
racy of 68.61 and 65.92% test accuracy. The self-made model is a 27-layer deep
convolutional neural network consisting of 8 convoluting layers, three max-pooling
layers, and two fully connected layers. The model is created using Keras, which is a
framework built on Tensorflow, a machine learning library. A total of 427,319 param-
eters are used to develop the proposed model. Out of these parameters, 426,839 are
trainable, and 480 are non-trainable.
1 Introduction
Product Market Analysis is the process of assessing the market or the public to
fully understand and comprehend what they require or how they react to a particular
product. It is studying the need for economic purposes and getting to know what the
end-user wants or requires. Based on a market survey, the developers of a particular
product can fix the issues and plan a release strategy to succeed. Market analysis
also discusses the profit margin in the picture because if the reviews seem mostly
positive, it can yield a more significant profit margin.
Every company that dreams of launching a product or feature has to go through
Market Analysis. There are mainly two types of Market Analysis, as shown in Fig. 1.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 63
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_6
64 A. Kumaria et al.
1. Phase 1:
a. The target market is identified, and requirements are understood.
b. Requirements and Features are gathered from the target audience, and a
fair price point is decided based on the complexity.
c. Based on the requirements, a product is developed, keeping the target
audience in mind.
2. Phase 2:
a. The product is sent out to various users in the form of a “testing” phase.
This is known as the Alpha Test and Beta Test.
b. The testers review the product and give feedback to the developers and
notify them about bugs and issues. They also mention what further
improvements could be made.
Till today, the second phase has been a strenuous process. Most of the software
has an automatic bug capturing feature, but it will never give the full story. Along
with that, the user experience is something that needs to be manually expressed to the
developers. When it comes to products like video games, the testers need to call and
inform the developers about bugs and improvements manually. Most of the Android
phones in the market have an automatic bug capturing feature, but the user interface
and whether the user is happy with the product cannot be captured.
This software is made keeping in mind the second kind of analysis. Let us take
an example of an application. The application is a video game created for mobile
devices. In the eye of the creators and the company, the video game is already perfect.
But when you send this game out to the public, issues may arise, such as unnoticed
bugs, hard-to-understand user interface, and so on. Hence beta and alpha testing is a
necessary process. The application is sent out to a handful of registered users around
the world. They use the app and try finding bugs and report this back to the company
Product-Based Market Analysis Using Deep Learning 65
that works on eliminating them. In some cases, the testers are also supposed to say
how they feel about the application.
Due to the rise of Artificial Intelligence and Deep Learning, Convolutional Neural
Networks have been able to find a way to record human sentiments or the emotions a
person is displaying based on something as simple as an image. While the sentiment
analysis feature can be used for Market Analysis, it has never really been used much.
Hidden layers in a ConvNet can automatically identify features in a person’s face.
The earlier layers might identify lines, and it builds up to identify smiles, eyebrows,
etc. The hidden layers do all this, so there is no need for another human being to
manually deduce a person’s emotions.
One of the most widely known datasets for performing sentiment analysis is the
“fer2013” dataset, consisting of 35,887 rows. Each row is a unique image depicting
a particular emotion. There are three columns in the dataset: identifying the pixels
to create the image, another identifying the emotion id, and the last one identifying
whether the image is part of training or testing. There are 28,709 images for training
the model on and 7178 images to test the model. There are seven different kinds of
emotions in the dataset that are Happy, Sad, Surprised, Angry, Disgust, Neutral, and
Fear. The dataset has images of various faces centered within a 48 × 48 dimension
image in grayscale and gives accuracy on the proposed model of 68.61% on training
data and 65.92% testing data.
In Sect. 2, a discussion on related work is provided. The proposed system is
explained in Sect. 3. Then, in Sect. 4, results are discussed and analyzed. Our proposed
work is concluded in Sect. 5.
2 Related Work
Minaee and Abdolrashidi [1] aim to identify facial expressions using convolutional
neural networks and an accuracy of 70.02%. Pramerdorfer and Kampel [2] have tried
CNN for facial expression recognition. They identified Facial Expressions from a
deep Convolutional Neural Network architecture such as VGG, Res-Net, and Incep-
tion. The VGG Net of depth 10 received an accuracy of 72.7%. The Inception model
of depth 16 received an accuracy of 71.6%. The Res-Net model of depth 33 received
an accuracy of 72.4%.
Badjatiya et al. [3] proposed a model in this research paper that identifies whether
a particular tweet is hurtful to any community in any way. It uses a deep learning
model to classify the nature of a specific tweet. Poria et al. [4] presented a multimodal
dataset for emotion recognition. The model analyzes a particular person’s emotion
after training on a dataset created from the TV Series Friends. Hence, along with
facial expressions, it can detect the tone and speech to understand the sentiment. A
review paper by Mäntylä et al. [5] talks about the rise in sentiment analysis through
the years and how it is in a way related to customer feedback. It shows that the
customer’s sentiment is equivalent to how the product performs in the market. Good-
fellow et al. [6] took part in a competition where they implemented facial expression
66 A. Kumaria et al.
The system flow of the proposed software, as shown below in Fig. 2, would have the
following actors:
1. The Reviewer
a. would be able to see the available products and review them.
b. Once a particular user reviews a product, it cannot be reviewed again by
the same user hence keeping the reviews authentic and unbiased.
2. The Admin can
a. add or delete a product.
b. see the reactions of people to their products.
The system starts the user’s front camera, and using cv2’s Haar cascade; the system
tries finding a face in every frame. This face is passed into the Deep Learning model,
where the particular facial expression is recognized. The system does this process
Product-Based Market Analysis Using Deep Learning 67
in a loop till the user decides to stop the review, after which the most shown facial
expression is extracted, and the category is updated in the database. The Admin can
see this category alongside the respective product to know how users feel about it.
The face extracted is preprocessed before passing it into the proposed Convolu-
tional Neural Network model by converting it into grayscale and then reduced to 48
× 48 dimensions. Pixels range from 0 to 255, so to reduce the computation, each
pixel is divided by 32 and is kept as a “float32” data type. The same preprocessing
is done for each image in the database. Since the data the proposed Convolutional
Neural Network model has been trained on, and the data passed to get a prediction
is preprocessed in the same way, the margin of error significantly decreases. Adding
to that, a manual data augmentation technique is done where the image is flipped
vertically. This ensures that there is more data for training and testing. It was found
that this augmentation increased the accuracy by approximately 6% hence getting
the accuracy from 62 to 68%. Out of the various optimization techniques available
such as Gradient Descent, RMS Prop, and Momentum, the Advanced Momentum
Estimation (Adam) optimizer was chosen since it was one of the best performing
optimizers that takes the RMS Prop equation and Momentum equation and clubs it
into one high-performance optimizer.
The architecture of the proposed Convolutional Neural Network Model is
illustrated in Fig. 3.
The proposed model was trained on ASUS Rog-Strix G notebook with the
following specifications:
• Intel i7 9th Generation Processor,
• 8 GB DDR4 RAM,
68 A. Kumaria et al.
• 512 GB SSD,
• NVIDIA GeForce GTX 1650 4 GB Graphics Card, and
• Windows 10 Home Operating System.
The following libraries are used to run the proposed software:
• numpy (1.16.4),
• matplotlib (3.1.3),
• pandas (1.0.1),
• opencv_python (3.4.2.17),
• Keras (2.3.1), and
• Pillow (8.0.1).
The proposed model was trained for 200 epochs with a batch size of 64. The loss of
each epoch was plotted on a graph to comprehend the performance of ConvNet. The
initial loss was approximately 1.8, but within 25–30 epochs, the loss was reduced
to approximately 1.0. The train data continued to fall to approximately 0.8, whereas
the test remained consistent at approximately 1.0. The graph is illustrated in Fig. 4.
The classification report for the proposed model is shown in Fig. 5 (Table 1).
Product-Based Market Analysis Using Deep Learning 69
Table 1 Comparison between the proposed model and other well-known models
Model Ensemble Residual VGG Res-Net Inception DeepEmotion Proposed
Name ResMaskingNet Masking [2] [2] [2] [1] model
with six other Network
CNNs [6] [6]
Accuracy 76.82% 74.14% 72.7% 72.4% 71.6% 70.02% 68.61%
Extra ✔ ✔ ✘ ✘ ✘ ✘ ✘
Training
Data
The proposed model can successfully predict the seven facial emotion categories,
including but not limited to Happy, Sad, Surprised, Neutral, etc. Figure 6 shows a
snapshot of live feed prediction where the “Happy” emotion is successfully detected.
70 A. Kumaria et al.
5 Conclusions
References
1. Minaee, S., Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional
convolutional network (2019)
2. Pramerdorfer, C., Kampel, M.: Facial expression recognition using convolutional neural
networks: state of the art (2016)
3. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in
tweets (2017). https://doi.org/10.1145/3041021.3054223
4. Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R.: MELD: A multi-
modal multi-party dataset for emotion recognition in conversations, pp. 527–536 (2019). https://
doi.org/10.18653/v1/P19-1050
5. Mäntylä, M., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis—a review of
research topics, venues, and top cited papers. Comput. Sci. Rev. 27 (2016). https://doi.org/10.
1016/j.cosrev.2017.10.002
6. Goodfellow, I., Erhan, D., Carrier, P., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang,
Y., Thaler, D., Lee, D.-H., Zhou, Y., Ramaiah, C., Feng, F., Li, R., Wang, X., Athanasakis, D.,
Shawe-Taylor, J., Milakov, M., Park, J., Bengio, Y.: Challenges in representation learning: a
Product-Based Market Analysis Using Deep Learning 71
1 Introduction
The number of vehicles on the road is increasing day by day; road accidents have
become common in most parts of the country and the leading cause of death. As we
know, the person behind the steering is responsible for the road traffic system and
road traffic safety. The driver is responsible for himself in addition to the passengers
in the vehicle. Drowsiness is a human trait that is often ignored by many people
when it comes to their safety. But this characteristic can cause problems to the driver
and the passengers if it is not considered and reacted upon, or else it may lead to
an accident and be the cause of death. Driver Drowsiness is a demanding issue that
needs to be taken care of to improve road traffic safety. Driver drowsiness detection
is an essential component in modern-day driver monitoring systems because too
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 73
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_7
74 A. Rajkar et al.
many traffic accidents are happening worldwide due to drivers’ drowsiness. There
are various attempts in the literature to spot the drowsiness of the driver. We have
studied some approaches, as given in Table 1.
This paper is organized further as follows: In Sect. 1, a discussion on related work
is provided. The proposed system is explained in Sect. 2. Then, in Sect. 3, results are
discussed and analyzed. Our proposed work is concluded in Sect. 4.
2 Proposed Approach
This section details the proposed approach to detect driver’s drowsiness that works
on two parameters. The process starts with capturing the live video stream from the
camera and is processed to be sent to the model to predict drowsiness. Using the
OpenCV library, the video stream is cropped into the eye region and the face. Each
frame is checked for checking the state of the eyes as open or closed. Suppose the
state of the eyes is closed for more than a specific time set in the system. If drowsiness
is detected, the system will alert the driver and the passengers with an alarm. This
same process is followed for detecting if the driver is yawning or not. The subsequent
section details the working of each module. Figure 1 shows the flow of the proposed
approach process.
Driver Drowsiness Detection Using Deep Learning 75
2.1 Datasets
There are some standard datasets available for drowsiness detection. In this paper,
the following datasets are used.
a. YawDD VIDEO DATASET [6]
This is a dataset of 2423 subjects. 1192 people with closed eyes, and 1231 people
with open eyes. The images of open eyes are taken from the dataset Labeled face in
the wild. Some examples from the CEW dataset are given in Fig. 3.
Fig. 3 Closed eyes in the wild (Ref. Closed Eyes In The Wild [7])
In this paper, the driver’s face in real-time videos is detected using the OpenCV
library’s Haar cascade classifier. It is an open-source library that is primarily used
for computer vision. It is also used for processing images and machine learning.
OpenCV supports many programming languages like Python, Java, C++, etc. It is
used to process images to identify faces, objects, and many more. The OpenCV’s
inbuilt feature, i.e., Haar feature-based cascade classifiers, is used to classify the input
to detect the face and the eye region. The cascade is pre-trained from many positive
and negative images; further, it can detect objects from other required images. This
approach is a machine learning-based approach. Detecting the face and eye regions
is a crucial step to determine drowsiness. Detecting face and eye regions is shown in
Fig. 4.
To use the CNN model on the YawD dataset, it needs to be converted into images
and then resized into 24 × 24 resolution. Then, face was determined using the
OpenCV library. Then the image was converted into grayscale. It is decided whether
the mouth state is open or closed and labeled accordingly. The open mouth state
is “1,” and the closed mouth state is “0” and is saved into a CSV file. The CEW
dataset was available in the form of cropped eyes and the resolution of 24 × 24 and
grayscale. The open eye state is labeled “1,” and the Closed eye state is marked as
“0” and saved into a CSV file. The data is separated into 80% for training and 20%
for validation during training the model. Each pixel is later divided by 32 and saved
as a float32 value.
78 A. Rajkar et al.
The system starts the user camera and using OpenCV’s HAAR cascade, user’s face
and eye regions are detected frame by frame. These frames are then forwarded on
to the trained CNN model. This gives the output if the state of the eye is open or
closed, similarly, if the driver is yawning or otherwise. If the eyes remain closed
for the given time threshold, the system provides a drowsiness alert. Besides, if
the user is repeatedly yawning for the given time threshold, the system offers a
drowsiness alert. The time threshold is a dynamic field and can be set accordingly.
The convolutional neural network is the proposed deep learning model. After many
trials, we perfected our proposed model using four conv2d, four max-pooling layers
that were then flattened, two dense layers were added. To prevent overfitting, four
dropout layers were used.
Two datasets (Yawning Detection Dataset and CEW) are used for training the CNN
model and testing purposes. With the help of OpenCV’s Haar cascade algorithm,
the face region and eye region are determined. In the paper, since two features, i.e.,
eyes and yawn, were to be trained, two CNN models were used. Adam’s optimization
Driver Drowsiness Detection Using Deep Learning 79
algorithm was used in training the proposed model since it is better than the stochastic
gradient descent procedure. After a couple of tries, a model with three convolutional
layers was selected, which got us the best accuracy. To improve the performance, the
original photos from the YawDD dataset were resized into 24 × 24 resolution. To
calculate loss, categorical cross-entropy loss was used, which is also called softmax
loss. Table 2 shows the losses for training and validation for closed eyes in the wild
dataset, and Table 3 shows the losses for training and validation for the yawning
detection dataset. Figure 5a, b show the graph loss values for epochs 1, 10, 20, 30,
40, and 50. After observing the results, it is indicated that the loss value decreases in
each epoch value. This is indicating the proposed model was successful. The train,
validation, and average accuracy are shown in Table 4. The comparison of average
accuracy is given in Table 5.
Driving after alcohol consumption is another serious problem with drivers.
Notable research has been reported to detect and predict early the effect of alco-
holism [9–11]. The proposed work can be extended in this direction. Further, Internet
of Things (IoT)-based systems also got famous due to location-independent services
[12–14]. The reported work may be extended in this direction. The IoT-enabled
system can provide an early alarm to the traffic control unit to avoid accidents.
80 A. Rajkar et al.
4 Conclusions
This paper aims to detect the drowsiness of drivers using a deep learning approach.
CNN models are utilized to detect driver drowsiness in real time. OpenCV’s Haar
cascade algorithm is used to see the driver’s face and eye regions. Then, the system is
trained with the proposed convolutional neural network for the detection of drowsi-
ness. The performance in real time is excellent. The driver drowsiness system works
successfully, with an average accuracy of 96%. For future work, we can improve
performance by getting a larger dataset. Another feature of face recognition can be
added to prevent theft of vehicles. The system also can be converted into a mobile
application for feasible usage.
References
1. Gwak, J., Hirao, A., Shino, M.: An investigation of early detection of driver drowsiness using
ensemble machine learning based on hybrid sensing. Appl. Sci. 10(8), 2890 (2020). https://
doi.org/10.3390/app10082890
2. Kepesiova, Z., Ciganek, J., Kozak, S.: Driver drowsiness detection using convolutional neural
networks. In: 2020 Cybernetics & Informatics (K&I) (2020). https://doi.org/10.1109/ki48306.
2020.9039851
3. You, F., Li, X., Gong, Y., Wang, H., Li, H.: A real-time driving drowsiness detection algorithm
with individual differences consideration. IEEE Access 7, 179396–179408 (2019). https://doi.
org/10.1109/access.2019.2958667
82 A. Rajkar et al.
4. Mehta, S., Dadhich, S., Gumber, S., Bhatt, A.J.: Real-time driver drowsiness detection system
using eye aspect ratio and eye closure ratio. SSRN Electron. J. (2019). https://doi.org/10.2139/
ssrn.3356401
5. Sathasivam, S., Mahamad, A.K., Saon, S., Sidek, A., Som, M.M., Ameen, H.A.: Drowsi-
ness detection system using eye aspect ratio technique. In 2020 IEEE Student Conference on
Research and Development (SCOReD) (2020). https://doi.org/10.1109/scored50371.2020.925
1035
6. Abtahi, S., Omidyeganeh, M., Shirmohammadi, S., Hariri, B.: YawDD: yawning detection
dataset. IEEE Dataport (2020). https://doi.org/10.21227/e1qm-hb90.
7. Song, F., Tan, X., Liu, X., Chen, S.: Eyes closeness detection from still images with multi-scale
histograms of principal oriented gradients. Pattern Recogn. (2014).
8. Savas, B.K., Becerikli, Y.: Real time driver fatigue detection system based on multi-task
ConNN. IEEE Access 8, 12491–12498 (2020). https://doi.org/10.1109/access.2020.2963960
9. Bavkar, S., Iyer, B., Deosarkar, S.: Rapid screening of alcoholism: an EEG based optimal
channel selection approach. IEEE Access 7, 99670–99682 (2019). https://doi.org/10.1109/
ACCESS.2019.2927267
10. Bavkar, S., Iyer, B., Deosarkar, S.: BPSO based method for screening of alcoholism. In: Kumar,
A., Mozar, S. (eds.) ICCCE 2019. Lecture Notes in Electrical Engineering, vol. 570, pp. 47–53.
Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8715-9_6
11. Bavkar, S., Iyer, B., Deosarkar, S.: Optimal EEG channels selection for alcoholism screening
using EMD domain statistical features and harmony search algorithm. Biocybern. Biomed.
Eng. 41(1), 83–96 (2021)
12. Deshpande, P., Iyer, B.: Research directions in the internet of every things (IoET). In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, pp. 1353–1357 (2017). https://doi.org/10.1109/CCAA.2017.8230008
13. Deshmukh, D., Iyer, B.: Design of IPSec virtual private network for remote access. In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, pp. 716–719 (2017). https://doi.org/10.1109/CCAA.2017.8229894
14. Iyer, B., Patil, N.: IoT enabled tracking and monitoring sensor for military applications. Int. J.
Syst. Assur. Eng. Manag. 9, 1294–1301 (2018). https://doi.org/10.1007/s13198-018-0727-8
Emotion Detection from Social Media
Using Machine Learning Techniques:
A Survey
Abstract The work carried out in this paper is to overview and compare various
sentiment analysis methodologies and approaches in detail and also discuss the limi-
tations of existing work and future direction about sentiment analysis methodolo-
gies. The main goal of sentiment analysis for market prediction is to recognize the
customer’s opinion about the available products. The work carried out in this paper is
to overview and compare various sentiment analysis methodologies and approaches
in detail with Sentiment Emotion Detection (SED) and also discuss the limitations
of existing work and future direction about sentiment analysis methodologies on
SED. The main goal of sentiment analysis for market prediction is to recognize the
customer’s opinion about the available products. It can pave the way for improve-
ment and prevent future defects and flaws. The tools for identifying and classifying
opinion communicated in a bit of text, in sound, or video formats indicate whether
the creator’s mood toward a specific issue, thread, item, and so on is positive, nega-
tive, or neutral. Human emotions are limited to being positive or negative. Still, it has
more categories like happiness, sadness, joy, disgust, surprise, depression, frustration,
anger, fear, confidence, trust, anticipation, shame, kindness, love, friendship, faith,
and wonder. Analyzing people’s comments/emotions is essential for the country,
business, or individuals for their existence, which gives the researcher motivation on
sentiment analysis on emotion detection.
V. Ahire (B)
RCPET’s Institute of Management Research and Development, Shirpur, India
S. Borse
SSVPSs. Late Karmveer Dr. P. R. Ghogrey Science College, Dhule, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 83
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_8
84 V. Ahire and S. Borse
1 Introduction
Every minute of the day, a tremendous amount of data is generated by social media
networks. Social media like YouTube, Facebook, Twitter, LinkedIn, WhatsApp,
Reddit, or any product website are available online around the globe. It is where
people spend a lot of time sharing their thoughts, views, and opinions across the
world [1, 2]. When people share their thoughts through social media, they express
their emotions directly or indirectly. The process of analyzing this expression is
called Sentiment Analysis.
Sentiment Analysis (SA) is part of social network analysis (SNA). With the help
of social media, various known or unknown entity constructs are formed in social
networks. It links with family members, groups, colleagues, peers, and maybe
connects with users for commercial purposes [3, 4]. The interconnectivity of the
individual in a social media network is called groups or communities. People get
connected on social media platforms with attributes like their relationship, similarity
of interest or habits, etc. It is found that people have social commodity beliefs on
those people with whom they trust more and follow them to achieve their solution
to real-time problems. The digital world discovered a large amount of data gener-
ated by these networks, which is highly important to understand the user’s thoughts
accurately. This paper provides a detailed overview and survey of emotion detec-
tion from social media by using machine learning techniques. Section 1 contains
the introduction about SNA; Sect. 2 contains the background of SNA with data
collection/acquisition, data cleaning, clustering or community detection, sentiment
analysis level, and some sentiment analysis approaches. Section 3 contains related
work on sentiment detection with the table. The table shows a review of emotion
detection in sentiment analysis. It briefly elaborates on which dataset the researcher
used and which approach they implemented to achieve the result. Section 4 contains
a discussion and future research directions followed by a conclusion in Sect. 5.
2 The Background
spaces where people comment in their choice of social network. Here, analyzing
people’s comments is essential for production, countries, and entities to offer people
or consumers the best services. Hence, it is crucial to recognize emotion expressed
by these entities to regulars of the production or services. This information collected
from customers or individuals encodes their feelings regarding their procurement.
This analysis is essential for any organization in the decision-making process, like
what people say, how they’re saying it, and what they mean, to ensure their growth.
The following steps are used for sentiment analysis.
The data cleaning steps will turn your dataset to a very informative format. Data
cleansing is searching and modifying, erasing corrupt or inappropriate minutes from
collected data, and referring to classifying the data. Data cleaning can be done using
new advanced machine learning techniques easily.
After collecting data from different social media networks, the next steps are to
forming the group. It is also known as community detection. It can be done with
many features like liking, disliking, behavior, culture, emotion, etc. The process of
finding the interrelated groups in the networks is called community detection. Many
algorithms were developed for finding community detection in social media network
86 V. Ahire and S. Borse
data. The algorithms for this are classified into approaches based on clustering, graph
partition, genetic algorithms, label propagation, etc. [6].
This classifying sentiment polarity will exist in three analysis levels based on the
given data [7, 8].
Document level: This sentiment classification level considers the entire docu-
ment’s opinion and predicts the document’s view as either positive or negative.
Sentence level: This sentiment classification level considers the sentence opinion
and predicts the sentence’s view as positive or negative, or neutral. The sentence can
be subjective with a positive, negative, or neutral state.
Aspect level: This level of sentiment classification will classify the sentiment
concerning the entities’ specific aspects. Rather than seeing language builds,
viewpoint level straightforwardly takes a gander at the assessment itself.
Instead of looking at language constructs, the aspect level directly looks at the
opinion itself. This level of sentiment analysis will convey the views or emotions at
each level.
Sentiment analysis tasks include different types of strategies, which are classified
mainly into three types of approaches:
Lexicon-based approach: It is an unsupervised learning mechanism. It works on
the polarity of the sentence and measures positive, negative, or neutral forms. It is
having two basic approaches: Dictionary and a Corpus-based approach.
Machine learning approach: It is classified into two procedures, supervised
learning and unsupervised learning. Supervised learning techniques will predict the
polarity of the target data or test data based on the training dataset with a finite set of
classes such as positive and negative. Simultaneously, an unsupervised learning tech-
nique is proposed when there is no possibility of providing a prior training dataset
to my data. There are four approaches to machine learning:
a.Supervised.
b.Unsupervised.
c.Semi-supervised.
d.Reinforcement learning.
Hybrid approach: Based on the above two strategies, we consider a hybrid
approach.
Emotion Detection from Social Media Using Machine Learning … 87
Much work has been performed using machine learning techniques but has limita-
tions such as disregarding the word’s contextual meaning, a high number of misclas-
sifications, a partial number of groups, and weak context information extraction.
So to overcome these lacunas, some researchers recommended using deep learning
techniques for improved performance.
88 V. Ahire and S. Borse
The above table focuses on the details of existing work for emotion detection
[11–23] and limitations and future work.
Some common challenges in Emotion Sentiment analyses are the following:
1. Text is frequently displayed with noisy and incorrect syntax in a message.
2. In many languages, a single word may have several meanings, hence polarity is
constructed in the environment.
3. Terminology is not limited. Words may be induced due to named entities as well
as user errors and deliberate misspelling.
4. Some sentences might be mocking (sarcastic), thus causing to produce an
inappropriate result.
5. Many times, sentiments are vague due to the mention of multiple opinions about
them.
Many researchers suggested that sentiment analysis using emotion detection is
carried out with lexicon-based and machine learning techniques for limited approach
algorithm. Further, EEG analysis and the Bigdata approach can also be used for
emotion detection [24].
5 Conclusions
Table 1 (continued)
Ref Dataset Approach Outcome Limitation Future Work
No.
[17] ISEAR Machine Multinomial NB Complex Complex
dataset learning gave good results emotion can’t Emotion can be
Multinomial be predicted solved after
NB, SVM, accurately adding features
DTC and KNN or rules-based
approaches
[18] AWS dataset Deep learning Result of Long In future, we To get higher
CNN and short-term memory can consider performance,
Bi-LSTM analysis upgraded other axioms use convolution
as related to the of sentiment to deep learning
basic neural understand algorithms
network model emotion more
accurately for
the specific
domain
[19] ISEAR Machine It shows Discounted Improve
learning enhancement in relation performance
SVM routine associated between with a hybrid
with baseline features approach
[20] EmotionLine Machine F1-score for friends Amount of To get higher
learning 0.815 and data performance use
EmotionPush 0.885 insufficient a hybrid
approach
[21] SemEval Machine LSTM F1 score is The only Using
learning 0.5861 for four restricted Bi-LSTMs can
classes number of improve the
groups performance
produced
[22] Tencent Lexicon based Got 84.3% accuracy It works on Improve
Weibo (2013) Chinese blogs accuracy with
only machine
learning
[23] Facebook Machine Hindi obtained an It contains Using the
multilingual learning F1-score of 0.4521, non-English multilingual
texts and English got text, which dataset and
0.5520 reduces the hybrid approach
system’s can improve
performance performance
References
1. Chakraborty, K., Bhattacharyya, S., Bag, R.: A Survey of sentiment analysis from social media
data. IEEE Trans. Comput. Soc. Syst. 7(2), 450–464
2. Pokhun, L., Yasser Chuttur, M.: Emotions in texts. Bull. Soc. Inf. Theory Appl. 4(2), 59–69
(2020)
Emotion Detection from Social Media Using Machine Learning … 91
3. Leskovec, J.: Social media analytics: tracking, modeling, and predicting the flow of information
through networks. In: Proceedings of 20th International Conference Companion World Wide
Web, pp. 277–278 (2011)
4. Acheampong, F., Wenyu, C., Nunoo-Mensah, H.: Text-Based Emotion Detection: Advances,
Challenges, and Opportunities (2020)
5. Canali, C., Colajanni, M., Lancellotti, R.: Data acquisition in social networks: issues and
proposals. In: Proceedings of International Workshop Services Open Sources (SOS), pp. 1–12.
ISSN 0167-739X (2011)
6. Flake, G. W., Lawrence, S., Giles, C.L.: Efficient identification of Web communities, KDD
150–160 (2000)
7. Ray, P., Chakrabarti, A.: A mixed approach of deep learning method and rule-based method to
improve aspect level sentiment analysis. Appl. Comput. Inf. ahead-of-print No. ahead-of-print
(2020)
8. Jain, A., Pal Nandi, B., Gupta, C., et al. Senti-NSetPSO: large-sized document-level sentiment
analysis using Neutrosophic Set and particle swarm optimization. Soft Comput. 24, 3–15
9. Gunes, H., Schuller, B., Pantic, M., Cowie, R.: Emotion representation, analysis and synthesis
in continuous space: a survey. In: Paper Presented at: Proceedings of the Face and Gesture,
pp. 827–834. IEEE (2011)
10. Brusco, M., Doreian, P., Steinley, D.: Deterministic block modelling of signed and two mode
networks: a tutorial with software and psychological examples. Br. J. Math. Stat. Psychol.
(2019)
11. Pradeepth, N.: Deep Learning Based Sentiment Analysis for Recommender System, Annals.
Comput. Sci. Ser., 16th Tome 2nd Fasc-2018, 155–160 (2018)
12. Ahmad, Z., Jindal, R., Ekbal, A., Bhattachharyya, P.: Borrow from rich cousin: transfer learning
for emotion detection using cross lingualembedding. Expert Syst. Appl. 139, 112851 (2020)
13. Dahiya, S., Mohta, A., Jain, A.: Text Classification based Behavioural Analysis of WhatsApp
Chats, pp. 717–724 (2020). https://doi.org/10.1109/ICCES48766.2020.9137911
14. Suhasini, M., Srinivasu, B.: Emotion detection framework for twitter data using supervised
classifiers. New York, NY: Springer 2020, 565–576 (2020)
15. Seal, D., Roy, U.K., Basak, R.: Sentence-level emotion detection from text based on semantic
rules. In: Paper Presented at: Proceedings of the Information and Communication Technology
for Sustainable Development, pp. 423—430. Springer (2020)
16. Joshi, A.: Sentiment Analysis and Opinion Mining from Noisy Social Media Content.
International Institute of Information Technology, Hyderabad (2020)
17. Nasir, A.F.A., Nee, E.S., Choong, C.S., Ghani, A.S.A., Abdul Majeed, A.P.P. Adam, A., Furqan,
M.: Text-based emotion prediction system using machine learning approach. In: The 6th Inter-
national Conference on Software Engineering & Computer Systems; IOP Conference Series:
Materials Science and Engineering 769, 012022 (2020)
18. Goud, G., Garg, B.: Sentiment analysis using long short-term memory model in deep learning.
In: 2nd EAI International Conference on Big Data Innovation for Sustainable Cognitive
Computing, pp. 25–33 (2019)
19. Singh, L., Singh, S., Aggarwal, N.: Two-stage text feature selection method for human emotion
recognition. In: Paper Presented at: Proceedings of the 2nd International Conference on
Communication, Computing and Networking, pp. 531–538; Springer (2019)
20. Huang Y-H, Lee S-R, Ma M-Y, Chen Y-H, Yu Y-W, Chen Y-S. EmotionX-IDEA: emotion
BERT–an affectional model for conversation; arXiv preprint arXiv:1908.06264 (2019)
21. Chatterjee, A., Narahari, K.N., Joshi, M., Agrawal, P.: SemEval-2019 task 3: EmoCon-
text contextual emotion detection in text. In: Paper Presented at: Proceedings of the 13th
International Workshop on Semantic Evaluation, pp. 39–48 (2019)
22. Ma, J., Xu, W., Sun, Y.H., Turban, E., Wang, S., Liu, O.: An ontology-based text-mining method
to cluster proposals for research project selection. IEEE Trans. Syst. ManCybern. Part A Syst.
Hum. 42, 784–790 (2012)
23. Malte, A., Ratadiya, P.: Multilingual cyber abuse detection using advanced transformer archi-
tecture. In: Paper Presented at: Proceedings of the TENCON 2019–2019 IEEE Region 10
Conference, pp. 784–789. IEEE (2019)
92 V. Ahire and S. Borse
24. Kamthekar, S., Deshpande, P., Iyer, B.: Cognitive analytics for rapid stress relief in humans
using EEG based analysis of Tratak Sadhana (Meditation): a Bigdata approach. Int. J. Inf. Retr.
Res. (IJIRR) 10(4), 1–20 (2020)
Deep Age Estimation Using Sclera
Images in Multiple Environment
Abstract Human age estimation from images using machine learning techniques
is a challenging task. Due to physical aging process, color and texture of sclera, a
protective outer layer present in human eye, get changed. In this work, we present an
exploratory study to find the effectiveness of using sclera region of eye images for
age estimation. It employs a modified form of deep neural network model VGG-16.
The model is trained and tested by SBVPI dataset, in which the images are acquired
with high-end cameras. The model is also tested using images acquired by a mobile
camera fitted with a macro lens. The work gives the best mean-absolute-error of 0.06
and the encouraging results lead us to conclude that sclera images can be used as an
effective modality for human age estimation. It is a pioneering work in the sense that
the idea of using sclera for the purpose has not been explored before.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 93
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_9
94 S. Das et al.
Fig. 1 Examples of multi-gaze (front, up, left, and right) eye images. Images a–d are taken from
the SBVPI dataset. Images e–h are captured using a mobile handset camera. The age of subjects is
annotated below each image
2 Literature Review
A brief review of the current scenario of research on age estimation is given now.
In 2012, Gou proposed age estimation and sex classification using colored images
acquired with cameras installed in public places [8]. The focus is on face-based
features, and numerous feature extraction techniques are described. In 2015, Jana
et al. proposed a method using skin wrinkle features extracted from face images,
and it has experimented with images from Asian subjects [10]. In 2017, Lin et
al. studied age estimation for the same subject at multiple ages and proposed a
dimension reduction scheme for face images employing neural networks [12]. In the
same year, Hu et al. used the Kullback–Leibler divergence to estimate age difference
between subjects and also contributed a large face image dataset [9]. In 2020, Ajala
and Viriri showed age prediction, and gender classification using face images work
better using a deep convolution neural network [2]. In the same year, the DeepUAge
model was proposed by Anda et al. to assist in combating child sexual exploitation
material (CSEM), where the aim is to classify child age groups for restricting access
to specific contents [3]. Apart from face images, other methods are developed in
the curvelet domain for extracting features from fingerprints in the estimation of
age [16]. Eye-tracking was proved to be active in the case of toddlers [6].
In recent years, sclera images obtained in visible light are used in biometric
recognition systems that gave rise to two essential classifications of works: sclera
segmentation and recognition. Worldwide competitions are organized for exploring
the effectiveness in sclera segmentation [7, 19]. The next step after sclera segmen-
tation is sclera recognition for use in biometric recognition systems [14]. Sclera
segmentation being used as a separate research topic, we used the segmented sclera
images for our work. Novel datasets named MASD, MSD, and SBVPI are proposed
for these works. SBVPI dataset is provided with corresponding age meta-data for
subjects whom we used for this work. To our knowledge, this paper is novel to present
age prediction using advantageous characteristics of sclera images.
96 S. Das et al.
3 Proposed Method
The basic assumption of our work is ‘human sclera color changes with age’. There
are medical pieces of evidence for this reported in [4, 5]. Sclera stiffness also changes
due to age because of variation of underlying choroidal thickness [20]. This is also
discussed in detail from an image processing point of view in [15]. Recently sclera
images are explored for suitability in biometric recognition systems, which gives rise
to two essential research fields, namely, sclera segmentation and recognition [14,
19]. Sclera segmentation being a particular problem, ground-truth images provided
in the SBVPI dataset are used for this work. For images acquired by a mobile camera,
sclera segmentation is done by us [7]. RGB colored sclera region obtained from the
eye image is illustrated in Fig. 2. The figure also shows the change of colors in the
sclera region due to age, as evidenced by images acquired by mobile cameras from
four subjects.
At first, square-shaped patches of size 300 × 300 are segmented from the sclera
area. The segments are then fed into a deep neural network to get an estimation of the
age of the subject. The network produces a single floating-point number as output
which essentially indicates the age. The model of the deep network is similar to VGG-
16 with variations [18]. The network has four convolution layers, and in between,
it has three max polling layers to ensure that colors and patterns all over the patch
have sufficient variations in features and interconnections. The final convolution
layer is attended and fed to a network of three dense layers. The final dense layer
has only one node with a ’sigmoid’ activation function to ensure that the output is a
single positive floating-point value. ’Adam’ optimizer is used with a learning rate of
Fig. 2 Ground-truth image superimposed on original RGB image gives sclera-segmented RGB
image. A sclera patch is then sliced from it. On the right-hand side, four patches obtained from
sclera images (acquired by a mobile camera) belonging to four subjects of different ages are shown
Deep Age Estimation Using Sclera Images in Multiple Environment 97
eye images within 2 to 10 cm from the lens with a clear focus. We use Samsung J4
Galaxy and Yunicorn 5530 mobiles for image acquisition. The images are captured
under varying lighting conditions, varying distances from the lens, a good focus that
clearly show the sclera vessels, and some slightly blurred or distorted images due to
motion. Every individual is asked to look to the left, right, upward, and toward the
camera lens for capturing images of multiple gazes. The image-capturing device was
slightly rotated or tilted to get a variety of images. So essentially, the image dataset
made by us has variations in gaze direction, position, capturing distance, illumination,
blur, etc., due to motion and focus change. It also contains sclera-segmented ground-
truths images prepared by us.
We used 70% images from both datasets in training; the remaining images are used
for testing. To reduce over-fitting during training, we select two patches of 300 × 300
randomly instead of a single patch from each image, which doubles training and test
data size. The model is trained with a batch size of 16. Training converges within
approximately 300 epochs. The average time required for execution in a batch is
approximately 50 ms.
Mean-absolute-error (MAE) is calculated by finding the mean of absolute dif-
ferences between the predicted age and given age. Using all images of the SBVPI
dataset and mobile handset images, we obtain an MAE of approx. ±12 and ±9
years, respectively. The fact is further elaborated for all images given in Fig. 4 for
the SBVPI dataset and Fig. 5 for images of the mobile handset. The graphs show
a higher number of predicted images with low MAE which ascertains the model’s
usability for prediction. Few images have high MAE, which increases the overall
MAE. To further analyze the variation of subject ages used in training, Fig. 6 shows
the number of images versus subject ages. The figure shows a very low number
of subject images for children and elderly groups than middle-aged subjects. This
has led to a higher error in prediction for children and the elderly. The graph in
Fig. 7 depicts the scenario. The higher number of images in training for middle-aged
Fig. 4 Graph depicting the distribution of error (MAE) versus its frequency for SBVPI dataset
Deep Age Estimation Using Sclera Images in Multiple Environment 99
Fig. 5 Graph depicting the distribution of error (MAE) versus its frequency for mobile handset
images
Fig. 6 Graph depicting the distribution of subject ages used in training the model
subjects gives good results for the middle-aged subjects. Table 1 gives the results
separately for each dataset, along with imaging constraints and sex of subjects. As
the number of subjects is significantly less above age group 50 and below 16, we
experimented by removing eye images of these 9 subjects for further evaluation.
This reduces the overall MAE to ±8 and ±6 years for SBVPI and mobile handset
images, respectively, represented by MAE-R in the table. We can easily conclude
that more subjects in each age category are essential to perform biased training for all
age groups. We observed no significant impact due to multiple mobile handsets used
for the image acquisition process or sex. Further results using our mobile handsets
have performed better over standard datasets prepared in a constrained environment.
Our work suggests sclera be an essential feature for predicting age and face images
be used widely in the process.
100 S. Das et al.
Fig. 7 Graph depicting the error for subjects with variation in age
5 Conclusion
Acknowledgements We express our gratitude to Dr. Matej Vitek of the University of Ljubljana
and his team members for allowing us to use the SBVPI dataset.
Deep Age Estimation Using Sclera Images in Multiple Environment 101
References
1. Abbasi, A., Khan, M.: Iris-pupil thickness based method for determining age group of a person.
Int. Arab J. Inf. Technol. 13(6) (2016)
2. Agbo-Ajala, O., Viriri, S.: Deeply learned classifiers for age and gender predictions of unfiltered
faces. Sci. World J. (2020). https://doi.org/10.1155/2020/1289408
3. Anda, F., Le-Khac, N.A., Scanlon, M.: DeepUAge: improving underage age estimation accu-
racy to aid CSEM investigation. Forensic Sci. Int. Digit. Investig. 32, (2020). https://doi.org/
10.1016/j.fsidi.2020.300921
4. Beattie, J.R., Pawlak, A.M., McGarvey, J.J., Stitt, A.W.: Sclera as a surrogate marker for deter-
mining AGE-modifications in Bruch’s membrane using a Raman spectroscopy-based index of
aging. Investig. Ophthalmol. Vis. Sci. 52(3), 1593–1598 (2011). https://doi.org/10.1167/iovs.
10-6554
5. Coudrillier, B., Tian, J., Alexander, S., Myers, K.M., Quigley, H.A., Nguyen, T.D.: Biomechan-
ics of the human posterior sclera: age and glaucoma-related changes measured using inflation
testing. Investig. Ophthalmol. Vis. Sci. 53(4), 1714–1728 (2012)
6. Dalrymple, K.A., Jiang, M., Zhao, Q., Elison, J.T.: Machine learning accurately classifies age
of toddlers based on eye tracking. Sci. Rep. 9, 6255 (2019). https://doi.org/10.1038/s41598-
019-42764-z
7. Das, S., Ghosh, I.D., Chattopadhyay, A.: An efficient deep learning strategy: its application
in sclera segmentation. In: 2020 IEEE Applied Signal Processing Conference (ASPCON), pp.
232–236. Kolkata (2020)
8. Guo, G.: Human age estimation and sex classification. In: Video Analytics for Business Intel-
ligence, vol. 409, pp. 101–131. Springer, Berlin, Heidelberg (2012)
9. Hu, Z., Wen, Y., Wang, J., Wang, M., Hong, R., Yan, S.: Facial age estimation with age
difference. IEEE Trans. Image Process. 26(7), 3087–3097 (2017). https://doi.org/10.1109/TIP.
2016.2633868
10. Jana, R., Datta, D., Saha, R.: Age estimation from face image using wrinkle features. Procedia
Comput. Sci. 46, 1754–1761 (2015). https://doi.org/10.1016/j.procs.2015.02.126
11. Levi, G., Hassner, T.: Age and gender classification using convolutional neural networks. In:
IEEE Conference on Computer Vision and Pattern recognition (CVPR) Workshop on AMFG.
Boston (2015)
12. Lin, C.T., Li, D.L., Lai, J.H., Han, M.F., Chang, J.Y.: Automatic age estimation system for face
images. Int. J. Adv. Robot. Syst. 9(5), 626–635 (2017). https://doi.org/10.5772/52862
13. Rot, P., Emeršič, Ž., Štruc, V., Peer, P.: Deeps multi-class eye segmentation for ocular biomet-
rics. In: 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), pp.
1–8 (2018). https://doi.org/10.1109/IWOBI.2018.8464133
14. Rot, P., Vitek, M., Grm, K., Emeršič, Ž., Peer, P., Štruc, V.: Deep sclera segmentation and recog-
nition. In: A. Uhl, C. Busch, S. Marcel, R. Veldhuis (eds.) Handbook of Vascular Biometrics,
pp. 395–432. Springer (2020). https://doi.org/10.1007/978-3-030-27731-4_13
15. Russell, R., Sweda, J.R., Porcheron, A., Mauger, E.: Sclera color changes with age and is a cue
for perceiving age, health, and beauty. Psychol. Aging 29, 626–635 (2014). https://doi.org/10.
1037/a0036142
16. Saxena, A.K., Chaurasiya, V.K.: Multi-resolution texture analysis for fingerprint based age-
group estimation. Multimed. Tools Appl. 76(5), 3087–3097 (2017). https://doi.org/10.1007/
s11042-017-4516-1
17. Saxena, A.K., Sharma, S., Chaurasiya, V.K.: Neural network based human age-group estimation
in curvelet domain. In: Eleventh International Multi-Conference on Information Processing-
2015 (IMCIP-2015), pp. 781 –789 (2015)
18. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recog-
nition (2014). arXiv:1409.1556
102 S. Das et al.
19. Vitek, M., Das, A., Pourcenoux, Y., Missler, A., Paumier, C., Das, S., Ghosh, I.D., et al.:
SSBC 2020: Sclera segmentation benchmarking competition in the mobile environment. In:
International Joint Conference on Biometrics (IJCB 2020) (2020)
20. Zhou, H., Dai, Y., Shi, Y., Russell, J.F., Lyu, C., Noorikolouri, J., Feuer, W.J., Chu, Z., Zhang,
Q., de Sisternes, L., Durbin, M.K., Gregori, G., Rosenfeld, P.J., Wang, R.K.: Age-related
changes in choroidal thickness and the volume of vessels and stroma using swept-source oct
and fully automated algorithms. Ophthalmol. Retin. 4(2), 204–215 (2020). https://doi.org/10.
1016/j.oret.2019.09.012
Data Handling Approach for Machine
Learning in Wireless Communication:
A Survey
1 Introduction
Recent technological development has increased the demand for wireless communi-
cation, and 5.7 Billion population, i.e., 71% will shift to the wireless domain. One of
the reasons for this shift is increased mobile and broadband data rates (43.9 Mbps)
and (110.4 Mbps) respectively. Every year, a number of the applications come into
the market such as Machine to Machine (M2M) communication and smartphones,
and recently Internet of Things (IoT) is replacing the whole market scenario with
increased wireless capabilities [1]. Considering this situation, the next-generation
network has to open the more advanced requirements in terms of computational
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 103
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_10
104 N. S. Kulkarni et al.
wireless context such as Multi-Input Multi-Output (MIMO), Smart Grid (SG), Cogni-
tive Radio (CG) HetNets, Small Cells, and Device to Device (D2D) networks is
outlined.
In various WC areas, ML was used previously. An extensive survey is carried out
in [7] with different network technologies. This work gives insights into used ML
in various wireless domains. In [8], a gap between DL and Wireless Communica-
tion Network (WCN) is reduced by mapping various platforms and ML techniques
to simplify DL’s effective deployment onto WC. With a high-level introduction to
supervising and unsupervised learning, application to communication network at
different layer protocol stack emphasizes the physical layer presented [9].
ML algorithms can learn and adapt to a changing environment in wireless
networks. From [4–9], we can identify the complex pattern recognizing the capacity
of ML, and with DL, a complex radio environment and large-scale topology-related
intelligent management can be design.
In WC, massive data related to various aspects of the network is generated, and
using such Wireless Big Data (WBD) and AI-driven intelligence, the network can
be managed intelligently. In this approach, DL, with its brain-like acute feature
extraction capacity, plays an essential role in analyzing the complex relationship and
catching the network’s real-time dynamics.
ML techniques can realize the implementation of human-like prediction or
decision-making process. The most substantial advantage of adopting ML techniques
is that it can learn continuously from the data over time. Even during the system’s
operation, it can be continually updated by observing newly observed and produced
data.
This quality makes the AI a vital driving force in the next-generation network
where the network becomes autonomous. To fetch the critical information from
WCN, AI capabilities are subdivided into four types of analytics: descriptive, diag-
nostic, predictive, and prescriptive analytics [10]. In a descriptive kind of analytics,
important information of the network is collected and using diagnostic tools, network
performance analysis is carried out. Recently, predicting the important network
parameters with the help of ML, DL has seized the network engineers’ attention.
However, based on these three analytics, predicting future network impairments and
observing the most probable solution is the challenge for prescriptive analytics in the
next-generation network. A few of the tasks under various analytics are identified in
Table 1.
Presently, three types of analytics are used for network management, and deci-
sions are based on human interaction. In a future network, such manual interven-
tions can lead to operational delay and performance degradation. Hence, based on
this analytics information, an application-oriented prescriptive type decision-making
model is proposed in Sect. 3. In Sect. 2, we will study various wireless communication
scenarios where AI had a significant role.
106 N. S. Kulkarni et al.
To develop the decision model based on the wireless communication channel vari-
ability, various signal detection and classification methods are the first step to draw
the meaning of full information from the wireless network. After fetching the channel
information, user-centric data such as mobility and context awareness play a useful
role in tracking the user movement in a particular area. Once sufficient relevant
information related to both channel and user is acquired, network performance opti-
mization can be done. With this flow, the recent literature related to signal detection,
classification, mobility, context awareness, traffic prediction, and network perfor-
mance optimization is surveyed to identify key design parameters for next-generation
data handling mechanisms.
Data Handling Approach for Machine Learning … 107
Paper No. Parameter targeted Name ML/DL method Parameter compared O/P compared with Results
[10] Signal detection and recovery Deep learning Bit error rate (BER) with and LS, MMSE With limited data, channel
of the transmitted data without cyclic prefix (CP) characteristic can be learned
with DL
[11] Signal detection Deep learning Robustness—(MSE and LMMSE-MMSE, FC-DNN, DL-based intelligent
BER) and ComNet-BiLSTM, extraction and with efficient
complexity—Floating-point performance can be seen in
multiplication (FLOPs), signal detection
memory usage,
computational intensity, and
time consumption
[12] Signal detection DL and CNN Precision, recall, F1 score, SVM, PSVMHSVM, DNN, Excellent ability to capture
accuracy SAE distinct information with
high accuracy can be seen
[13] Signal classification and DNN Classification accuracy, comparing the IQ-A-Ø and • For accurate decision, a
interference detection precision, recall, and F1 FFT vector with varying balanced trade-off between
score SNR results are compared efficiency and complexity
must be considered
• Time-varying multipath
and channel impairments
must be dealt with
appropriately
[14] Discovery of decoding RNN Robustness, adaptivity Neural decoder, Turbo New codes can be learned on
algorithm decoder the AWGN channel
(continued)
N. S. Kulkarni et al.
Table 2 (continued)
Paper No. Parameter targeted Name ML/DL method Parameter compared O/P compared with Results
[15] Demodulation CNN, DBN, and Adaboost Accuracy Changing the distance and Accuracy is inversely
modulation type proportional to the
transmission distance, and
high order of modulation is
preferred for more accuracy
[16] Signal demodulation DL Changing the SNR and DBN, SVM, MLD • Different modulation
different modulation mode models need different
accuracy and training period training periods
is compared • Higher modulation order
needs longer training
signal periods
• With an increase in
modulation order,
demodulation accuracy
decreases
Data Handling Approach for Machine Learning …
[17] Radio monitoring Deep CNN Throughput, latency With different time Context awareness is one of
scenarios the criteria for resource
optimization
[18] Time-varying underwater DL BER BER changing with SNR With less training better,
acoustic with severe Doppler results can be achieved
effect
[19] Modulation classification Ensemble voting classifier SNR versus accuracy KNN, SVC, AdaBoost, DT, More accurate results can be
BAG, RFC, GB, LR, XGB observed by using DL
(continued)
109
Table 2 (continued)
110
Paper No. Parameter targeted Name ML/DL method Parameter compared O/P compared with Results
[20] Wireless technology FNN, CNN, decision tree, Accuracy, generalizability, FNN manual feature, As automatic feature
classification RForest robustness, complexity CNN-RSSI, image, IQ extraction outperforms
based, random forest manual feature extraction in
all except complexity, the
proper trade-off between
manual and automated
feature extraction methods
needs to be investigated
N. S. Kulkarni et al.
Data Handling Approach for Machine Learning … 111
In [31], various Wireless Sensor Network fault detection mechanisms are developed
with ML, and one such mechanism was enhanced. Here, the results are tested on a
real medical dataset, which gives accurate results compared to the existing one.
A method for learning data flow rates in a wireless network to improve its quality of
service is presented in [32]. An appropriate neighboring node for packet forwarding
is selected by learning the environment with the help of Reinforcement Learning
(RL). The hierarchical decision technique is used to improve the learning capacity
of nodes. For each layer, the decision is applied, and particular nodes are selected
with whom more information about the environment is present. This information is
shared with less informative nodes, and learning capacity is improved.
A deep learning framework consisting of a binary measurement matrix, having
a non-uniform quantizer, and a non-iterative recovery solver is presented [33]. By
training the network, these parts are jointly optimized. The results on synthetic and
real datasets reveal a drastic reduction of the transmission bits.
With intelligent optimization and target repair, a jointly optimized extreme
learning machine (JOELM) approach is proposed for the short-term prediction of
fading channels [34]. In this, firefly algorithm is imported to intelligently optimize
the traditional extreme learning algorithm.
To optimize the spectrum and energy, cellular and IoT’s symbiotic relationship is
used in a centralized and de-centralized manner [35]. With DL’s help, channel esti-
mation at a global and local level based on different frames to detect user association
policy at BS is defined. Based on distributed DRL algorithm, users are managed at a
central and distribution center based on historical channel and interference informa-
tion.To efficiently handle the data, accuracy and computational efficiency together is
not consider for data handling approach in [36]. Efficient and fast processing of data
queue management in the data processing layer needs to be investigated from an
overhead point of view. In [37], a connection is established between the model-
driven and in-depth learning approach by examining the model and data-driven
approach. Considering the Wireless Communication scenario, only a data-driven
approach is insufficient; however, a theoretical mathematical model as a primary
information decider will efficiently balance resource management accuracy and flex-
ibility. Several issues in the model-driven approach in context to receiver design and
channel information accuracy are discussed in [38]. A model-driven approach can
significantly reduce the computation time compared to the Monte Carlo simulation
with a specialized and accurate selection of models.
Several models driven approach issues context to receiver design, and channel
information accuracy is discussed in the paper.
Various opportunities are listed in Table 3.
114 N. S. Kulkarni et al.
Decomposing the data into in-tower and inter-tower, various characteristics and
root causes of dynamic channels are studied [43]. The first time, DL is used for
predicting the individual tower traffic based on spatial dependency. Due to hetero-
geneous devices, uneven bursty traffic reaches switches and may lead to congestion.
A Deep CNN-based intelligent Partial Overlapping Channel allocation strategy is
proposed in [44], predicting future traffic and assigning the channels by reducing the
convergence time. In dynamic users movement, data accessing and processing is a
challenging a task. Various issues such as privacy and security need to be considered
while mapping the user data and satisfaction level [45]. Multiple opportunities are
listed in Table 4.
In the above section, we have identified many essential points that need to be consid-
ered to develop ML-based WCN. In [12, 15, 16, 19, 22], novel model-driven, data-
driven training approaches are discussed, and alternatively, the online–offline mode
is used for manual and automatic feature extraction. DL is used efficiently for channel
estimation, and narrow features are extracted from wideband channel [11, 13], which
motivate us to use the ML in WCN.
To enhance the performance, various new approaches are proposed in [34, 35]
for intelligent optimization. Traffic prediction is the first step in managing the
next-generation network; hence, different traffic prediction approaches are studied
[39–44].
Though the past developments have improved performance, limitations with the
existing approach constrain its high-performance utilization.
From the above section, we have observed a few issues with current deep machine
learning in wireless communication which are as follows:
1. ML’s complexity depends upon the size and quality of the data with performance
objective and efficient learning, and updation needs to be observed during data
exchange. Traditional complexity evaluator matrices are not sufficient as they
cannot catch the dynamic requirement of future data handling networks.
2. The application of the proposed approach in interfacing distributed devices
using interfacing frameworks [46] has been limited to defined protocols, and a
dynamic updation leads to operational instability in such conditions.
3. To deal with the future network demand, several ML need to train the network
several times, which creates processing overhead, leading to a considerable
latency in the network. Such delays minimize the network throughput, data
integrity, and network lifetime by increasing the burden on the allotted resources.
116 N. S. Kulkarni et al.
4 Conclusions
In this work, various performance optimization and traffic prediction techniques are
reviewed. Based on the channel and user information, various research opportunities
are identified, and a novel data handling approach is proposed. The proposed method
suggests the new algorithm with lower overhead, lesser computational complexity,
and more extensive network integrity. This research solution finds scope to improve
the network performance under dynamic time-variant channel conditions in wireless
communication. This solution can minimize the constraint of dynamic variations,
making communication more robust to variations in data exchange over a wireless
medium. This provides an enormous scope in offering higher service compatibility
and resource utilization for next-generation wireless communication.
Deep machine learning algorithms work very well for network management,
network optimization, signal management, channel assignment, network security,
route deciding, etc. Deep reinforcement learning and deep Q-routing are the main
learning techniques, which are most useful for network operations. However, it
is difficult to obtain training data that includes various scenarios. Due to wireless
networks’ dynamic behavior, it is challenging to create the datasets for training,
and due to the dynamicity and unpredictability of wireless channels, it is hard to
find any regular pattern from previously experienced data. The learning and upda-
tion of observations in a run-time environment are a highly complex and resource-
consuming process. In addition to this, the volume and the integration of the network
add processing complexity and overhead to the network. These constraints limit the
usage of ML in wireless communication with many performances. In this research
work, a focus is made for developing low complexity and a fast adaptive approach
for wireless communication to improve the network’s overall performance.
References
1. Deshpande, P., Iyer, B.: Research directions in the internet of every things (IoET). In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, 2017, pp. 1353–1357. https://doi.org/10.1109/CCAA.2017.8230008
2. Khaled, B.L., Wei, C., Yuanming, S., Jun, Z., Ying-Jun, A.Z.: The roadmap to 6G: AI
empowered wireless networks. IEEE Commun. Mag. 84–90 (2019)
3. Shi, Y., Zhang, J., Letaief, K.B., Bai, B., Chen, W.: Large-scale convex optimization for ultra-
dense cloud-RAN. IEEE Wirel. Commun. 22(3), 84–91 (2015)
4. Dai, H.-N.: Big data analytics for large-scale wireless networks: challenges and opportunities.
ACM Comput. Surv. 52(5), 1–35. Article 99. Publication date: September 2019
5. Aguilar Igartua, M., Almenares Mendoza, F.: INRISCO: INcident monitoRing in Smart
Communities. IEEE Access 8, 72435–72460 (2020)
6. Wang, J., Jiang, C.: Machine learning paradigms in wireless network association. Encyclopedia
of Wireless Networks, pp. 1–9 (2018)
7. Boutaba, R., Salahuddin, M.A., Limam, N., Ayoubi, S., Shahriar, N., Estrada-Solano, F.,
Caicedo, O.M.: A comprehensive survey on machine learning for networking: evolution,
applications and research opportunities. J. Internet Serv. Appl. 1–99 (2018)
Data Handling Approach for Machine Learning … 119
8. Zhang, C., Patras, P.: Deep learning in mobile and wireless networking: a survey. IEEE
Commun. Surv. Tutor. 1–67 (2018)
9. Kadam, K., Srivastava, N.: Application of machine learning (Reinforcement Learning) for
routing in wireless sensor networks (WSNs). In: Proceedings of the 2012 1st International
Symposium on Physics and Technology of Sensors, pp. 349–352 (2012)
10. Kibria, M.G., Nguyen, K., Villardi, G.P., Zhao, O., Ishizu, K., Kojima, F.: Big data analytics,
machine learning, and artificial intelligence in next-generation wireless networks. IEEE Access
6, 32328–32338 (2018)
11. Ye, H., Li, G.Y., Juang, B.-H.: Power of deep learning for channel estimation and signal
detection in OFDM systems. IEEE Wirel. Commun. Lett. 7(1), 114–117 (2018)
12. Gao, X., Jin, S., Wen, C.-K., Li, G.Y.: ComNet: combination of deep learning and expert
knowledge in OFDM receivers. IEEE Commun. Lett. 22(12), 2627–2630 (2018)
13. Yuan, Y., Sun, Z., Wei, Z., Jia, K.: DeepMorse: a deep convolutional learning method for blind
Morse signal detection in wideband wireless spectrum. IEEE Access 7, 80577–80587 (2019)
14. Kulin, M., Kazaz, T., Moerman, I., De Poorter, E.: End-to-end learning from spectrum data: a
deep learning approach for wireless signal identification in spectrum monitoring applications.
IEEE Access 6, 18484–18501 (2018)
15. Kim, H., Jiangy, Y., Rana, R., Kannany, S., Oh, S., Viswanath, P.: Communication algorithms
via deep learning. In: Proceedings of ICLR 2018, pp. 1–17 (2018)
16. Ma, S., Dai, J., Lu, S., Li, H., Zhang, H., Du, C., Shiyin, L.: Signal demodulation with machine
learning methods for physical layer visible light communications: prototype platform, open
dataset, and algorithms. IEEE Access 7, 30588–30598 (2019)
17. El Khayat, I., Geurts, P., Leduc, G.: Improving TCP in wireless networks with an adap-
tive machine-learnt classifier of packet loss causes. International Federation for Information
Processing, pp. 549–560 (2005)
18. Liu, W., Santos, J.F., Jiao, X., Paisana, F., DaSilva, L.A., Moerman, I.: Using deep learning and
radio virtualization for efficient spectrum sharing among coexisting networks. In: 13th EAI
International Conference, CROWNCOM, pp. 1–10 (2018)
19. Zhang, Y., Li, J., Zakharov, Y.V., Li, J., Li, Y., Lin, C., Li, X.: Deep learning based single carrier
communications over time-varying underwater acoustic channel. IEEE Access 7, 38420–38430
(2019)
20. Mahabub, A., Sultan bin Habib, A.-Z.: A voting approach of modulation classification for
wireless network. In: Proceedings of the 6th International Conference on Networking, Systems
and Security, pp. 133–138 (2019)
21. Wang, H., Wu, Z., Ma, S., Lu, S., Zhang, H., Ding, G., Li, S.: Deep learning for signal demodula-
tion in physical layer wireless communications: prototype platform, open dataset, and analytics.
IEEE Access 7, 30792–30801 (2019)
22. Fontainea, J., Fonseca, E., Shahida, A., Kist, M., DaSilva, L.A., Moermana, I., De Poortera,
E.: Towards low-complexity wireless technology classification across multiple environments.
Ad Hoc Netw. 91, 101881, 1–12 (2019)
23. Zhang, R., Cui, Y., Claussen, H., Haas, H., Hanzo, L.: Anticipatory association for indoor visible
light communications: light, follow me! IEEE Trans. Wirel. Commun. 17(4), 2499–2510 (2018)
24. Qin, Z., Ye, H., Li, G.Y., Fred Juang, B.-H.: Deep learning in physical layer communications.
IEEE Wirel. Commun. 26(2), 93–99 (2019)
25. Sun, Y., Peng, M., Zhou, Y., Huang, Y., Mao, S: Application of machine learning in wireless
networks: key techniques and open issues. IEEE Commun. Surv. Tutor. 1–37 (2019)
26. Alkhateeb, A., Alex, S., Varkey, P., Li, Y., Qu, Q., Tujkovic, D.: Deep learning coordinated
beamforming for highly-mobile millimeter wave systems. IEEE Access 6, 37328–37348 (2018)
27. Chen, B., Yang, C.: Caching policy for cache-enabled D2D communications by learning user
preference. IEEE Trans. Commun. 66(12), 6586–6601 (2018)
28. Nishio, T., Yonetani, R.: Client selection for federated learning with heterogeneous resources
in mobile edge. In: IEEE International Conference on Communications (ICC), pp. 1–7 (2019)
29. Han, P., Zhou, Z., Wang, Z.: User association for load balance in heterogeneous networks with
limited CSI feedback. IEEE Commun. Lett. 24(5), 1095–1099 (2020)
120 N. S. Kulkarni et al.
30. Liu, R., Lee, M., Yu, G., Li, G.Y.: User association for millimeter-wave networks: a machine
learning approach. IEEE Trans. Commun. 68(7), 4162–4174 (2020)
31. Pachauri, G., Sharma, S.: Anomaly detection in medical wireless sensor networks using
machine learning algorithms. Proc. Comput. Sci. 70, 325–333 (2015)
32. Sruthi, S.S., Varghese, A.: Enhance QoS by learning data flow rates in wireless networks using
hierarchical docition. ICECCS 708–714 (2015)
33. Sun, B., Feng, H., Chen, K., Zhu, X.: A deep learning framework of quantized compressed
sensing for wireless neural recording. IEEE Access 4, 5169–5178 (2016)
34. Sui, Y., Yu, W., Luo, Q.: Jointly optimized extreme learning machine for short-term prediction
of fading channel. IEEE Access 6, 49029–49039 (2018)
35. Zhang, Q., Liang, Y.-C., Vincent Poor, H.: Intelligent user association for symbiotic radio
networks using deep reinforcement learning. In: IEEE Global Communications Conference
(GLOBECOM), pp. 1–12 (2019)
36. Jan, B., Farman, H., Khan, M.: Designing a smart transportation system: an internet of things
and big data approach. IEEE Wirel. Commun. 73–79 (2019)
37. Zappone, A., Di Renzo, M., Debbah, M.: Wireless networks design in the era of deep learning:
model-based, AI-based, or both? (2019). arXiv:1902.02647
38. He, H., Jin, S., Wen, C.-K., Gao, F., Li, G.Y., Xu, Z.: Model-driven deep learning for physical
layer communications. IEEE Wirel. Commun. (2019)
39. Zhang, C., Zhang, H., Yuan, D., Zhang, M.: Citywide cellular traffic prediction based on densely
connected convolutional neural networks. IEEE Commun. Lett. 22(8), 1656–1659 (2018)
40. Liang, D., Zhang, J., Jiang, S., Zhang, X., Wu, J., Sun, Q.: Mobile traffic prediction based
on densely connected CNN for cellular networks in highway scenarios. In: 11th International
Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–5 (2019)
41. Zhang, C., Zhang, H., Qiao, J., Yuan, D., Zhang, M.: Deep transfer learning for intelligent
cellular traffic prediction based on cross-domain big data. IEEE J. Sel. Areas Commun. 37(6),
1389–1401 (2019)
42. Paul, U., Liu, J., Troia, S., Falowo, O., Maier, G.: Traffic-profile and machine learning based
regional data center design and operation for 5G network. J. Commun. Netw. 21(6), 569–583
(2019)
43. Wang, X., Zhou, Z., Xiao, F., Xing, K., Yang, Z., Liu, Y., Peng, C.: Spatio-temporal analysis and
prediction of cellular traffic in metropolis. In: IEEE 25th International Conference on Network
Protocols (ICNP), pp. 1–14 (2018)
44. Tang, F., Mao, B., Md. Fadlullah, Z., Kato, N.: On a novel deep-learning-based intelligent
partially overlapping channel assignment in SDN-IoT. IEEE Commun. Mag. 80–86 (2018)
45. Alkurd, R., Abualhaol, I.: Big-data-driven and AI-based framework to enable personalization
in wireless networks. IEEE Commun. Mag. 18–24 (2020)
46. Simeone, O.: A very brief introduction to machine learning with applications to communication
systems. IEEE Trans. Cogn. Commun. Netw. 4(4), 648–664 (2018)
Breast Cancer Detection in
Mammograms Using Deep Learning
Abstract Breast cancer is the most lethal cancer among women. Early-stage diag-
nosis may reduce the mortality associated with breast cancer subjects. Diagnosis
can be made with screening mammography. The main challenge of screening mam-
mography is its high risk of false positives and false negatives. This paper presents
the detection of breast cancer in mammograms using the VGG16 model of deep
learning approaches. The VGG16 model is trained and tested on 322 images from
the MIAS dataset. It performs better as compared to AlexNet, EfficientNet, and
GoogleNet models. Classification of mammograms will improve mammograms’
efficient screening, which will be a support system to radiologists.
1 Introduction
Breast cancer disease has the second most noteworthy death rate in women [11].
According to the global cancer statics, the number of new cases in 2018 was esti-
mated to be 18,078,957 and deaths 9,555,027 (52.85%) globally [3]. Breast cancer
cases amount to 2,088,849 (11.55%) and the deaths are estimated to be 626,679
(6.56%). Sixty percentage of the deaths occur in low-income developing countries
like Ethiopia, noted by [5]. If the cancer is detected early, it increases the expectancy
of the patient’s survival rate and decreases the mortality rate. Many presentations like
masses, areas of symmetry and distortion, and micro-calcifications may reveal breast
cancer. The most common and representative indication is masses which may not
be detected due to overlapping breast tissues. Masses can be of two types, namely
undetected and misidentified. False negative cases are categorized as undetected
masses in which delayed diagnosis costs the survival of a patient. Misidentified mass
adds to unwanted anxiety and pain to patients, along with the additional burden of
re-screening and biopsy [7]. Many mammographic density ratings, ranging from
manual classification (e.g. BI-RADS) to automatic scores, have been suggested.
Radiologists classified the mammograms visually in the early years by a series of
intuitive yet poorly defined breast tissue patterns. Manual classification is a low-cost
solution but may lead to a considerable risk of misclassification. Also, mammogram
interpretation is challenging, and the possibility of missing abnormality for the tired
radiologist or inexperienced personnel may exist. Therefore, it is expected to have
an efficient, inexpensive, robust, and accurate non-invasive system or tool for breast
cancer detection using a mammogram. This paper presents breast cancer detection
using mammograms in Cranial-Caudal (CC) and Medial-Lateral Oblique (MLO)
views using convolutional neural network, i.e. VGG16.
The paper is organized as follows: Sect. 2 discusses the earlier work in breast
cancer detection. Section 3 explains the presented VGG16 framework for the classi-
fication of mammograms. Section 4 provides the experimental findings followed by
the conclusions in Sect. 5.
2 Related Work
3 Methodology
The presented work in this paper is the detection of abnormal mammograms using
the VGG16 deep learning network. The supervised learning of networks is used in
this work. The block schematic of the detection of abnormal mammograms is as
shown in Fig. 1. The dataset used for the experiments is the MIAS dataset, which
consists of 322 images. The mammograms in this dataset are categorized into normal
class and abnormal class. The distribution of images is 208 normal and 114 abnormal
(63 benign and 51 malignant) images in the database. The scans are standardized
to a size of 1024 × 1024 pixels [4]. Figure 2 shows sample images of cancerous
mammograms in CC and MLO views from the MIAS dataset.
3.1 Preprocessing
The images from the MIAS dataset carry a lot of background noise. The presence
of pectoral muscles and outer region makes it a challenging dataset for classification
and segmentation. In this work, the annotations from all the images are removed,
and the pectoral muscles are also cropped. If done so, it always reduces errors and
increases the accuracy of classifying the mammograms. After preprocessing of data,
we obtain the final cropped breast region for the classification task.
The CNN models trained on smaller datasets, like the MIAS dataset, suffer from
an over-fitting problem. To mitigate the over-fitting, data augmentation is preferred
[16]. Data augmentation methods like rotation, scaling, horizontal flipping, resizing
of the images, shearing, etc. are used in this work. Total 2600 images from 322
Fig. 1 The block schematic of classification approach for mammograms during breast cancer
screening
124 A. Pillai et al.
Fig. 2 Sample images of cancerous mammograms from the MIAS dataset a and b MLO views c
and d CC views
images of the MIAS dataset are generated using data augmentation. The percentage
of mammograms used for training, validation, and testing are 70%, 15%, and 15%,
respectively.
3.3 VGG16
The VGG16 model [12] is selected for the work presented on breast cancer detection.
The framework of this excellent CNN model is displayed in Fig. 3.
It consists of five blocks and 16 layers of convolution for feature extraction from
mammograms. Each layer is followed by ReLU and max-pooling layers, supporting
the extraction of varied and in-depth information. The combination of these five
blocks (as shown in Fig. 3) results in better characterization of mammograms. This
leads to improved classification accuracy. The 1 × 1 convolution layers [14] support
Breast Cancer Detection in Mammograms Using Deep Learning 125
We trained the dataset on the VGG16 classification model along with AlexNet [8],
GoogleNet, and EfficientNet. Dropout layer and batch normalization were also used
with each model to reduce over-fitting. While implementing the models, we found out
that AlexNet and EfficientNet were very slow and less efficient than the other models.
AlexNet, which consists of 8 convolution layers, gave an accuracy of 69.64%, while
GoogleNet, which has 22 layers, provided an accuracy of 71.67%. The best results
(the highest accuracy of 75.46%) were achieved using the VGG16, which comprises
16 layers. VGG16 performs excellent with minimum losses. EfficientNet resulted in
an accuracy of 72.29%. A comparative analysis of these four models is presented in
Table 1. The network is trained with a learning rate of 0.001 for 250 epochs.
Table 1 Comparison of various networks for classification of Mammograms on the MIAS dataset
Name of Number Accuracy Loss Validation Parameters
the model of layers (%) loss (Million)
AlexNet 8 69.64 1.84 1.94 49.0
EfficientNet 17 72.29 0.49 1.53 5.3
GoogleNet 22 71.67 0.31 0.63 22.2
VGG16 16 75.46 0.31 0.44 138.0
126 A. Pillai et al.
Even though the VGG16 has fewer layers than some other models, it achieved the
highest accuracy on the MIAS dataset. The other three models have limited accuracy
compared to VGG16. However, it is still a challenge to achieve good performance
with deep learning approaches to classify mammograms from the MIAS dataset.
5 Conclusion
Breast cancer detection in mammograms using VGG16 is presented in this work. The
system identifies the given image as a normal or abnormal mammogram. The pre-
sented methodology includes image preprocessing, data augmentation, and predict-
ing the outcome of new data provided to the trained model. The average classification
accuracy of 75.46% is achieved for the MIAS dataset with VGG16. It is the highest
accuracy compared to AlexNet, GoogleNet, and EfficientNet. However, the number
of trainable parameters is huge in the VGG16. This classification approach may help
in the early diagnosis of breast cancer during the screening of mammograms. It will
be helpful to radiologists for prioritizing mammograms for abnormality during the
screening programs.
References
1. Abbas, Q.: Deepcad: a computer-aided diagnosis system for mammographic masses using deep
invariant features. Computers 5(4), 28 (2016)
2. Arevalo, J., González, F.A., Ramos-Pollán, R., Oliveira, J.L., Lopez, M.A.G.: Representation
learning for mammography mass lesion classification with convolutional neural networks.
Comput. Methods Programs Biomed. 127, 248–257 (2016)
3. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L., Torre, L.A., Jemal, A.: Global cancer
statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in
185 countries. CA: Cancer J. Clinic. 68(6), 394–424 (2018)
4. Brzakovic, D., Neskovic. M.: Mammogram screening using multiresolution-based image seg-
mentation. In: Series in Machine Perception and Artificial Intelligence. World Scientific, pp
103–127 (1994). https://doi.org/10.1142/97898127978340006
5. Hadgu, E., Seifu, D., Tigneh, W., Bokretsion, Y., Bekele, A., Abebe, M., Sollie, T., Merajver,
S.D., Karlsson, C., Karlsson, M.G.: Breast cancer in ethiopia: evidence for geographic dif-
ference in the distribution of molecular subtypes in africa. BMC Women’s Health 18(1), 1–8
(2018)
6. Hadush, S., Girmay, Y., Sinamo, A., Hagos, G.: Breast cancer detection using convolutional
neural networks (2020). arXiv:200307911
7. Hamed, G., Marey, M., Amin, S., Tolba, M.: Deep learning in breast cancer detection and
classification, pp. 322–333 (2020). https://doi.org/10.1007/978-3-030-44289-7-30
8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional
neural networks. Commun. ACM 60(6), 84–90 (2017)
9. Li, B., Ge, Y., Zhao, Y., Guan, E., Yan, W.: Benign and malignant mammographic image clas-
sification based on convolutional neural networks. In: Proceedings of 10th International Con-
ference on Machine Learning and Computing, ACM (2018). https://doi.org/10.1145/3195106.
3195163
Breast Cancer Detection in Mammograms Using Deep Learning 127
10. Petrosian, A., Chan, H.P., Helvie, M.A., Goodsitt, M.M., Adler, D.D.: Computer-aided diagno-
sis in mammography: classification of mass and normal tissue by texture analysis. Phys. Med.
Biol. 39(12), 2273–2288 (1994). https://doi.org/10.1088/0031-9155/39/12/010
11. Selvathi, D., Poornila, A.A.: Deep learning techniques for breast cancer detection using medical
image analysis. In: Biologically Rationalized Computing Techniques for Image Processing
Applications. Springer, pp 159–186 (2018)
12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recog-
nition (2015). 1409.1556
13. Singh, D., Singh, A.K.: Role of image thermography in early breast cancer detection-past,
present and future. Comput. Methods Programs Biomed. 183(105), 074 (2020)
14. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke,
V., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Com-
puter Vision and Pattern Recognition (CVPR), IEEE (2015). https://doi.org/10.1109/cvpr.2015.
7298594
15. Wang, J., Yang, X., Cai, H., Tan, W., Jin, C., Li, L.: Discrimination of breast cancer with
microcalcifications on mammography by deep learning. Scienti. Rep. 6(1), 1–9 (2016)
16. Wang, J., Perez, L., et al.: The effectiveness of data augmentation in image classification using
deep learning. Convolut. Neural Netw. Vis. Recognit. 11 (2017)
17. Yang, Z., Dong, M., Guo, Y., Gao, X., Wang, K., Shi, B., Ma, Y.: A new method of micro-
calcifications detection in digitized mammograms based on improved simplified pcnn. Neuro-
comput 218(C), 79–90 (2016). https://doi.org/10.1016/j.neucom.2016.08.068
Deep Learning-Based Parameterized
Framework to Investigate the Influence
of Pedagogical Innovations
in Engineering Courses
1 Introduction
M. Ashok (B)
Rajalakshmi Institute of Technology, Chennai, TN, India
e-mail: ashok.m@ritchennai.edu.in
K. Ramasamy
Dhirajlal Gandhi College of Technology, Salem, TN, India
U. Ashok
SRM Valliammai Engineering College, Chennai, TN, India
R. Pandian
Velammal Engineering College, Chennai, TN, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 129
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_12
130 M. Ashok et al.
2 Literature Survey
Butler-Henderson and Crawford [1] explored the parameters for students’ evalu-
ation in physical, logical, and pedagogical levels. Tan and Matsuda [2] conveyed
the impact of teaching practices in the agenda of pedagogy implementations. Hill
and France [3] identified the factors which influence the technology-based virtual
learning environment. Kiernan [4] commented about the best teaching practices that
happened during the Covid-19 pandemic break out. Grimal et al. [5] elaborated on
implementing pedagogies for professional training in Universities for Engineering
curriculum. van Twillert et al. [6] narrated teachers’ and students’ behavior analysis
during collaborative learning using ICT tools. Punithavathi and Geetha [7] analyzed
the role of mobile technologies in undergraduate engineering education. Nancy et al.
[8] depicted the role of ICT tools in hybrid teaching and its instances. Özgür [9]
listed the stress management parameters for teachers involving in gadget-based
Deep Learning-Based Parameterized Framework … 131
3 Methodology
The parameters responsible for pedagogical innovations were the Virtual Learning
Environment tool, Social Media Influence, Digital Professional Platform, and
resources available for the teaching–learning process. To create the framework, the
parameters were treated as Usage Quotients (UQ). Every Parameter was described
as the ratio of two or more numerical attributes. It would be easy to perform
computations or predict using the data sets.
3 represents the count of exams, viz., Unit Test-1 (First two and a half Units),
Unit Test- 2 (Second two and a half Units), and Model Exam (complete five units)
as prescribed by the University.
Influence factor (IF) was defined as the variant for storing the influence of (1), (2),
(3), and (4).
5 Conclusions
References
1 Introduction
More than a million people have succumbed to COVID-19, and more than 75 million
people have had COVID-19. Mass testing is essential for isolating infected individ-
uals and slowing the spread. India currently has the second most COVID-19 cases
worldwide. The ICMR has recommended faster COVID-19 tests in containment
zones, results in 30 min, costing | 450. With 220 million Indians sustained on an
expenditure level of less than Rs 32/day, and India going through the ‘unlock’ phase,
reverse migration of workers, and reopening of offices and some educational insti-
tutes, the need for a quick, accurate, and inexpensive COVID-19 test could not be
more significant. Even with the advent of the vaccine, there is no information about
its longevity and durability. Hence, there is an added need for people to be tested
regularly.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 137
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_13
138 S. Dhekane et al.
Dry cough is one of the most common symptoms of COVID-19. The virus, being
a respiratory disease, even if the person is asymptomatic, the lungs are weakened
by this virus, and hence, his/her forced cough is discernible from a healthy person’s
cough [1]. While these slight cough differences are not decipherable to the human
ear, they can be picked up by an Artificial Intelligence system.
The Aarogya Setu app by the Indian government has a national reach, and inte-
gration with our model can help people get a preliminary test of COVID-19. The
economy needs to get back on track with the unlock phase. For this to happen,
regular screening on a large scale is of paramount importance. Our model architec-
ture (Fig. 1) enables a nearly cost-free, real-time solution for COVID-19 testing. The
model uses forced cough recordings to recognize whether the patient is COVID-19
positive or not using Artificial Intelligence [2]. Disparate Machine Learning [3] and
Deep Learning [4] methods were implemented, and the results were observed.
Techniques that were employed are briefly described in Sect. 2 Methodology.
Section 3 shows the results of all the methods adopted. It also explains the results
and their implications. Section 4 is a synopsis of the overall methodology and the
results obtained.
2 Methodology
The coughing sound recorded while data collection is resampled to 16 kHz. Librosa
library was used for audio processing. From the resampled audio, the leading and
trailing silence was removed. Mel-Frequency Cepstral Coefficients (MFCCs) [6]
were extracted and stored in the form of Mel spectrogram images from this silence-
removed audio. As mentioned in the dataset section, the Cambridge dataset was
skewed in favor of healthy cough recording samples; that is, the dataset contained
fewer COVID-19 positive samples as compared to healthy samples; hence data
augmentation was used to increase the COVID-19 positive samples and balance
the dataset to get improved results.
After Mel spectrogram images were extracted from audios, these images were used
for data augmentation. To augment the Mel images, image pixel data was scaled by
using different normalization techniques. All Mel images from COVID-19 positive
samples were processed by applying these techniques, and the new augmented images
were stored in the dataset for further training.
The techniques used were Pixel value normalization, Centering pixel values, and
Standardizing pixel values.
To train the model so that it learns to conduct the COVID-19 test, the forced cough
recordings were obtained from the dataset (Sect. 2.1). The recordings were then
processed to extract features. These features were fed to disparate Machine Learning
and Deep Learning models (Fig. 2). Each model was tested to calculate the accuracies,
and a comparative study was performed to conclude which model is best in terms of
reliability and accuracy.
Machine Learning Models:
For training the machine learning models [3], Mel-frequency cepstral coefficients
(MFCCs) were extracted for all 724 audio samples available and stored in our dataset.
The audios were divided into frames of 25 ms duration with the overlap of 10 ms, and
Mel-frequency Classification as
Forced cough
Cepstral ML/DL model COVID-19 positive
recording
Coefficients or negative
12 MFCC coefficients were extracted for each frame. 75% of the available dataset
was used for training and 25% for testing. The models performed binary classification
into COVID-19 positive and healthy. Support vector machine classifier, decision tree
classifier, random forest classifier, and k-nearest neighbor classifier were trained to
obtain the results.
Deep Learning Models:
Suitable CNNs [7] were tested for classifying the Mel spectrogram images into binary
classes: COVID-19 positive and healthy. As mentioned above, a Mel-frequency
spectrogram was plotted for each audio. Then, because the dataset is skewed, data
augmentation was applied to balance the data, and hence a total of 1147 images were
subjected to the CNN under test. This set of 1147 images contained 564 images that
represented COVID-19 positive cough audios. A custom-made CNN model and a
few other transfer learning [8]-based CNN models were tested to determine which
one was the best at correctly classifying the maximum number of Mel images.
Custom-made CNN.: A custom CNN was designed by building a sequential model
comprising of a series of 2D Convolution layers, pooling layers, dense layers, and
dropout layers along with regularization to avoid overfitting. The architecture of this
custom-made CNN is shown below (Fig. 3).
Transfer Learning and Fine-Tuning Pre-trained CNNs.: Pre-trained CNNs (pre-
trained on the Imagenet dataset classifying a plethora of images into many classes)
were tweaked and used for classifying Mel spectrogram images. These deep neural
networks are loaded with Imagenet weights, and then the top softmax layer that
classifies images into a thousand classes is removed. A few top layers are set to
trainable, and the bottom layers are frozen on Imagenet weights. Dense and dropout
layers are added at the top, including a last dense layer containing two nodes that clas-
sify the Mel images into two classes: COVID-19 positive and healthy. The different
pre-trained CNNs used are enumerated as follows:
• ResNet50: Out of these 176 layers, 171 were frozen at Imagenet weights. The top
softmax classifier layer was replaced by three layers: a dense layer containing 100
nodes, an L2 loss regularizer, a dropout layer, and a final dense layer containing
two nodes with softmax activation. The summary of this network architecture can
be found in Table 1.
• Xception: Similar to the procedure followed in ResNet50, after removing the top
softmax classifier layer of the Xception model, 129 out of the total 133 layers
were frozen on Imagenet weights. The top softmax classifier was replaced with a
128-node dense layer, a dropout layer, and a final softmax dense binary classifier
layer. The summary of this network architecture can be found in Table 1.
• VGG16: This model, too, was fine-tuned for training Mel spectrogram images
by removing the top softmax classifier layer, freezing the bottom 17 layers (out
of a total of 20 layers), and adding a 100-node dense layer, dropout layer, and a
softmax binary classifier layer.
The summary of all network architectures pertaining to above mentioned pre-trained
CNNs can be found in Table 1.
Modern Transfer Learning-Based Preliminary Diagnosis of COVID-19 … 141
This section contains the results of all the models that were adopted to test COVID-19
using cough recordings. The overall accuracy and the categorical accuracy of each
class (COVID-19 positive and healthy) are mentioned for each model. The graphs
142 S. Dhekane et al.
depicting training and validation accuracies and losses are also plotted for each Deep
Learning model. Two types of models were used: Machine Learning based and Deep
Learning based.
See Table 2.
Custom CNN:
Categorical accuracies:
Accuracy of predicting healthy cough recording correctly: 98.59%, Accuracy of
predicting COVID-19 positive cough recording correctly: 78.72% (Fig. 4)
ResNet50:
Categorical accuracies:
Accuracy of predicting healthy cough recording correctly: 94.37%, Accuracy of
predicting COVID-19 positive cough recording correctly: 86.52% (Fig. 5)
Xception:
Categorical accuracies:
Modern Transfer Learning-Based Preliminary Diagnosis of COVID-19 … 143
Fig. 4 Losses and accuracies versus epoch for training and testing of Custom CNN
Fig. 5 Losses and accuracies versus epoch for training and testing of ResNet50 using transfer
learning and fine-tuning
Fig. 6 Losses and accuracies versus epoch for training and testing of Xception using transfer
learning and fine-tuning
Fig. 7 Losses and accuracies versus epoch for training and testing of VGG16 using transfer learning
and fine-tuning
models were trained on a total of 724 audios, out of which only 141 were COVID-19
positive and others healthy. Table 3 summarizes the accuracies of DL models.
Deep Learning models perform better than ML models here, as Mel spectrogram
images were augmented to balance the dataset. A total of 724 audio samples present in
the dataset were processed to extract Mel images, and these were then supplemented
to obtain 1147 Mel images. As previously mentioned, these Mel spectrogram images
were then fed to various CNNs. Among the multiple CNNs tried and tested, it is
evident that applying transfer learning on VGG16 and then fine-tuning gave out the
best accuracy. This can be attributed to the fact that a smaller model will work better
Modern Transfer Learning-Based Preliminary Diagnosis of COVID-19 … 145
Table 4 Comparison table of present method with the other similar way on a similar dataset
Model Overall accuracy (%) Number of samples for COVID-19 Total samples
positive
Our model 92.19 141 724
Cambridge model 80 141 599
since it is a small dataset. VGG16 has the most miniature architecture among the
pre-trained CNNs.
We obtained the dataset from a research group at the University of Cambridge.
They developed a mobile application and website named COVID-19 Sounds App to
collect cough recordings and conducted research on the collected dataset [5]. Their
research includes a total of 599 sounds from different users, which contained cough
as well as breathing sounds. Classifiers such as logistic regression, gradient boosting
trees, and support vector machines were tested on features that combine handcrafted
features and features obtained through transfer learning. Our model was trained to
identify COVID-19 positive or negative based on only cough recordings, whereas
the above model uses breathing recordings. Comparing the accuracies of our model
with the above model, our model performs better (Table 4).
4 Conclusion
Acknowledgements COVID-19 Sounds App’s [5] reliable data has helped us build this model that
can play an essential role in recovering from the pandemic.
Conflict of Interest All of the authors do not have any conflict of Interest with any individuals,
agencies, or institutes.
146 S. Dhekane et al.
References
1. Shi, Y., Liu, H., Wang, Y., Cai, M., Xu, W.: Theory and application of audio-based assessment
of cough. J. Sens. 2018. Article ID 9845321 (2018)
2. Imran, A., Posokhova, I., Qureshi, H.N., Masood, U., Riaz, M.S., Ali, K., John, C.N., Iftikhar
Hussain, M.D., Nabeel1, M.: AI4COVID-19: AI enabled preliminary diagnosis for COVID-19
from cough samples via an app. Inf. Med. Unlocked (2020)
3. Alpaydın, E.: Introduction to Machine Learning, Second Edition (Adaptive Computation and
Machine Learning), 2nd edn. The MIT Press Cambridge, Massachusetts, London, England
(2010)
4. Moolayil, J.J.: Learn Keras for Deep Neural Networks: A Fast-Track Approach to Modern Deep
Learning with Python, 1st edn. Apress, New York (2019)
5. Brown, C., Chauhan, J., Grammenos, A., Han, J., Hasthanasombat, A., Spathis, D., Xia, T.,
Cicuta, P., Mascolo, C.: Exploring Automatic Diagnosis of COVID-19 from Crowdsourced
Respiratory Sound Data. In: KDD’20 (Health Day), San Diego, CA, USA (virtual event) (2020)
6. Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals, 4th edn. AT&T, Prentice-
Hall, Inc., Englewood Cliffs, New Jersey
7. Laguarta, J., Hueto, F., Subirana, B.: COVID-19 artificial intelligence diagnosis using only cough
recordings. IEEE Open J. Eng. Med. Biol. (2020)
8. Hussain, M., Bird, J.J., Faria, D.R..: A study on CNN transfer learning for image classification.
In: 18th Annual UK Workshop on Computational Intelligence, Nottingham (2018)
Biomedical Text Summarization:
A Graph-Based Ranking Approach
Abstract The latest and precise information regarding the biomedical and health-
care domain is required in the current pandemic situation. The world has turned
into a small place where everyone wants quick and relevant medical data to prevent
contagious diseases. Doctors, nursing staff, medical practitioners, frontline Covid19
epidemic fighters, and even the common man requires updates and summarized
biomedical statistics. A study on Graph-based biomedical text summarization with
different similarity measures and ranking of sentence embeddings is presented in
this paper. Cosine and Dice similarities and the pre-trained BERT model providing
context via sentence embeddings are combined with TextRank and PageRank algo-
rithms resulting in an opulent extractive text summarization of biomedical Cord19
Pubmed articles. Rouge-1 and Rouge-L scores are empirically calculated, providing
a comparison between the average F-score, precision, and recall values for various
graph-based sentence extraction methods. It has been observed that Cosine similarity
and BERT sentence embeddings are equally effective when used with graph-based
ranking algorithms. The significant contribution is the proposed TextRank with the
BERT embedding model, which is evaluated as the preferred choice for short biomed-
ical document summarization. But for large documents, the BERT model behaves
heavy and causes latency in execution whereas, LexRank including Cosine measure
still works efficiently for mid-size document summarization.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 147
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_14
148 S. Gupta et al.
1 Introduction
The enormous volume of Textual data in the biomedical area is consistently a chal-
lenge that drives analysts to create new domain-related textual processing methods.
In the past, biomedical summarization techniques have been broadly examined to
give clinicians and specialists. Biomedical data is accessible as various kinds of
archives. Biomedical reviews provide clinicians and specialists vital information to
evaluate the most recent advances in a specific field of study, create and approve
new theories and hypotheses, experimental analysis, and decipher their outcomes.
Readers need to track down accumulated data from their own experience along-with
considering new knowledge to comprehend and assess the biomedical article [1]. The
new Researcher can’t figure out which of the concurrent sentences are significant and
the preliminary information the authors need to introduce. Numerous wellsprings in
the existing literature can be accessed over the Web, Cord19, and Pubmed reposito-
ries with openly filed insightful archives inside the healthcare and biomedical fields.
The motivation is to retrieve informative sentences through the bulk of the biomed-
ical articles, which seems to be a significant challenge. Text Summarization is to
summarize the vast data in a precise length with informative knowledge collection.
The proposed methodology gives the text rank graph-based approach for extracting
top-ranked sentences to summarize the biomedical text articles (Cord19, Pubmed).
Text summarization using automated methods is troublesome and significant due to
human summarizers attached with knowledge and language capabilities that aren’t
straight similar to computer logic. By and large, the two types of ATS approaches are
either extractive or abstractive [2]. In the extractive way of summarization, significant
sentences from the text content are decided and selected in the shortened outline.
Extraction relies upon text strings taken from the original first content as it were.
Interestingly, abstractive summarization techniques plan to replicate the significant
substance into another structure as humans do.
A brief history of work done in this field is presented in Sect. 2. Section 3 contains
the proposed methodology, system architecture, working approach, and algorithm.
Section 4 highlights the experimental setup, analysis, and results for the work done.
The last Sect. 5, contains the conclusion.
2 Literature Review
repeated words deciding the significant parameter of a sentence. WordNet [4] repre-
sents the famous non-exclusive philosophy. Nagwani et al. presented Dice and Cosine
similarities for sentence similarity measure [5].
Chen and Verma [6] built up a clinical book rundown framework that pre-owned
catchphrases from the first report as a question. The arrangement of the watchwords
was extended from the coordinating ideas inside the UMLS knowledgebase. The
sentences are scored considering 1 point when the first catchphrase is present in
any sentence and a 0.5 point if it is the extended watchword. Finally, the superior k
sentences having the most noteworthy scores have been chosen in the summarized
content. Likewise, Sharaff and Nagwani [7] identify email threads and evaluate this
with precision-recall and F-score. Mihalcea et al. [8] introduced TextRank that like-
wise utilized PageRank for assessing the list of sentences yet assembled and spoke to
chart by using a co-event connection, gotten from comparability of each sentence. The
text processing for the SMS threads is reported in [9]. The authors isolated the archive
into tokenized sentences and planned the text strings via metaphysics, resulting in
the Ontology-based sentence tree by assessing scores. The more significant rank of
hubs represented a superior score.
Zahir et al. [10] summed up conventional archives utilizing a diagram-based
strategy. First, they constructed vertices and edges generated from the sentences with
a word frequency of more than one. Scores at that point are expressed in the form
of a specific symmetric lattice. This work didn’t consider the relative relationship
within the sentence to decrease confusion and rely on the framework. It also lacked
the evaluation of cosine resemblance. Moradi and Ghadiri [11, 12] introduced the
summarization based on Bayesian principles utilizing data from the Pubmed repos-
itory, which is strategically integrated with the UMLS philosophy; at that point,
the sentence vectors are processed for separation estimation and recovering ideas.
Mohamed and Oussalah [13] presented Semantic Role Labeling, i.e., SRL to parse,
set weightage to individual sentences, afterward utilized PageRank to score penalties.
Sharaff and Nagwani [14] gave the document summarization by an agglomer-
ative method. To assess the content rundown exhibition, there are a few estima-
tions, i.e., F-measure, Recall, Precision, and ROUGE values [15]. ROUGE values
depend on the N-gram cover among the frameworks created and compared to the
reference summarization (otherwise called reference standard). The standard gold
summarization might be the human-made synopsis or theoretical from the original
text. The ROUGE measurement process scores are somewhere in the range of 0–
1. A higher score speaks to a more prominent presentation. Our paper introduces
a programmed extraction-based summarization framework zeroing in biomedical
survey and research papers via graph structure. PageRank algorithm combined with
BERT word embedding and Cosine and Dice similarities is proposed. The proposed
technique’s evaluation is done via calculating ROUGE measurements using multiple
similarity measures and comparing average F-score, precision, and recall attributes.
150 S. Gupta et al.
3 Proposed Methodology
After initial preprocessing, the PageRank algorithm’s blend with various syntactic
and word embedding models is proposed for extractive text summarization over
biomedical datasets. Similarity matrices are generated using Cosine, Dice, and the
combination of Cosine and Dice similarities. Separately, pre-trained BERT [16, 17]
model is implemented to produce word embeddings to provide semantic and contex-
tual representation. Later, a sentence graph model is created via the TextRank algo-
rithm based on the PageRank algorithm [18]. The sentences having apex ranking are
selected to produce the précised content.
The TextRank algorithm is based upon the PageRank algorithm. In the planned archi-
tecture, sentences are represented as multiple nodes. Similarly, edge identification
is performed from the score of different similarity measures. Biomedical articles
are used for extractive graph-based text summarization with a ranked approach. The
Cord19 dataset contains Covid19-related articles, and the Pubmed dataset contains
research articles and literature in the biomedical domain.
Figure 1 highlights the basic building blocks of the developed system and model
for extractive summarization of biomedical textual data. Biomedical documents
are fed to Json [19] and XML parsers (Pubmed Parser) [20] to gather text data,
which is convenient for text processing. Documents are tokenized and preprocessed;
sentences are converted into vectors, and words are weighed using cosine angle and
dice weighting schemes. In addition to the weighting mechanism, BERT is used for
word embedding, which provides the similarity scores between two sentences with
the help of the knowledgebase. According to similarity matrices, nodes, edges, and
linkage are identified for creating a text graph. The different sentences are ranked,
and then top-ranked sentences are picked to form the summary document.
PageRank
PageRank (Brin and Page, 1998) is maybe one of the most well-known positioning
calculations and was planned as a technique for Web connect examination. In
contrast to other positioning calculations, PageRank coordinates the effect of both
approaching and active connections into one single model, and in this way, it creates
just one bunch of scores:
PR(Vm)
PR(Vm) = (1 − d) + d ∗ vn ∈ ln(Vm)
|Out(Vn)|
Here, d represents the boundary, which is fixed somewhere in the range of 0–1.
For every one of these calculations, beginning from self-assertive qualities allotted to
Biomedical Text Summarization: A Graph-Based Ranking Approach 151
every hub in the diagram, the calculation repeats until assembly under a given edge
is accomplished. In the wake of running the calculation, a score is related to every
vertex, which indicates the “significance” or “force” of that vertex inside the diagram.
It has been observed that the underlying worth decision does not influence the last
qualities; just the quantity of emphases to intermingling might be extraordinary.
word markings, stop words/noise (resembling symbolic words) removal, and similar-
stemmed word evacuation.
3.3 Sentence-Scoring
In Figure 3, the sentences are marked as nodes and their interconnectivity as edges.
These nodes are derived from the sentence hubs, while advantages are the linkage
among any two nodes. The significant accomplishments consolidated together are (1)
Understanding the semantic context of the document via the BERT model resulting
in the extraction of critical biomedical concepts and (2) Evaluation and comparison
of the standard weighting mechanism over Cord19 and Pubmed datasets.
3.4 Sentence-Selection
The privileged sentences are selected via the TextRank algorithm based on different
similarity measures. The stepwise sequence of processing text data and establishing
summarized content based on various similarity measures and word embeddings is
depicted through Algorithm 1.
Parse Cord19 dataset and Pubmed dataset with Json parser and an XML parser
1: for each Document D, do
2: Identify and fragment sentences via Tokenization
3: for each sentence (Sn), do
4: Execute preprocessing steps (Tokenize words, Filter stop words, Convert sentence in
lowercase, Lemmatize)
5: Attain filtered sentences (Sn)
6: Set Damping factor as .85
7: Set Convergence threshold as 1e-5
8: Set Iterations as 100
9: Convert text to vector
10: Evaluate Cosine measures
11: Evaluate Dice measures
12: Evaluate and combine Cosine and Dice measures
13: Evaluate sentence embedding with Bert
14: Construct similarity matrix Graph with the node as sentences and edge as similarity scores
15: Execute Page Rank algorithm
16: Return top-ranked sentences
17: end for
18: Generate summary with given no. of top-ranked sentences
19: end for
The environment setup included executing the open-source Python notebook created
on Google collaborator consisting of different extended libraries like nltk, BERT
transformers, and numpy. Pubmed and Cord19 biomedical text repositories are
tweaked with XML parser [20] and Json parser [21] to obtain plain text documents
and abstracts. ROUGE scores are evaluated by comparing generated summary from
the proposed system, and the gold outline is created from the abstracts of the original
biomedical text articles. At that point, we utilized Python ROUGE [22] to gauge the
produced rundowns in ROUGE scores. The summarizers’ text rank cosine is deployed
from Sumy [23]. The summarization tasks—TextRank with BERT, LexRank cosine,
Dice, and Cosine with Dice are our practical implementations.
Table 1 shows average F-scores, precision, and recall values from ROUGE execu-
tion over the proposed methods. It has been observed that our proposed technique
performed well with highlighted scores.
154 S. Gupta et al.
5 Conclusion
In our paper, the results are offered for the biomedical extraction-based text summa-
rization calculated from different similarity measures using a text rank algorithm
to extract top-ranked sentences in summary. The proposed method contributes to
Biomedical Text Summarization: A Graph-Based Ranking Approach 155
extracting top-ranked sentences with other syntactic and word embedding methods
for graph-based matrix generation. The PageRank algorithm is used for ranking the
highest score sentences, and BERT and Cosine-based text rank algorithms perform
well compared to the baseline. BERT-based ranking is efficient for short docu-
ments but is time-consuming for huge text corpuses. The Cosine-based LexRank
performs efficiently for short and mid-length document summarization. Graph-based
approaches for finding specific significant vertices can be enhanced in future work,
and the proposed method can be further integrated with different knowledge bases
like UMLS for biomedical datasets (Pubmed, Cord19).
References
1. Mishra, R., Weir, C.R., Bian, J., Jonnalagadda, S., Fiszman, M., Mostafa, J., Del Fiol, G.: Text
summarization in the biomedical domain: a systematic review of recent research. J. Biomed.
Inform. 52, 457–67 (2014)
2. Allahyari, M., Trippe, E.D., Pouriyeh, S., Safaei, S., Assefi, M., Kochut, K., Gutierrez, J.B.:
Text summarization techniques: a brief survey (2017). arXiv preprint arXiv:1707.02268
3. Sharaff, A., Roy, S.R.: Comparative analysis of temperature prediction using regression
methods and back propagation neural network. In: ICOEI (2018).
4. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
5. Nagwani, N.K., Singh, P.: Weight similarity measurement model based object oriented
approach for bug databases mining to detect similar and duplicate bugs. In: International
Conference on Advances in Computing, Communication and Control (ICAC3’09) (2009)
6. Chen, P., Verma, R.: A query-based medical information summarization system using ontology
knowledge. pp. 37–42.
7. Sharaff, A., Nagwani, N.K.: Email thread identification using latent Dirichlet allocation and
non-negative matrix factorization based clustering techniques. J. Inf. Sci. 42(2), 200–212 (2016)
8. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. Int. J. Public Adm. 42(7), 596–615
9. Sharaff, A., Shrawgi, H., Arora, P., Verma, A.: Document summarization by agglomerative
nested clustering approach. In: IEEE International Conference on Advances in Electronics,
Communication and Computer Technology (2016)
10. Zahir, S., Cenek, M., Fatima, Q.: New graph-based text summarization method. pp. 396–401
(2015)
11. Moradi, M., Ghadiri, N.: Different approaches for identifying important concepts in proba-
bilistic biomedical text summarization. Artif. Intell. Med. (2017)
12. Moradi, M., Ghadiri, N.: Quantifying the informativeness for biomedical literature summa-
rization: an itemset mining method. Comput. Methods Progr. Biomed. 146, 77–89 (2017)
13. Mohamed, M., Oussalah, M.: An iterative graph-based generic single and multi document
summarization approach using semantic role labeling and wikipedia concepts. pp. 117–120
14. Sharaff, A., Nagwani, N.K.: SMS spam filtering and thread identification using bi-level text
classification and clustering techniques. J. Inf. Sci. 1–13 (2015)
15. Lin, C.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization
Branches Out (2004)
16. Beltagy, I., Cohan, A., Lo, K.: SciBERT: a pretrained language model for scientific text. In:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing,
pp. 3615–3620. Association for Computational Linguistics, Hong Kong, China (2019)
17. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-
networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language
Processing, pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China
(2019)
156 S. Gupta et al.
18. Page, L., Winograd, T., Motwani, R., Brin, S.: The PageRank citation ranking: bringing order
to the web. Stanford InfoLab (1999)
19. Moen, S., Ananiadou, T.S.S.: Distributional semantics resources for biomedical text processing.
pp. 39–43 (2013)
20. Achakulvisut, T., Acuna, D.E.: Pubmed Parser. (2015). https://doi.org/10.5281/zenodo.159504
21. Inc. GitHub. Open source data. https://github.com/deepset-ai/ COVID-QA, (2020). Stephan
Tulkens, “humumls”. GitHub repository. Retrieved from https://github.com/clips/humumls
(2018)
22. Yuya Taguchi, “pythonrouge”. GitHub repository. Retrieved from https://github.com/tagucci/
pythonrouge (2018)
23. Mišo Belica, “sumy”. GitHub repository. Retrieved from https://github.com/miso-belica/sumy
(2018)
EEG-Based Diagnosis of Alzheimer’s
Disease Using Kolmogorov Complexity
1 Introduction
Alzheimer’s disease (AD) is the most common and significant public health issue
worldwide. The impact of AD on the aging population is growing at an alarming rate.
At present, the number of people suffering from AD and its cognitive impairments is
estimated to be more than 50 million, and it is predicted that it will double by 2030
and triple by 2050 [1]. AD is a chronic neurological disorder that kills the number
of synapses due to the deposition of tau protein neurofibrillary tangles and amyloid
D. Puri (B)
Department of Electronics and Telecommunication, R.A.I.T, Mumbai, India
S. Nalbalwar · A. Nandgaonkar
Department of Electronics and Telecommunication, Dr. B. A.T. University, Lonere, India
A. Wagh
Directorate of Technical Education, Maharashtra State, Mumbai, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 157
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_15
158 D. Puri et al.
plaque, and eventually, the death of neurons will occur [2]. In recent years, several
research groups have investigated the potential of EEG for the diagnosis of AD over
other brain imaging techniques such as fMRI, SPECT, and PET [3]. Since EEG
recording systems are non-invasive, of low cost, mobile, and provide high temporal
resolution, EEG can be used as a tool to screen people for the risk of AD. Many
researchers have shown that AD has three significant effects in EEG: slowing, loss
of complexity, and perturbation in EEG [4].
The study in [5] proposed relative power to investigate the slowing effect and
Lempel–Ziv complexity for measuring irregularity in EEG of AD patients, and its
final accuracy is 85%. [6] proposed the approximate entropy and auto-mutual infor-
mation for detecting AD patients from NC subjects and got accuracy up to 90%. In
this paper, we have used Spectral Entropy (SE) and Kolmogorov Complexity (KC)
to investigate the slowing effect and regularity present in EEG of AD patients. These
features have been provided to various classifiers to check the performance. SVM
performed better compared to other supervised and unsupervised classifiers.
The rest of the paper is described as follows: Sect. 2 provides the details of EEG
recordings and gives SE and KC feature information. Section 3 presents the analysis
and discussion of the results, followed by Conclusion in Sect. 4.
EEG datasets used in this experiment consist of two groups: AD patients and Normal
Controlled (NC) subjects. EEG signals have been captured from the 11 NC subjects
(4 men and 7 women) and 12 (5 men and 7 women) AD patients with age 72.8 ±
8.0 (mean ± standard deviation) years. These AD patients have been recruited from
Alzheimer’s patients Relatives Association of Valladolid (AFAVA). These patients
have fulfilled the criteria of probable AD. To access the cognitive ability, all partic-
ipants have gone through clinical evaluation like physical and neurological exami-
nation, brain scanning, and the most significant test Mini-Mental State Examination
(MMSE) [7]. The mean value of the MMSE score was 13.1 ± 5.9 (mean ± SD)
for AD patients. 5 AD patients (out of 12) had MMSE scores less than 12, which
denotes severe dementia. All EEG signals have been recorded at the University
Hospital of Valladolid (Spain). The NC group contains 11 subjects (4 men and 7
women) and an age group of 72.5 ± 6.1 (mean ± SD), having no present symp-
toms and history of dementia or any neurological disorder. This NC group had more
than 30 MMSE values. All participants willingly participated in the EEG recording
activity, and written informed consent was taken from NC subjects and caregivers of
demented patients. The local ethical committee approved this acquisition process of
the Hospital Clinic Universitario de Valladolid.
EEG-Based Diagnosis of Alzheimer’s Disease … 159
Fig. 1 Three electrode (P3, P4, and O2) sample EEG signal of a Controlled person, b AD Patient
More than 5 min of EEG signals were recorded from each subject using a Profile
Study Room 2.3.411 EEG system (Oxford Instruments) at electrodes O1, O2, Fz, Cz,
Pz, F3, F4, F7, F8, Fp1, Fp2, C3, C4, T3, T4, T5, and T6 of the International 10–20
electrode placement system with linked earlobe reference points [6, 7]. During the
recording process, all participants were at rest, awake, and with eyes closed under
vigilance control. The 12-bit A-D converter was used to sample the EEG signals at
256 Hz sampling frequency. A specialist physician had checked the EMG activity,
eye movement, and other artifacts in EEG segments.
Thus, we have only selected the EEG segments free from electro-oculographic
with minimal EMG generated due to nonlinear analysis movements. Afterward, EEG
data was arranged in 5s artifact-free epochs (1280 points). The average number of
epochs selected was 28.8 ± 15.5 (mean ± SD) per electrode per subject. Figures 1
and 2 show a sample EEG of three electrodes (P3, P4, and O2) for AD patients and
HC subjects.
Sp(k)
P(k) = (1)
i Sp(i)
In this work, SE has been applied to all the EEG data of the AD patients and NC
subjects.
Table 1 Average spectral entropies of AD patient and controlled subjects for each electrode with
their (mean ± SD) values
Electrode AD patients Controlled subject p
F3 0.574504193 ± 0.1121 0.579422229 ± 0.1122 0.1114
T5 0.575884202 ± 0.1123 0.581471674 ± 0.1123 0.0192
F7 0.574658548 ± 0.1137 0.578638296 ± 0.1137 0.6354
Fp1 0.574861246 ± 0.1127 0.581081475 ± 0.1127 0.0632
Fp2 0.577233983 ± 0.1141 0.580057193 ± 0.1141 0.1244
T3 0.575677362 ± 0.1141 0.579304739 ± 0.1141 0.7663
F4 0.574568333 ± 0.1122 0.578557822 ± 0.1122 0.8242
T4 0.577530426 ± 0.1134 0.582928486 ± 0.1134 0.9701
C3 0.575470097 ± 0.1142 0.579007141 ± 0.1142 0.1819
T6 0.576247037 ± 0.1135 0.58030197 ± 0.1134 0.0322
F8 0.576602097 ± 0.1146 0.578504554 ± 0.1146 0.4426
C4 0.575781958 ± 0.1123 0.579298901 ± 0.1122 0.3199
P3 0.575143866 ± 0.1132 0.582246802 ± 0.1132 0.0014
O1 0.576824677 ± 0.1138 0.579224846 ± 0.1138 0.0027
O2 0.582924038 ± 0.1114 0.588796275 ± 0.1113 0.0086
P4 0.57875928 ± 0.1156 0.580618155 ± 0.1156 0.0031
162 D. Puri et al.
0.585
0.58
0.575
0.57
0.565
F3 F4 F7 F8 Fp1 Fp2 T3 T4 T5 T6 C3 C4 P3 P4 O1 O2
EEG Electrodes
Fig. 2 Comparison of spectral entropies of AD and NC classes for all 16 EEG electrodes
NC subject AD Patient
10
9.9
Kolmogorov
Complexity
9.8
9.7
9.6
9.5
9.4
F3 F4 F7 F8 Fp1 Fp2 T3 T4 T5 T6 C3 C4 P3 P4 O1 O2
EEG Electrodes
Fig. 3 Comparison of Kolmogorov complexity of AD and NC classes for all 16 EEG electrodes
differences were found using Area Under Curve (AUC) of the ROC plot. The other
values like precision, recall, F1-score, and accuracy of each classifier for three
different feature sets: (a) only SE, (b) only KC, (c) SE, and KC.
Firstly, we have evaluated the six different classifiers’ performance parameters by
using only SE feature sets that have been shown in Table 2. SVM and KNN provide
maximum accuracy of 90.8% and 90.6%, whereas other classifiers like RF, MLPNN,
Table 2 Performance parameters of various classifiers using only spectral entropy feature sets with
tenfold cross-validation technique
Accuracy F1-score Precision Recall AUC
SVM 90.8 90.7 91 90.8 95.1
RF 89.7 89.6 89.9 89.7 94.9
MLPNN 90.6 90.6 90.6 90.6 95.4
NB 89.7 86.8 86.9 86.7 93.3
KNN 90.6 90.5 90.7 90.6 95.3
AdaBoost 82.6 82.5 82.6 82.6 81.7
EEG-Based Diagnosis of Alzheimer’s Disease … 163
Table 3 Performance parameters of various classifiers using only Kolmogorov complexity feature
sets with tenfold cross-validation technique
Accuracy F1-score Precision Recall AUC
SVM 92.9 92.8 93 92.9 96.6
RF 91.2 91.1 91.6 91.2 96.5
MLPNN 93.1 93 93.2 93.1 96.9
NB 87.3 87.3 87.5 87.3 93.7
KNN 92.1 92.1 92.3 92.1 96.8
AdaBoost 84.9 84.9 84.9 84.9 84.1
Table 4 Performance parameters of various classifiers using spectral entropy (SE) and Kolmogorov
complexity feature sets with tenfold cross-validation technique
Accuracy F1-score Precision Recall AUC
SVM 95.6 95.1 95.2 95.2 98.3
RF 90.5 90.4 90.5 90.5 96.1
MLPNN 94.1 94.1 94.1 94.1 97.7
NB 79 79 79 79 86.8
KNN 95.2 95.6 95.6 95.6 98.5
AdaBoost 88.1 88.1 88.1 88.1 87.5
Naive Bayes (NB), and AdaBoost, 89.7%, 90.5%, 89.7%, and 82.6%, respectively.
Secondly, we have applied the KC feature sets to the same classifiers which have
already been used for SE feature sets; again, we got the maximum accuracy of 92.9%
from the SVM classifier. The other classifiers also performed well. The accuracy,
F1-score, precision, recall, and AUC for all classifiers with KC as feature sets are
provided in Table 3. Thirdly, we have applied the combination of SE and FC feature
sets to the same classifiers; it is observed that all classifier performance parameters
have been improved, as shown in Table 4. SVM provides the maximum classification
rate of 95.6%. In all three experiments, SVM has performed well compared to all other
classifiers used to evaluate feature sets. The comparison of classification accuracy
from all classifiers has been shown in the bar chart in Fig. 4; it’s clear that the
combination of SE and KC feature sets performs well compared to an individual
one. Performance estimation is done by a tenfold cross-validation method in all
scenarios. From these evaluations, it has been observed that AD patients’ EEG is
more regular than that of the NC subjects; this has been captured from KC feature
sets. The spectral entropy values and KC values are significant biomarkers that can
identify an AD patient from NC subjects. A comparison table of the present method
with the other similar approach is shown in Table 5.
164 D. Puri et al.
100 SE KC SE+KC
95
90
85
80
SVM RF MLPNN NB KNN AdaBoost
Fig. 4 Performance evaluation of various classifiers for three different input feature sets a SE only,
b KC, c SE, and KC
4 Conclusion
In our framework, the diagnosis of AD patients from NC subjects has been performed
on the basis of measures of Spectral Entropy and Kolmogorov Complexity. We
obtained that SE values are significantly lower in AD patients’ EEG than NC subjects.
The EEG of AD patients is more regular. This has been captured from KC values.
There are some limitations of this work. Firstly, the in-hand dataset size was small.
To utilize this technique as a tool for diagnosing AD, this must be overextended
to more extensive AD patient samples. In our next work, we will concentrate on
studying EEG synchrony with various entropies and complexity for the diagnosis of
AD patients from mild cognitive impairment patients and healthy controlled subjects
of the same age group. We will apply the method described in this work on the other
EEG data collected at various hospitals to find our method’s correctness.
EEG-Based Diagnosis of Alzheimer’s Disease … 165
References
1. Alzheimer’s disease facts and figures the journal of alzheimer’s association, Chicago, vol. 13
(2020)
2. Lopez-Martin, M., Nevado, A. and Carro, B.: Detection of early stages of Alzheimer’s disease
based on MEG activity with a randomized convolutional neural network. Artif. Intell. Med.
107 (2020). ISSN 0933-3657, https://doi.org/10.1016/j.artmed.2020.101924
3. Puri, D., Ingle, R., Kachare, P., Awale, R.: Wavelet packet sub-band based classification of
alcoholic and controlled state EEG signals. In: International Conference on Communication
and Signal Processing (ICCASP), Atlantis Press, pp. 562–567 (2016). https://doi.org/10.2991/
iccasp-16.2017.82
4. Fiscon, G., Weitschek, E., Cialini, A., Felici, G., Bertolazzi, P., De Salvo, S., Bramanti, A.,
Bramanti, P. and De Cola, M.C.: Combining EEG signal processing with supervised methods
for Alzheimer’s patients classification. BMC Med. Inform. Decis. Mak. 18(35) (2018). https://
doi.org/10.1186/s12911-018-0613-y
5. Dauwels, J., Srinivasan, K., Ramasubba Reddy, M., Musha, T., Vialatte, F.-B., Latchoumane,
C., Jeong, J., Cichocki, A.: Slowing and loss of complexity in Alzheimer’s EEG: two sides of
the same coin? Int. J. Alzheimer’s Dis. 539621 (2011). https://doi.org/10.4061/2011/539621
6. Abasolo, D., Hornero, R., Escudero, J., Gomez, C., Garcia, M., Lopez, M.: Approximate
entropy and mutual information analysis of the electroencephalogram in alzheimer’s disease
patients. In: IET 3rd International Conference On Advances in Medical, Signal and Information
Processing (MEDSIP), (2006), pp. 1–4. https://doi.org/10.1049/cp:20060347
7. Folstein, M.F., Folstein, S.E., McHugh, P.R.: Mini-mental state: a practical method for grading
the cognitive state of patients for the clinician. J. Psychiatry Res. 12(3), 189–198 (1975). https://
doi.org/10.1016/0022-3956(75)90026-6
8. Vakkuri, A., Yli-Hankala, A., Talja, P., Mustola, S., Tolvanen-Laakso, H., Sampson, T., Viertiö-
Oja, H.: Time-frequency balanced spectral entropy as a measure of anesthetic drug effect in
central nervous system during sevoflurane. Propofol, Thiopental Anesth., Acta Anaesthesiol.
Scand. 48(2), 145–153 (2004)
9. Petrosian, A.: Kolmogorov complexity of finite sequences and recognition of different preictal
EEG patterns. In: Proceedings Eighth IEEE Symposium on Computer-Based Medical Systems,
Lubbock, TX, USA, pp. 212–217 (2015). https://doi.org/10.1109/CBMS.1995.465426
10. Latchoumane, C.F.V., Vialatte, F.B., Jeong J., Cichocki, A.: EEG Classification of mild and
severe alzheimer’s disease using parallel factor analysis method. In: Ao, S.I., Gelman, L. (eds.)
Advances in Electrical Engineering and Computational Science. Lecture Notes in Electrical
Engineering, vol. 39. Springer, Dordrecht (2009). https://doi.org/10.1007/978-90-481-2311-
7_60
11. Datta, A., Chatterjee, R.: Comparative study of different ensemble compositions in EEG signal
classification problem. In: Abraham, A., Dutta, P., Mandal, J., Bhattacharya, A., Dutta, S. (eds.)
Emerging Technologies in Data Mining and Information Security. Advances in Intelligent
Systems and Computing, vol. 813. Springer, Singapore (2019). https://doi.org/10.1007/978-
981-13-1498-8_13
12. De Bock, T.J., et al.: Early detection of Alzheimer’s disease using nonlinear analysis of EEG via
tsallis entropy. In: Biomedical Sciences and Engineering Conference. Oak Ridge, TN, pp. 1–4
(2010). https://doi.org/10.1109/BSEC.2010.5510813
Quantification of Streaking Effect Using
Percentage Streak Area
1 Introduction
S. Ahmed (B)
Baba Ghulam Shah Badshah University, Rajouri, India
e-mail: sajjad@bgsbu.ac.in
S. Islam
ZHCET, Aligarh Muslim University, Aligarh, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 167
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_16
168 S. Ahmed and S. Islam
pression [3]. To cover such forgeries, usually, an operation such as median filtering
and contrast enhancement is applied. This is done to hide the evidence that may be
visible to naked eyes or sophisticated tools used for the purpose.
To remove the evidences left by image manipulation techniques, the median filter
(MF) is the tool of the choice. The filter is effective against those techniques that rely
on the supposition that neighboring pixels are linearly correlated.
Median Filter is widely used for removing impulse noise from digital images. As it
is a nonlinear filter, it efficiently eliminates traces left by other methods. Thus, detect-
ing the median filter application on an image raises suspicion about the authenticity
of an image. Numerous methods have been proposed for median filter detection.
Many of the median filter detection methods use the streaking effect as an artifact
left by applying the median filter. Such work’s reliability is based on the fact that the
median filter images contain a streaking effect.
The purpose of the work is to investigate the streaking effect in natural images
in standard datasets, which are being used in digital image forensics-based methods
to evaluate such detectors’ performance. The work provides an in-depth analysis of
how much streaking is present in natural images and what it happens to streaks when
an original, unaltered, and uncompressed image is median filtered with different
window sizes. The work presented also studied the amount of streaking present in
other operators such as Gaussian filter, Average filter, and Unsharp masking. To
quantify the streaking effect on an image, percentage streak area (PSA) as proposed
in [4] was employed.
The rest of the paper is organized in the following sections: Sect. 2 covers back-
ground and prior work, Sect. 3.1 describes the experimental setup, Sect. 3.2 presents
and analyzes the results, and the conclusion is presented in Sect. 4.
where
When a median filter is applied on one-dimensional data, the output value is one of
the input values, and no new value is created. Thus, the same value may be chosen
as output over several filter window shifts producing streaks or blotches having no
visual correlation. Such artifacts are called streaks and the effect is called streaking
effect, first mentioned in [5]. The median filter application on an image produces an
image with the same or almost the same gray level. Such regions exhibit different
shapes depending on the filter’s size and the type of filter applied, such as star and
square-shaped. These regions, which are introduced into the image, may take the
form of a streak or a two-dimensional rough patch. The streaking effect for images
was analyzed by Bovik et al. in [6].
Among various methods used to detect median filter applications on digital images,
the most frequently used method is based on the streaking effect. The streaking effect
was first mentioned in [5]. Bovik in [6] performed a probability-based analysis of
the impact in median filtered signals, and the results showed that median filter with
square window produces blotches in two-dimensional signals. An undesirable effect
of applying a median filter to an image is the streaking effect. Application of median
filter produces streaks or blotches, which are the runs of equal or near-equal values.
The streaking effect problem is that it makes blotches and streaks which means it
introduces artifacts such as false lines and contours. The streaking effect becomes
useful when testing whether an image is median filtered or not and is applied for
many researchers’ purposes. In [7], Kirchner et al.’s work is based on the streaking
effect and measured the streaking effect by taking the first-order image difference
of the image. They observed that the ratio of the number of zeros to the number of
ones might be used as a feature vector. In the same paper, the author also explored
the SPAM application [8] to detect median filter application. They successfully used
second-order SPAM feature vector of size 686 dimensions to detect median filtering
in high-quality JPEG compressed images. The work by Cao et al. [9] is also based
170 S. Ahmed and S. Islam
on streaking artifacts. They quantified the effect by measuring the number of zeros
in the image’s textured region after taking the first-order difference image. The work
considered both horizontal and vertical streaks introduced after the application of a
median filter. The work by Yuan in [10] is based on the fact that median does not
introduce new values. Only redistribution of value occurrs and presented the median
filtering forensics (MFF) feature. The MFF is a combination of five subfeatures
that are calculated using order statistics to measure local dependence introduced
by the median filter. Kang et al. [11] used the autoregressive model for median
filter detection. Li et al in [12] proposed a single dimension feature based on the
observation that frequency residual obtained from an image which is median filtered
again and again monotonically decreases. Sajjad Ahmed and Saiful Islam introduced
percentage streak area to measure streaking effect characteristics using a median
filter. Further, a median filter detector has been proposed based on machine learning
methods. This method utilizes the percentage streak area as a quantitative measure to
construct a feature vector based on an increase in the percentage of pixels involved
in streaking after median filtering an image.
The percentage streak area (PSA) is a metric proposed in [4] to quantify the streaking
effect. The PSA is the percentage of pixels involved in streaks in an image. We denote
a gray-level image as I of M × N dimensions. We will assume that
−→ )=←
psa(I − )
psa(I
and
↓ psa(I ) =↑ psa(I ).
The − → is calculated in the horizontal direction and vertical directions ↓ psa and
psa
merged to create, psa(I ), by taking weighted mean of − → ) and ↓ psa(I ) as
psa(I
follows:
−→ )√2+ ↓ psa(I )√2
psa(I
psa(I ) = (2)
2
The above equation may also be written as
−−−−−−−−→ 100
psa(I ) = ( Str eak Ar ea(I )+ ↓ Str eak Ar ea(I )) × √ (3)
MN 2
−−−−−−−−→
where Str eak Ar ea(I ) is the total numbers of pixels in row-wise streaks in image
I , and ↓ Str eak Ar ea(I ) represents the total numbers of pixels involved in column-
Quantification of Streaking Effect Using Percentage … 171
wise streaks in image I . For an image I w which is median filtered with a square
window of size w, psa(I w ) may be calculated using Eq. 3 as follows:
−−−−−−−−→ 100
psa(I w ) = ( Str eak Ar ea(I w )+ ↓ Str eak Ar ea(I w )) × √ (4)
MN 2
To analyze the presence of streaks and how they are affected by median filter, a study
of streak in original images and streaks in median filtered images was conducted.
For this purpose, a total of 12,826 digital images from the standard image dataset
UCID [13], from BOSS [14], and Dresden [15] were taken to construct a dataset of
natural image DS = {U C I D, B O SS, Dr esden}. The DS consists of 1338 images
from UCID, 10,000 Images from BOSS, and 1488 images from Dresden datasets.
The median filtered dataset DSw was constructed by median filtering the original
dataset DS with different filter window sizes w = {3, 5, 7, 9, 11}. To study the effect
of average filtering, DS was filtered using the same window size, w, generating
DSavgw . Similarly, dataset DSg f and DSusm were generated from DS. We employed
Percentage Streak Area (PSA) as described in [4] as a quantification parameter to
measure the streaking effect. We define a streak as a run of pixel length with the
same or almost the same pixel intensity value in the study. A streak of length one
consists of 2 pixels of the same or almost the same intensity. The psa(I ) measures the
percentage of pixels of an image involved in streaks and may be applied to measure
the streaking effect.
To study the streaks in original and unaltered images, the dataset DS was processed
to extract streaks and PSA using Eq. 3. One of the images highlights the horizontal
streak in original and median filtered images in Fig. 1 for UCID000107.tif from
the UCID dataset. Figure 1a shows the original image, and Fig. 1b shows a median
filtered image. Figure 1c shows horizontal streaks in the original image and Fig. 1d
shows streaks in the median filtered image. The visual difference between Fig. 1c,
d clearly shows an increase in streaking after applying the median filter. Table 1
shows statistics related to streaks in the image UCID000107.tif. Similar statistics
were available for every image in dataset DS indicating an increase in streak area
after median filtering except a very small number of images and is presented in
Table 2. Table 2 shows the number of images from each dataset. Upon investigation,
172 S. Ahmed and S. Islam
it was found that these images contain highly saturated regions such as a black sky
with a moon in the foreground. Figure 3 shows the mean percentage streak area
for the dataset DS. The mean percentage streak area for natural images is minimal
compared to the mean percentage streak area for median filter images of any window
size. Also, the percentage streak area for images filtered with a higher filter window
size is more than the percentage streak area of images filtered with smaller window
filter sizes. It can be inferred from Fig. 3 that significant streaks are present in original
and unaltered images, but the increase in pixels involved in streaks is also significant
after the median filtering of the images. The dataset DS was also studied for the
effect of the repetitive application of median filter on PSA and increase in PSA on
repeated application of median filtering with a window size of 3 × 3. The first five
differences between consecutive repetitions of media filter of all images in the UCID
dataset are potted in Fig. 2. di f f (1) is the difference in PSA of original images and
PSA of the 1-time median filtered version of the images. Similarly, di f f (2) is the
difference in PSA of 2-times and 1-time median filtered version of the image and so
on. All images from the datasets show the same trend.
For unfiltered and original images di f f (1) and di f f (2) are tremendous as com-
pared to di f f (2) and di f f (3) and so on. Similar results are obtained for the Dresden
Quantification of Streaking Effect Using Percentage … 173
Table 2 Outliers
Datasets No. of 3×3 5×5 7×7 9×9 11 × 11
images
BOSS 10,000 2 0 3 5 12
UCID 1338 0 0 2 0 0
Dresden 1488 0 0 0 0 0
Total 12,826 2 0 5 5 12
Fig. 2 Difference in
percentage streak area
Fig. 4 Average percentage streak area after application of various operations on dataset
dataset and the BOSS dataset. For all three datasets, the increase in percentage streak
area (PSA) is monotonic. The rise in PSA is huge when an image is median filtered
for the first time. But on further application of median filter, the PSA increases slowly.
Table 2 show several images that do not follow the monotonic behavior when median
filtered with different window sizes of 3 × 3, 5 × 5, 7 × 7, 9 × 9, and 11 × 11. All
other images in the dataset considered for the study show a monotonic increase in
the streaking effect on repeated median filtering.
When an image is filtered with other popular filters that are often applied filters for
image processing such as average filter, Gaussian filter, and unsharp masking filter,
the streaking also increases and is plotted in Fig. 4 which shows mean percentage
streak area for the UCID dataset when filtered using median filter, the average filter,
the Gaussian filter, and the unsharp masking filter. The results clearly show that the
mean percentage streak area increases significantly more for the median filter than
the average filter and Gaussian filter. The increase is small for the Gaussian filter,
but the trend is reversed for unsharp masking, for which the percentage streak area
decreases. Percentage streak area is more for median filter and is a characteristic
feature for detecting a median filter application. Further study of percentage streak
area as a quantification measure of streaking needs to be evaluated, and a qualitative
and quantitative comparison is required.
4 Conclusions
The paper investigates the streaking effect in natural images. The percentage streak-
ing area (PSA) has been used as a metric to quantify streaking in an image. Stan-
dard image datasets UCID, BOSS, and Dresden have been used in the study. Our
Quantification of Streaking Effect Using Percentage … 175
work shows that significant streaking is present in original and unaltered images as
well. The investigation indicates that though authentic images contain a considerable
amount of streaking, the median filter application increases the streaking significantly
and can be detected by percentage streak area (PSA). The percentage streak area also
increases on application of average filter and Gaussian filter but increase in signifi-
cantly smaller as compared to increase in percentage streak area when same image is
median filtered and for unsharp masking filter the percentage streak area decreased.
In conclusion, we can say that the streaking effect can be quantified using percentage
streak area and is a promising feature vector for future studies in the area.
References
1. Ferrara, P., Bianchi, T., Rosa, A.D., Piva, A.: Image forgery localization via fine-grained anal-
ysis of CFA artifacts. IEEE Trans. Inf. Forensics Secur. 7(5), 1566–1577 (2012)
2. Cao, G., Zhao, Y., Ni, R.: Forensic identification of resampling operators: a semi non-intrusive
approach. Forensic Sci. Int. 216(1), 29–36 (2012)
3. Neelamani, R., De Queiroz, R., Fan, Z., Dash, S., Baraniuk, R.G.: Jpeg compression history
estimation for color images. IEEE Trans. Image Process. 15(6), 1365–1378 (2006)
4. Ahmed, S., Islam, S.: Median filter detection through streak area analysis. Digit. Invest. 26, 100–
106 (2018). [Online]. https://www.sciencedirect.com/science/article/pii/S1742287617303109
5. Justusson, B.I.: Median Filtering: Statistical Properties, pp. 161–196. Springer Berlin Heidel-
berg, Berlin, Heidelberg (1981). [Online]. http://dx.doi.org/10.1007/BFb0057597
6. Bovik, A.C.: Streaking in median filtered images. IEEE Trans. Acoust. Speech Signal Process.
ASSP-35(4), 181–194 (1987)
7. Kirchner, M., Fridrich, J.: On detection of median filtering in digital images. IS& T/SPIE
Electron. Imaging 110–754 (2010)
8. Pevny, T., Bas, P., Fridrich, J.J.: Steganalysis by subtractive pixel adjacency matrix. IEEE Trans.
Inf. Forensics Secur. 5(2), 215–224 (2010)
9. Cao, G., Zhao, Y., Ni, R., Yu, L., Tian, H.: Forensic detection of median filtering in digital
images In: IEEE International Conference on Multimedia and Expo (ICME), pp. 89–94 (2010)
10. Yuan, H.-D.: Blind forensics of median filtering in digital images. IEEE Trans. Inf. Forensics
and Secur. 6(4), 1335–1345 (2011)
11. Kang, X., Stamm, M.C., Peng, A., Liu, K.J.R.: Robust median filtering forensics based on
the autoregressive model of median filtered residual. In: Proceedings of the 2012 Asia Pacific
Signal and Information Processing Association Annual Summit and Conference, Dec 2012,
pp. 1–9 (2012)
12. Li, W., Ni, R., Li, X., Zhao, Y.: Robust median filtering detection based on the difference of
frequency residuals. In: Multimedia Tools and Applications, pp. 1–19 (2018)
13. Schaefer, G., Stich, M.: Ucid: an uncompressed color image database. Electron. Imaging 2004,
472–480 (2003)
14. Bas, P., Filler, T., Pevny, T.: Break our steganographic system: the ins and outs of organizing
boss. In: International Workshop on Information Hiding, pp. 59–70 (2011)
15. Gloe, T., Böhme, R.: The Dresden image database for benchmarking digital image forensics.
J. Digit. Forensic Pract. 3(2–4), 150–159 (2010)
Improving Topographic Features
of DEM Using Cartosat-1 Stereo Data
Abstract In the current study, we have generated a Digital Elevation Model (DEM)
using high spatial resolution stereo images (2.5 m spatial resolution) of the Cartosat-
1 satellite and examined the terrain’s quantitative topographic features. Firstly, the
DEM is generated through topographic features such as elevation, slope gradient,
aspect, hill shade, and contour map. We performed a comparative evaluation of
the accuracy of topographic features the DEM generated through stereo images and
freely accessible Cartosat-1 DEM data (30 m spatial resolution) with other references
DEMs such as Shuttle Radar Topography Mission (SRTM) DEM and ALOS global
DSM (AD3D30). The visual analysis of all the DEMs is done through a surface profile
map. The surface profile map of DEM generated through stereo images shows a good
correlation with reference DEMs in all regions of the profile map. This study reveals
that the Cartosat-1 DEM generated through stereo images gives better accuracy than
freely accessible Cartosat-1 DEM.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 177
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_17
178 L. Bopche and P. P. Rege
there is an immense requirement for precise and accurate DEMs covering the world’s
entire surface.
DEMs can be extracted from numerous data sources and techniques such as
the Interferometric Synthetic Aperture Radar (InSAR) methods, photogrammetric
methods, aerial laser scanning, and ground surveying methods [4, 5]. DEM is also
referred to as Digital Terrain Model (DTM) and Digital Surface Model (DSM) in
scientific journals. DSM of the topography characterizes the Earth’s terrain and
contains all objects like plants, ground surface, vegetation, and buildings. DSM
is very beneficial for landscape modeling: land use land cover planning, environ-
mental application, construction purpose, and many others [6]. The existing DEMs
like Cartosat DEM, Shuttle Radar Topography Mission (SRTM) DEM, and ALOS
global DSM (AD3D30) are used for obtaining the topographic features map.
This paper has generated high Spatial Resolution (SR) Cartosat-1 DEM (10-m
SR) from stereo pair images of the Cartosat-1 satellite. A detailed comparison of
the DEM generated through stereo images and Cartosat-1 DEM has been accom-
plished regarding the DEMs like SRTM DEM and ALOS global DSM (AD3D30)
to estimate the quality accuracy generated DEM [7, 8]. Freely available Cartosat-1
DEMs show considerable drawbacks in consistency, availability, degree of resolu-
tion, and coverage. This study shows that the DEM accuracy generated through stereo
pair images was close to the reference DEM compared to Cartosat-1 DEM (30-m
resolution).
The following sections of the paper are organized as follows: Sect. 2 of the paper
explains the detailed topographic region and the data-set under study. Section 3
presents the methodology used in this work, whereas Sect. 4 covers a discussion on
the results. The paper finishes with Sect. 5, which is Conclusion. Henceforth, the
DEM extracted through stereo pair images and freely accessible Cartosat-1 DEM
will be designated as DEM-CART and DEM-CART1, respectively.
2 The Background
In the current study, DEM comparison has been performed for the Ambegaon taluka
of Pune district of Maharashtra, India. The northwest part of the study area contains
rugged mountains, undulating regions, and a high slope gradient. The region of
interest is about 1039 km2 . The geography of the study area lies between latitude
73° 24 0 N and 73° 27 0 N and longitude 19° 6 0 E to 19° 3 0 E as shown in
Fig. 1.
The data-set was downloaded from the following website:
(1) DEM-CART1 {Bhuvan—geoportal of Indian Space Research Organization
(ISRO)}
(2) SRTM-DEM {www.usgs.org}
(3) ALOS World 3D global DSM {https://www.eorc.jaxa.jp/ALOS/en/aw3d30}.
Improving Topographic Features of DEM Using … 179
Cartosat-1 DEM with a 7.5 m vertical resolution and 2.5 m SR are projected to
extract DEM. The Rational Polynomial Coefficient (RPC) file and geometric models
are the essential components required for extracting DEM from stereo pair images
of Cartosat-1 satellite. The Cartosat satellite was launched by ISRO on May 5, 2005.
Cartosat-1 satellite has two panchromatic cameras for stereo image viewing and a
Global Positioning System (GPS) receiver to position the areas under consideration.
Panchromatic cameras of Cartosat-1 sensors are tilted 5° (backward-viewing) and 26°
(forward-viewing) from the ground axis, respectively. The time difference between
capturing the stereo images is 52 s. Cartosat-1 satellite has along-track stereo viewing
capabilities with a swath width of 30 km.
Stereo images of the Ambegaon area for Cartosat-1 sensors are acquired by
National Remote Sensing Centre (NRSC) Hyderabad, India. The DEM extracted
using stereo-pair images are shown in Fig. 2a. DEM-CART1 of the study region
downloaded from the Bhuvan website is shown in Fig. 2b.
The SRTM has been the foremost mission using a space-borne single-pass InSAR
instrument to create a worldwide DEM of the Earth’s terrestrial surface with moderate
horizontal and vertical accuracies ± 30 m, ± 16 m, respectively. The SRTM mission
has been a revolution in Remote Sensing (RS) of geography creating the most
comprehensive, high-resolution DEM worldwide. The SRTM mission was started
by the collaboration of the NGA, and NASA gathered interferometric sensor data
utilized by the Jet Propulsion Laboratory (JPL) to produce a near worldwide (80%
of Earth’s land) DEM for latitude lesser than 60°. The SRTM DEM of the area under
consideration is shown in Fig. 2c.
180 L. Bopche and P. P. Rege
(a) (b)
(c) (d)
Fig. 2 Study Area a Cartosat-1 DEM generated through stereo pair images (DEM-CART),
b Cartosat-1 DEM (DEM-CART1) downloaded by Bhuvan web portal, c SRTM DEM, and d ALOS
World 3D global DSM
Since 2014, the Japan Aerospace Exploration Agency (JAXA) has conducted the
project to produce the accurate worldwide digital 3D model “ALOS World 3D”
(AW3D) for screening the global land regions through the utilization of 3 million
scene records obtained by the PRISM panchromatic optical sensor on the progressive
land observing satellite “DAICHI” (ALOS).
The advanced digital 3D model contains a DEM or DSM that can characterize
land topographies with 5 m (approximately) in SR and orthorectified PRISM nadir
viewing images. The AW3D DSM data-set is further processed, and the “ALOS
World 3D-30 m” (AW3D30) DSM data-set was released, which has approximately
30 m of SR. The digital 3D model has been used in various applications like damage
estimation of natural calamities, map development, infrastructure planning, and water
resource study. The ALOS DEM of the area under consideration is shown in Fig. 2d.
3 Methodology
of the Cartosat-1 DEM. The conventional noise reduction filters such as weighted
average filter, median filter, sharpen filter, lee sigma filter, and local sigma filter are
used to remove noise from the preprocessing steps’ stereo images. After that, the
DEM was created from the stereo images. The process of extracting DEM-CART in
LPS software includes multiple actions such as generating a block file, adding and
editing the frame, providing the RPC (exterior and interior positioning), initiation of
automated tie points, block triangulation, and generation of DEM.
The extraction of DEM procedure in LPS software, using stereo images, starts
with generating a block file, describing the geometric model, and then a raster image
is included. The new raster image is corrected by providing RPC. Cartosat-1 sensor
stereo images have corresponding RPC files. The Rational Polynomial sensor models
relate the image space (row and column) to the latitude, longitude, and altitude of
the terrain. We have used the automatic tie point collection to select the tie points
on the stereo pair image. The LPS software selects a point in one image and sets the
corresponding point in the second image.
The DEM extraction block triangulation process defines the mathematical relation
between the sensor model’s images and the territory. Block triangulation of the frame
is to be done by using the automatic tie points of the image. The DEM-CART is
extracted after processing the block file.
In this work, DEM is extracted using the automatic tie point selection method only.
The DEM extraction using the automatic tie point selection techniques develops a
good quality DEM from the stereo images without gathering any extra information
(like Ground Control Points (GCPs) of the region) for the area under the study.
Slope map, aspect map, hill shade map, and contour map are the crucial topographic
features that effectively represent the landscape structure and relief of the terrestrial
surface. Topographic features have an enormous influence on the accuracy and quality
of the DEMs. The effects of the spatial distribution of vertical errors on the topo-
graphic feature maps were examined for the resultant DEM-CART by comparing
elevation inconsistencies, mean value, and standard deviation (SD) of the DEMs
with respect to topographic features obtained from the reference DEM. Topographic
feature maps for all the DEMs were generated by using the spatial analytic toolbox
of ArcGIS software.
The quality and accuracy of DEM-CART, DEM-CART1, SRTM DEM, and ALOS
World 3D global DSM are compared through elevation value and topographic
features produced in ArcGIS 10.8 software, as shown in Fig. 3.
182 L. Bopche and P. P. Rege
Fig. 3 Topographic features maps of all the DEMs a slope map, b aspect map, and c hill shade
map
Improving Topographic Features of DEM Using … 183
Fig. 3 (continued)
For the comparison purpose, the statistical parameters will be generated and
compared. The elevation value is compared through minimum elevation, maximum
elevation, mean, and SD of the DEM as given in Table 1. The DEM-CART elevation
statistical data are closer to the reference DEM (such as SRTM DEM and ALOS
DSM) compared to the DEM-CART1.
For comparing the quality and accuracy of the DEM-CART, DEM-CART1, SRTM
DEM, and ALOS World 3D global DSM through topographic features like slope
map, aspect map, and hill shade map were created for all the DEMs in ArcGIS
10.8 software individually. The spatial analytic toolbox’s surface tool is used for
generating the topographic feature maps of all the DEMs. A comparison of statistical
parameters for the topographic features is given in Table 1. The statistical values of
the topographic features for the DEM-CART are closer to the reference DEM values
than the DEM-CART1.
The visual comparison of the DEM-CART, DEM-CART1, SRTM DEM, and
ALOS World 3D global DSM was done with the help of the contour map and surface
profile map of the DEM. The 50 m time interval contour maps of the study area are
generated using ArcGIS software for all the DEMs, as shown in Fig. 4. The contours
of DEM-CART were comparatively similar to the reference DEM. The count of the
different contour lines has also validated the DEM-CART accuracy (Table 2).
Fig. 4 Contour map of Cartosat-1 DEM (DEM-CART1), DEM-CART, SRTM DEM, and ALOS
World 3D global DSM
Table 2 Statistical values of different contour lines for all the DEMs
Contours DEM-CART1 DEM-CART SRTM DEM ALOS DSM
200–550 596 378 147 112
600–800 1396 677 555 587
850–1200 508 329 431 373
Total counts 2500 1384 1133 1072
Improving Topographic Features of DEM Using … 185
Fig. 5 Surface profile maps of Cartosat-1 DEM (DEM-CART1), SRTM DEM, ALOS World 3D
global DSM, and generated Cartosat-1 DEM (DEM-CART), respectively
The study region’s surface profile maps are generated using ArcGIS software for
all the DEMs, as shown in Fig. 5. The surface profiles were plotted (elevation value
versus distance in km) to check height variation along the profile lines. The outcomes
of the surface profile lines show that some regions (red circles) of DEM-CART1 are
less similar to reference DEM compared to the elevation profile map of DEM-CART.
The elevation profile map of DEM-CART shows a good correlation with reference
DEM in all regions of the profile map.
5 Conclusion
In this study, DEM’s topographic features extracted through Cartosat-1 stereo images
(DEM-CART) and DEM-CART1 are compared and validated against a reference
DEM for the area under consideration. The statistics of the terrain-related attributes
are calculated with the help of the ArcGIS software. It is observed that the eleva-
tion value and topographic feature attribute values of DEM-CART are closer to the
reference DEM. The main reasons are (i) DEM-CART is generated through high-
resolution Cartosat-1 stereo images, (ii) sufficient number of the tie points are used,
186 L. Bopche and P. P. Rege
and (iii) noise reduction filters are used for removal of the noise in the preprocessing
steps.
DEM-CART1 shows a more considerable variation in the reference DEM’s eleva-
tion values and topographic features’ attribute values. DEM-CART provided useful
and realistic information about the area’s topography and showed virtually the same
as that of the reference DEM. The visual analysis of the DEMs also clarifies the
quality and accuracy of the CART-DEM. The graphical illustration of the 50 m time
interval contour map of DEM-CART is similar to the reference DEM.
The surface profile graph of the DEM-CART1 shows a more considerable differ-
ence in the elevation values against distance compared to the reference DEM. The
surface profile map of DEM-CART shows a good correlation with reference DEM
in all regions of the profile map. The DEMs’ topographic feature maps are handy for
several studies like hydrology, drainage network, groundwater mapping, landslide
hazards mapping, runoff modeling, and watershed analysis.
References
1. Yin, Z.Y., Wang, X.: A cross-scale comparison of drainage basin characteristics derived from
digital elevation models. Earth Surf Process. Landf. 24, 557–562 (1999)
2. Bhatt, S., Ahmed, S.A.: Morphometric analysis to determine floods in the Upper Krishna basin
using Cartosat DEM. Geocarto. Int. 29, 878–894 (2014)
3. Gopinath, G., Swetha, T.V., Ashitha, M.K.: Automated extraction of watershed boundary and
drainage network from SRTM and comparison with Survey of India toposheet. Arab. J. Geosci.
7, 2625–2632 (2014)
4. Giribabu, D., Kumar, P., Mathew, J., Sharma, K.P., Krishna Murthy Y.V.N.: DEM generation
using Cartosat-1 stereo data: issues and complexities in Himalayan terrain. Eur. J. Remote Sens.
46, 431–443 (2013)
5. Singh, V.K., Ray, P.K.C., Jeyaseelan, A.P.T.: Orthorectification and digital elevation model
(DEM) generation using Cartosat-1 satellite stereo pair in Himalayan Terrain. J. Geogr. Inf.
Syst. 2, 85–92 (2010)
6. Hobi, M.L., Ginzler, C.: Accuracy assessment of digital surface models based on WorldView-2
and ADS80 stereo remote sensing data. Sensors 12, 6347–6368 (2012)
7. Pakoksung, K., Takagi, M..: Assessment and comparison of digital elevation model (DEM)
products in varying topographic, land cover regions and its attribute: a case study in Shikoku
Island Japan. Model Earth Syst. Environ. (2020)
8. Agarwal, R., Sur, K., Rajawat, A.S.: Accuracy assessment of the CARTOSAT DEM using robust
statistical measures. Model Earth Syst. Environ. 6, 471–478 (2020)
Active Noise Cancellation System
in Automobile Cabins Using
an Optimized Adaptive Step-Size FxLMS
Algorithm
1 Introduction
The noise generated by an Internal Combustion (IC) engine comprises several compo-
nents from various sources. In its normal operating condition, the noise generated by
combustion ranges mostly 100–1000 Hz, which falls in the narrowband spectrum.
Hence, this justifies applying a single-channel FxLMS algorithm to minimize the
complexity of computations and installation costs. Another critical issue is related to
specific nonlinear characteristics of the noise generated by the IC Engine system. The
origin of the primary nonlinearity effects can be from the following three sources:
the primary source of noise, actuators and system-based sensors, and the paths of
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 187
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_18
188 A. Bisht and H. Y. Patil
acoustic propagation. With all these factors and implementation considerations, such
as the physical constraints, complexity of the system design, and cost reduction, we
have decided to go with an upgraded Adaptive Step-Size FxLMS algorithm. The
FxLMS algorithm has been aptly designated as perhaps the most popular of all adap-
tive algorithms that can be used to update the controller weight. The fixed step-size
version of this algorithm provides satisfactory performance, but at the expense of the
burden caused by more decadent computational complexity. To acquire an algorithm
that assures a high convergence rate in both dynamic and stationary environments,
using an adaptive step-size function in the original algorithm seems more appro-
priate and lucrative [1]. For the algorithm’s application, an approximation of the
second propagation path is performed, which is then used to filter the reference noise
signal. Following this, it is used in a generalized secondary propagation path and
static non-deterministic noise field. These effects have been analyzed only under the
assumption that the secondary path is perfect [2, 3]. The complexity created as a
result of the x-filtering blocks poses a challenging problem and leads to disruptions
in implementing the system.
The above problem has provided ample motivation to seek a new advanced struc-
ture for the system where the disruptions and discrepancies caused by this may be
mitigated or eliminated [4]. Active Noise Cancellation (ANC) of narrowband white
noise will be instrumental in reducing noise signals with frequency characteristics
of a discrete nature. Considering the case of narrowband white noise, which ranges
from a few 100 Hz to less than a few 1000 Hz, it is a difficult task to eliminate
interference merely by methods of a submissive nature, but it can be removed more
effectively by using state-of-the-art ANC techniques involving destructive interfer-
ence [5]. An algorithm with a fixed step size might fail to give an optimized response
to a time-variant channel’s parameters, which may result in poor performance. For
the sake of overcoming this limitation, methods involving variable step size have
been developed, and many of the adaptive step-size functions have been built to
overcome the limitations of existing structures. For this very reason, a small step
size is usually maintained in a bit to ensure that the property of better convergence
is fulfilled [6]. Here, we have envisioned developing and implementing an advanced
algorithm for an optimized variable step-size and tap-length active noise cancella-
tion system using FxLMS for randomly generated narrowband internal combustion
engine noise. Using this approach, we endeavor to achieve a much better convergence
rate than previous system structures [7, 8] and an increase in overall performance
while also respecting the power and computational cost constraints. Section II reviews
the design features of a standard feedforward ANC system. Section III discusses the
intricacies of existing methodologies involving the FxLMS algorithm, while section
IV describes the various design parameters that we have taken into consideration
while constructing our proposed design. In section V, we analyze the simulation
results as a result of our contribution, and section VI draws up concluding remarks
concerning the proposed system, which reaffirms and validates the feasibility and
functionality of the design.
Active Noise Cancellation System … 189
The transfer function involving the secondary propagation path has a critical role in
generating anti-noise in avenues where ANC applications are in high demand since
it has nonlinearity. As a result, it causes a delay, which further leads to instability in
the LMS algorithm. This problem has been dealt with with the help of the robust and
efficient FxLMS algorithm [9], as it also takes into account an estimate of the second
propagation path. There is also the advantage of flexibility because the algorithm can
be used in both feedback and feedforward structures. In the case of a feedforward
ANC system, P(z) denotes the primary propagation path, which entails the acoustic
response of source reference noise to the system identification error sensor, and S(z)
represents the secondary propagation path. But since the effect of the secondary
propagation path effect needs to be canceled out, we need to measure the secondary
impulse response denoted by S(z). The secondary signal y(i) is represented by (1)
[9].
where the coefficient w(i) and signal vector x(i) have length L, and the FIR filter
W(z) exists at discrete time interval i. By the FxLMS algorithm, these coefficients
are updated in the following manner:
wl (i + 1) = wl (i) + μx (i − 1)e(i), (2)
l = 0, 1, . . . , L − 1, μ > 0
3 Existing Methodologies
This method of online secondary propagation path modeling was proposed in [10].
This basic method generates a random noise signal for training purposes. In this
paper, we use an adaptive filter to generate a secondary impulse response that will
190 A. Bisht and H. Y. Patil
model S(z) while the ANC system is in operation. As per this algorithm, the signal
e(i), which denotes the noise residue, is expressed as
where v(i) is the random AWGN signal generated internally and then injected at the
output of W(z), which denotes the control filter. Here, the finite impulse response
(FIR) filter is responsible for modeling the secondary impulse response and is
modeling FIR filter S (z) and the controller denoted by W(z), and both are represented,
respectively, as follows:
f (i) = d(i) − y (i) + v (i) − v (i)
(7)
g(i) = [d(i) − y (i)] + v (i)
(8)
The modeling FIR filter S (z) has its coefficients updated as follows:
where μe (i) is the parameter for step-size function. Following this, the controller
(W(z)) coefficients are updated in the given manner:
The reference signal is filtered through S (z) to derive the LMS algorithm’s input
T
x (i) = s (i) ∗ x K (i) (11)
where
Several other techniques have since been presented in a bid to outperform Eriksson’s
method [10–15]. Among these, one of the current methods presented in [12] shows
great promise. Akhtar’s approach can be described as a worthy upgrade over
Eriksson’s system [10]. This method performs modeling of the filter using the VSS-
LMS algorithm and makes use of f(i) as the designated error signal in the case of both
W(z) and S (z). This algorithm updates the coefficients of the modeling filter S (z).
The step-size parameter of the algorithm (μe (i)) then updates the filter in question,
and the appropriate calculations of the parameter are performed using the following
steps:
• In the beginning, the power computation of error signals f(i) and e(i) is performed:
4 Proposed Methodology
which unfortunately gets offset by the high computational costs, which is where the
FxLMS algorithm gets into the picture. It is widely preferred as an algorithm since it
finds use in a range of avenues involving the economy’s industrial and commercial
sectors. In a real-time environment, there is between the secondary loudspeaker and
the error microphone, a secondary propagation path. The error microphone has the
option of being in either static mode or dynamic mode. This secondary propagation
path introduces an unwanted timing delay in the error signal, and this leads to an
error in the synchronization process between the reference signal and error signal.
The FxLMS algorithm has specifically been developed to compensate for this delay to
nullify this effect of attenuation in the reference signal. Once this has been achieved,
it is used as the adaptive filter input.
When we deal with a reference signal with uncorrelated disturbance, Feedforward
ANC has immense utility. The same is not directly available in the case of feedback
ANC reference signal. So, it has to be internally generated to be used effectively as the
adaptive filter input. FxLMS algorithm also has the advantage of faster convergence,
which it owes to the output signal’s prefiltering before it is sent through the secondary
propagation path. This algorithm’s performance, which uses a static step size, puts
a heavy burden in the form of increased computational complexity. Hence, in a bid
to acquire an algorithm that promises to offer a fast convergence rate in both static
and dynamic environments, we have made dedicated efforts to incorporate the use
of the adaptive step-size function in the proposed design.
Here, we have made the use of FIR filters to model P(z), which denotes the
Primary Propagation Path, the Controller (C(z)), Secondary Propagation Path, which
is characterized by S(z), and the Estimated Secondary Propagation Path S (z). In the
system given in Fig. 1, we have represented the source noise signal as x(k), which is
then propagated to the sensor through the primary propagation path, which is a fluid
medium and is represented by P(z). The arriving noise signal is measured by the
sensor as y p (k). To mitigate the effects of and correspondingly cancel out this noise,
another ‘noise’ signal yω (k) is generated using the controller C(z) and noise signal
x(k). In other words, we need to model the Controller on the lines of the primary
propagation path denoted by P(z).
A least mean square adaptive step-size algorithm is applied to adjust the controller
coefficient dynamically. However, another fluid secondary propagation medium is
represented by S(z) between the sensor and the actuator. This is more commonly
referred to as the secondary propagation path, as described earlier. Therefore, in a
bid to acquire a practical solution and ensure efficient cancellation of narrowband
noise, there is an evident need to compensate the adjustment process using S (z)
which is an estimate of S(z). The main objective here is to ensure that this newly
generated noise signal destructively interferes with the original noise signal x(k).
Therefore, to acquire the required solution and ensure maximum efficiency in the
cancellation process of narrowband noise, we compensate this controller coefficient
adjustment using S (z), which is a modeling FIR filter that offers an estimate of S(z).
• In Fig. 1, which illustrates the design of the proposed feedforward active
noise cancellation system: P(z) represents the primary propagation path used for
modeling the acoustic response between the reference and error microphones in S(z)
or secondary propagation path.
• The Controller function represented by C(z) is convolved with S(z) in the
secondary propagation path to eliminate d(k). C(k)‘s objective is to reduce the mean
square error of e(k), which is essentially a significant determinant of the algorithm’s
accuracy.
• Background noise, x(k), which in this instance is additive white Gaussian noise
(AWGN), is random and passes through the primary propagation path P(z) and since
its characteristics are uncorrelated with those of the unwanted signals in the channel
medium, the signal y p (k) is generated at the other end of P(z), from which we can
acquire the residual error signal e(k) as per Eqs. (17, 18):
e(k) = y p (k) − ys (k); (17)
ys (k) = s(k) ∗ ys (k). (18)
• The acoustic path passing through S(z) and C(z) is estimated using a suitable
adaptive filter by injecting the same white Gaussian noise x(k) at the control filter’s
(C(z)) input. In Fig. 1, S (z) which is the modeling FIR filter and has length K
generates x s (k) as per Eq. (19):
(k)
T
xs = s (k) ∗ xs N (k). (19)
Here, x s (k) generates a response error signal yw (k) after convolving with the
error residue e(k) in the FxLMS filter, after which it is sent to controller C(z).
where μs (k) is the parameter for modeling process step size. Finally, we update the
coefficients of the controller C(z) in the manner represented by Eqs. (21, 22):
C(k + 1) = C(k) + μs (k) ∗ f (k) ∗ x (k);
(21)
(22)
The reference signal passing through S (z) is filtered to derive the LMS algorithm
input, which is given by Eqs. (23, 24):
T
x (k) = s (k) ∗ x N (k) (23)
where
Here harmonic sources are controlled by the system through adaptive filtering
of non-synthesized reference signal, which uses an adaptive step size containing
parameters alpha and beta. (k) controls the speed and shape of the adaptive step-size
algorithm, and β(k) controls the range of values of the functional response of S(z).
If the tap length is denoted by K, i.e., the adaptive filter length, we refer to Eq. (24).
Then we update the modeling filter using the step-size parameter of the algorithm
(μs (k)) and the appropriate calculations of the parameter are performed using the
following steps:
• In the beginning, the power computation of error signals f(i) and e(i) is executed:
0.9 <∝< 1
• Following this, the estimated power ratio is acquired by Eq. (15) and corre-
sponding derived results for n = 0 and n = ∞.
• And finally, the calculation of the step size is performed as given below in Eq. (27):
where the values of μs (0) and ∝ are experimentally determined and μs (0) denotes
the initial step size at the beginning of the path modeling process, which involves
varying the step size with the time changing at discrete intervals given by k. These
Active Noise Cancellation System … 195
values are specifically chosen to ensure that the process of adaptation doesn’t slow
down or become unstable. This is done to ensure that the initial value of μs (n)
corresponds to μs (0) and that the estimators given in Eqs. (20, 21) are initialized by
identical values, which for the sake of convenience are taken as unity, or in other
words Pe (0) = P f (0) = 1. It is also recommended that the value ∝ that we use in
the two estimators’ case is the same. Using the proposed Optimized Adaptive Step-
Size FxLMS (OASSFxLMS) algorithm increases the accuracy of modeling, which
is instrumental in improving the system’s performance.
The newly designed ANC generates anti-noise with an amplitude equal to and
a phase opposite to that of the unwanted noise that it cancels out while traveling
through the secondary propagation path source. The convergence rate analysis and
the magnitude of noise reduction will be critical in realizing our ambition of achieving
better performance than conventional algorithms. Using a feedforward system, we
can avoid acoustic feedback, which is not desirable by applying this technique. The
traditional ANC algorithms involving FxLMS, which use a fixed tap length, usually
need a control filter with a predetermined long tap length for different environments.
As a result, the convergence rate slows down because the maximum value of step
size has a set limit.
The proposed algorithm is designed to self-adjust the required tap length to adapt
to the environment so that the noise cancellation system can achieve faster conver-
gence than conventional methods. For applications involving ANC, primary and
secondary propagation paths have asymmetric impulse responses, which help attain
the desired output response for the sake of canceling out undesired noise. Thus,
the new OASSFxLMS algorithm has been developed with a generalized dynamic
step-size function for a response model, which promotes the exponential decay of
noise residue by optimizing the filter’s coefficients. Specific issues concerning the
application of the proposed algorithm have also been addressed. Hence, we expect
the proposed OASSFxLMS algorithm to offer better performance and faster conver-
gence when compared to its conventional variable step size and other fixed step-size
counterparts.
the causality constraint, and each harmonic can be independently managed by the
reference signal generated internally.
The simulation results for the proposed algorithm are given below. The input
signal is composed of narrowband random additive white Gaussian noise. One can
observe in the figures below that the Active Noise Cancellation process manages to
generate anti-noise, which has an amplitude equal to and a phase opposite to that
of the unwanted noise, which it successfully cancels out while traveling through
the secondary propagation path source, which acts as the control signal as shown in
Fig. 3.
The noise residue left after carrying out the destructive interference by using
the variable step OASSFxLMS algorithm is significantly less than conventional
ASSFxLMS which is applied for the same Narrowband White Noise. In this partic-
ular instance, it is passed through a finite impulse response (FIR) filter to acquire the
best fit to model the internal combustion engine noise that we desire to cancel out.
The noise reduction increases slightly with the proposed OASSFxLMS algorithm’s
help by varying the step and tap length of the secondary path coefficients.
The convergence speed of OASSFxLMS is also observed to be better when
equated with that of the conventional ASSFxLMS. The same can be observed in
Figs. 2 and 4. The System Identification Error parameter shown in Fig. 4 represents
the accuracy of secondary impulse response by comparing its characteristics with
that of the input noise signal. This particular figure has again demonstrated that our
proposed OASSFxLMS algorithm has outperformed the conventional ASSFxLMS
algorithm by significantly tuning down the system identification error. Figure 5 shows
the amplitude levels of filter taps of the secondary and secondary impulse path coef-
ficients, which have been compared to show variation in terms of amplitude. As
observed in Fig. 5, the OASSFxLMS algorithm initially approaches the modeled
secondary impulse response with a larger step size and smaller tap length, which we
have obtained by reducing the MSD of the coefficients of the filter taps. Following
this, we have used varied filter taps to acquire the MSD for all the algorithms while
adjusting the step size according to Eq. (27). In this process, we have developed a
recursive algorithm for optimizing the estimation of step size and tap length, which
helps keep the algorithm’s computational complexity in check. Since the MSD of
the two-sided exponential decay model can be proved as a convex function of the
tap lengths and step sizes, the new algorithm has the property of global optimality.
As mentioned earlier, our experimental analysis involves evaluating MSD and the
Noise Reduction Ratio (NRR) comparison for the chosen ANC algorithms.
Simulation results are shown in Figs. 6 and 7 which conclusively prove that our
proposed OASSFxLMS algorithm has a substantially faster convergence and better
noise reduction performance than other conventional and existing algorithms, even
assuming the tap length and step-size is known a priori. The same has been illustrated
in the table representation given above, where the OASSFxLMS has a comparatively
higher NRR than its counterparts while also maintaining a lower MSD, which is
highly desirable for accurate estimation of the step size and an optimized convergence
rate. Thus, our proposed OASSFxLMS algorithm, which has been developed with
Fig. 6 Comparison of the MSD performance for different ANC algorithms in case of narrowband
IC Engine noise
Fig. 7 Comparison of NRR performance for different ANC algorithms in case of narrowband IC
Engine noise
Active Noise Cancellation System … 199
a generalized form of variable step sizes for a secondary impulse response model,
has achieved the minimum MSD in the computer simulations performed for optimal
coefficients of filter taps and offers the highest noise reduction ratio when compared
to the conventional ASSFxLMS, FxLMS, or FxRLS algorithms (Table 1).
6 Conclusion
In this paper, a modified variable step-size and tap-length FxLMS algorithms have
been proposed. The proposed algorithm has an advantage over others because of its
simplicity and robust performance, making it a decent contender for practical appli-
cations. The computational complexity of a given algorithm is usually determined
by considering the number of required multiplication operations per iteration for
the given algorithm (27). After performing the relevant computer simulations, we
obtained a computational complexity of O(N log N), which is feasible to implement
within the pre-existing industrial norms. All the existing methodologies generally
involve three adaptive filters and have the same level of computational complexity.
Although we have introduced a linear computational complexity in the proposed
method while updating the step size instead of using the one used in Akhtar’s method
with constant computational complexity, the higher convergence rate and reduction
levels have more than compensated for the same. The following has been demon-
strated with the help of computer simulations, through which we can observe that the
proposed method offers much higher convergence, good stability, and robustness for
ANC of narrowband internal combustion engine noise. Hence, we have successfully
managed to develop a high-performance feedforward ANC design that promises low
power consumption due to low computational complexity and would be viable in
reducing the internal combustion engine noise in automobiles. Experimental results
show that the proposed design can attenuate most of the narrowband combustion
noise between 100 and 1000 Hz.
References
1. Huang, B., Xiao, Y., Sun, J., Wei, G.: A variable step-size FXLMS algorithm for narrowband
active noise control. IEEE Trans. (2013)
200 A. Bisht and H. Y. Patil
2. Ardekani, I.T., Student Member, I.E.E.E., Waleed, H.A., Member, I.E.E.E.S.: Effects of imper-
fect secondary path modelling on adaptive active noise control systems. IEEE Trans. Control
Syst. Technol. 20(5), 1252–1262 (2012)
3. Tahir Akhtar, M., Member, I.E.E.E., September, W.M.M.: Improving performance of hybrid
active noise control systems for uncorrelated narrowband disturbances. IEEE Trans. Audio
Speech Lang. Process. 19(7), 2058–2066 (2011)
4. Prof. Mrs. Pathak, B., Ms. Hirave, P.P.: FXLMS algorithm for feed forward active noise
cancellation. In: Universal Association of Computer and Electronics Engineers IEEE 978-
1-46730136-71111$26.00, pp. 18–22. IEEE (2011)
5. Meller, M., Niedzwiecki, M.: Multi-channel self-optimizing narrowband interference canceller.
Signal Process. 98 Elsevier 396–409 (2013)
6. Ang, W.P., Farhang-Boroujeny, B.: A new class of gradient adaptive step-size Imsalgorithms.
IEEE Trans. Signal Process. 49(4), 805–810 (2001)
7. Manzano, E.A., Tafur, J.: Optimal step size for a delayed FxLMS algorithm applied in a
prototype of active noise control system. In: 2018 IEEE 14th International Conference on
Control and Automation (ICCA)
8. Chang, D.C., Chu, F.T.: Feedforward active noise control with a new variable tap-length and
step-size filtered-X LMS algorithm. IEEE/ACM Trans. Audio, Speech, Lang. Process. 22(2)
(2014)
9. Kuo, S.M., Morgan, D.R.: Active noise control: a tutorial review. Proc. IEEE 8(6), 943–973
(1999)
10. Eriksson, L.J., Allie, M.C.: Use of random noise for online transducer modeling in an adaptive
active attenuation system. J. Acoust. Soc. Am. 85(2), 797–802 (1989)
11. Kuo, S.M., Vijayan, D.: A secondary path modeling technique for active noise control systems.
IEEE Trans. Speech Audio Process. 5(4), 374–377 (1997)
12. Akhtar, M.T., Abe, M., Kawamata, M.: A method for online secondary path modeling in active
noise control systems. In: Proceedings of the IEEE 2005 International Symposium Circuits
Systems (ISCAS2005), 23–26, pp. I-264–I-267 (2005)
13. Akhtar, M.T., Abe, M., Kawamata, M.: Modified filtered-x LMS algorithm based active noise
control system with improved online secondary path modeling. In: Proceedings of the IEEE
2004 International Midwest Symposium Circuits Systems, 25–28, pp. I-13–I-16 (2004)
14. Kuo, S.M., Vijayan, D.: Optimized secondary path modeling technique for active noise control
systems. In Proceedings of the IEEE Asia-Pacific Conference on Circuits and Systems, pp. 370–
375. Taipei, Taiwan (1994)
15. Zhang, M., Lan, H., Ser, W.: Cross-updated active noise control system with online secondary
path modelling. IEEE Trans. Speech, Audio Proc., 9(5) (2001)
FFT-Based Robust Video Steganography
over Non-dynamic Region in Compressed
Domain
Abstract The proposed research work presents a novel data hiding method for video
steganography in the compressed domain. In this method, the random numbered
secret frames are selected from the RGB cover video sequence. This method increases
the complexity level of video steganography by considering the specific host to
conceal confidential data. It extracts the specific non-dynamic region from the secret
frame and transforms the pixel value to the frequency domain using Fast Fourier
Transform (FFT). The usage of random Least Significant Bit (LSB) of the real part of
FFT as a carrier object leads to good video quality and secret data-carrying capacity.
Furthermore, the secure compressed stego video is reconstructed using the H.264
video compression technique. The proposed method is experimented on some well-
known video datasets by considering RGB images with different resolutions as a
secret message. Performance evaluation parameters evaluate the proposed method’s
efficiency, imperceptibility, robustness, and embedding capacity, and the improved
results are compared with reported methodologies. The results show a significant
improvement in the imperceptibility as a Peak Signal-to-Noise Ratio (PSNR) value
is reached up to infinity (Inf) in some cases. At the same time, the similarity between
embedded and extracted message is achieved nearer to 1 with the negligible Bit Error
Rate (BER) less than 0.1%, and the embedding capacity greater than 0.5% in all cases
indicates an excellent sign of carrying a big amount of confidential data.
R. Patel (B)
Computer Engineering Department, CGPIT, UTU, Bardoli 394350, Gujarat, India
e-mail: rachu.cuty@gmail.com
K. Lad
SRIMCA, Uka Tarsadia University, Bardoli 394350, Gujarat, India
M. Patel
Department of Mathematics, Uka Tarsadia University, Bardoli 394350, Gujarat, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 201
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_19
202 R. Patel et al.
1 Introduction
2 Literature Review
Liu and Xu [9] have introduced a robust steganography method for HEVC based
on secret sharing in which the secret message was encoded by threshold secret
sharing and embedded into a 4 × 4 luminance DST block. The average PSNR value
was obtained between 34.4 and 46.38 dB with an average Bit Error Rate (BER) of
20.41%–22.59%. This method achieves high performance in the context of visual
quality and robustness performance based on HEVC.
Yang and Li [10] have proposed the latest video steganography method using
high-efficiency video coding (HEVC). This method is based on motion vector space
encoding for the HEVC process. In this method, the motion vector components are
selected from N/2 prediction units (PUs) through smaller sizes in a coding tree unit
(CTU) as a secret information carrier object. Embedding capacity is higher than
LSB under similar motion vectors and lower than LSB under identical carriers. The
empirical results show that the PSNR is varying from 30 to 41.50 dB.
Despite these video steganography methods, it is necessary to improve PSNR
value and high robustness by selecting well-secured carrier objects. In this consid-
eration, the characteristics of the transform coefficient (FFT) components in the
temporal domain perform vital roles. Furthermore, selecting a secret frame and a
specific region of that frame used as a carrier object to conceal confidential data
enhances video steganography’s robustness.
3 Proposed Methodology
Stego key
R
ADM FFT
Compressed Secret
Original G Message
Video
Original Non-dynamic B
Frame Region Embed Key
IFFT
H.264
Compressed
Stego Video Encoder
Stego Stego Non-
Frame dynamic Region
B component secret message is extracted from the random LSBs of the real part of
obtained FFT components. Finally, the secret message is reconstructed by combining
R, G, and B components. The extracting process can be briefly described in Algorithm
2.
The proposed method has been experimented on different cover videos having a
different size (number of frames), resolution (dimension of frames), and frame rate
(frames per second). The quality of video steganography is measured based on the
following quality assessment parameters.
4.1 Imperceptibility
It measures variation between original and stego data and was measured by two
different parameters: PSNR and MSE. The lower the value of MSE, the higher the
value of PSNR increases the level of imperceptibility. The MSE and PSNR can be
calculated using Eqs. (1) and (2), respectively.
m n h
i=1 j=1 k=1 [F(i, j, k) − S(i, j, k)]2
MSE = (1)
m×n×h
MAX2F
PSNR = 10 × log10 (dB) (2)
MSE
where F is the original cover frame, while S is the stego-frame. m × n represents the
frame’s dimension, and h is used to denote the RGB component of the frame using
(k = 1, 2, and3). The highest pixel value of the frame F is denoted by MAXF .
4.2 Robustness
The robustness of video steganography is decided based on its two parameters: (i)
Sim: The distance between embedded and extracted data, and (ii) BER: Error between
the bit positions of original and stego object. Both Sim and BER can be calculated
using Eqs. (3) and (4), respectively.
x y
where N and N are the concealed and extracted hidden data, and “x” and “y” are
the sizes of the hidden data.
Video steganography’s capacity to hide maximum data into a cover object is known
as embedding payload/hiding capacity. It is measured by the Hiding Ratio (HR) that
can be calculated by Eq. (5) [1, 2].
Sr. Cover Cover Cover Frame Total No. of Secret Secret Proposed method Existing methods Proposed method Proposed
no. video video video size rate no. of frames message message imperceptibility APSNR (dB) robustness method
name (Height × frames selected (SM) size embedding
Width) from (Height × capacity
cover Width) AMSE APSNR Method Method Sim BER HR (%)
video (dB) [9] [10] (%)
1 Basketball 1080 × 50 108 3 242 × 0.0000000 Inf – 42.30 0.9772 0.0034 0.58353
drive 1920 150
3 Cactus 1080 × 50 132 6 260 × 0.0000595 90.387 – 40.00 0.8627 0.0567 0.87770
1920 420
4 Kristen 720 × 60 165 6 203 × 0.0000007 109.773 46.38 – 0.9418 0.0414 1.20414
and Sara 1280 328
5 Slide 720 × 30 167 9 236 × 0.0005159 81.005 – – 0.8207 0.0766 1.08406
editing 1280 381
6 Traffic 800 × 30 109 3 362 × 0.0000017 105.771 – 40.16 1.0000 0.0007 2.63958
1280 224
7 Party 480 × 50 228 6 323 × 0.0000008 109.338 40.65 36.05 0.8632 0.0588 2.69598
scene 832 200
8 Basketball 480 × 50 197 3 129 × 0.0000506 91.089 41.61 – 0.9799 0.0028 2.23958
drill 832 208
9 People on 800 × 30 150 3 139 × 0.0000000 Inf – 41.96 0.9668 0.0031 1.01354
street 1280 224
(continued)
R. Patel et al.
Table 1 (continued)
Sr. Cover Cover Cover Frame Total No. of Secret Secret Proposed method Existing methods Proposed method Proposed
no. video video video size rate no. of frames message message imperceptibility APSNR (dB) robustness method
name (Height × frames selected (SM) size embedding
Width) from (Height × capacity
cover Width) AMSE APSNR Method Method Sim BER HR (%)
video (dB) [9] [10] (%)
10 China 768 × 30 233 3 125 × 0.0000000 Inf – 40.44 0.9938 0.0014 1.07023
speed 1024 202
a* Inf: There is no significant difference between the original video and the stego video
FFT-Based Robust Video Steganography …
213
214 R. Patel et al.
contrary, with the increment of the number of cover frames, the increment in the
secret message’s size improves video steganography’s hiding capacity.
References
1. Mstafa, R.J., Elleithy, K.M.: Compressed and raw video steganography techniques: a compre-
hensive survey and analysis. Multimed. Tools Appl. 76(20), 21749–21786 (2017). https://doi.
org/10.1007/s11042-016-4055-1 (Springer)
2. Mstafa, R.J., Elleithy, K.M., Abdelfattah, E.: Video steganography techniques: taxonomy, chal-
lenges, and future directions. In: Applications and Technology Conference (LISAT), 2017.
IEEE Long Island, pp. 1–6, IEEE (2017). https://doi.org/10.1109/LISAT.2017.8001965
3. Balu, S., Babu, C.N.K., Amudha, K.: Secure and efficient data transmission by video steganog-
raphy in medical imaging system. Cluster Comput. 4057–4063 (2018). https://doi.org/10.1007/
s10586-018-2639-4 (Springer)
4. Li, G., Ito, Y., Yu, X., Nitta, N., Babaguchi, N.: Recoverable privacy protection for video
content distribution. EURASIP J. Inf. Secur. 1–11 (2010). https://doi.org/10.1155/2009/293031
(Springer)
5. Liu, Y.X., Li, Z., Ma, X., Liu, J.: A novel data hiding scheme for H.264/AVC video streams
without intra-frame distortion drift. In: IEEE 14th International Conference on Communication
Technology, pp. 824–828. IEEE (2012). https://doi.org/10.1109/ICCT.2012.6511318
6. Liu, Y., Li, Z., Maa, X., Liu, J.: A robust data hiding algorithm for H.264/AVC video streams.
J. Syst. Soft. 86(8), 2174–2183 (2013). https://doi.org/10.1016/j.jss.2013.03.101
7. Mstafa, R.J., Elleithy, K.M.: A DCT-based robust video steganographic method using bch
error correcting codes. In: 2016 IEEE Long Island Systems, Applications and Technology
Conference (LISAT). IEEE (2016). https://doi.org/10.1109/LISAT.2016.7494111
FFT-Based Robust Video Steganography … 215
8. Mstafa, R.J., Elleithy, K.M., Abdelfattah, E.: A robust and secure video steganography method
in DWT-DCT domains based on multiple object tracking and ECC. IEEE Access, vol. 5.
IEEE—Institute of Electrical Electronics Engineers, Inc., ISSN No.: 2169-3536. https://doi.
org/10.1109/ACCESS.2017.2691581, pp 5354–5365, 6th April 2017
9. Liu, S., Xu, D.: A robust steganography method for HEVC based on secret sharing. Cognitive
Syst. Res. 59, 207–220 (2020). https://doi.org/10.1016/j.cogsys.2019.09.008 (Elsevier)
10. Yang, J., Li, S.: An efficient information hiding method based on motion vector space encoding
for HEVC. Multimed. Tools Appl. 77(10), 11979–12001 (2017). https://doi.org/10.1007/s11
042-017-4844-1 (Springer)
11. Khan, A., Sarfaraz, A.: FFT-ETM based distortion less and high payload image steganog-
raphy. Multimed. Tools Appl. 25999–26022 (2019). https://doi.org/10.1007/s11042-019-
7664-7 (Springer)
12. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd ed. ISBN: 0-13-168728-x 978-0-
13-168728-8. Pearson Education (2008)
13. Tutatchikov, V.S.: Two-dimensional fast Fourier transform Batterfly in analog of Cooley-Tukey
algorithm. In: 11th International Forum on Strategic Technology (IFOST), pp. 495–498, IEEE
(2016). https://doi.org/10.1109/IFOST.2016.7884163
14. Hussein, A.A., Al-Thahab, O.Q.J.: Design and simulation a video steganography system by
using FFTturbo code methods for copyrights application. Eastern-Eur. J. Enterprise Technol.
2(9), 43–55 (2020). https://doi.org/10.15587/1729-4061.2020.201010
15. https://www.elecard.com/videos. Video compression Guru, Elecard Video, June 2019
16. https://github.com/remega/video_database/tree/master/videos. Remega video database,
November 8, 2017
An Improved Approach for Devanagari
Handwritten Characters Recognition
System
Rajdeep Singh, Arvind Kumar Shukla, Rahul Kumar Mishra, and S. S. Bedi
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 217
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_20
218 R. Singh et al.
Fig. 1 Swars-Devanagri
Fig. 2 Vyanjan-Devanagri
An Improved Approach for Devanagari Handwritten … 219
Humans have been scripting various notes and documents, spiritual notes on papers,
and posted handwritten cards to their relatives for information storage and communi-
cation from ancient times to date. Historical scripts/books and other past documents
are also written in different styles and manners for a long time. Although we are
using computers to write the many types of documents for different purpose of life
now. The use of paper is the finest choice to write letters. To search the data from
past handwritten scripts is challenging to understand quickly, but a computerized
220 R. Singh et al.
form of data can be easily assessable by the user. Therefore, we have to find the
best way to convert handwritten documents for recognition by the computer easily.
It improves human efficiency to understand the handwritten data with the help of
computers. Handwritten data with different methods and forms makes the document
more difficult to understand because every person has a unique way of writing the
data. Therefore, Devanagri language is difficult compared to the English language
due to writing styles [14–17].
This research is focused on solving some of the issues related to Machine Learning
and Image Processing so that it is useful to recognize handwritten characters and
numeric digits for various areas of life. In specific, the research objectives are as
follows.
• Designing a human-friendly handwritten character recognition system will help
classify the information using machine learning techniques.
• Recommendation of an appropriate solution for the Devnagiri characters recog-
nition system.
• To publish the research work for information dissemination to the computer or
researchers and different kinds of organizations.
The handwritten character recognition method is separated into the following steps,
clearly shown in Fig. 6.
The electronic images and pictures are instigating for input purposes with the help
of electronic devices. Such devices utilize an enclosure that is robotic in the scenery.
The images had been taken by writing their own words on a computer or creating
the images of words in many styles and scanned data also [18, 19].
3.2 Pre-processing
It has simple phases: range standardizes and centralization, inception lost points,
promoting, incline improvement, and exploring points [20].
3.3 Segmentation
This phase is responsible for allocating the characters from an image separately.
The document is operated in a very well-defined manner. Firstly, intensity lines are
separated exploitation row diagram. From every row, words are extracted from the
victimization column bar chart, and at last, letters are separated from language words.
In this phase, the style and appearance of images and data are modified by the feature
extraction technique, which is more relevant for the arrangement and grouping of
data. The methods like binary category, linear, and graph-based analysis for feature
extraction are used to mine unique typescript ideas. This scheme is based on the
nature of the entered data.
222 R. Singh et al.
3.5 Classification
Whenever we feed up the input images to our projected handwritten character recog-
nition system, it gives square-shaped assessment distributed forms as entered for
skilled classifiers like SVM or ANN. It measures up to input attributes through
seizing on the prototype and resolves the most effective identical type for input data.
This phase carried out the process of removing unspecified outputs by adjusting
the semantic data. It is a procedure to get the result from the form identification
of handwritten data. Devanagri (Lipi)-based data records/images will amplify the
correctness gained by a clean appearance for acknowledgment. For script or hand-
written data input, some shape recognizers acquiesce one sequence of characters,
whereas the rest of others give up a range of proxy values for each temperament,
typically through a exist of assurance [21, 22].
The following model clarifies the machine learning process for character recog-
nition. The different phases of Machine Learning Classifiers are working, as shown
in Fig. 7.
Fig. 9 Experimental 90
analysis and discussion
80
70
K-Mean
60
50
SVM
40
30
20 Proposed
Model
10
0
150 750 1500 3000 4500
4.1 Dataset
4.2 Augmentation
After collecting the dataset, a Convolutional Neural Network (CNN) is used for
feature extraction. For feature extraction on the above dataset, a supervised learning
technique of Convolutional Neural Network (CNN) is used. For this, we take consid-
erable fundamental information. A large amount of data can give an extensive and
accurate amount of feature attribute in CNN. All the dataset are divided into their
different data category. The process of CNN is different in different images. It is
divided into three different convolutional layers. Each layer has in between max
building blocks. First, take an image as an input. The input is given to the initial
convolutional block and modifies the entered image with 36 kernels of 3.5 × 3.5.
After the first max building block, the primary convolutional building block’s result
gives the output to the second convolutional building block as input data. In the
second convolutional block, the image will be filtered with 64 kernels of size 4 × 4.
Give this output to the second max-pooling layer as input data; after that, the max
building block results in the final convolutional block. It filters images/data as 128
kernels of size 1 × 1 and gives the output as fully connected 512 neuron layers—
the result given to softmax function. The softmax function provides a prospected
224 R. Singh et al.
circulation of the four result categories. The last layer is connected to MLP. All
convolutional layer output has the activated ReLu function and it is fully connected
to the layers. The system is trained using Adam. The batch size is the size of 100
for 1000 epochs. Thus, we collect the features of the image dataset using the CNN
algorithm [23].
In the data analysis and database management technique, clustering is one of the data
structure management techniques. Lots of data can be subdivided into a subgroup.
The same type of data can be placed in the same group. Using this method, we
can define the task of identification. Find a homogeneous subpart inside the data
point. Euclidean-based distance or correlation-based distance is used to identify
this method. It is an application-specification. Based on the features, sub-grouping
clustering analysis is used [2].
Clustering is an unsupervised machine learning method. Clustering can be done
differently. Partition of the dataset features is taken from the Convolutional Neural
Network (CNN). Each section of the dataset is non-overlapping clusters where every
point of features has belonged to only one group. Decide the total number of clus-
ters—first centroid of the random data point and iterating data. Suppose centroids
are not changed then iterating repeatedly. Data points assign in the same cluster.
After that, calculate the sum of the squared distance between data points and all data
centroids (Fig. 8).
After that, call the SVM algorithm to evaluate the k number of clusters. The sort
number is denoted as a T. Create a condition where every value can be evaluated as
a newly generated solution. Then it will give a kSVM solution.
ksvm-model = {(c1, Lsvm1), (c2, Lsvm2)…(ck, Lsvmk)};
where k = local model no. of the clusters;
y = it is presented as a parameter which is hyper of kernel function of RBF;
c = the error rate of SVM.
And in the very last return the global best solution. Repeat till all such cluster
is pruned, and it gives the final classification. Then we can classify and identify
the characters and numeric values very efficiently and calculate the accuracy of the
character’s recognition [8].
In Machine Learning, perhaps k-means is the most known and studied method
for clustering analysis [8, 9].
The k-means is a clustering method that helps to feed up new scanned data or
images into the required form of blocks for handwritten character categories. Using
a desktop GUI form/web portal, a client can demand and visualize the historical
images with the server’s past gathered data.
An Improved Approach for Devanagari Handwritten … 225
6 Conclusion
The proposed method can give better performance. The k-means algorithm doesn’t
work well in the universal cluster, and it does not work well with a cluster of different
data sizes and different data masses. So that after clustering, if we give the clusters in
the multiple SVM class, it provides better classification. In this method, it is found that
a large number of datasets can be easily trained and tested to recognize the different
handwritten characters. Now in daily life, this kind of approach is beneficial. Future
work can be developing the algorithm for better-segmented techniques. So there is a
scope of improvement in the methods.
References
1. Pal, U., Chaudhuri, B.B.: Indian script character recognition: a survey. Pattern Recognit. 37,
1887–1899 (2004)
2. Singh, H., Sharma, R.K.: Moment in online handwritten character recognition. In: National
Conference on Challenges & Opportunities in Information Technology (COIT- 2017,
Gobindgarh. March 23 (2007)
3. Hanmandlu, M., Ramana Murthy, O.V.: Fuzzy model-based recognition of handwritten
numerals. Pattern Recognit. 40, 1840–1854 (2007)
4. Arora, S.: Combining multiple feature extraction techniques for handwritten devnagari char-
acter recognition. In: IEEE Region 10 Colloquium and the Third ICIIS Kharagpur, India
(2008)
5. Arica, N.: An overview of character recognition focused on offline handwriting, C99-06-C-203,
IEEE (2000)
6. Cardona, G., Jain, D.: The Indo-Aryan Languages. Routledge, pp. 68–69 (2003). ISBN 978-
0415772945
226 R. Singh et al.
7. Ramteke, R.J., Mehrotra, S.C.: Recognition of handwritten devnagari numerals. Int. J. Comput.
Process. Orient. Lang. (2008)
8. Das, N., Sarkar, R., Basu, S., Saha, P.K., Kundu, M., Nasipuri, M.: Handwritten bangla character
recognition using a soft computing paradigm embedded in two pass approach. Pattern Recogn.
48, 2054–2071 (2015)
9. Sarkhel, R., Das, N., Saha, A.K., Nasipuri, M.: A multi-objective approach towards cost effec-
tive isolated handwritten bangla character and digit recognition. Pattern Recogn. 58, 172–189
(2016)
10. Indian, A.: A survey of offline handwritten hindi character recognition. IEEE (2017). 978-
15090-6403-8/17
11. Shamim, S.M., Neural, D.: Glob. J. Comput. Sci. Technol., Artif. Intell. 18(1) Version 1.0 Year
2018, Type: Double Blind. Peer Rev. Int. Res. J., Publisher: Global Journals, Online ISSN:
0975-4172 & Print ISSN: 0975-4350 (2018)
12. Saha, M.: Int. J. Adv. Sci. Technol. 29(9), 2900–2910 (2020)
13. Shukla, A.K.: Patient diabetes forecasting based on machine learning approach. In: Pant, M.,
Kumar Sharma, T., Arya, R., Sahana, B., Zolfagharinia, H. (eds.) Soft Computing: Theories and
Applications. Advances in Intelligent Systems and Computing, vol. 1154.Springer, Singapore
(2020). https://doi.org/10.1007/978-981-15-4032-5_91
14. Das, N., Basu, S., Sarkar, R., Kundu, M., Nasipuri, M., Basu, D.: Handwritten bangla compound
character recognition: potential challenges and probable solution. In: IICAI, pp. 1901–1913
(2009)
15. Das, N., Das, B., Sarkar, R., Basu, S.: Handwritten banglabasic and compound character
recognition using MLP and SVM classifier. J. Comput. 2(2), 109–115 (2010)
16. Bag, S., Bhowmick, P., Harit, G.: Recognition of bengali handwritten characters using skeletal
convexity and dynamic programming in emerging applications of information technology
(EAIT). In: Second International Conference , pp. 265–268 (2011)
17. Aggarwal, A., Rani, R., Dhir, R.: Handwritten Devanagari character recognition using
gradient features, international journal of advanced research in computer science and software.
Engineering 2(5), 85–90 (2012)
18. J. Pradeepa, E., Srinivasana, S., Himavathib.: Neural network based recognition system inte-
grating feature extraction and classification for english handwritten. Int. J. Eng. 25(2), 99–106
(2012)
19. Aggarwal, A., Rani, R., Dhir, R.: Handwritten devanagari character recognition using gradient
features. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(5), 85–90 (2012). (ISSN: 2277-128X)
20. Pathan, I.K., Ali, A.A., Ramteke, R.J.: Recognition of offline handwritten isolated Urdu
character. Int. J. Adv. Comput. Res. 4(1), 117–121 (2012)
21. Vaidya, S.A., Bombade, B.R.: A novel approach of handwritten character recognition using
positional feature extraction. Int. J. Comput. Sci. Mob. Comput. 2(6), 179–186 (2013)
22. Wu, X., Tang, Y., Bu, W.: Offline text-independent writer identification based on scale invariant
feature transform. IEEE Trans. Inf. Forensics Secur. 9, 526–536 (2014)
23. Dholakia, K.: A survey on handwritten character recognition techniques for various indian
languages. Int. J. Comput. Appl. 115(1), 17–21 (2015)
PSO-WT-Based Regression Model
for Time Series Forecasting
1 Introduction
P. S. Rao (B)
Department of CSE, Acharya Nagarjuna University, Guntur, India
G. P. Varma
Department of CSE, Chaitanya Bharathi Institute of Technology, Hyderabad, India
Ch. D. Prasad
Department of EEE, SRKR Engineering College, Bhimavaram, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 227
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_21
228 P. S. Rao et al.
segmented linear regression modeling is proposed [4]. One of the critical applica-
tions of regression models in the present smart cities concept is forecasting load
and renewable generation inputs. This MLR approach is used for short-term load
forecasting to provide better power system planning available in [5]. In [6], wind
forecasting is presented with a deep learning model due to wind speed’s volatile
nature. Recently, the application of these prediction studies has emerged in civil
engineering. Several optimized MLR models are presented in [7–10] to identify the
unknown quantities. However, these MLR models produce large errors when the
data is more volatile. Therefore, nonlinear regression models were developed for
accurate prediction studies. Artificial neural networks (ANN) [11, 12], fuzzy logic
[13], support vector machine (SVM) [14, 15], and deep neural networks (DNN) [16]
fall under intelligent studies. However, these models are complex and required hard
preprocessing.
In this paper, the accuracy of the prediction results is enhanced with wavelet
transform (WT). Using WT, the detailed and approximate frequency components of
the time series are extracted. These coefficients are used in PSO-assisted MLR model
for best fit. This approach predicts the future data more accurately than standard
MLR models. The complete procedure of algorithm implementation and results are
presented in the following sections: Sect. 2 includes applying wavelet coefficients
in the linear regression model with PSO assistance. Results and discussions are
presented in Sect. 3, and conclusions are reported in Sect. 4.
2 Proposed Method
n
y= βi xi + β0 (1)
i=1
In Eq. (1), y is the output variable and xi (i = 1, 2 . . . n) is the input variable. The
coefficients of MLR are identified to fit the data correctly. In this paper, the PSO
algorithm is used to find the optimal values of the coefficient. These models are not
suitable with available data in complex processes where identifying the dependent
variables is difficult. Therefore, more features are required for close prediction. For
this purpose, the input time data is processed through WT to extract the approximate
and detailed coefficients. Further, these coefficients are used in MLR models as input
variables. WT is used to find multiple time data features to find accurate prediction
PSO-WT-Based Regression Model for Time Series Forecasting 229
models of engineering applications [10, 11]. By using WT, time series x(t) can be
analyzed at various levels of decomposed coefficients using Eq. (2)
∞
w f (a, b) = ∫ x(t) · ϕa,b (t) · dt (2)
−∞
In Eq. (2), ϕa,b (t) is mother wavelet function. There are several mother wavelets
available in the literature. Among them, the Daubechies wavelets are used for more
useful feature extraction. This WT analysis decomposed the given signal into low-
and high-frequency components known as approximation coefficients al and the
detailed coefficients d l . When these coefficients are substituted in Eq. (1) with single
input data to obtain better forecasting outputs, the modified expression is given by
k
k
y= βi al i + γi dl i + β0 (3)
i=1 i=1
Equation (3) coefficients’ optimal value is also identified by using PSO [18]. Since
selecting the coefficients influences prediction, population search-based techniques
are suitable for finding the best-fit values. Among these techniques, PSO is a simple
population search-based algorithm introduced in 1995 by Kennedy and Eberhart.
This algorithm is used in this paper to find the best-fit regression coefficients. The
mechanism of PSO is based on two equations known as position ( p) and velocity
(v). The solutions of the problem (coefficients of MLR) are randomly generated
within the limits of the variables called positions. Using these randomly generated
initial values, the fitness of the particles is calculated and, based on the best fitness
solutions, updated using Eqs. (4) and (5) given by
vni+1 = ωv in + a1r1 pbest i − pni + a2 r2 gbest i − pni (4)
In Eqs. (4) and (5), pbest and gbest are the individual and group best positions.
The rest of the variables are the control and standard parameters of the PSO. The
objective function used for the PSO-WT-MLR approach is given by
2
Fitness function = yactual − ypredicted (6)
This fitness value is calculated for every solution of PSO to find the best optimal
coefficients of MLR.
230 P. S. Rao et al.
Initially, an LR model has been considered in which the output depends on the single
input. The corresponding data of financial series is fit with linear expression given
by
y = β1 x1 + β0 (7)
Figure 1 shows the time series data for the implementation of regression models.
This data is divided into three segments known as pre-data, training data, and testing
data [17, 18]. The total samples in the data set are 768, out of which 1–10 samples
are used as pre-data samples, 11–720 samples for training, and 721–768 samples for
testing. For the data shown in Fig. 1, the optimal LR model obtained by PSO is given
by
In Eq. (8), x1 is single sampled delay information of y. Similarly, the MLR model
for the data using twoinputs identified by PSO is given by
The regression model coefficients are identified using PSO with the fitness func-
tion values 24.8059, 24.6779, and 15.9623. Using the PSO regression models, the
absolute errors of testing data are shown in Fig. 3. This result shows the improve-
ment in the proposed PSO-WT-MLR approach. The R 2 values for various regression
models obtained by PSO are reported in Table 1. These results show the improvement
of WT-assisted PSO-optimized MLR approach for accurate prediction of future data
samples.
4 Conclusions
In this paper, the PSO-WT-MLR approach is presented for time series forecasting.
The regular LR and MLR models that were obtained using PSO produce many
errors in forecasting the data. Therefore, the prediction results are improved by
using wavelet coefficients. The wavelet coefficients extracted after data processing
provided better prediction models. The statistical measures and percentage errors of
the test data provided the efficacy of the proposed method.
References
1. Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples.
Biometrika 76(2), 297–307 (1989)
2. Ng, K.Y., Awang, N.: Multiple linear regression and regression with time series error models
in forecasting PM 10 concentrations in Peninsular Malaysia. Environ. Monit. Assess. 190(2),
63 (2018)
3. Alaeddini, A., Alemzadeh, S., Mesbahi, A., Mesbahi, M.: Linear model regression on time-
series data: non-asymptotic error bounds and applications. In: 2018 IEEE Conference on
Decision and Control (CDC), pp. 2259–2264. IEEE (2018)
4. Valsamis, E.M., Husband, H., Chan, G.K.: Segmented linear regression modelling of time-
series of binary variables in healthcare. Comput. Math. Methods Med. (2019)
5. Amral, N., Ozveren, C.S., King, D.: Short term load forecasting using multiple linear regression.
In: 2007 42nd International Universities Power Engineering Conference, pp. 1192–1198. IEEE
(2007)
6. Liu, H., Mi, X., Li, Y.: Smart multi-step deep learning model for wind speed forecasting based
on variational mode decomposition, singular spectrum analysis, LSTM network and ELM.
Energy Convers. Manage. 159, 54–64 (2018)
7. Egbe, J.G., Ewa, D.E., Ubi, S.E., Ikwa, G.B., Tumenayo, O.O.: Application of multilinear
regression analysis in modeling of soil properties for geotechnical civil engineering works in
Calabar South. Niger. J. Technol. 36(4), 1059–1065 (2017)
8. Nagaraju, T.V., Prasad, C.D., Murthy, N.G.: Invasive weed optimization algorithm for predic-
tion of compression index of lime-treated expansive clays. In: Soft Computing for Problem
Solving, pp. 317–324. Springer, Singapore (2020)
9. Nagaraju, T.V., Prasad, C.D.: Swarm-assisted multiple linear regression models for compres-
sion index (Cc) estimation of blended expansive clays. Arab. J. Geosci. 13(9) (2020)
10. Shen, Y.X., Yang, J.G.: Temperature measuring point optimization and thermal error modeling
for NC machine tool based on ridge regression. Mach. Tool Hydraulic. 40(5), 1–3 (2012)
11. Pradeep Kumar, D., Ravi, V.: Forecasting financial time series volatility using particle swarm
optimization trained quantile regression neural network. Appl. Soft Comput. 58, 35–52 (2017)
12. Ghazvinian, H., Bahrami, H., Ghazvinian, H., Heddam, S.: Simulation of monthly precipitation
in semnan city using ANN artificial Intelligence model. J. Soft Comput. Civil Eng. 4(4), 36–46
(2020)
13. Yuan, K., Liu, J., Yang, S., Wu, K., Shen, F.: Time series forecasting based on kernel mapping
and high-order fuzzy cognitive maps. Knowl.-Based Syst. 206, 106359 (2020)
14. Singh, V., Poonia, R.C., Kumar, S., Dass, P., Agarwal, P., Bhatnagar, V., Raja, L.: Prediction
of COVID-19 corona virus pandemic based on time series data using support vector machine.
J. Discrete Math. Sci. Cryptograp. 1–5 (2020)
15. Sahoo, B.B., Jha, R., Singh, A., Kumar, D.: Application of support vector regression for
modeling low flow time series. KSCE J. Civil Eng. 23(2), 923–934 (2019)
PSO-WT-Based Regression Model for Time Series Forecasting 233
16. Vidal, A., Kristjanpoller, W.: Gold volatility prediction using a CNN-LSTM approach. Expert
Syst. Appl. 113481 (2020)
17. Gupta, D., Pratama, M., Ma, Z., Li, J., Prasad, M.: Financial time series forecasting using twin
support vector regression. PLoS ONE 14(3), e0211402 (2019)
18. Rao, P.S., Varma, G.P., Prasad, C.D.: Identification of linear and nonlinear curve fitting models
using particle swarm optimization algorithm. In: AIP Conference Proceedings, vol. 2269, no.
1, p. 030040. AIP Publishing LLC (2020)
Leaf Diagnosis Using Transfer Learning
Abstract Agricultural machine learning is used to improve plant yield and crop
quality. The key challenge faced by farmers in farming is the assault on bacterial
infections, fungal viruses, and worm attacks, or unsupervised leaf-decaying agricul-
ture. The application of transfer learning and tweaking of state-of-the-art models can
be used to diagnose plant and crop diseases. Here, a state-of-the-art method with
Faster RCNN and single shot detector (SSD) is used to propose a hybrid method for
detecting plant diseases. This hybrid method senses the leaves of the plant and deter-
mines the affected area. Experimental studies indicate that healthy and unsanitary
plant leaves are specifically classified.
1 Introduction
The biggest problem farmers face is low yields and poor yields due to insects and
pests. Insects, rodents, fungus, and weeds decrease yields by up to 20% or more
during the early to mid-and post-harvest times. Farmers can use pesticides to keep
the insect and rodent populations under control. Lack of quality control, high prices,
noise, timely unavailability, lack of education, and the use of defective machinery due
to an untrained labor force are the main constraints for pesticide inefficiency. Cotton
plays a crucial role in the Indian economy, as its textile industry is primarily cotton-
based. The Indian textile industry contributes about 5% to its gross domestic product
(GDP), 14% to industrial production, and 11% to overall export revenues [1]. The
non-application of new technology leads to low yields relative to the global average.
P. Udawant (B)
Assistant professor SVKM’s NMIMS MPSTME, Shirpur, India
e-mail: Prashant.udawant@nmims.edu
P. Srinath
Associate professor SVKM’s NMIMS MPSTME, Mumbai, India
e-mail: pravin.srinath@nmims.edu
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 235
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_22
236 P. Udawant and P. Srinath
2 Literature Survey
Deep learning models such as Convolution Neural Network [5] were used to develop
models for detecting and diagnosing diseases in plants using leaf images of diseased
and healthy plants. In previous works, deep learning models for the identification
and diagnosis of plant diseases were validated by five simple CNN architectures;
AlexNet, AlexNetOWTBn, GoogleNet, Over feat, VGG [6–9].
These models have been trained, tested, and implemented with the aid of the
Torch7 framework [10]. As shown by the findings, the highest performance rates
were achieved by VGG and AlexNetOWTBn architectures. These two models were
later trained and tested on the original pictures. When trained with the actual pictures,
VGG showed a maximum success rate of 99.53 percent.
The Deep Learning approach was used to establish an image-based system to
detect plant diseases [11]. The proposed method used a publicly available dataset
containing 54,306 images of both diseased and stable plant leaves. The dataset
included 14 species of plants and 26 diseases. Three experiments were performed
with various versions of the dataset; the first version of the dataset consisted of
original images, i.e., colored images; the second version of the dataset consisted of
grayscale images; and the third version of the dataset, consisting of images leaves, was
Leaf Diagnosis Using Transfer Learning 237
segmented from the image, thus eliminating the extra background from the photos.
Performance in two architectures: First, AlexNet [7] and second, GoogleNet [8] on
Plant Village Dataset [2] was studied by training the model in two separate cases.
In the first example, the model was learned from scratch, and in the second case, a
transition learning approach was used in which pre-trained models were modified
for model training. The average accuracy was 85.53% for the model trained from
scratch and 99.34% for the model trained from transfer learning.
A comparative analysis of machine learning algorithms was undertaken to clas-
sify safe and unhealthful plant leaves [12]. Three different kinds of plants were
selected for this analysis, namely cabbage, sorghum, and citrus. Three features were
taken into account to classify safe and unhealthful plants, which consisted of color-
based features such as pixels; descriptors such as histogram of directed gradients
(HOG) [13]; and statistical features such as mean, standard deviation, min, and max.
The dataset used to train the models consisted of 382 cabbage images, 262 sorghum
images, and 539 citrus images. Three separate forms of machine learning techniques,
namely support vector machine (SVM) [14], supervised ANN [15], and Random
Forest [16], have been used for the classification of these photos. For sample prepa-
ration, 60% of the dataset images were used, while the remaining 40% of the images
were used for research purposes. The efficiency of all three models was compared
with the F1 Score [17]. According to the findings, SVM obtained the highest F1 Score
for damaged sorghum but did not produce positive results in detecting damaged citrus,
while Random Forest achieved an average F1 score of 0.954.
An electronic diagnostic system for the detection of wheat diseases has been
proposed [18]. Deep multi-instance discovers that the suggested approach detects
wheat diseases and mapS disease regions with only picture-level annotations for
training pictures. The dataset containing in-field images of the wheat crop was
compiled and used for verification purposes. The wheat disease dataset consisted of
9,230 photographs containing 7 different types of disease, one of which was a healthy
form. An entirely convoluted network [19] (FCN) is used to achieve local feature
extraction and disease prediction. This convoluted network generates disease spatial
score maps where each score map corresponds to a local raw picture window. These
equations are then integrated into a multi-instance learning (MIL) system. Approxi-
mation of bounding boxes (BBA) is performed to better seal disease positions. Two
models VGG-CNN-S and VGG-CNN-VD16, which are considered basic models,
have been trained for 60 epochs with 0.0001 as an initial learning rate and a batch
size of 45 instances, while the advanced models VGG-FCNVD16 and VGG-FCN-S
have been trained for around 20 epochs with 0.00005 as an initial learning rate and
a batch size of 2 cases. The three aggregated functions of VGG-FCN-S and VGG-
FCN-VD16 were used to match the constructed models with the standard Convoluted
Neural Network. The findings reveal that VGG-FCN-S outperforms VGG-CNN-S
and VGG-FCN-VD16 outperforms VGG-CCN-VD16 in all categories except Black
Chaff.
238 P. Udawant and P. Srinath
3 Proposed Method
This research aims to detect the plant’s diseased region and provide useful informa-
tion to the farmers. A lot of research is going on to prevent pest attacks and identify
the disease early to avoid any further losses.
Around 1500+ images have been collected directly from Maharashtra and Gujarat’s
fields varying in diseases, health, color, etc. The images were gathered using current
high-quality mobile smartphone cameras. The dataset collected is verified by experts.
As the images taken of plant leaves are acquired from fields, they may contain
dust, water spots, or noise. Also, the collected dataset consists of images taken from
different cameras, which results in a difference in the exact pixels values. The sole
purpose of pre-processing is to reduce the noise and other irrelevancy to make data
consistent throughout the project. Each image was manually analyzed to find defects,
the difference in leaf color, blurriness, remove shadows, etc. (Fig. 1)
The processed images were less than needed for the project, so to get more data, we
augmented the collected dataset on different aspects like rotation, scaling, skewness,
shear, blurring, etc., randomly making our dataset of 1500+ images to more than
10,000+ images. To train the object detection model, we need to label or annotate
our data. The dataset images were labeled as “Healthy” and “Unhealthy” for the first
detection. The image annotation took place before the augmentation process; once
annotations were done, the images’ annoys and images were augmented, producing
more labeled data. The annotation was done using LabelImg Software. This software
helps create a bounding box and label the bounding box and store them in Pascal
VOC XML format, which can be read while training and transferring learning object
detection models.
Now, image segmentation is the next approach for the system. It represents the
image more understandably and easy to analyze. Under the segmentation process, a
digital image is subdivided into multiple segments. The main objective behind the
segmentation process is to dig out meaningful information from the digital image.
Leaf Diagnosis Using Transfer Learning 239
The techniques which could be used under the segmentation process are Region
Based, Edge Based, Threshold Based, Feature-Based Clustering, and Color Based.
Once we got labeled data in Pascal VOC XML format, we did some error correction,
if any, in the labels and then converted them into a single CSV file along with the
bounding box values such as X-min, Y-min, X-max, Y-max, and labels. The converted
CSV file was converted into Tfrecords, which maps the bounding box values, labels,
and image pixel values into a package. If records are a type of file read by the
TensorFlow Python Library, which can unpack the underlying values while training.
The object detection model used to detect the “Healthy” and “Unhealthy” regions is
called the Single Shot Detector model. Using transfer learning, the MobileNet Single
Shot Detector model is trained for our particular dataset for over 30,000+ epochs
and got the testing accuracy of approximately 91. Once the model was trained, the
weights of the model were frozen and saved for further use. The frozen model can
test the model on new data and validate and compare it with other models. If any
changes are required, then continue training the model for the last checkpoint and
validate again.
4 Architectural Design
There are five significant measures used to diagnose plant diseases in plants. The
processing scheme consists of an image analyzer through an image, an image analyzer
like image enhancement, noise reduction, image annotation and segmentation where
the affected and usable areas are segmented, feature extraction, and classification. In
the end, the presence of diseases on plants can be observed and remembered. RGB
photos of leaf samples were obtained in the initial stage (Fig. 2).
The Single Shot Detector consists of two parts that extract map features and use
convolution filters to detect objects. SSD is using VGG16 to remove a structure
diagram. The particles are then identified using the Conv4 3 layer. Use multiple
layers (multi-scale attribute maps) to detect objects independently. As CNN gradually
limits spatial scales, the resolution of feature maps also decreases. SSD uses lower
resolution layers to detect larger object sizes. Add six additional auxiliary convolution
layers to the VGG16. Five of these will be added to the target detection. There are
six assumptions in three of these layers instead of four. Any added feature layer (or
current feature layer from the base network) produces a defined range of detection
predictions using a series of coevolutionary filters [20] (Fig. 3).
The Lecture Notes in Computer Science volumes are sent to ISI for inclusion in their
Science Citation Index Expanded (Table 1).
Leaf Diagnosis Using Transfer Learning 241
The software used for labelling the images for this project is called "Label-Img".
The software LabelImg helped us annotate images by building bounding boxes over
the region of interest and storing them in Pascal VOC XML format. Figure 5 shows
the procedure to label images using a labeling tool. It provides a perfect GUI for
managing labels and finds errors, color coding, etc. The color codes tell the different
labels, and the right panel shows us the total annotations done. The left panel is used
for the I/O, as well as for configuring the XML files.
The type of model used in the proposed method is object detection model called
Single Shot Detector using the MobileNet, which has the configuration responsible
for detecting an image of a cotton leaf as diseased or healthy. The Single Shot Detector
Object Detection model was trained on more than 30,000+ epochs on Google’s
Colaboratory server having Nvidia GPU Tesla K80 resulting in the final lossless
0.00.
All the images in Fig. 6 are the predicted results for the test images used for testing
the trained. It is showing the percentage of the healthy and unhealthy portion of the
cotton plant leaves.
6 Conclusion
Identifying the contaminated region with the help of machine learning was a big
challenge. Transfer learning technique is used to transfer knowledge of a previously
learned model to millions of images. With the use of transfer learning, an object
detection model is developed that has been able to send the diseased region as a
result very quickly and precisely. The result shows that from the testing image data
set Healthy and unhealthy cotton plant leaves have been identified accurately.
References
1. Dixit, P., Lal, R.C.: A critical analysis of indian textile industry: an insight into inclusive growth
and social responsibility. Russ. J. Agric. Socio-Econ. Sci. 88(4) (2019)
2. Khan, A., Sohail, A., Zahoora, U., Qureshi, A.S.: A survey of the recent architectures of deep
convolutional neural networks. Artif. Intell. Rev. 53(8), 5455–5516 (2020)
3. Hughes, D., Salath’e, M.: An open access repository of images on plant health to enable the
development of mobile disease diagnostics (2015). arXiv preprint arXiv:15110.080
4. Mohanty, S.P., Hughes, D., Salath’e, M.: quot; Using Deep Learning for Image-Based Plant
Disease Detectionquot, arxiv, 1604, 25 April 2016
5. Xin, M., Wang, Y.: Research on image classification model based on deep convolution neural
network. EURASIP J. Image Video Process. 2019(1), 40 (2019)
6. Ferentinos, K.P.: Deep learning models for plant disease detection and diagnosis. Comput.
Electron. Agric. 145, 311–318 (2018)
7. Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M.S., Van Esesn, B.C.,
Awwal, A.A.S., Asari, V.K.: The history began from alexnet: a comprehensive survey on deep
learning approaches. arXiv preprint arXiv:1803.01164 (2018)
8. Jasitha, P., Dileep, M.R., Divya, M. (2019) Venation based plant leaves classification using
GoogLeNet and VGG. In: 2019 4th International Conference on Recent Trends on Electronics,
Information, Communication amp; Technology (RTEICT) (pp. 715–719). IEEE
9. Wulandhari, L.A., Gunawan, A.A.S., Qurania, A., Harsani, P., Tarawan, T.F., Hermawan, R.F.:
Plant nutrient deficiency detection using deep convolutional neural network. ICIC Express Lett.
13(10), 971–977 (2019)
10. Nguyen, G., Dlugolinsky, S., Bob’ak, M., Tran, V., Garc´ıa, A.L., Heredia, I., Mal´ık: Machine
Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey.
Artif. Intell. Rev. 52(1), 77–124 (2019)
11. Mohanty, S.P., Hughes, D.P., Salath’e, M.: Using deep learning for image-based plant disease
detection. Front. Plant Sci. 7 Article 1419, September 2016.
12. Rahman, H., Jabbar Ch, N., Manzoor, S., Najeeb, F., Siddique, M.Y., Khan, R.A.:A comparative
analysis of machine learning approaches for plant disease identification. Adv. Life Sci. 4(4)
(2017)
13. Islam, M.A., Yousuf, M.S.I., Billah, M.M.: Automatic plant detection using HOG and LBP
features with SVM. Int. J. Comput. (IJC) 33(1), 26–38 (2019)
14. Cervantes, J., Garcia-Lamont, F., Rodr´ıguez-Mazahua, L. and Lopez, A., : A comprehen-
sive survey on support vector machine classification: Applications, challenges and trends.
Neurocomput. 408, 189–215 (2020)
15. Walczak, S.: Artificial neural networks. In: Advanced Methodologies and Technologies in
Artificial Intelligence, Computer Simulation, and Human-Computer Interaction, pp. 40–53.
IGI Global
246 P. Udawant and P. Srinath
16. Herrera, V.M., Khoshgoftaar, T.M., Villanustre, F., Furht, B.: Random forest implementation
and optimization for Big Data analytics on LexisNexis’s high performance computing cluster
platform. J. Big Data 6(1), 68 (2019)
17. Vakili, M., Ghamsari, M., Rezaei, M.: Performance Analysis and Comparison of Machine
and Deep Learning Algorithms for IoT Data Classification (2020). arXiv preprint arXiv:2001.
09636
18. Lu, J., Hu, J., Zhao, G., Mei, F., Zhang, C.:An In-field Automatic Wheat Disease Diagnosis
System. arxiv:1710.08299v1, 26 Sep 2017.
19. Chen, G., Zhang, X., Wang, Q., Dai, F., Gong, Y., Zhu, K.: Symmetrical dense-shortcut deep
fully convolutional networks for semantic segmentation of veryhigh-resolution remote sensing
images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(5), 1633–1644 (2018)
20. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: Single shot
multibox detector. In European Conference on Computer Vision, pp. 21–37. Springer, Cham
(2016)
Attendance System Using Face
Recognition Library
Abstract Among several methods used for monitoring the attendance of students,
facial recognition is not mostly acclaimed. The emerging image processing tech-
nology is not a prevailing part of regular attendance monitoring systems regardless
of the numerous benefits. To eliminate data handling processes, it is required to
design an intelligent system that detects a student’s face and verifies it from the
database. This paper proposes a system that uses TensorFlow for face identifica-
tion and verification and displays students’ attendance on a web-based/local GUI.
This system is capable of generating real-time output based on video feed obtained
from the classroom. The outcome is labeled with the name of the student as entered
in the database. This system functions on the Google Colab platform on Graphics
Processing Units (GPUs). In its preliminary stage, a local dataset of a student under
diverse light conditions has been experimented upon to study the behavior of the
Face Recognition algorithm in illumination. The results suggest that the algorithm is
effective under low light conditions as well. This paper primarily engenders signif-
icant advances in image processing through facial recognition library highlighting
Machine Learning applications in everyday circumstances.
Abbreviations
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 247
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_23
248 B. Patel et al.
1 Introduction
performed in this field in Sect. 2, the proposed working system in Sect. 3 along
with an experiment, its results, and discussions in Sect. 4, and the conclusion of the
paper in Sect. 5. The study has been carried out in multiple factors that comprehen-
sively justify selecting necessary components for designing such a system. The areas
of consideration are the Machine Learning platform, Algorithm, Image Processing
Method, Dataset, and Camera.
2 Related Work
Among all the manual-based attendance tracking practices, face recognition is not
widely used. Machine Learning tools are proposed for improved results [3, 5, 6].
The overall system enables the algorithm to make decisions with minimum human
intervention. Image processing techniques are used to enhance facial recognition
and produce high accuracy. Digital image processing consists of numerous methods
to detect faces in an image composed of a finite number of elements referred to
as a picture element. Generally, image processing involves considering images as
2-dimensional properties while applying already set signal processing methods to
them [6]. HOG Feature Detection [7, 8], Haar Cascade using Eigenfaces [9], Local
Binary Pattern [6, 10], Principal Component Analysis [11], and Linear Discriminant
Analysis are some approaches used to detect faces from an image and extract requisite
features from it. In some cases, a combination of techniques has also been used for
enhanced results [11, 12].
A prototype system is proposed in [13] where feature extraction is performed
through 128-d facial embeddings from Face Net, and its implementation uses libraries
originating from OpenCV and dlib. Another system used the Viola–Jones Algorithm
and HOG features along with the SVM classifier to perform a quantitative analysis
of recorded images based on PSNR values [8]. The computed results suggest that
the higher the PSNR, the better its compressed or reconstructed image.
The TensorFlow platform will be used in the proposed system because it gives the
best result and accuracy for large datasets [8, 9]. Several image processing methods
can be used for face detection and verification, but based on primary research, the
most suitable is HOG feature extraction. Along with flexible programming, Tensor-
Flow allows easy data representation in terms of dataflow graphs that can be used
for the computation of accurate results. The datasets to be used are created from
students’ images in vivid poses, donning accessories like spectacles, masks, etc.,
and in varying light conditions.
In the HOG feature extraction method [7], the images are converted from RGB to
grayscale. These edited images are divided into cells to compute gradients within the
image, demonstrating the orientation of edges in the cell to determine the weights.
The cells are grouped to form blocks to normalize above overlapping spatial blocks
representing the descriptor. Moreover, the algorithm uses Support Vector Machines
(SVMs) that examine data and count gradient orientation appearances in parts of an
250 B. Patel et al.
image. The images’ training and testing are carried out in such binary classifiers to
determine a person’s attendance.
3 Proposed System
The proposed system incorporates a database of labeled images with students’ names
in a classroom environment, as shown in Fig. 1. This data is passed through a neural
network for training the system to identify facial features from the image. Later, the
camera input is provided to the face recognition algorithm in the form of a video
feed to obtain the total number of students and their name labels as real-time output.
Further, the attendance is registered on a web-based/local GUI, which presents the
administrator’s tabular form. This system ensures the effortless storage of attendance
data in a classroom.
The working of the proposed design is as follows.
1. Training the Model: A database of labeled students’ images is created. Some of
the images are sent to the Face Recognition Model as training data. This data
must include images in vivid poses and gestures.
2. Camera Input: After the model is trained, testing occurs on video data obtained
from the classroom.
3. Face Recognition Model: The system is trained with images in Support Vector
Machines that analyze the data and determine the face by processing the oriented
gradient features. This block of the procedure determines students’ presence by
comparing the camera input with the accumulated database.
4. Real-Time Output: The output obtained labels the images and stores the record
simultaneously.
5. Attendance Database Management: The appearances of the students are
recorded in a tabular format. A system using SQLite is generated for database
management.
The model verifies faces by differentiating the test data from the original database
and displays the recorded attendance. HOG features being a dense overlapping grid
gives outstanding results for person detection [4] and thus are to be used in the Face
Recognition Model.
3.1 Experiment
5. Testing: The real-time images of persons verified from the stored database are
labeled with names after completing the procedure. A new image can be distin-
guished by computing face features. After computing face embeddings, they
are differentiated to identify the person in the image.
HOG features are used due to advantages such as proper orientation binning,
fine-scale gradient, relatively coarse spatial binning, and high-quality local contrast
normalization, which is essential for good performance [4].
The experimentation findings show that the HOG feature extraction method has
exhibited satisfactory face identification and verification. The features and landmarks
have been detected successfully to provide positive output. The hardware deployed
in this experiment was that of low resolution, but it has been proven successful. The
images tested were shot under high, medium, and low light conditions. The dlib
Face Recognition Library has successfully identified faces in all the light conditions
shown in Fig. 2. The model’s working remains the same, but interpretation concerning
dataset and image classification changes [9]. It is inferred that using normalized
HOG features yields good results for face identification. The presented framework
functions under low light conditions to produce real-time output.
From this experiment, we also conclude that GPUs’ use has eliminated the scope
of some common error factors like false acceptance of data, low lighting, and occlu-
sion, which tend to reduce the system’s reliability. Using video as an input source
has been advantageous because it offers a lesser size for the same amounts of frames
that images provide. This paper indicates the relevance of advanced concepts like
Machine Learning in everyday situations to improve supervision and overcome
traditional methods’ drawbacks.
The experiment has been carried out through the TensorFlow platform because it
holds the following advantages (Table 1).
The proposed system’s complete study puts forth a framework that stimulates an
accessible platform for attendance monitoring with real-time output. Implementing
Fig. 2 The output of face recognition library for high, medium, and low light conditions
Attendance System Using Face Recognition Library 253
this structured system has been suggested to execute TensorFlow using datasets of
students in diverse poses and gestures. For larger datasets, the Viola–Jones algorithm
can be used parallel to processing of data using GPUs on TensorFlow. A high-quality
laptop camera has been used for experimentation, which can be used to execute
the system proposed. By successfully achieving this experimentation’s output, the
proposed system methodology’s performance can be improved under unfavorable
light conditions. Thus, the accuracy of the system can be enhanced to a large extent.
Cloud and the Internet of Things (IoT) systems have become very popular due to their
location-independent services, seamless connectivity and scalability, and portability
with significantly less energy consumption [16–18].
5 Conclusion
Under unfavorable light conditions, this algorithm has delivered accurate output.
Therefore, attendance tracking during classroom sessions can be effectively executed
through this system. As discussed, CNN’s use extends the working environment to
ensure attendance monitoring in low light conditions. There is seemingly high effec-
tiveness of face recognition technology in everyday situations, eliminating subsidiary
changes in the domain that affect the system’s reliability.
This experimentation will be expanded in the Machine Learning arena in the future
with the introduction of novel algorithms that can increase the system’s learning rates
and efficiency. With the TensorFlow platform, an advanced system will be designed
that can handle larger datasets. Furthermore, the system’s data can be made available
through a live feed, thus increasing real-time output. A database management system
can be designed to maintain attendance logs, which can be accessed when desired.
This experiment provides a study for illumination through photo and video feed.
Research can be conducted on pose and gesture alterations and angle variations.
With these advancements, there can be developed an innovative system that can
effortlessly be utilized in everyday situations.
254 B. Patel et al.
References
1. Varadharajan, E., Dharani, R., Jeevitha, S., Kavinmathi, B., Hemalatha, S.: Automatic atten-
dance management system using face detection. In: Coimbatore, 2016 Online International
Conference on Green Engineering and Technologies (IC-GET)
2. Hoo, S., Ibrahim, H.: Biometric-based attendance tracking system for education sectors: a
literature survey on hardware requirements. J. Sens. (2019)
3. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task
cascaded convolutional networks. IEEE Signal Process. Lett.
4. Kazemi, V., Sullivan, J.: One-millisecond face alignment with an ensemble of regression
trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 1867–1874 (2014)
5. Khan, S., Akram, A., Usman, N.: Real-Time Automatic Attendance System for Face Recog-
nition Using Face API and OpenCV 2020. Springer Science+Business Media, LLC, part of
Springer Nature 2020
6. Chintalapati, S., Raghunadh, M.V.: Automated attendance management system based on face
recognition algorithms. In: 2013 IEEE International Conference on Computational Intelligence
and Computing Research
7. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’05), vol.
1, pp. 886–893. IEEE (2005, June)
8. Rathod, H., Ware, Y., Sane, S., Raulo, S., Pakhare, V., Rizvi, I.A.: Automated attendance system
using machine learning approach. In: Navi Mumbai, 2017 International Conference on Nascent
Technologies in Engineering (ICNTE)
9. Sripathi, V., Savakhande, N., Pote, K., Shinde, P., Mahajan, J.: Face recognition based
attendance system. 2020 Int. Res. J. Eng. Technol. (IRJET)
10. Salim, O.A.R., Olanrewaju, R.F., Balogun, W.A.: Class attendance management system using
face recognition. In: Kuala Lumpur, 2018 7th International Conference on Computer and
Communication Engineering (ICCCE)
11. Patil, M.N., Iyer, B., Arya, R.: Performance evaluation of PCA and ICA algorithm for facial
expression recognition application. In: Pant, M., Deep, K., Bansal, J., Nagar, A., Das, K.
(eds.) Proceedings of Fifth International Conference on Soft Computing for Problem Solving.
Advances in Intelligent Systems and Computing, vol. 436, pp. 965–976. Springer, Singapore
(2016). https://doi.org/10.1007/978-981-10-0448-3_81
12. Borkar, N.R., Kuwelkar, S.: Real-time implementation of the face recognition system. In: Erode,
2017 International Conference on Computing Methodologies and Communication (ICCMC)
13. Handaga, B., Murtiyasa, B., Wantoro, J.: Attendance system based on deep learning face
recognition without queue. In: Semarang, Indonesia, 2019 Fourth International Conference on
Informatics and Computing (ICIC)
14. Apoorva, P., Impana, H.C., Siri, S.L., Varshitha, M.R., Ramesh, B.: Automated criminal iden-
tification by face recognition using open computer vision classifiers. In: Erode, India, 2019 3rd
International Conference on Computing Methodologies and Communication (ICCMC)
15. Khan, S., Akram, A., Usman, N.: Real time automatic attendance system for face recognition
using face API and OpenCV. Wireless Pers. Commun. 113, 469–480 (2020)
16. Deshpande, P., Iyer, B.: Research directions in the Internet of Every Things (IoET). In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, 2017, pp 1353–1357. https://doi.org/10.1109/CCAA.2017.8230008
17. Iyer, B., Patil, N.: IoT enabled tracking and monitoring sensor for military applications. Int. J.
Syst. Assur. Eng. Manag. 9, 1294–1301 (2018). https://doi.org/10.1007/s13198-018-0727-8
18. Deshpande, P.: Cloud of everything (CLeT): the next-generation computing paradigm. In:
Advances in Intelligent Systems and Computing, vol. 1025, pp. 207–214, Springer, Singapore
(2020). https://doi.org/10.1007/978-981-32-9515-5_20
Studies on Performance of Image
Splicing Techniques Using Learned
Self-Consistency
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 255
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_24
256 B. K. Priya et al.
been used to produce realistic combined images, substitute significant image regions,
and generate unreal images.
Most of the existing techniques have used the standard supervised learning
approaches for several image detection problems. A problem with the standard super-
vised learning approaches is that, despite being efficient for many types of detection
problems, they aren’t perfectly suitable for image splice detection. This is because
the domain of manipulated images is broad and extensive, that it is nearly impossible
to have adequate manipulated training data in order to make a supervised method
completely work out. As a matter of fact, detecting visual manipulation can be pic-
tured as an anomaly detection problem, that is, we intend to mark something that
seems different and “out of normal”. For the image splice detection problem, no
suitable solution has been found in the literature [4–7]. Hence, a method has been
proposed that does not need to have any modified training data and can function in a
self-supervised environment. The image EXIF metadata has been utilized by the pro-
posed method to have a comprehensive source of information. The EXIF metadata
tags are nothing but specifications of the camera, which have digitally been inscribed
in the image during capturing and are pervasively accessible. Figure 1 shows that the
first glimpse of an image might appear original, but after detailed observation, it is
noticed that the person on the left (Modi) has been captured into the image. The
spliced region content is from a separate image, given next to it. This type of manip-
ulation is known as a spliced image, and this is an unusual way of producing fake
images. After accessing source images, it can interpret from their respective EXIF
metadata that may have numerous variations in the imaging pipelines. Hence, the
model utilizes the automatically recorded image Exchangeable Image File Format
(EXIF) metadata as the supervisory signal for training the model to verify if the
image is self-consistent, i.e., a single imaging pipeline has generated photo content.
Figure 1 shows an Anatomization of the spliced image. A typical method of pro-
ducing forged images is done by splicing the content from two separate authentic
image sources. The proposed model is based on the patches of a spliced image gen-
erated with different EXIF metadata attributes, that is, by other imaging pipelines.
But while testing an image for its authenticity on the model, access to EXIF metadata
values of these two source images is not highlighted.
2 Related Work
Salloum et al. [8] proposed a technique that learns to detect the spliced regions
trained using a thoroughly conventional network on labeled training data. This tech-
nique has been used to detect the significant problems of identifying some specific
manipulation cues, like double JPEG compression and contrast amplification. Mayer
et al. [9] introduced a model using the Siamese network to establish if pairs of image
patches are from a similar camera model, a specific case of our EXIF data consis-
tency model though the results are very preliminary. These methods also estimated a
photo’s semantic content and its metadata matches. Agarwal and Farid [10] proposed
a methodology that utilizes an inconsiderable difference between imaging pipelines
to detect image splices, especially during JPEG quantization, the method in which
different cameras truncate numbers. Due to easy interpretability, these approaches
are found to be helpful. Bondi et al. [11] define an algorithm for tampering identi-
fication and localization in the image, using characteristic footprints present on the
images by dissimilar camera models. In pristine images, the algorithm shows that
each pixel has been detected to identify the image as being shot with one device.
Conversely, traces of multiple devices are often seen if an image is obtained through
image composition. This proposed algorithm uses a CNN network to extract all the
camera model characteristics from image patches. The features are further evaluated
using iterative clustering to verify whether a picture has been forged and localizing
the tampered region. Doersch et al. [12] introduce a model that is trained to see
whether the relative positions of pairs or patches from a picture match or not. It is
trained to find out very similar artifacts like chromatic lens aberration as images are
noise. The proposed model has been implemented with our algorithm to interpret the
imaging pipeline’s properties and pass semantics.
3 Proposed Architecture
Figure 2 describes the flow diagram of how the splice detection and localization of a
spliced image are done. Two random image patches from an image are considered,
and prediction is done whether they have consistent metadata or not. Each metadata
attribute consists of a consistency metric at the time of training and testing.
The Siamese network is used for training the proposed model. In the Siamese net-
work, ResNet50 has been used as a subnetwork for extracting features from the image
patches, i.e., ResNet50 has a depth of 50 layers and 224 × 224 image input size. In
ResNet50, the image patches have to be passed through different convolution layers
258 B. K. Priya et al.
and other operations like batch normalization. Relu is performed on the resultant
block to get feature vectors of both the image patches. The feature vector passed
from the three-layer MLP (Multilayer Perceptron). MLP is used to detect different
kinds of object detection and edge-detection and passed through the sigmoid func-
tion. The sigmoid function calculates the two input image patches’ various features
are the same or not, and based on these, a similarity score matrix is generated to
show the two patches having the same EXIF attributes or different EXIF attributes.
The proposed model is trained using the Siamese Neural Network with the help of
ResNet50 subnetwork, as shown in Fig. 3.
Two image patches of size 128 × 128 are predicted to know the probability that
the two image patches have the same value for each EXIF metadata attribute. Flickr
images are considered for identifying the features, which occur in most similar image
patches (more than 50,000). EXIF values that do not occur more than 100 times
for each EXIF attribute are considered in the images. During training, the Siamese
network uses ResNet50 as a subnetwork that produces a 4096 dim. The feature
vectors are concatenated and pass them via a three-layer MLP with 4096, 2048,
1024 units, and a final output layer having the similarity score is generated. In this
way, for a particular EXIF attribute, a prediction is performed whether two patches
of an image will have the same value or not.
260 B. K. Priya et al.
The two main challenges faced in this technique are some EXIF attributes are too
challenging to learn as they are rare, and selecting the random pairs may have the
same values for EXIF attributes even though they are from the same image.
These issues can be overcome by implementing a unary and pairwise re-balancing.
During unary re-balancing, merge the rare EXIF attributes and construct a minibatch
for each attribute. While completing minibatch during pairwise re-balancing, fifty
percent of the batch should have the same value for the attribute, and remaining
should not.
Three operations have been performed in these phases, like randomly image resizing,
re-reading the image, and performed the Gaussian blur operation. For each opera-
tion, parameters have been chosen randomly from a discrete set of numbers. These
three operations have been introduced to our model to check whether the two image
patches have the same parameters for a particular augmentation operation. The post-
processing facilitates identifying the image’s consistency despite the manipulated
region consisting of the same metadata attribute as the image into which it was
inserted previously. By adding these three operations, 83 (80 + 3) binary attributes
are present. The next step is to combine the consistency predictions and get the over-
all consistency of the image, as shown in Fig. 4
Self-supervised training of the images: At the time of training, two random image
patches are predicted whether the two patches have consistent metadata or not.
The next step is the aggregation process. In the aggregation process, combine pairwise
consistency probabilities (which we got from the third phase) into a global self-
consistency score for the entire image. For a particular image, image patches are
sampled in a grid. This striding results in a maximum of 625 patches (for the standard
4:3 aspect ratio, we sample 25 × 18 = 450 patches). A response map has been a picture
corresponding to its consistency with every other patch in the image for a given
patch. The average overlapping calculation has been performed on patch predictions
to increase the spatial resolution of each response map. If the image is manipulated,
most of the patches from the non-spliced area of the image will ideally have a lower
consistency compared to the patches from the spliced portion. To generate a final
response map for an input image, it is essential to find the most consistent mode
among all patch response maps by the mean shift technique. This mean shift map
gives the response map by segmenting the image into two parts like consistent and
inconsistent regions naturally. This aggregated response map is our consistency map.
4.1 Dataset
Three different datasets have been used for testing and evaluating:
• Columbia dataset: This consists of 180 relatively simple images.
• Realisting Tampering(RT) dataset: This contains 220 images with a combination
of splicing and post-processing operations. Some other manipulations such as
copy-move are also included.
• In-the-wild dataset: It contains 201 images, which are extracted from THE ONION,
a news website, and REDDIT PHOTOSHOP BATTLES. As ground truth labels for
internet splices are not available, to get the approximate ground truth, annotation
of images is done manually.
262 B. K. Priya et al.
Figure 5 shows some spliced images and the EXIF consistency outputs produced by
our model. Figure 6 shows real authentic images and their EXIF consistency outputs
obtained from the proposed model. The proposed model has correctly indicated that
the images are consistent as they are original authentic images.
5 Conclusion
Images play a significant role in interpreting the facts and number of fraud incidents
happening in society. To identify such an image splice, a learning-based algorithm has
been proposed for recognizing many visually spliced images and manipulations in
the image. The proposed model shows an impeccable performance improvement, and
hence it can be integrated into various sectors to curb the spread of fake content and
avoid fraud incidents. The model also localizes the spliced region with an accuracy of
86%, the precision of 87%, and the specificity of 97%. With the additional copy-move
forgery detection, it can detect even amazing manipulations in the visual fakes.
Studies on Performance of Image Splicing Techniques Using Learned Self-Consistency 263
References
1. Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Two-stream neural networks for tampered face
detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops
(CVPRW), pp. 1831–1839. IEEE (2017). https://doi.org/10.1109/CVPRW.2017.229
2. Ghosh, P., Morariu, V., Larry Davis B.-C.I.S.: Detection of metadata tampering through dis-
crepancy between image content and metadata using multi-task deep learning. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 60–68
(2017). https://doi.org/10.1109/CVPRW.2017.234
3. Kniaz, V.V., Knyaz, V., Remondino, F.: The point where reality meets fantasy: Mixed adversarial
generators for image splice detection. In: Advances in Neural Information Processing Systems,
pp. 215–226 (2019)
4. de Sa, V.R.: Learning classification with unlabeled data. In: Advances in Neural Information
Processing Systems, pp. 112–119 (1994). https://doi.org/10.5555/2987189.2987204
264 B. K. Priya et al.
5. Ferrara, P., Bianchi, T., De Rosa, A., Piva, A.: Image forgery localization via fine-grained
analysis of CFA artifacts. IEEE Trans. Inf. Forensics Secur. 7(5), 1566–1577 (2012). https://
doi.org/10.1109/TIFS.2012.2202227
6. Ye, S., Sun Q., Chang, E.-C.: Detecting digital image forgeries by measuring inconsistencies
of blocking artifact. In: 2007 IEEE International Conference on Multimedia and Expo, pp.
12–15. IEEE (2007). https://doi.org/10.1109/ICME.2007.4284574
7. Cun, X., Pun, C.-M.: Image splicing localization via semi-global network and fully connected
conditional random fields. In: Proceedings of the European Conference on Computer Vision
(ECCV) (2018)
8. Salloum, R., Ren, Y., C-C. Jay Kuo (2018) Image splicing localization using a multi-task
fully convolutional network (MFCN). J. Vis. Commun. Image Represent. 51, 201–209 (2018).
https://doi.org/10.1016/j.jvcir.2018.01.010
9. Mayer, O., Stamm, M.C.: Learned forensic source similarity for unknown camera models. In:
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
pp. 2012–2016. IEEE (2018). https://doi.org/10.1109/ICASSP.2018.8462585
10. Agarwal, S., Farid, H.: Photo forensics from JPEG dimples. In: 2017 IEEE Workshop on
Information Forensics and Security (WIFS), pp. 1–6. IEEE (2017). https://doi.org/10.1109/
WIFS.2017.8267641
11. Bondi, L., Lameri, S., Güera, D., Bestagini, P., Delp, E.J., Tubaro, S.: Tampering detection
and localization through clustering of camera-based CNN features. In: 2017 IEEE Conference
on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1855–1864. IEEE
(2017). https://doi.org/10.1109/CVPRW.2017.232
12. Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context
prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp.
1422–1430 (2015). https://doi.org/10.1109/ICCV.2015.167
Random Forest and Gabor Filter Bank
Based Segmentation Approach for Infant
Brain MRI
Abstract A precise study of infant brain development during the first year of infancy
is crucial in the initial research of rapid neurological growth. Non-invasive techniques
of neuroimaging, such as MRI, are essential for connecting the brain to behavioral
changes in neonates and infants. The illustration of the developing brain from magnet
resonance (MR) offers a description of the developmental process following gesta-
tion. It is challenging to maintain normality since the presence of the normal brain
varies nearly every week. Hence, images in the infant MR generally reveal a decreased
contrast to adult tissue images. Therefore, the current computing techniques gener-
ally designed for adult brains are not appropriate for treating images in neonatal MR.
Few analytical tools for neuroimaging of the baby brain were suggested to over-
come those problems. This article presented state-of-the-art empirical approaches
for MRI diagnosis and the study of baby brains that helped us comprehend neonatal
neurodevelopment. We employed the BM3D image denoising method in prepro-
cessing stage. A hybrid combination of 32 Gabor filter banks and Canny edge, Sobel,
Prewitt, Scharr, Gaussian (with σ = 3 and 7), Median (with σ = 3 and 7), and Roberts
operators are used for effective feature extraction. Then lastly, the Random Forest
classifier is utilized for tissue segmentation and classification.
Keywords Random forest · Infant brain MRI · Gabor filter · BM3D · Neonatal
phase · Segmentation
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 265
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_25
266 V. R. Patil and T. H. Jaware
in the neonatal phase to examine brain development and growth. Brain segmenta-
tion is a fundamental requirement for evaluating the tissue structure of the brain
through computational MRI. Though manual analysis of MR image takes an incred-
ibly long time and so it is a tedious task. Moreover, the intra and inter heterogeneity
is vulnerable to manually marking, which decreases its efficacy. The shortcoming in
the manual method poses a challenge to make more extensive samples of subjects
needed for the large population assessment. Thus, specific automated approaches are
required to delineate the brain tissues. Automatic newborns and infant brain segmen-
tation are much more challenging than those of the adult human brain. Infant Brain
MRI is having relatively low CNR and low SNR because of the small head size.
However, at image collection, they are often exposed to significant motion artifacts.
The primary objective of this research is to have an accurate, automated infant brain
MRI segmentation framework.
Challenges
Automatic infant brain MRI segmentation is also tricky, amid success in the capturing
of MR images. There exist critical constraints for the MR images that adversely
impact Segmentation regardless of the development of the conceptual framework.
The intensity of subsequent tissue types is non-uniform as well as steadily varies
throughout the input image. The RF coils and magnetic interference produces
intensity inhomogeneity in the input image. Applied magnetic field power scanner
contributes to more substantial variations in intensity. The mix of various tissue types
in a particular voxel with partial volume (PV) effect presents inherent problems for
the accurate description of tissue borders [3]. Though this resulting image has low
resolution, voxels with many tissues consist of mixed tissue strength in the voxel.
Besides, flickering inside this image is always visible, mainly due to electric noise
inside the body and subtle irregularities in receiving devices [4, 5]. So infant brain
MRI automated process of tissue segmentation becomes much tricky compared to
an adult.
The paper is structured as follows: Sect. 2 focuses on related work, Sect. 3 illus-
trates the proposed hybrid segmentation method, Sect. 4, presents simulated findings
and discussion finally the conclusion of research work is explored in Sect. 5.
2 Related Work
this section, a brief overview of segmentation methods utilized for infant brain MRI is
presented. It is classified into the brain and non-brain extraction, tissue segmentation,
and more detailed structure delineation and depicted in Fig. 1.
Research Gaps
An infant’s brain progresses in the first two years is complex and more relevant to the
advancement of neural skills and lifelong habits and the likelihood of neurodevelop-
mental disease. It is also essential to recognize and measure the standard progressive
brain patterns defined as early and specifically as needed for deviating progression,
highlighting developmental issues and disturbance [7]. The prevailing methodolog-
ical constraint on the comprehensive and accurate measurement of structural and
functional brain for infants by MRI is partly responsible for this critical research
gap [8]. Although the appropriate and automatic (considering the time) division
of earlier brain functional MRIs has not adequate frameworks, currently offered—a
method necessary for a nearly most comprehensive study of early brain MRIs. Unless
the infants’ Segmentation is precise and automatic, the MRI study is systemic and
labor-intensive, reducing all sample sizes, durability, and robustness. The initia-
tives towards early detection of defects in growth or developmental disabilities and
tracking the impacts of intervention would substantially raise the complexities of the
first-year adolescents’ Segmentation [9].
3 Proposed Method
The T1 and T2 weighted Brain MRI images of infants are used as the input images.
The images are used in the work are taken from a standard database of infant brain
MRI [10]. The input images are acquired using an MRI scanner having different
268 V. R. Patil and T. H. Jaware
scanning times, protocols, and sequences. The input images are having low contrast,
affected with noise due to RF Coils and magnets. For accurate Segmentation of brain
tissues, it’s very much necessary to preprocess these images and improve their quality
for further analysis. In healthcare, particularly in medical image analysis, edges are
of great importance. Preservation of edges and improvement of image quality is a
significant concern for accurate diagnosis.
BM3D is the new denoising approach relying on the assumption that an image has
a local patchy description in the time domains. By separating identical 2D image
patches towards 3D classes, this sparsity is improved [11]. BM3D is an advanced
technique. Owing to very precise block-matching in the more substantial edges, their
denoted findings can always be higher than in the smoother or weaker edge areas. This
makes improved image denoising with the use of adjustable block sizes in various
image areas. BM3D filtering and grouping process is known as the collaborative
filter method. This is performed in four different phases [12].
• Reveal as well as organize the image patches in a 3D block like a particular image
patch
• Perform 3D linear transformations on the image
• Modify variables with a shrinking range
• Perform reverse 3D linear transformation.
In the medical image analysis, the feature performs an incredibly crucial role.
Different image preprocessing operations are performed input brain MR image before
obtaining features [13, 14]. Afterward, feature extraction methods are used to acquire
essential features in medical image segmentation. Here 32 Gabor filter banks and
Canny edge, Sobel, Scharr, Gaussian (with σ = 3 and 7), Median (with σ = 3 and 7),
and Roberts operators are used for effective feature detection and edge preservation.
The expression used for the Gabor filtering bank is given as follows.
x
x 2 + γ 2 · y2
g(x, y; λ, θ, ψ, σ, γ ) = exp exp i 2π + ψ (1)
2σ 2 λ
The significant increase in medical research, a hybrid combination of all the above
methods, is the novelty of this research work. Optimum features are selected for this
purpose to improve segmentation accuracy.
Random Forest and Gabor Filter Bank … 269
Recently, random forests (RFs) have expanded interest in the infant’s brain MR image
analysis [15–17]. These RFs have proved to be specific and stable for several brain
tissue segmentation challenges to handle a significant volume of multiclass data of a
high dimension. The proposed segmentation framework is depicted in Fig. 2, whereas
Fig. 3 represents the random forest approach for Brain MR image segmentation.
Figure 4 represents the simulation results of the proposed method. For evaluating
each segmentation approach, the DSC and accuracy parameters are employed by
comparing results with manual Segmentation. Higher DSC and accuracy values indi-
cate superior performance. Regarding manual interpretations considered the ‘gold
standard,’ the efficiency of segmentation approaches is investigated systematically.
With overlapping tests or object displacement measurements, the efficiency is usually
approximate. The Dice Similarity coefficient seems to be the most standard indicator
of overlap used in an infant’s brain MRI, and the following expression gives it
2|X ∩ Y |
DSC = (2)
|X | + |Y |
270 V. R. Patil and T. H. Jaware
where X represents Segmentation obtained using the proposed algorithm and Y repre-
sents manual Segmentation. In the complete agreement between the two segmenta-
tion approaches, the calculation takes a value of 1 and 0 when there is no overlap.
Table 1 represents the values of performance metrics, which include DSC and
accuracy. Table 2 depicts the performance comparison with existing methods.
5 Conclusions
Infant brain MRI is becoming more attractive as higher quality images are obtained,
and newborn growth is gradually being focused. We proposed a system for infant’s
brain MRI segmentation in this research paper. Initially, segmentation challenges
and research gaps are discussed, followed by a short overview of the preprocessing
methods used to minimize image artifacts related to movement, noise, and volume.
To overcome BM3D image denoising and hybrid feature extraction strategies, the
Random Forest classifier is employed for accurate brain tissue segmentation. The
latest segmenting challenge iSeg2017 database is used for the Segmentation of
neonatal brain tissues. The method outperforms as compared to many existing
methods.
Random Forest and Gabor Filter Bank … 271
Table 2 Performance
Method for segmentation DSC
comparison
Proposed 0.902
Moeskops et al. [18] 0.857
Choi et al. [19] 0.820
References
1. Hack, M., Fanaroff, A.A.: Outcomes of children of extremely low birth weight and gestational
age in the 1990s. Seminars Neonatol. 5(2), 89–106 (2000)
2. Marlow, N., Wolke, D., Bracewell, M.A., Samara, M.: Neurologic and developmental disability
at six years of age after extremely preterm birth. N. Engl. J. Med. 352(1), 9–19 (2005)
3. Makropoulos, A., Gousias, I., Ledig, C., Aljabar, P., Serag, A., Hajnal, J., Edwards, A., Counsell,
S., Rueckert, D.: Automatic whole brain MRI segmentation of the developing neonatal brain.
IEEE Trans. Med. Imaging 33(9), 1818–1831 (2014)
4. Belaroussi, B., Milles, J., Carme, S., Zhu, Y.M., Benoit-Cattin, H.: Intensity nonuniformity
correction in MRI: existing methods and their validation. Med. Image Anal. 10(2), 234–246
(2006)
5. Tofts, P.: Quantitative MRI of the Brain: Measuring Changes Caused by Disease. Wiley (2003)
6. Weishaupt, D., Froehlich, J.M., Nanz, D., Kochli, V.D., Pruessmann, K.P., Marincek, B.:
How Does MRI Work?: An Introduction to the Physics and Function of Magnetic Resonance
Imaging. Springer (2008)
7. Xue, H., Srinivasan, L., Jiang, S., Rutherford, M., Edwards, A.D., Rueckert, D., Hajnal, J.V.:
Automatic segmentation and reconstruction of the cortex from neonatal MRI. NeuroImage
38(3), 461–477 (2007)
8. Rutherford, MA: MRI of the Neonatal Brain. W.B. Saunders (2002)
9. Prastawa, M., Gilmore, J.H., Lin, W., Gerig, G.: Automatic segmentation of MR images of the
developing newborn brain. Med. Image Anal. 9(5), 457–466 (2005)
10. Makropoulos, A., Counsell, S.J., Rueckert, D.: A review on automatic fetal and neonatal brain
MRI segmentation. Neuroimage 170, 231–248 (2018)
11. Gilmore, J.H.: Understanding what causes schizophrenia: a developmental perspective. Am. J.
Psychiatry 167(1), 8–10 (2010)
12. Wang, Y., Haghpanah, F., Aw, N., Laine, A., Posner, J.: A transfer-learning approach for first-
year developmental infant brain segmentation using deep neural networks (2020). https://doi.
org/10.1101/2020.05.22.110619
13. Wang, L., et al.: Benchmark on automatic six-month-old infant brain segmentation algorithms:
the iSeg-2017 challenge. IEEE Trans. Med. Imaging 38(9), 2219–2230 (2019)
14. Smith, S.M.: Fast robust automated brain extraction. Hum. Brain Mapp. 17(3), 143–155 (2002)
15. Shattuck, D.W., Sandor-Leahy, S.R., Schaper, K.A., Rottenberg, D.A., Leahy, R.M.: Magnetic
resonance image tissue classification using a partial volume model. Neuroimage 13(5), 856–876
(2001)
16. Magar, V.M., Christy, T.B.: Gabor filter based classification of mammography images using
LS-SVM and random forest classifier. In: 2nd International Conference on Recent Trends in
Image Processing and Pattern Recognition, pp. 69–83. Springer, India (2018)
17. Mahsa, D.D., Louis, C.: BISON: brain tissue segmentation pipeline using T1-weighted
magnetic resonance images and a random forest classifier. Magn. Reson. Imaging 85(4),
1881–1894 (2021)
18. Moeskops, P., Viergever, M.A., Benders, M.J., Isgum, I.: Evaluation of an automatic brain
segmentation method developed for neonates on adult MR brain images. In: Proceedings of
SPIE Medical Imaging, vol. 9413 (2015)
19. Choi, U.S., Kawaguchi, H., Matsuoka, Y., Kober, T., Kida, I.: Brain tissue segmentation based
on MP2RAGE multi-contrast images in 7 T MRI. PLoS ONE 14(2) (2019)
Sensory-Motor Cortex Signal
Classification for Rehabilitation Using
EEG Signal
1 Introduction
resources and cannot control their limbs or any body parts. A BCI empowers these
subjects to control a PC with their imagination for correspondence, portability, and
different purposes [2]. Practically speaking, movement-related signals are mostly
used for BCI applications since they are very much defined and unbiased to the
subject [3]. Movement-based BCIs enable the subject to demonstrate a decision by
performing or just thinking of a few already defined movements. If BCI is used
practically, it tries to convert the rhythmic signals related to human sensorimotor
procedures in a noninvasive way have been seriously studied. In this field, researchers
are mostly concentrated on power changes in alpha (α) and beta (β) bands [4]. The
benefit of the BCI procedure is the feasibility of predicting imaginary or actual
movement and finding which EEG signals would be best for controlling such an
instrument.
As the standard BCI model, Motor Execution (ME) or Motor Imaginary (MI) is
a unique mental state. During such a state, everyone plays psychological practice of
sensory-motor activity with or without motor execution. Hence, most of the time,
motor imaginary is used to control the disabled person’s idea [5]. The signals related
to particular movement and increment or decrement of the power in the specific band
are known as Event-Related Desynchronization (ERD) and Event-Related Synchro-
nisation (ERS), respectively [6]. The upper limbs’ actual movement induces changes
in EEG signals over a sensory-motor zone in explicit frequency ranges, for example,
the beta band and alpha band.
To distinguish this task incited EEG action, many feature extraction methods
have improved MI and ME BCIs. For instance, Power Spectral Density (PSD) or
AR modeling algorithms have been utilized to describe ERD, and ERS [7, 8]. As
of late, spatial filters like CSP have been demonstrated to help expand contrasts
between actual movement features, which leads to classifying these classes quickly
[9]. After feature extraction, its important to classify the signal with many latest
machine learning algorithms, for example, Linear Discriminant Analyzers (LDA) and
Support Vector Machine (SVM), etc., have been utilized to distinguish the objective
of ME and an interpretation of the motor execution of its output [10].
Figure 1 gives the block diagram of the proposed system for the classification
of sensory-motor signal while the subject is performing the suggested task. The
paper is arranged in the following manner. Section 2 gives a detailed explanation
of the materials and methods implemented during work. Section 3 gives the results
obtained during experimentation. Finally, Sect. 4 concludes the work presented in
the paper.
The participants were instructed to sit in a chair and their right hand was fixed in an
ARMEO Spring rehabilitation device (Hocoma, Switzerland). The ARMEO Spring
is an exoskeleton and supports the subject’s arm from gravity to prevent muscle
fatigue.
The motor activity areas that are better reflected are central, partial, and frontal, but
dominantly on the central portion. So that EEG data were recorded from 20 active
electrodes (CZ, P4, F8, Pz, P3, P8, O1, O2, T8, P7, C4, F4, Fp2, Fz, C3, F3, Fp1,
T7, F7, Oz, i.e., EXT) attached to an EEG cap from 20 dry scalp electrodes as per
the 10–20 system which is accepted internationally as shown in Fig. 2a (Top view),
b (Side view).
Figure 3 shows the experimental setup for a designed protocol in which the subject
is playing the High Flyer game and simultaneously brain signals were recorded by the
Enobio-20 machine designed by Neuro-Electrics. The Enobio machine’s advantage
is its functioning as wireless and wired and has a sampling rate 500 Hz.
The experimental setup and the experimental paradigm is as shown in Figs. 3
and 4, respectively. A rest period is between 0 and 4 s at the start of each run and
the end. There is a 1-s gap for mental preparation at the actual beginning run. The
trial ended 1 s after the success queue. After every run, a break of 5 s followed. Each
run consisted of 30 trials (15 trials for each target, randomly distributed). Six runs
were recorded, i.e., three for the left hand and three for the right hand. Also, four
different weight conditions to be followed for forearms like A, B, C, and D in weight
276 V. Kulkarni et al.
increasing order. The two trials of right-hand movement were recorded with the load
A and the load B. Similarly, two trials of left-hand movement were recorded with
the load C and the load D. Hence, we get the data of actual movements of both right
and actual left hand.
The EEG signal is very noisy as it contains artifacts added due to muscle movements
of different body parts, 50 Hz line noise, etc. Hence, preprocessing of recorded raw
EEG data is an important step. To remove this noise present in the raw EEG signal,
a bandpass filter from EEGLAB is used having the following specifications. The
FIR filter of transition width of 0.5 Hz, passband edges between range Hz, cutoff
frequencies—[7.75 30.25] Hz. The Independent-Components Analysis (ICA) is used
to eliminate noise from the recorded data. Hence, clean EEG data were generated by
removing such ICs. For example, an EEG sample channel before and after artifact
removal is shown in Fig. 5.
The ARMA model is the combination of Autoregressive (AR) model and Moving
Average (MA) model. ARMA modeling is among the most conspicuous parametric
techniques [11]. It predicts the future output from a previous linear mixture of EEG
data along with some independent components. This brings clean EEG data. The for-
ward expectation of filtered EEG data was cultivated by utilizing the accompanying
derivation
m
p [k] = an p [k − n] + q [k] (1)
n=1
where q[k] is the prediction error (modified information included in present EEG
sample), an are the model parameters, and p[k] is a time series fed to the model.
For the evaluation, recorded EEG data having 20 electrodes. Each elbow movement
is trialed for 1 s, and sampling of filtered EEG data is done with the sampling rate
500 Hz.
The ARMA ( p, q) model is
p
q
yt = − at(c) yt−c + bt(d) et−d + et (2)
c=1 d=1
P j − PR
ERD/ERS = (3)
PR
The above Fig. 6 shows ERS and ERD patterns due to left movement on C3 and C4
electrodes.
Common spatial pattern (CSP) [13] discovers spatial filters that increase the variance
of a particular class also parallelly decrease that of another. Suppose matrix X ∈ R E×P
Sensory-Motor Cortex Signal Classification for Rehabilitation Using EEG Signal 279
catch an attempt of filtered EEG data which portray an actual developer, where E
stands for electrodes, and P is a number of data points in a single run. If N j trials
are there in single training set for the class C j , so a mean covariance matrix for C j
is calculated with covariance matrix M
1
Cj = M (4)
N j M∈Ω
j
It could find the spatial filter w ∈ R E×1 which increases the variance of C1 as well
as decreases the variance of C2 by the fixing the following:
w T C1 w
max J (w) = (5)
w T (C1 + C2 )w
The solution of Eq. (5) is costed with the eigenvalue problem by C1 = λ(C1 +
C2 )w where w is the eigenvector and λ is a generalized eigenvalue. This gives E
eigenvectors wk also eigenvalues λk , k = 1,…, T. Hence, designed spatial filter gives
output yk
yk = wkT X (6)
The columns of wkT are CSPs which are considered as the origin of time-invariant
features vectors. The features used for classification are obtained by filtering the EEG
from Eq. (6).
280 V. Kulkarni et al.
This paper conducted one experiment three times with different speeds (Slow,
Medium, and Fast) and tested every classification algorithm on it. Also, every subject
has undergone the six trials for each elbow movement at each speed level. The results
obtained by applying multiple approaches are compared in the following section.
To test the classification results on features extracted from filtered data, the SVM
and KNN with different kernels from the Matlab 2018b are applied. After applying
all classifiers on filtered EEG data, it finds that Q-SVM [14] is the best classifier that
works for the EEG and BCI. In the holdout validation method, 30% of total data is
used for testing, and the remaining 70% are kept to train the classifier. Every time
while running the classifier, this 30% for testing data is selected randomly.
This section describes the quality of the collected EEG dataset and the performance
of different feature extraction methods is summarized in the following Table 1.
This comparison also gives the importance of frequency bands while performing
actual movements at the sensory-motor cortex area. Among three ARMA, CSP, and
ERD/ERS methodologies, ERD/ERS gives the best results with an average classi-
fication accuracy of 97.26 ± 0.3. From Table 1, it can be seen that hand movements
are better captured in the alpha band (8–12 Hz) than the beta band (13–30 Hz). The
performance of every subject is studied from three different experiments, as shown in
below Table 1. The bold values in Table 1 shows the best classification results among
all the trials. The classification accuracy for subject 3 is best among all trials from
all subjects.
The average classification accuracy of the proposed methodology is compared
with the state-of-the-art performances, as shown in Table 2. Nicolas-Alonso et al. [15]
focused on classifier and used Regularized Linear Discriminant Analysis (RLDA).
The authors got lesser accuracy (75%) as only bandpass filtering was performed in
the preprocessing part. To overcome this, ICA is included in the preprocessing part.
Ghaemi and Rashedi [16] had reported 76.02% by implementing Blind Source Sep-
aration (BSS) to remove artifacts that are not that much powerful if data is too noisy.
The ICA cleans EEG data irrespective of the rawness of collected EEG data. Tang et
al. [17] achieved a classification accuracy of 87.37%. They have used ERD/ERS as
features and LDA as a classifier. In 2019, Sun et al. [18] have worked on variations
based on segmented bispectrum (VBSB) as a feature and achieved an average CA
of 93.10%. The VBSB works on averaging technique.
Sensory-Motor Cortex Signal Classification for Rehabilitation Using EEG Signal 281
Table 1 Comparison of feature extraction methods within different frequency bands using Q-SVM
classifier
Alpha Beta
ARMA CSP ERD/ERS ARMA CSP ERD/ERS
SUB 1 Exp1 90.9 82.9 96.1 94.2 89.7 96.5
Exp2 93.2 84.0 97.6 95.4 91.6 96.2
Exp3 91.5 82.9 97.1 94.4 89.1 96.6
SUB 2 Exp1 92.9 86.8 97.0 94.2 89.7 96.3
Exp2 88.9 80.0 95.8 92.1 87.5 96.2
Exp3 94.6 87.7 97.0 96.2 92.9 97.2
SUB 3 Exp1 95.4 88.3 98.7 96.6 94.1 96.6
Exp2 93.4 83.8 97.4 95.6 91.7 97.7
Exp3 94.1 85.1 98.6 96.6 93.4 97.2
Mean ± SD 92.77 ± 0.6 84.61 ± 0.8 97.26 ± 0.3 95.03 ± 0.4 91.08 ± 0.6 96.72 ± 0.1
Here this paper tried to develop a robust and generalized algorithm to classify left
and right-hand movement irrespective of subject-specific model and tuning. Here
the paper implements three different ways to achieve higher accuracy than other
methodologies.
4 Conclusions
This paper proposes different ways to deal with the non-stationary time series EEG
data. The paper worked on recorded and filtered EEG used to classify the movements
of the upper limb. The detachability of EEG features of complete BCI data has
been assessed after the recording. The unique methodology has been conducted
to collect raw EEG data using a designed, efficient protocol. The results showed
above clearly signify that the ERD/ERS algorithm performs much better than other
techniques such as CSP and ARMA. The ERD/ERS marking shows contralateral as
282 V. Kulkarni et al.
well as an ipsilateral reflection at the sensory-motor cortex area when the volunteer is
performing actual movement tasks. The effect of hand movements is better captured
in the alpha or mu frequency bands. The proposed framework would lead to the
advancement of real-world BCI technology strategies and assistance for healthy and
motor-disabled persons in everyday life.
References
1. Clerc, M., Bougrain, L., Lotte, F.: Brain-Computer Interfaces, vol. 1. Wiley-ISTE (2016)
2. Fok, S., Schwartz, R., et.al.: An EEG-based brain computer interface for rehabilitation and
restoration. In: 2011 Annual International Conference of the IEEE Engineering in Medicine
and Biology Society, pp. 6277–6280. IEEE (2011)
3. Vidaurre, C., Ramos-Murguialday, A., Haufe, S., Gómez-Fernández, M., Müller, K.-R.,
Nikulin, V.V.: Enhancing sensorimotor BCI performance with assistive afferent activity: an
online evaluation. NeuroImage (2019)
4. Pfurtscheller, G., Brunner,C., Schlögl, A., Da Silva, F.H.L.: Mu rhythm (de) synchronization
and EEG single-trial classification of different motor imagery tasks. NeuroImage 31(1), 153–
159 (2006)
5. Liang, S., Choi, K.S.: Improving the discrimination of hand motor imagery via virtual reality
based visual guidance. Comput. Methods Programs Biomed. 132, 63–74 (2016)
6. Pfurtscheller, G.: Induced oscillations in the alpha band: functional meaning. Epilepsia 44, 2–8
(2003)
7. Penny, W.D., Roberts, S.J., Curran, E.A., Stokes, M.J.: EEG-based communication: a pattern
recognition approach. IEEE Trans. Rehabil. Eng. 8(2), 214–215 (2000)
8. Shibata, E., Kaneko, F.: Event-related desynchronization possibly discriminates the kinesthetic
illusion induced by visual stimulation from movement observation. Exp. Brain Res. 237(12),
3233–3240 (2019)
9. Song, X., Yoon, S.-C.: Improving brain-computer interface classification using adaptive com-
mon spatial patterns. Comput. Biol. Med. 61, 150–160 (2015)
10. Guger, C., Edlinger, G., Harkam, W., Niedermayer, I., Pfurtscheller, G.: How many people
are able to operate an EEG-based brain-computer interface (BCI). IEEE Trans. Neural Syst.
Rehabil. Eng. 11(2), 145–147 (2003)
11. Tseng, S.-Y., Chen, R.-C., Chong, F.-C., Kuo, T.-S.: Evaluation of parametric methods in EEG
signal analysis. Med. Eng. Phys. 17(1), 71–78 (1995)
12. Nagai, H., Tanaka, T.: Action observation of own hand movement enhances event-related
desynchronization. IEEE Trans. Neural Syst. Rehabil. Eng. (2019)
13. Feng, J.K., Jin, J.: An optimized channel selection method based on multi frequency CSP-rank
for motor imagery-based BCI system. Comput. Intell. Neurosci. 2019 (2019)
14. Hortal, E., Planelles, D.: SVM-based brain-machine interface for controlling a robot arm
through four mental tasks. Neurocomputing 151, 116–121 (2015)
15. Nicolas-Alonso, L.F.: Adaptive stacked generalization for multiclass motor imagery-based
brain computer interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 23(4), 702–712 (2015)
16. Ghaemi, A., Rashedi, E.: Automatic channel selection in EEG signals for classification of left
or right hand movement in brain computer interfaces using improved binary gravitation search
algorithm. Biomed. Signal Process. Control. 33, 109–118 (2017)
17. Tang, Z., Sun, S., Zhang, S., Chen, Y., Li, C., Chen, S.: A brain-machine interface based on
ERD/ERS for an upper-limb exoskeleton control. Sensors 16(12), 2050 (2016)
18. Sun, L., Feng, Z., Lu, N., Wang, B., Zhang, W.: An advanced bispectrum features for EEG-based
motor imagery classification. Expert Syst. Appl. 131, 9–19 (2019)
D-CNN and Image Processing Based
Approach for Diabetic Retinopathy
Classification
Abstract People with diabetes risk developing an eye disease called diabetic
Retinopathy. It happens when high blood glucose levels cause damage to blood
vessels within the retina. These blood vessels may swell, leak or close, stopping
blood from passing through. Sometimes new blood vessels may grow on the retina.
All of these results can steal the eye vision. Generally, for the diagnosis and detec-
tion of this disease, skilled professionals must detect this disease using images of the
patient’s retina. But due to recent development and improvement in deep learning,
this task can be done very efficiently and easily using advanced techniques in deep
understanding. We have implemented multiple states of the art DNN architecture like
InceptionV3, VGG net, and ResNet with transfer learning. We have used Gaussian
blur with some filters as preprocessing the image, and it is found that it gives better
results. This also helped to remove unwanted noise from the image. In this work, the
dataset contained images of five different D.R. classes (No D.R., Mild, Moderate,
Proliferate DR, Severe) is used. After training multiple models, InceptionV3 had the
best result with an accuracy of 81.2% on training data and 79.4% on testing data, so
we chose it.
1 Introduction
Diabetic Retinopathy usually has no early warning signs. It can cause fast vision
loss. In general, however, a person with this disease is likely to possess blurred
vision, creating it very difficult to do things like browse or drive. In some cases, the
vision can recover or worse throughout the day. It is the leading cause of blindness
in adults and the most common cause of blindness among people with diabetes. It
can arise due to the high blood sugar levels that diabetes causes having too much
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 283
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_27
284 A. Khan et al.
sugar in the blood and can damage blood vessels throughout the body, including the
retina. If sugar blocks the tiny blood vessels connected to the retina, it can cause
them to leak or bleed. As a result, the eye may grow a new blood vessel that is much
weaker and leaks or bleeds more quickly than a normal retina. If the eye starts to grow
new blood vessels, this is known as proliferative D.R., which experts consider a more
advanced stage. The first stage is non-proliferative diabetic retinopathy. The eye may
accumulate fluid during long periods of high blood sugar. This fluid accumulation
changes the shape and curve of the lens, causing vision changes. Due to this, the
person may become blind.
More than 20% of people with diabetes are suffering from diabetic Retinopathy.
India is considered the world’s capital in terms of diabetes, and it is projected that
by 2025, India will have 69.9 million cases of diabetes and by 2030, 80 million
patients, which is around a 266% increase. Diabetes is now the fifth major cause
of blindness around the world. It is the primary cause of blindness among diabetic
patients worldwide [1].
This work aims to make it easier for professionals and patients to check the
severity of diabetic Retinopathy and have a second opinion. To achieve the purpose,
we have built a D-CNN model that can predict the level of D.R. given an image
of the patient’s retina. This paper is organized further as follows: Sect. 2 discusses
the method used and the proposed system. Experimental results are discussed and
analyzed, and future work is explained in Sect. 3. Then, in Sect. 4, Our proposed
work is concluded (Fig. 1 and Table 1).
Fig. 1 A comparison is shown between the vision of an average person and a person with diabetic
retinopathy (Ref. [2])
D-CNN and Image Processing Based Approach … 285
2 Proposed Method
2.1 System
For diabetic Retinopathy, the manual detection system is preferred Fig. 2. Due to
the recent development in CNN, its efficiency and accuracy have been significantly
The dataset used for this model (Ref. [12]) had 3700 images of the retina of the person
labeled in 5 different classes, namely as 1-No-DR 2-Mild 3-Moderate 4-Proliferate
D.R. 5-Severe D.R. below are the 5 images representing each class of D.R (Fig. 4).
As severity increases and D.R. goes to level 4, white spots on the surface of the
retina are formed, and the objective of our model is to detect these spots and classify
that into different classes of D.R.
For preprocessing of the images, Gaussian blur with some filters is used to remove
the unwanted and unnecessary noise from the image, and that would make our model
much more accurate and more comfortable to classify the images. Cropping on the
Fig. 4 Retinal images of different stages or levels of DR (Ref. [13]). a Level 0: no D.R., b level 1:
mild, c level 2: moderate, d level 3: severe, e level 4: proliferate D.R
D-CNN and Image Processing Based Approach … 287
Fig. 5 Effect of applying Gaussian blur on the retinal image with some filters
image is also used to align the retina to the center of the image so that the essential
features of the images are on the same spot.
It is the result of blurring an image by convolving the image with the gaussian
function. The effect of applying gaussian blur on the retinal image can be seen in
Fig. 5. As it is visible, that image becomes much more visible for feature extraction.
It is a low pass filter that removes the high-frequency component of an image, and
the formula for Gaussian blur in 2-D is
1 −(x 2 +y 2 )/(2σ 2 )
G(x, y) = e (1)
2π σ 2
The first step in our model is the image preprocessing for training and test images.
The preprocessed images are then fed into the inceptionv3 model. With the help of
transfer learning, the inception model parameters are trained, and then the features we
get from the inceptionv3’s last layer are fed to a fully connected dense layer connected
to a dropout layer to minimize any overfitting that may occur. Finally, the SoftMax
layer takes the previous layer’s input and converts it to a one-hot encoded vector that
can be later then interpreted for the prediction. A diagrammatical representation of
our model’s architecture is shown in Fig. 6.
288 A. Khan et al.
2.5 InceptionV3
It is stating the art convolutional neural network that is 48 layers deep. It consists
of multiple inception netblocks stack on top of each other. With the help of transfer
learning, you can load weights trained on more than 1 million images of 1000 different
categories. This architecture has 11 inception blocks, 5 convolutional layers, two
maxpooling layers, one average pooling layer, and one fully connected layer. The
idea behind transfer learning is that to get the weight of a network that has been
previously trained on millions of images, which may have captured the features that
our model may not be able to capture (Fig. 7).
A performance metric’s job is to measure how good our model is in doing its supposed
task, and for that, we have used accuracy as our performance metrics, which calculates
the percentages of examples it classified correctly.
1 m
Accuracy = Xi (2)
m i=1
where Xi = 1, if and only if the predicted label yˆ(i) is equal to the true label y(i);
otherwise Xi = 0. The Training Accuracy of the system is 81.2%, and Test Accuracy
is 79.4%.
A comparison is shown of our system with some already existing systems in Table
2. It can be seen that [8] achieved 94.25% accuracy, which is very good compared
to the proposed approach. The reason for our system’s less accuracy maybe that size
of the dataset was smaller. Other than that, our method performed reasonably well
compared to some different approaches discussed in Table 2. As future work, the
performance can be improved by training the model with more data and advanced
algorithms. An application can be made using this model so that the prediction of D.R
can be made by any professional with ease. In the future, the medical data collected
may be of enormous volume and turns to be a big data management problem. Prescrip-
tive and predictive analytics can be the best candidate for dealing with massive data
management [15] and will help generate user-specific information (Fig. 8).
4 Conclusion
Multiple states of the art DNN architecture like InceptionV3, VGG net, and ResNet
with transfer learning are implemented in this paper. Gaussian blur is used for prepro-
cessing, which helps in improving the performance of the system. The proposed
system reached 81.2% accuracy. The system is compared with other states of the art
systems, as shown in Table 1. It can be seen that the proposed system performed
290 A. Khan et al.
well, but compared to [8], our system got less accuracy. The reason could be the size
of the dataset and image size. As future work, the performance can be improved by
training the model with more data and advanced algorithms.
References
1. Pandey, S., Sharma, V.: World diabetes day 2018: Battling the emerging epidemic of diabetic
retinopathy. Ind. J. Ophthalmol. 66(11), 1652 (2018)
2. topconhealth.com: Diabetic retinopathy: an eye disease with 4 stages. https://www.topconhea
lth.com/diabetic-retinopathy-an-eye-disease-with-4-stages/
3. Dutta, S., Manideep, B.C., Basha, S.M., Caytiles, R.D., Iyengar, N.C.: Classification of diabetic
retinopathy images by using deep learning models. Int. J. Grid Distrib. Comput. 11(1), 99–106
(2018). https://doi.org/10.14257/ijgdc.2018.11.1.09
4. Arcadu, F., Benmansour, F., Maunz, A., Willis, J., Haskova, Z., Prunotto, M.: Deep learning
algorithm predicts diabetic retinopathy progression in individual patients. Npj Digit. Med. 2(1)
(2019). https://doi.org/10.1038/s41746-019-0172-3
5. Chakrabarty, N.: A deep learning method for the detection of diabetic retinopathy. In: 2018 5th
IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer
Engineering (UPCON) (2018). https://doi.org/10.1109/upcon.2018.8596839
6. Rajalakshmi, R., Subashini, R., Anjana, R.M., Mohan, V.: Automated diabetic retinopathy
detection in smartphone-based fundus photography using artificial intelligence. Eye 32(6),
1138–1144 (2018). https://doi.org/10.1038/s41433-018-0064-9
7. Verma, K., Deep, P., Ramakrishnan, A.G.: Detection and classification of diabetic retinopathy
using retinal images. In: 2011 Annual IEEE India Conference (2011). https://doi.org/10.1109/
indcon.2011.6139346
8. Li, F., Liu, Z., Chen, H., Jiang, M., Zhang, X., Wu, Z.: Automatic detection of diabetic
retinopathy in retinal fundus photographs based on deep learning algorithm. Transl. Vis. Sci.
Technol. 8(6), 4 (2019). https://doi.org/10.1167/tvst.8.6.4
9. Athira, T.R., Sivadas, A., George, A., Paul, A., Gopan, N.R.: Automatic detection of diabetic
retinopathy using R-CNN. Int. Res. J. Eng. Technol. (IRJET) 5595–5600
D-CNN and Image Processing Based Approach … 291
10. Jiang, H., Yang, K., Gao, M., Zhang, D., Ma, H., Qian, W.: An interpretable ensemble deep
learning model for diabetic retinopathy disease classification. In: International Conference of
IEEE Engineering in Medicine and Biology Society (EMBC) (2019)
11. Bajaj, R., Kulkarni, N., Garg, S.: Diabetic retinopathy stage classification. SSRN Electron. J.
(2020). https://doi.org/10.2139/ssrn.3645460
12. Diabetic retinopathy resized from: https://www.kaggle.com/sohaibanwaar1203/diabetic-retino
pathy-full
13. Wang, X., Lu, Y., Wang, Y., Chen, W.: Diabetic retinopathy stage classification using convo-
lutional neural networks. In: 2018 IEEE International Conference on Information Reuse and
Integration (IRI) (2018). https://doi.org/10.1109/iri.2018.00074
14. Milton-Barker.: Inception-v3-deep-convolutional-architecture-for-classifying-acute-
myeloidlymphoblastic. (2019)
15. Deshpande, P.S., Sharma, S.C., Peddoju, S.K.: Predictive and prescriptive analytics in big-data
era. In: Security and Data Storage Aspect in Cloud Computing. Studies in Big Data, vol. 52,
pp. 71–81. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-6089-3_5
16. Alban M., Gilligan T.: Automated detection of diabetic retinopathy using fluorescein angiog-
raphy photographs. Stanford Tech. Rep. (2016)
Pothole Detection Using YOLOv2 Object
Detection Network and Convolutional
Neural Network
Abstract Bad road conditions, such as cracks and potholes, can cause passenger
discomfort, vehicle damage, and accidents. Condition of roads indirectly effects on
growth of the country. Hence, there is a need for such a system that can detect potholes.
It would allow vehicles to issue alerts to identify potholes so that drivers can reduce
the speed and avoid them and make the ride smooth. Many researchers had developed
various algorithms to become aware of potholes on roads. In this paper, the proposed
system detects the potholes using You Only Look Once version 2(YOLOv2) and a
convolutional neural network (CNN). The predefined CNN, namely resnet50, is used
to extract the features of testing images and training images. Kaggle data set is used
to evaluate the proposed algorithm. The experimental results are evaluated in terms
of precision rate and recall rate. The proposed approach precision rate is 94.04% for
test images.
1 Introduction
In rainy season, the roads are occupied with flooded water, so it is not easy to identify
the potholes underwater. The existence of potholes on roads is a significant issue for
road accidents. By setting up a pothole detection method in vehicles, the accidents
are minimized and enhance drivers’ security. Detection of potholes is a difficult
task compared with object detection like signboards, cars, pedestrians, etc., because
potholes have a wide range of geometrics. In our daily life, manual detection of
R. Sumalatha (B)
Vardhaman College of Engineering, Hyderabad, Telangana, India
e-mail: r.sumalatha@vardhaman.org
R. V. Rao
St. Peters Engineering College, Hyderabad, Telangana, India
S. M. R. Devi
G. Narayanamma Institute of Technology & Science for Women, Hyderabad, Telangana, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 293
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_28
294 R. Sumalatha et al.
potholes is a significant problem, and it also consumes more time. This paper proposes
a pothole detection system using YOLOv2 and a convolutional neural network.
Youngtae Jo et al. had suggested a new pothole detection method with a black box
camera [1]. Artis Mednis et al. discussed accelerometer data-based pothole detection
using smartphones [2]. Kulwant Singh et al. discussed real-time pothole detection
using image processing techniques [3]. Rui Fan et al. contributed a disparity trans-
formation algorithm for pothole detection [4]. Hsiu-WenWang et al. have proposed
a real-time pothole detection method based on mobile sensing techniques [5]. Lim
Kuoy Suong et al. proposed a method for detecting potholes using CNN [6]. Kashish
Bansal et al. proposed a machine learning-based pothole detection system to pinpoint
potholes present on roads [7]. Kiran Kumar et al. developed a pothole detection and
depth estimation using laser for intelligent vehicle systems [8]. Hadistian Muhammad
Hanif et al. proposed a proximity sensor-based pothole detection system for vehicles
to avoid accidents on roads [9]. Aditya Anand et al. proposed safe driving by setting
maximum speed limit and pothole detection for vehicles and sending information to
drivers using GPS technology [10]. Zhaojian Li et al. describes a pothole detection
based on a multiphase model [11].
This paper is organized as follows: the literature survey is discussed in Sect. 2. The
proposed work is explained in Sect. 3. Section 4 presents the proposed pothole detec-
tion system. The experimental results are discussed in Sect. 5. Section 6 discusses
the conclusion of the work presented in this paper.
2 Proposed Model
2.1 Database
The pothole dataset consists of two folders, namely normal and potholes. ‘Normal’
contains 352 images of smooth roads from different angles, and ‘Potholes’ includes
329 images of streets with potholes in them. In this paper, only pothole images are
used for pothole detection [12]. Figure 2 shows sample pothole images.
Pothole Detection Using YOLOv2 Object Detection Network … 295
YOLOv2 is more precise and faster than YOLO [13]. The YOLOv2 architecture is
as shown in Fig. 3. It consists of batch-normalization and anchor-boxes.
296 R. Sumalatha et al.
2.3 Resnet50
In this paper, the Resnet50 CNN model is used to detect the potholes on the road. The
architecture of Resnet50 CNN consists of Convolutional, pooling, Rectified Linear
Unit (ReLU), and Fully Connected layers. The Resnet50 architecture is shown in
Fig. 4.
The input color images are fed to the convolutional layer. This layer extracts the
features from input images. The first convolutional layer extracts the low-level
features, and the next layers extract the middle and high-level features from the
images. The first layer performs a convolution operation between the input image
and filter to produce a feature map. This feature map is given as input for the next
layer.
Pothole Detection Using YOLOv2 Object Detection Network … 297
The pooling layer reduces the dimensionality of the feature map. In this category,
there are two-layer options:
1. Max pooling
2. Average pooling.
Max Pooling finds the maximum value from the part of an image enclosed by the
filter. An average Pooling computes the average of all the values from the part of an
image enclosed by the filter.
ReLU is a non-linear operation. This layer removes every negative value from the
filtered images and replaces it with the zero’s.
f (x) = x i f x > 0
= 0if x < 0 (1)
Fully Connected Layer (FC) establishes a connection between each filter in the earlier
layer to every filter in the subsequent layer. FC layer provides the feature map to the
softmax activation function for classification.
The softmax activation function is used to get probabilities of the input being. Finally,
the obtained possibilities of the object in the image belonging to the different classes.
In this paper, 329 pothole images are used for evaluating the performance of the
proposed method in terms of precision rate and recall rate. To create a YOLOv2
pothole detector network, the mini-batch size is equal to 16, the initial learning
298 R. Sumalatha et al.
rate is 0.001, and maximum epochs are 20 chosen for the training process. Figure 5
shows the experimental results of the proposed method. Table1 provides a qualitative
comparative analysis of the proposed method. The proposed method achieves a 94.04
precision rate and 0.1181 recall rate. The precision rate and recall rate formulas are
given below.
Fig. 5 (continued)
T r ue Positive
Pr ecision = (2)
T r uepositive + False Positive
T r ue Positive
Recall = (3)
T r ue Positive + FalseN egative
4 Conclusions
In this paper, we propose a pothole detection system using YOLOv2 and CNN.
The proposed approach achieves a 94.04% precision rate on test data. In this paper,
resnet50 pre-trained network is used to extract features from images. The proposed
system efficiently detects the pothole condition using the resnet50 CNN model.
References
1. Jo, Y., Ryu, S.: Pothole detection system using A black-box camera. Sensors 15, 29316–29331
(2015)
2. Mednisy, A., Strazdins, G., Zviedris, R., Kanonirs, G., Selavo, L.: Real time pothole detection
using android smartphones with accelerometers. IEEE Conf. (2011)
3. Singh, K., Hazra, S., Chandramukherjee, S.G., Gowda, S.: Iot based real time potholes detection
system using image processing techniques. Int. J. Sci. Technol. Res. 9(02), 785–789 (2020).
Issn 2277–8616
4. Fan, R., Ozgunalp, U., Hosking, B., Pitas, M.L.: Pothole detection based on disparity
transformation and road surface modeling. IEEE Trans. Image Process. 1–12 (2019)
5. Wang, H.-W., Chen, C.-H., Cheng, D.-Y., Lin, C.-H., Lo, C.-C., A real-time pothole detection
approach for intelligent transportation system. Math. Prob. Eng. 2015, 1–7
6. Suong, L.K., Jangwoo, K.:Detection of potholes using a deep convolutional neural network. J.
Univ. Comput. Sci. 24(9), 1244–1257
7. Bansal, K., Mittal, K., Ahuja, G., Singh, A., Gill, S.S.: Deepbus: machine learning based
real time pothole detection system for smart transportation using Iot. Internet Technol. Lett.
3(E156), 1–6 (2020)
8. Vupparaboina, K.K., Tamboli, R.R., Shenu, P.M., Jana, S.: Laser-based Detection and Depth
Estimation of Dry and Water-Filled Potholes: A Geometric Approach. IEEE (2015)
9. Hanif, H.M., Lie, Z.S., Astuti1, W., Tan, S.: Pothole detection system design with proximity
sensor to provide motorcycle with warning system and increase road safety driving, In: The
3rd International Conference on Eco Engineering Development IOP Conference Series: Earth
and Environmental Science, vol. 426, pp. 1–9 (2020). IOP Publishing
10. Anand, A., Gawande1, R., Jadhav, P., Shahapurkar, R., Devi, A., Kumar, N.: Intelligent vehicle
speed controlling and pothole detection system. In: E3S Web of C onferences 170, EVF’2019,
pp. 1–5 (2020)
11. Li, Z., Kolmanovsky, I., Atkins, E., Jianbo, L., Filev, D.: Road anomaly estimation: model
based pothole detection. In: American Control Conference Palmer House Hilton, 1–3 July
2015, Chicago, IL, USA, pp. 1315–320
12. https://www.Kaggle.Com/Atulyakumar98/Pothole-Detection-Dataset
13. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: 2017 IEEE Conference on
Computer Vision And Pattern Recognition (Cvpr). IEEE (2017)
A New Machine Learning Approach
for Malware Classification
Abstract Cloud Computing provides a lot of data to be shared and secured with the
Cloud Service Provider and the Client. Accessing the Cloud Environment is through
the Internet. Any data leak or poor network configuration will lead to Malware
entry into the Cloud Computing Environment. In this scenario, accessing the Cloud
Environment is easier for Malware from both—outside and inside. In this paper,
a new learning machine approach is used effectively where significance is given
to data analysis, feature engineering, and modeling. This way, it helps us quickly
differentiate actual file and malware type based on the characteristics before entering
into the Cloud Environment. The designed system is tested and results are tabulated
with performance metrics. Our system gives the best results with a multi-level log
loss with 0.03% error and 97% accuracy acceptable in the malware system.
1 Introduction
Cloud computing involves hosting services over the internet. The three types of
services are categorized into Platform as a Service (PaaS), Infrastructure as a Service
(IaaS), and Software as a Service (SaaS). PaaS is used to host applications such as
Amazon web services, Google, etc., for virtual computing, storage, and computing
stacks, known for its on-demand services, for example, Simple Storage Service (S3)
and Elastic Cloud Computing (EC2). Applications can be developed in an IaaS
with a scalable environment and additional networking, storage caching, and content
delivery. Security in cloud computing can be classified into two categories: security
G. Shruthi (B)
Department of Computer Science and Engineering, Siddaganga Institute of Technology,
Tumakuru, Karnataka, India
P. Shrinivasacharya
Department of Information Science and Engineering, Siddaganga Institute of Technology,
Tumakuru, Karnataka, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 301
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_29
302 G. Shruthi and P. Shrinivascharya
issues faced by cloud providers and their customers. Insecure Interfaces, API, Data
Loss & Data Leakage, and Hardware Failure, are the top three threats in the cloud
with 29, 25, and 10% accountability of all cloud security outages, respectively [1, 2].
Hackers will try to gain access to the cloud and control a large amount of information
through a single attack. This is popularly called “hyper jacking”. Examples of hyper
jacking are the iCloud 2014 leak and Dropbox security breach.
Hackers can breach this information, hack millions of passwords and private data,
and make money by bitcoin [3]. According to a recent study, insider attack is the
biggest threat in the cloud computing environment. Cloud service providers should
check the physical access to the server in the data center to avoid any suspicious
activity. Proper data isolation and logical data segregation and virtualization should
be taken care of against Malware, data leakage, and exploited vulnerabilities of threats
causing cloud outages. The paper organization is as follows. Section 2 summarizes
various methods and work done on the classification of Malware. In Sect. 3, the
distribution of malware classes in the data set is analyzed. In Sect. 4, the technique
used in classifying Malware is discussed. In Sect. 5, experimentation on the datasets
is conducted, and the result is analyzed. Finally, a conclusion is drawn in Sect. 6.
2 Background
In this section, research work for classification and detection of Malware is focused.
In the paper [4, 5] author has discussed static and dynamic approaches to malware
analysis. Static and string analysis is used to develop a detection system with inter-
pretable and semantic strings extracted from API execution calls [6]. They also use
the Super Vector machine ensemble approach to construct detectors. The author has
also discussed a way of classifying Malware with the technique of text categoriza-
tion. In paper [7] also text categorization procedures for malware classification are
discussed. In this paper, the author has tried to extract all n-grams from the train
data set. The highest 55,500 features are consistent with their frequency score to
which selection method like “Fisher score” is applied. After this, they tried various
machine learning algorithms on the obtained results. Mostly used trained algorithms
are Support Vector Machines, Bayes algorithms, Decision trees, and Artificial Neural
Networks. The paper “Detection of Malicious Code by Applying Machine Learning
Classifiers on Static Features” by Shabtai [8] focuses on techniques such as “Fisher
score”, “document frequency”, “Gain ratio”, and “hierarchical feature selection” and
“feature classification algorithms”. (Fisher Score, Document Frequency, Gain Ratio,
and Hierarchical Feature Selection and Feature Classification Algorithms). In [9], a
different n-gram model using different classifiers such as Bays, IB, Decision Trees,
and Random Forests is proposed.
To reduce feature space, class-wise document frequency is also used if the
extracted n-grams count is enormous. N-gram features were extracted from range
one to eight. KNN with the Euclidean distance as metric is also reported as a linear
clustering algorithm for detecting Malware [10]. GIST features were extracted from
A New Machine Learning Approach for Malware Classification 303
3 Methodology
Our method is evaluated on given data set by malware classification challenge posted
on the Kaggle website by Microsoft [1–3, 14]. To correctly classify malware samples,
the system uses only .asm files. It also classifies as many as possible categories of
features from the extensive dataset collection [15] (21,736 file types). The data set
has a group of known malware files representing a mixture of nine different families.
Each malware file has a unique identifier (Id), a hash value (20 characters), a class
that uniquely identifies the file, and an integer representing one of nine family names
to which the Malware may belong. The nine families of Malware are as follows: 1.
Lolli-pop 2. Ramnit 3. Vundo 4. Simda 5. Obfuscator ACY 6. Kelihos_ver3 7. Tracur
8. Kelihos_ver1 9. Gatak.
The metric to be evaluated for its performance is the logarithmic loss and confusion
matrix (accuracy). For every file, we generated a group of predicted probabilities (one
for each class), and therefore, the log loss of the model is as follows:
1
N M
logloss = − yi j log pi j , (1)
N i=1 j=1
where N = Files within the test set, M = the number of labels (in class), log = The
Natural logarithm, yij = 1 if observation i is in class and j = 0 if the observation
is not in the class, and predicted probability of the observation i is in class j [14].
Confusion matrix or error matrix computes the performance of a classifier on a given
test data. It provides the visualization of performance measures and counts every
class of correct and incorrect predictions.
304 G. Shruthi and P. Shrinivascharya
4 Data Preparation
5 Data Preprocessing
Each malware file contained assembly-level code, and 52 crucial features were
extracted with parallel processing on all the .asm files. Features of assembly-level
files were extracted by preprocessing the Header and opcode fields from the code
section. We need to create numerical feature vectors out of the existing structure
to use this data in the classification process. In this work, the extraction process
was done on .asm files by randomly distributing the files into a separate folder. The
different static properties such as opcodes, bytes, segments, prefixes, a function call,
and Malware API were collected. The .asm files contain many instructions at the
assembly level, but only the vital occurrence of prefixes and keywords and opcode
A New Machine Learning Approach for Malware Classification 305
were counted with the “bag of words” model and thread processing. These item
counts are important in the instruction, which will define the behavior of Malware.
The following prefixes in the segments give the best values to count their occurrence
in the .asm files: HEADER, text, idata, pav, data, bss, e-data, r-data. The following
best opcodes are used to get the best result to count the occurrence of the opcode.
Jmp, mov, retf, push, xor, pop, nop, retn, sub, or, ror, rol, jnb, rt.
Similarly, registers such as edx, eip, esp, esi, eax, ebx, ecx, edi, and ebp are the best
for the count. The extracted feature is available as binary data. The machine learning
model accepts numeric data, so we converted the output of this feature extracted data
into a malware file vector representation. The algorithm shows the iterative method
of removing the features from .asm files. This experiment was run on an Intel 2 GHz
processor with 8 GB RAM to extract the features mentioned earlier from the .asm
files. The time taken to extract the unigram features from 150 GB data of .asm files
is 48 h. The thread processing concept was applied to perform this operation by
splitting the .asm files into five folders and the data of each folder file was assigned
to the thread. Each thread would count .asm files and the occurrences of prefixes.
Algorithm 1 shows each thread process of collecting features from each .asm file.
The resulting data frame is as follows (Fig. 2):
6.1 Classification
The first machine learning model used is the k-nearest neighbor algorithm (KNN)
method for classification. In this model, X is training data, and Y is a target value.
It has to predict the class labels for the provided data and return the probability
estimates for the test data X. It gave a log loss = 0.137(on test data) and the number
of misclassified points as 2.25. The second model is Logistic Regression, which
produces results in a binary format and predicts the outcome, which is a discrete
variable. It gave a log loss for test data as 0.69 and the number of misclassified
points as 14.2. The latest algorithm used is Random Forest Classifier, which is a
multi-object classifier. Its accuracy is high, and training time is less. Even though
large data is missing, it maintains its accuracy. It gave a log loss for test data as 0.034
and the number of misclassified points as 0.597. The system will be trained on the
training data, optimize and tune its hyperparameter using the cross-validation data,
and eventually, we can compute its log loss and accuracy using the test data. The
effectiveness of the approach is measured with the results in the table given below
(Table 1).
The graph shows the results obtained by three different models: KNN, logistic
regression, and random forest on the trained system and wont to predict malware
classes (Figs. 3, 4).
7 Conclusion
References
1. Deshpande, P., Sharma, S.C., Sateesh Kumar, P.: Security threats in cloud computing. In:
International Conference on Computing, Communication & Automation, pp. 632–636 (2015)
2. Deshpande, P., Sharma, S.C., Peddoju, S.K., et al.: Security and service assurance issues in
Cloud environment. Int. J. Syst. Assur. Eng. Manag. 9, 194–207 (2018). https://doi.org/10.
1007/s13198-016-0525-0
3. Deshpande, P., Sharma S.C., Peddoju S.K.: Data storage security in cloud paradigm. In: Pant,
M., Deep, K., Bansal, J., Nagar, A., Das, K. (eds) Proceedings of Fifth International Conference
A New Machine Learning Approach for Malware Classification 309
on Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol.
436, pp.247–259 (2016). https://doi.org/10.1007/978-981-10-0448-3_20
4. Gibert, D., Mateu, C., Planes, J.: The rise of machine learning for detection and classification
of malware: research developments, trends and challenges. J. Netw. Comput. Appl. 153 (2020)
5. Ahmadi, M., Ulyanov, D., Semenov, S., Trofimov, M., Giacinto, G.: Novel feature extraction,
selection and fusion for effective malware family classification. In: Proceedings of the Sixth
ACM Conference on Data and Application Security and Privacy, pp.183–194 (2016)
6. Ye, Y., Chen, L., Wang, D., Li, T., Jiang, Q., Zhao, M.: SBMDS: an interpretable string based
malware detection system using SVM ensemble with bagging. J. Comput. Virol. (2), 283 (2009)
7. Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: De-tection of malicious code by applying
machine learning classifiers on static features: a state-of-the-art survey. Information security
Technical Report 14, no. 1, pp. 16–29 (2009)
8. Shabtai, A., Moskovitch, R., Feher, C., Dolev, S., Elovici, Y.: Detecting unknown malicious
code by applying classification techniques on opcode patterns. Secur. Inform. 1(1), (2012)
9. Jain, S., Meena, Y.K.: Byte level n–gram analysis for malware detection. In: International
Conference on Information Processing, Springer, Berlin, Heidelberg, pp. 51–59 (2011)
10. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and
automatic classification. In: Proceedings of the 8th International Symposium on Visualization
for Cyber Security, pp. 1–7 (2011)
11. Naderi, H., Vinod, P., Conti, M., Parsa, S., HadiAlaeiyan, M.: Malware signature generation
using locality sensitive hashing. In: International Conference on Security & Privacy, pp.115–
124. Springer, Singapore (2019)
12. Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining
techniques. ACM Comput. Surv. (CSUR) 50(3), 1–40 (2017)
13. Grini, S., Shalaginov, A., Franke, K.: Study of soft computing methods for large-scale multi-
nomial malware types and families detection. In: Proceedings of the 6th World Conference on
Soft Computing (2016)
14. Microsoft malware classification challenge (big 2015) https://www.kaggle.com/c/malware-cla
ssification (2017). Accessed 30 Sept 2019.
15. Bazrafshan, Z., Hashemi, H., HazratiFard, S.M., Hamzeh, A.: A survey on heuristic malware
detection techniques. In: The 5th Conference on Information and Knowledge Technology,
pp. 113–120. IEEE (2013)
Analysis of Feature Selection Techniques
to Detect DoS Attacks Using Rule-Based
Classifiers
Abstract Denial of Service (DoS) attacks are emerging as a security threat, which,
when ignored, may result in enormous losses for the organizations. Such attacks
lead to the unavailability of the services provided by the organizations to legitimate
users. The detection of such attacks with lower computation and minimization of
errors is an ongoing research area. This paper focuses on analyzing different feature
selection methods for feature selection in the detection of DoS attacks. The analysis
of feature selection methods provides relevant and noisy feature subsets based on
the score obtained by each method. The obtained relevant feature subset is tested on
the CICIDS-2017 DoS dataset and achieves higher accuracy of 99.9591% with the
PART classifier.
1 Introduction
As technology moves forward at a fast rate, many devices are being connected to the
internet. The dependence on remote servers is increasing substantially. It is becoming
crucial that these servers should be available on-demand basis. Small downtime
duration may incur huge losses to organizations.
Denial of Service (DoS) is one such type of attack [1] that might result in the
unavailability of services provided by the organizations when performed on an unse-
cured network. DoS attacks leverage the resource handling vulnerabilities due to
logical and programmatical errors in handling network packets. A robust system
is needed to detect and prevent such attacks and ensure that these services remain
available to the end-users.
Recently, DoS attacks have been revolutionalized to become a massive threat
to large businesses and governments. Botnets or IoT devices act as one common
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 311
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_30
312 A. Vaidya and D. Kshirsagar
platform for DoS attacks. Recently, a DoS attack was performed on Amazon Web
Services (AWS) [2], a multi-purpose cloud computing platform, using a specific
technique called Connectionless Lightweight Directory Access Protocol” (CLDAP)
Reflection. The network traffic was generated from these vulnerable CLDAP servers
and was amplified by 56–70 and performed for over three days.
Feature selection in intrusion detection plays a dominant role by drastically
decreasing the build-up time for any Machine Learning model. It helps in removing
unnecessary features and reduces the complexity of the model. Reduced features
result in a simpler model, thereby decreasing the computational time required to
detect DoS attacks. Feature selection algorithms are mainly differentiated into three
types, namely Filter, Wrapper, and Embedded. The former method returns a subset of
features based on the data’s generic characteristics, with lower computational over-
head, but are comparatively inaccurate. Wrapper-based uses a pre-trained learning
algorithm’s probabilistic accuracy to give the quality output for selected features.
Information Gain (IG) is a type of filter method [3] generally used to select
features that help measure the significance of the dataset attribute. Gain Ratio (GR)
is an extension of the information gain algorithm [4] that resolves the bias problem
towards features with a broader set of values. The correlation Feature Selection
method (CFS) [5] implements a search algorithm and a function to evaluate feature
subsets’ merit. Mutual Information Feature Selection (MIFS) correlates two random
variables [6] in terms of symmetry. ReliefF is an extension to the Relief algorithm,
which overcomes the drawbacks of the former algorithm, which cannot deal with
incomplete data and is limited to two-class problems.
Multiple types of classifiers exist to solve a variety of kinds of problems [7]. Each
type of classifier is used to solve a specific type of situation. The contribution of our
work is as follows:
1. This work analyses different feature selection methods for the detection of DoS
attacks at the application layer.
2. This work obtains relevant feature subsets based on the particular feature
selection method’s score and is tested on the CICIDS-2017 DoS dataset.
Section 2 of the paper gives an overview of the literature. Section 3 describes the
proposed methodology. Section 4 talks about implementing the proposed model and
its results, and finally, Sect. 5 concludes the paper.
2 Related Work
Obaid et al. [8] described the DoS attacks at different layers of the open systems
interconnection (OSI) model in detail. According to the authors, HTTP GET and
POST are popular in the OSI Model application layer. DoS attacks at the presentation
layer and session layer include Security Sockets Layer (SSL) misuse and telnet DoS
Attacks. Transportation layer DoS attacks have a transport layer protocol. Attacks in
this layer can be further classified as flooding and desynchronization attacks. Ping of
Analysis of Feature Selection Techniques … 313
death, smurf, internet control message protocol (ICMP) flood are common attacks in
the Network Layer. Unfairness, collision, and exhaustion are DoS attacks occurring
in the data link layer.
Mohammadi et al. [9] proposed a novel multivariate-mutual-information-based
feature selection method to select essential intrusion detection features. The selected
features are used to build a model using the least-square-SVM algorithm. The
proposed model outperforms other approaches that use MIFS, modified MIFS, and
linear-LCFS techniques. The analysis is carried out on KDD Cup ‘99 along with
Kyoto 2006 + and NSL-KDD datasets.
Dua et al. [10] proposed an ensemble classifier approach for the detection of intru-
sions. The model proposed by the authors is a combination of a random tree (RT),
REP Tree, k-nearest neighbors (KNN), J48 Graft, and random forest (RF) classi-
fiers. The model is validated on NSL-KDD. 99.72% accuracy is achieved for binary
classification, and 99.68% accuracy is achieved for multiple-class classification.
Umar et al. [11] (2020) propose a hybrid IDS Model which uses a wrapper-based
feature selection algorithm with a Decision Tree used as a feature evaluator. ANN,
SVM, KNN, RF, and NB models are built using the selected features. These models
are compared with baseline models, which are built using all features. The proposed
model achieves a detection rate of 97.05% when built with RF.
Ployphan et al. [12] proposed a hybrid intrusion detection model using the Adap-
tive AdaBoost classifier. It uses a correlation feature reduction technique, and a
hybrid classifier includes MLP, KNN, C4.5, linear discriminant analysis (LDA), and
support vector machines (SVM) using the concept of adaptive boosting. The model
is tested on UNSW-NB15, NSL-KDD, and KDD Cup ‘99 datasets and achieves an
accuracy of 99.96%.
Azar et al. [13] proposed an intrusion detection model based on selection methods
of a specific feature combination. These methods include GR, CFS, and IG. The target
system uses KNN, naive bayes (NB), and multilayer perceptron (MLP) for intrusion
detection. Validation of the proposed model is carried out on KDD Cup ‘99 dataset
and achieves a detection rate of 98.9%.
Taha et al. [14] proposed a lightweight Intrusion Detection System (IDS) for DoS
attacks, which uses IG and CFS for feature selection and employs C4.5, NB, RF,
REP Tree classifiers. Features for detecting DoS attacks are reduced to 9 from 41.
The proposed model has been validated on KDD Cup ‘99 dataset and results in a
99.6% detection rate.
Pattawaro, A. and Polprasert, C. [15] proposed a network IDS using attribute
ratio (AR) as a feature reduction strategy with a threshold value of 0.01. The reduced
feature subset is given to k-means clustering and the XGBoost classification algo-
rithm. The model is validated on the NSL-KDD dataset. Accuracy and true positive
rate are equal to 84.41% and 86.36%, respectively. However, FPR is 18.20%, and
the area under the curve is 0.84.
Aljawarneh et al. [16] proposed hybrid IDS using feature reduction analysis. IG
and the concept of the vote are used for feature selection. It uses a hybrid classifier
with a combination of RT, REPTree, J48, AdaBoost - M1, NB, decision stump, and
314 A. Vaidya and D. Kshirsagar
Meta-Paging. The model results in 99.9% of accuracy on NSL-KDD for DoS attack
detection.
Pullagura et al. [17] proposed an IDS with a robust feature reduction technique.
The system uses a combination of feature selection techniques using euclidean
distance, CFS, and chi-square. SVM is used to train the model. The model is vali-
dated on the KDD Cup ‘99 dataset. The proposed method reduced the features from
41 to 5. Further, the system has an accuracy of 96.25%, precision of 80.20%, and
recall of 78.96%.
Most of the research is carried upon packet-based datasets—KDD Cup ‘99
and NSL-KDD, which contains network and transport layer attacks. However, the
CICIDS-2017 DoS dataset contains Layer-7 (Application Layer) attacks. It encour-
ages us to apply different feature selection techniques to make an effort to reduce
the count of features required to build models without reducing the accuracy.
3 Proposed Model
library is used for replacing the infinity and N/A values with appropriate large values
and zeros, respectively. A duplicate column named” Fwd Header Length” is removed.
Weka is used to apply IG, GR, CFS, SU, and ReliefF algorithms on the prepro-
cessed dataset. Scikit-learn python framework is used exclusively for calculating
the Mutual Information of the features. Different feature selection techniques are
applied to the dataset consist of 77 features. These feature selection techniques
assign a specific score to each feature. The range of the score assigned by each
feature selection algorithm is shown in Table 1.
Some of the features have been assigned a score equal to zero by each feature
selection algorithm. Such features having a score of zero assigned by each feature
selection algorithm are mentioned in Table 2. These features are termed noisy features
and can be discarded for DoS attack detection.
316 A. Vaidya and D. Kshirsagar
Rule-based classifiers will be applied for the evaluation of the model. The
following performance metrics will be used for assessment:
TN +TP
Accuracy = ∗ 100 (1)
T N + T P + FP + FN
classifiers after removing the noisy features. It can be again observed that PART
performs better than other rule-based classifiers in terms of accuracy.
Table 5 shows a brief comparison with the existing intrusion detection systems.
The work [9], 10 is carried on KDD Cup’99, NSL-KDD dataset, which consists of
layer 3 (Network layer) and layer 4 (Transport layer) attacks. The work [11] is carried
on the UNSW-NB15 dataset, consisting of layer 4 DoS attacks. However, the types of
DoS attacks are absent in this dataset. The work presented in this paper is carried on
CICIDS-2017, a reasonably new dataset that consists of modern layer 7 (Application
layer) attacks. The results reported in [9] and [10] achieve an accuracy of 94.31% and
99.72%, respectively, to detect DoS attacks at the network and transport layer. The
work presented in this paper achieves higher accuracy of 99.95% for the detection
of DoS attacks at the application layer compared to the study presented in [9, 10],
and [11].
5 Conclusions
References
1. Wankhede, S., Kshirsagar, D.: Dos attack detection using machine learning and neural
network. In: 2018 Fourth International Conference on Computing Communication Control
and Automation (ICCUBEA), pp. 1–5. IEEE (2018)
2. Nicholson, P.: 5 most famous dos attacks (2020). https://www.a10networks.com/blog/5-most-
famous-ddos-attacks/
3. Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., Liu, H.: Feature selection: a
data perspective. ACM Comput. Surv. (CSUR) 50(6), 1–45 (2017)
4. Ibrahim, H.E., Badr, S.M., Shaheen, M.A.: Adaptive layered approach using machine learning
techniques with gain ratio for intrusion detection systems (2012). arXiv:1210.7650
5. Deshpande, P., Aggarwal, A., Sharma, S.C., Kumar, P.S., Abraham, A.:Distributed port-scan
attack in cloud environment. In: Fifth International Conference on Computational Aspects of
Social Networks (CASoN), pp. 27–31 (2013)
6. Ambusaidi, M.A., He, X., Nanda, P., Tan, Z.: Building an intrusion detection system using a
filter-based feature selection algorithm. IEEE Trans. Comput. 65(10), 2986–2998 (2016)
7. Kshirsagar, D., Kumar, S.: An ensemble feature reduction method for web attack detection. J.
Discret. Math. Sci. Cryptogr. 23(1), 283–291 (2020)
8. Pandey, V.C., Peddoju, S.K., Deshpande, P.S.: A statistical and distributed packet filter against
DDoS attacks in cloud environment. Sādhanā 43, 32 (2018). https://doi.org/10.1007/s12046-
018-0800-7
9. Mohammadi, S., Desai, V., Karimipour, H.: Multivariate mutual information based feature
selection for cyber intrusion detection. In: 2018 IEEE Electrical Power and Energy Conference
(EPEC), pp. 1–6. IEEE (2018)
10. Dua, M., et al.: Attribute selection and ensemble classifier based novel approach to intrusion
detection system. Procedia Comput. Sci. 167, 2191–2199 (2020)
11. Umar, M.A., Zhanfang, C., Liu, Y.: Network intrusion detection using wrapper-based decision
tree for feature selection. In Proceedings of the 2020 International Conference on Internet
Computing for Science and Engineering, pp. 5–13 (2020)
Analysis of Feature Selection Techniques … 319
12. Sornsuwit, P., Jaiyen, S.: A new hybrid machine learning for cybersecurity threat detection
based on adaptive boosting. Appl. Artif. Intell. 33(5), 462–482 (2019)
13. Salih, A.A., Abdulrazaq, M.B.: Combining best features selection using three classifiers
in intrusion detection system. In: 2019 International Conference on Advanced Science and
Engineering (ICOASE), pp. 94–99. IEEE (2019)
14. Tchakoucht, T.A.I.T., Mostafa Ezziyyani, M.: Building a fast intrusion detection system for
high-speed-networks: probe and dos attacks detection. Procedia Comput. Sci. 127, 521–530
(2018)
15. Pattawaro, A., Polprasert, C.: Anomaly-based network intrusion detection system through
feature selection and hybrid machine learning technique. In: 2018 16th International Confer-
ence on ICT and Knowledge Engineering (ICT&KE), pp. 1–6. IEEE (2018)
16. Aljawarneh, S., Aldwairi, M., Yassein, M.B.: Anomaly-based intrusion detection system
through feature selection analysis and building hybrid efficient model. J. Comput. Sci. 25,
152–160 (2018)
17. Priyadarsini, P.I., Sai, M.S.S., Suneetha, A., Santhi, M.V.B.T.: Robust feature selection
technique for intrusion detection system. Inter. J. Control Autom. 11(2), 33–44 (2018)
18. Kshirsagar, D., Kumar, S.: Identifying reduced features based on ig-threshold for dos attack
detection using part. In: International Conference on Distributed Computing and Internet
Technology, pp. 411–419. Springer (2020)
19. Shaikh, J.M., Kshirsagar, D.: Feature reduction-based dos attack detection system. In: Next
Generation Information Processing System, pp. 170–177. Springer, Berlin
Botnet Detection Using Bayes Classifier
Abstract In today’s connected world, risk of getting attacked over the internet is
increased, which plays a major role in infecting the devices over the internet. The
internet is flooded with different malwares, but we have focused on the harmful
effects of Botnet. Botnet is a group of devices controlled by a single device to attack
and infect other devices over the internet. The devices are called bots and these can be
any internet-connected device and the single device controlling these can be called
as a botmaster or a bot driver. It is crucial to detect them at a faster rate since they can
perform various malicious activities. We performed different experiments to detect
Botnet. For experimentation, we used CICIDS2017 dataset and different machine
learning algorithms from Weka. With the ML algorithms, we achieved the highest
accuracy of 98.9146% for NaiveBayesMultinominalText algorithm.
1 Introduction
attacker through an established [2] command and control (C&C) channel. Using an
established C&C channel, botmaster can easily manipulate the bot’s behavior on
victims’ machine for his interests. Botnet is controlled and supervised by botmaster
and has become a distributed platform to perform malicious and illegal activities
on victims’ machine such as sending SPAM-emails, malware distribution, identity
theft, and attack on the organizational network and critical infrastructure. Botnet
allows attackers to perform different attacks such as Phishing, DDoS, Cryptojacking,
Snooping, Bricking, and Spam-bots. Botnets can be categorized [3] as Centralized
and Decentralized.
It can be used to remove unwanted or irrelevant attributes from the dataset. It makes
it easy for ML algorithms to analyze and understand the data in huge dataset. Feature
selection methods are classified into filter and wrapper [4] according to the feature
evaluation measures. The filter methods directly select a feature’s subset according
to data characteristics from the dataset and then apply different classification algo-
rithms to evaluate the selected subset. The wrapper methods uses predefined learning
algorithms to choose a feature subset for evaluation. Wrapper methods require more
computation, and hence these are more expensive and complicated than filter-based
approach. So, when there is substantial data to process, filter-based methods would
be the preferred choice. There are different types of feature selection methods:
1. Correlation: In this method the algorithm selects a feature subset that is highly
correlated with the output class and not much correlated with another class
feature [5]. The correlation can be calculated as
maxe
ax y = (1)
m + m(m − 1)aee
Botnet Detection Using Bayes Classifier 323
where a(xy)is the correlation (dependence) between features and the class vari-
able, m is the number of features, a(xe) is the average of the correlation between
feature-class and a(ee) is the average inter-correlation between feature.
2. Information Gain: It is a feature selection technique where the feature is selected
based on the amount of information provided by the feature in the output class. It
is based on entropy and also calculated as reduction in entropy. The value of IG
lies between 0 and 1. Zero represents no information and 1 represents maximum
information. The more relevant feature in the output class will have a higher IG
value and will get selected. Information gain and entropy [6] are vice versa. The
value IG can be increased by decreasing the value of entropy. The value of IG is
used to split the dataset in ID3 while building the decision tree. We can calculate
the value of IG for single variable as follows:
where IG represents the information gain for the dataset D, H(D) represents the
entropy for dataset D and H(D|z) represents the conditional entropy of the dataset
D for variable z.
3. Gain Ratio: The gain ratio is the improvement in Information gain; it overcomes
the IG’s drawback [7], where IG considers many attributes for the decision,
which may not be useful. The gain ratio helps in selecting those features which
are essential for that outcome in the decision tree. The gain ratio can be calculated
as
I n f or mationGain(a)
Gain Ratio(a) = (3)
I mpor tantvalue(a)
where
n
|Di | |D|
I mpor tantvaluea (D) = ( )log 2 ( ) (4)
i=1
|D| |D|
The above equation gives the value of dividing attributes into training dataset D
and n parts.
4. Symmetrical uncertainty: IG is based on the context of selecting attributes. It is
the drawback of IG. Symmetrical uncertainty (SU) overcomes the drawback of
IG [8] by normalization. It is restricted to the range of 0 to 1, where if the SU
comes zero, that means there is no relation between two attributes, and if SU
comes 1, then the two attributes are dependent on each other and can be taken
into consideration. It can be calculated as
IG
SU = 2 ∗ ( ) (5)
(H (a) + H (b))
324 P. Kolpe and D. Kshirsagar
where O(f) is the observed frequency, that is, number of observations in the class
and E(f) is the expected frequency [10], that is, many observations in the class
where there is no connection between the feature and target.
Our paper is further distributed in 4 sections, where we have performed literature sur-
vey, and with the help of which we have got direction to select classifier and perform
botnet detection. Our next section describes our proposed model for the detection of
botnet which gives different results. The results are presented and discussed in the
next section.
2 Related Work
Narang, Pratik et al. [14] proposed a conversation—based mechanism for the detec-
tion of botnet. They have used Discrete Fourier Transforms (DFTs) and informa-
tion entropy as a measure, Correlation-based Feature Selection (CFS) algorithm,
Consistency-based Subset Evaluation (CSE) search algorithm for feature selection.
The classifiers they used are Random forests, Reduced error pruning’ (REP) trees,
Naive Bayes and Decision tree hybrid classifier, namely Naive Bayes tree (NB tree),
K nearest neighbors algorithm, Support Vector Machines (SVM),‘stacking’ ensem-
ble learning technique (also known as ‘stacked generalization’). They have selected
a total of 23 features.
Kirubavathi and Anitha [15] proposed an approach for botnet detection which is
based on network traffic performance analysis. The classifiers they used are Boosted
decision tree (AdaBoostM1+J48) ensemble classifier, Naive Bayesian (NB) statisti-
cal classifier, and Support vector machine (SVM) discriminative classifier. They have
used the publicly available datasets and those are Conficker dataset from CAIDA,
ISOT Botnet dataset from the University of Victoria, dataset from the University of
Georgia, four different datasets from CVUT University, three different IRC botnet
dataset from Centro University, Argentina, Citadel botnet and Alexa benign datasets
from Dalhousie University. They have used different classifiers, out of which the
Naive Bayesian classifier gives the highest accuracy of 99% and 0.02% false posi-
tive rate.
Haq and Singh [16] proposed a hybrid approach for the detection on the basis
of false positive rates of signature-based and anomaly-based detection. The hybrid
approach is a combination of classification technique and clustering technique. They
have used the Weka tool for pre-processing, training, testing, and cross-validation of
features. The classifiers used are Naive Bayes Classifier, Ibk classifier, Rule Decision
Table, Trees, and J48 classifier. On the other hand, they have also used k-means
clustering. They have used the publicly available datasets and they are ISOT, ISCX,
and CTU-13. According to the analysis of the results obtained with various classifiers,
J48 tree algorithm gave the highest accuracy of 90.2723%.
Garre et al. [17] proposed a system using honeypots with SSH sensors to capture
the data coming through the network and machine learning technique for the detection
of attack. They have selected features based on the commands captured during the
SSH session. The classifiers used are Decision tree, Random forest, SVM, Naive
Bayes. They created their own dataset with a total of 93 features, 72 commands, 7
session states, and 14 statistics. The Random Forest classifier achieved a precision of
326 P. Kolpe and D. Kshirsagar
95.7% and a recall of 93.9%, which indicates that the random forest gives less false
negative. Due to this, they have chosen Random Forest classifier. They have tested
their model on 20% of training dataset achieving the highest accuracy of 99.59%,
precision of 96.87%, recall of 100%, giving zero false negatives.
Narang et al. [18] performed feature selection techniques for peer to peer botnet
traffic. They have used the ISOT dataset which is publicly available and their own
created dataset. There were a total of 23 features in their dataset. They used three
feature selection techniques and those are CFS (Correlation-based Feature Selection),
CSE (Consistency-based Subset Evaluation), PCA (Principal Component Analysis).
For experiments, three different classifiers are used and those are C4.5, Naive Bayes,
and Bayes Network. They obtained 5 features from CFS, 8 features from CSE, and
12 features from PCA.
Al Janabi et al. [8] used a real time dataset with 18266 records and 5 features to
understand which technique is best. There are two types of feature selection tech-
niques those are wrapper model and filter model. These are further classified into
different types. They have performed a comparative study of these types of feature
selection techniques.
3 Proposed Model
3.1 Dataset
In our proposed model, there are different components as shown in Fig. 1. The first
component of our model is Dataset and we have used the dataset [19] named CICIDS
2017 Friday Morning Botnet. This is a publicly available dataset by the University
named UNB (University of New Brunswick). The dataset have 184045 instances,
out of which 182096 are benign and 1948 are bots. Since the dataset is a collection
of raw data, it needs to be pre-processed. The pre-processing of data is performed
by the next component of our model as shown in Fig. 1.
Feature selection is a technique for selecting distinct features from a dataset. We have
used open-source software named Weka for selecting useful features by applying
different machine learning feature selection algorithms. Now, on the basis of selected
features, the dataset is classified and this task is performed by the next component
called classifier. We are using filter based-feature selection algorithms for selecting
distinct features.
3.4 Classifier
There are different ML classifiers which are used for the classification of data. In
classification, class is predicted on the basis of input data. The class can also be
called as the target output. There are different types of classifiers, out of which, we
are using binary classifier, which gives class as bot or benign. There are different
machine learning-based classifiers available in Weka, out of which, we are using
Bayes classifiers.
328 P. Kolpe and D. Kshirsagar
In our experimentation, we have used a labeled dataset which is collected from [19]
, and it consists of 78 features including 1 label. Then this dataset is pre-processed
with the help of python code. The pre-processed data is used for feature selection.
Features are chosen using filter- based feature selection algorithms such as Infor-
mation Gain, Correlation, Gain Ratio, Symmetrical Uncertainty, Chi-Square, and
ReliefF. As shown in Fig. 1, with the help of ML feature selection algorithms, fea-
tures are listed based on their weights and according to their ranks. We observed that
the selected feature list has zero and non-zero weighted features subsets. However,
the zero weighted features do not have any significance. So, we applied different
classification algorithms on the list of non-zero weighted feature subsets. The list
of non-zero weighted features subset are shown in Table 1, and the zero weighted
features subset are shown in Table 2.
We applied the Naive Bayes classifier for selected features with tenfold cross-
validation. The results using the Naive Bayes classifiers are shown in Table 3. From
Table 3, we observed that the Naive Bayes Multinominal Text classifier gives the
highest accuracy of 98.9416%. So we selected this classifier for our further exper-
imentation. We have applied the Naive Bayes Multinominal Text classifier for the
non-zero weighted feature subset from Table 2. The result after applying the Naive
Bayes Multinominal Text classifier for different lists of selected features using differ-
ent feature selection algorithms is shown in Table 3. It is observed that the correlation
and ReliefF selected the same list of features. Similarly, Gain ratio, Chi-square and
information gain, and Symmetrical Uncertainty selected the same features (Table 4).
5 Conclusions
We proposed a model for botnet detection using the filter-based feature selection
algorithms and Naive Bayes classifier. The bots are detected with the highest accuracy
of 98.9416% with the help of the Naive Bayes Multinomial Text classifier from
dataset CICIDS 2017. In future, we will try to reduce the features by using different
feature reduction techniques to obtain high accuracy. We will test our model against
different datasets.
References
1. https://www.akamai.com/uk/en/resources/what-is-a-botnet.jsp
2. Chen, C.-M., Lin, H.-C.: Detecting botnet by anomalous traffic. J. Inf. Secur. Appl. 21, 42–51
(2015)
3. https://www.crowdstrike.com/epp-101/botnets/
4. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng.
40(1), 16–28 (2014)
5. Hall, M.A.: Correlation-based feature selection for machine learning (1999)
6. https://machinelearningmastery.com/information-gain-and-mutual-information/#:~:
text=Information
7. Trabelsi, M., Meddouri, N., Maddouri, M.: A new feature selection method for nominal clas-
sifier based on formal concept analysis. Proc. Comput. Sci. 112, 186–194 (2017)
330 P. Kolpe and D. Kshirsagar
8. Al Janabi, K.B.S., Kadhim, R.: Data reduction techniques: a comparative study for attribute
selection methods. IJACST 2249–3123
9. Pupo, O.G.R., Morell, C., Soto, S.V.: ReliefF-ML: an extension of ReliefF algorithm to multi-
label learning. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds.) Progress in Pattern Recognition,
Image Analysis, Computer Vision, and Applications. CIARP 2013. Lecture Notes in Computer
Science, vol. 8259. Springer, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-
41827-36
10. Deshpande, P., Sharma, S.C., Peddoju, S.K., et al.: Security and service assurance issues in
cloud environment. Int. J. Syst. Assur. Eng. Manag. 9, 194–207 (2018). https://doi.org/10.
1007/s13198-016-0525-0
11. Mukherjee, S., Sharma, N.: Intrusion detection using naive Bayes classifier with feature reduc-
tion. Proc. Technol. 4, 119–128 (2012)
12. Ugochukwu, C.J., Bennett, E.O.: An intrusion detection system using machine learning algo-
rithm. Int. J. Comput. Sci. Math. Theory 4(1), 39–47 (2018)
13. https://scikit-learn.org/stable/modules/naive_bayes.html
14. Narang, P., Hota, C., Sencar, H.T.: Noise-resistant mechanisms for the detection of stealthy
peer-to-peer botnets. Comput. Commun. 96, 29–42 (2016)
15. Kirubavathi, G., Anitha, R.: Botnet detection via mining of traffic flow characteristics. Comput.
Electr. Engi. 50, 91–101 (2016)
16. Haq, S., Singh, Y.: Botnet detection using machine learning. In: 2018 Fifth International Con-
ference on Parallel, Distributed and Grid Computing (PDGC), pp. 240–245. IEEE (2018)
17. Garre, J.T.M., Pérez, M.G., Ruiz-Martínez, A.: A novel machine learning-based approach for
the detection of SSH botnet infection. Future Gen. Comput. Syst. 115, 387–396
18. Narang, P., Reddy, J.M., Hota, C.: Feature selection for detection of peer-to-peer botnet traffic.
In: Proceedings of the 6th ACM India Computing Convention, pp. 1–9 (2013)
19. Canadian Institute of Cybersecurity. https://www.unb.ca/cic/datasets/ids-2018.html
Insider Attack Prevention using
Multifactor Authentication Protocols - A
Survey
Abstract The technologies’ progress is liable to bring our needs at our doorstep
by accessing the applications through our palmtop or computer system. The users
share the credentials with the application servers to access the web applications. It
is mandatory to secure the user credentials from illegal access or usage. Multifactor
authentication protocols are designed to make use of web applications securely.
Simultaneously, a study on these proposed protocols is necessary to find its strength
in preventing several Security attacks, especially Insider attacks, which is crucial.
This paper presents a comparative study of different Security protocols based on
their strength in preventing Security attacks, including their performance and other
parameters involved in the protocol. It is inferred from the study that Insider attacks
can be prevented only when the Security protocol is free from various other attacks.
The comparative study on performance reveals that the computation cost increases
with the cryptosystem’s strength in designing the protocol.
1 Introduction
The growth of the digital world demands well-built web applications with high-level
Security. Authentication of the parties involved in communication is predominant in
deciding the level of Security. There are several applications on the web that are most
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 331
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_32
332 S. Rajamanickam et al.
widely used by many web users. In this paper, a few of the multifactor authentication
protocols suitable for applications like single server, distributed servers, remote users,
e health care systems, and Internet of Things are discussed. The user prefers to use
smart cards, for which numerous authentication protocols for an environment with a
single server are proposed. This idea is then extended to a multi-server environment.
Protocols are designed so that users can access all the servers’ services, using single
identity and password pair. The increase in the spread of diseases and high population
dictates the necessity of Telecare Medical Information Systems (TMIS). This system
creates a comfortable communication environment, linking the remote patients with
the medical practitioners and access the required data instantly. The advancement in
technologies demands wireless sensor networks in medical applications, which are
used to improve the care provided to people of different categories. The dependence
of consumers and business applications on cloud and IoT platforms is increased,
leading to many technologies [9]. However, the applications mentioned above turn
out to be fruitful only when the users feel secure when accessing these applications.
Generally, users prefer to use the applications frequently only when they use strong
authentication protocols for communication. These protocols are recommended for
usage only when they are free from several Security attacks. Researchers propose
several multifactor authentication protocols. The proposed protocols claim to prevent
Security attacks and preserve Security’s properties discussed in the forthcoming
section.
1. Password guessing attacks: It is a brute force attack in which a fraudulent try
to obtain the password by trying the different possible combinations of letters
or symbols. It is, in turn, classified into Offline and Online guessing attacks.
a. Offline password guessing attacks: An attacker tries to compute the exact
text from the hash value obtained by him. Adversary attempts with the list of
passwords and corresponding hash values stored for this purpose. b. Online
password guessing attacks: Attempting to guess the correct password, the illegal
user ventures with several user names and password pairs against the user login
portal.
2. Man in the middle attack: The attacker finds a suitable position between the
communication parties and eavesdrops or impersonates and creates regular com-
munication between the entities.
3. Impersonation attacks: On obtaining a legitimate user or server’s identity, the
attacker forges to communicate with parties in the protocol.
4. Replay attack: Data transmission by a malicious attacker by repeating the pre-
vious message transmitted causes a delay or attack.
Insider Attack Prevention using Multifactor Authentication Protocols - A Survey 333
5. Session key disclosure attack: Obtaining the credentials from the messages col-
lected from the session keys transmitted during the key agreement phase.
6. Denial of Service Attack: The accessibility of services is denied for legitimate
users due to the traffic flooding caused by the attackers.
7. Server spoofing attack: The attack caused by a Malicious user on obtaining the
server’s credentials forges to be a legitimate server and communicates with the
user.
8. Stolen verifier attack: The attacker obtains the users’ credential information,
which is stored in the database maintained by the server.
9. Cookie theft attack: The attack is caused by obtaining the data from the cookie
in a smart device.
10. Modification attack: An illegitimate user does attack by altering the valid data
transmitted between the communicating parties.
11. Smart card stolen attack: Attacker retrieves and uses the valid credentials stored
in the smart card, during the registration process, by executing a power analysis
attack and use the data for illegal access of web applications.
12. Insider attack: An insider attack is predominant over all other attacks since access
to the user’s significant credentials is easy and comfortable to the Insiders, who
may be an employee in the current duration or an employee who previously
worked in the organization.
The possibility of the occurrence of Insider attacks may fall into one of the following
cases:
Case 1: Insiders obtain the user credentials from the user while he/she registers with
the service organization.
Case 2: Insiders can retrieve the user credentials from the message shared with the
service provider.
Case 3: Insiders have a chance to make a correct guess of the valid username and
password if it is of low entropy.
Case 4: Insider attacks come into action when some careless users use a single
username and a corresponding unique password to access all web applications.
Insider attacks are to be treated seriously since they cause expenditure loss and
cause reputational damage to the organization. Moreover, it is challenging to detect
Insider attacks. Organizations’ better option is to use Security protocols free from
Insider attacks. Apart from being free from security attacks, the security protocols
must guarantee that the following properties are preserved in the protocol:
1. User/Device anonymity: User or device credentials should be confidential and
unknown to other illegitimate users.
2. Forward secrecy: Session keys are to be preserved even when the long-term
confidential information is compromised during the key exchange.
3. Backward secrecy: The inability to discover the preceding group keys by the
attacker from a contiguous subset, that contains the group keys.
4. Key independence: The incapability of a passive adversary from obtaining the
other set of group keys from any known proper set of group keys
334 S. Rajamanickam et al.
2 Literature Survey
In [8], a user authentication protocol is proposed for medical applications using smart
cards and passwords as significant authentication parameters, suffer from Insider and
several other attacks as seen in [3]. A key agreement scheme, used for authentication,
specifically for SIP, is proposed in [2], which by the cryptanalysis done in [7] is
proved to suffer from other Security attacks inclusive of Insider attacks. A protocol
used to authenticate a user to multiple servers based on biometric, with a password
and biometric information of the user stored in a smart card using ECC is proposed
in [10], which has various drawbacks as stated in [16]. An authentication scheme
for remote users’ benefit to access multiple servers using a key agreement scheme
is proposed in [15], and is found to be insecure against several Security attacks, as
proved in [12]. Kalra and Sood proposed an authentication protocol using ECC for
the embedded devices, HTTP clients in [5], which is found to be insecure against
Security attacks as in [6]. Users who prefer roaming services in global mobility
networks can use an anonymous authentication scheme proposed by Mun et al. in
[9], which is cryptanalyzed in [13] and found to have various Security drawbacks. A
key agreement scheme that uses three factors for authentication that can be used for
Telecare Medicine Information Systems is proposed by Nikooghadam et al. in [1]
is found to be insecure as stated in [4]. In common, all these protocols are insecure
against Insider attacks. The vulnerabilities causing Insider attacks are discussed in the
Insider Attack Prevention using Multifactor Authentication Protocols - A Survey 335
Based on the Security loopholes, the cause of Insider attacks is categorized into the
following cases:
1. Case 1: Direct share of credentials
Liu chung scheme: As mentioned in [3], Liu chung scheme [8], cannot prevent
Insider attacks.The user’s identity and password are shared with the trusted
authority to personalize the smart card. In this way, the Insider can misuse the
credentials and cause attacks.
2. Case 2: Credentials obtained by offline guessing attacks
Wang et al. scheme: As cryptanalyzed in [12], wang et al. scheme [15] suffers
Insider attacks. Insiders obtain the credentials by offline guessing attacks.
3. Case 3: Weakness in the server verification message
Arshad et al. scheme: In [7], Authors cryptanalyzed [2], proposed by Arshad
et al., the Insider can forge as legal client after selecting a random number dc
and obtain the message Vc = h(I Di Q s r ealm dc Q s Vi ), where Vi =
h(IDi PWi Nc ). Here, Insiders can cause forgery attack.
4. Case 4: Intentional action of the attacker
Odelu et al. scheme: The weakness of Odelu et al. scheme [10], is revealed in [16],
stating the user shares IDi with the organization, the hashed value, h(IDi k),
ri which is stored in the table, can be deleted by the attacker intentionally and
use the IDi and register again and cause forgery attack.
5. Case 5: Credentials obtained by guessing and stolen verifier attack
Mun et al. scheme: In [13], the Mun et al. scheme [9] is cryptanalyzed and found
that Insiders can execute a stolen verifier attack by obtaining the information
from the database and performing guessing attack credentials of the user and
misuse it in several ways.
6. Case 6: Credentials generated by Insider
Kalra and Sood scheme In [5], the cloud server CS itself generates a unique
password for every device registered with the CS. Any Insider can misuse the
information and cause several attacks, as proved in [6].
7. Case 7: Credentials obtained by power analysis attack and smart card stolen
attack
Arshad and Nikooghadam scheme: The cryptanalysis done by the author [4],
proves the Arshad et al. scheme [1], is prone to an Insider attack if he obtains
the smart card by stealing and in turn obtain the credentials by power analysis
attack.
Table 1 Characteristic features of the security protocols
336
Refs. An Application that can Factors used Participants Merits Cryptographic A Tool used for
use the protocol for involved in the Algorithm and Formal Security
for communication authentication communication operations used Analysis
[8] Wireless health care User id, Client Instant data access Bilinear pairing Not done
sensor networks password trusting authority or for the users hashing
server, sensor node Low computation cost
[2] IP based telephony User id, Client, Avoids illegal usage of ECC, Not done
networks Password server Voice over Internet Hashing
protocol
[10] Battery limited mobile User id, User, Server, Provides strong user ECC, BAN Logic,
devices Password, Registration center anonymity Hashing AVISPA tool
Biometric
[15] Remote distributed User id, User, Server, Easy re-registration Hashing, Oracle reveal
networks Password, Registration center procedure for the user, Key Exchange
Biometric Preserves biometric Protocol (IKEv2)
information
[5] Internet of Things Device id, Embedded device, Expand the coverage of ECC, AVISPA
cloud server Password Server capabilities offered by Hashing
IoT making them reliable
[9] Global mobility User id Mobile User, Free from Synchronization Elliptic Curve Not done
networks Home agent, problem Diffie–Hellman
Foreign agent (ECDH),
Hashing
[1] Telecare Medical User id, User, Provides strong user ECC, Not done
Information system Password, Telecare Server anonymity Hashing
Biometric
[11] Suitable for all User id, User, Free from inside attacks ECC, Scyther
web applications Password Service providing server, Hashing
Password management
server.
S. Rajamanickam et al.
Insider Attack Prevention using Multifactor Authentication Protocols - A Survey 337
4 Security Analysis
All the protocols mentioned above prove free from all these attacks through informal
Security analysis, which is done by cryptanalyzing their own proposed protocol, but
seems to be incomplete. Formal Security analysis is also done through different tools
like BAN logic, AVISPA tool.
5 Comparative Study
A comparative study is made on all these protocols and presented in Table 1, exposing
different characteristic features. A survey of the behavior of protocols with different
attacks is presented in Fig. 1. The different protocols’ performance is analyzed, with
a special mention in computation cost incurred by the protocols for login and authen-
tication phase along with the critical agreement and generation phase is presented in
Table 3. The description of the necessary notations used for performance analysis is
listed in Table 2.
Table 3 The Computation time of the security protocols for login and authentication and key
generation phases
Scheme Computation cost calculation for Computation time
Login and authentication phase (in ms)
Liu chung 1Th 0.023
Arshad et al. 8Th + 4TPM + 1TINV + 1TM + 3TR 10.86
Odelu et al. 24Th + 6TPM + 6TK 13.43
Wang et al. 4Th 0.032
Kalra & 9Th + 7TPM 15.602
Sood’s
Mun et al. 10Th + 4TM + 2TK 0.042
Arshad & 15Th + 4TPM + 1TINV + 2TM 9.258
Nikooghadam’s
Siranjeevi 8Th + 4TPA + 1TK 0.265
Rajamanickam
et al.
6 Conclusion
References
1. Arshad, H., Nikooghadam, M.: Three-factor anonymous authentication and key agreement
scheme for telecare medicine information systems. J. Med. Syst. 38(12), 136 (2014)
2. Arshad, H., Nikooghadam, M.: An efficient and secure authentication and key agreement
scheme for session initiation protocol using ECC. Multimed. Tools Appl. 75(1), 181–197
(2016)
3. Challa, S., Das, A.K., Odelu, V., Kumar, N., Kumari, S., Khan, M.K., Vasilakos, A.V.: An
efficient ecc-based provably secure three-factor user authentication and key agreement protocol
for wireless healthcare sensor networks. Comput. Electr. Eng. 69, 534–554 (2018)
4. Das, A.K.: A secure user anonymity-preserving three-factor remote user authentication scheme
for the telecare medicine information systems. J. Med. Syst. 39(3), 30 (2015)
5. Kalra, S., Sood, S.K.: Secure authentication scheme for iot and cloud servers. Pervasive Mob.
Comput. 24, 210–223 (2015)
6. Kumari, S., Karuppiah, M., Das, A.K., Li, X., Wu, F., Kumar, N.: A secure authentication
scheme based on elliptic curve cryptography for iot and cloud servers. J. Supercomput. 74(12),
6428–6453 (2018)
7. Lin, H., Wen, F., Chunxia, D.: An anonymous and secure authentication and key agreement
scheme for session initiation protocol. Multimed. Tools Appl. 76(2), 2315–2329 (2017)
8. Liu, C.-H., Chung, Y.-F.: Secure user authentication scheme for wireless healthcare sensor
networks. Comput. Electr. Eng. 59, 250–261 (2017)
9. Mun, H., Han, K., Lee, Y.S., Yeun, C.Y., Choi, H.H.: Enhanced secure anonymous authentica-
tion scheme for roaming service in global mobility networks. Math. Comput. Model. 55(1–2),
214–222 (2012)
10. Odelu, V., Das, A.K., Goswami, A.: A secure biometrics-based multi-server authentication
protocol using smart cards. IEEE Trans. Inf. Forensics Secur. 10(9), 1953–1966 (2015)
11. Rajamanickam, S., Vollala, S., Amin, R., Ramasubramanian, N.: Lightweight password-based
authentication techniques using ECC. IEEE Syst. J., Insid. Attack Prot. (2019)
12. Reddy, A.G., Yoon, E.J., Das, A.K., Odelu, V., Yoo, K.Y.: Design of mutually authenticated
key agreement protocol resistant to impersonation attacks for multi-server environment. IEEE
Access 5, 3622–3639 (2017)
13. Reddy, A.G., Yoon, E.J., Das, A.K, Yoo, K.Y.: Lightweight authentication with key-agreement
protocol for mobile network environment using smart cards. IET Inf. Secur. 10(5), 272–282
(2016)
14. Shamshad, S., Mahmood, K., Kumari, S., Khan, M.K.: Comments on “insider attack protection:
lightweight password-based authentication techniques using ECC”. IEEE Syst. J. (2020)
15. Wang, C., Zhang, X., Zheng, Z.: Cryptanalysis and improvement of a biometric-based multi-
server authentication and key agreement scheme. Plos one 11(2), (2016)
16. Zhang, M., Zhang, J., Tan, W.: Remote three-factor authentication protocol with strong robust-
ness for multi-server environment. China Commun. 14(6), 126–136 (2017)
Link Scheduling in Wireless Mesh
Network Using Ant Colony Optimization
Abstract Recent developments in the last few years have shown rapid growth in
wireless communication technologies. When a Network is established across the
given geographical distance, the biggest issue arises about its effective sharing among
all the stakeholders. Nowadays, the internet laid network parses across the globe.
Therefore, sharing among nations is required to be distributed without any loss to
anyone operating the network. It can be achieved by several other greedy based
network sharing approaches, which compute the network business along with the
routing information and collectively affects the throughput. To avoid this, we present
a novel approach of using the Genetic Algorithm-based Ant Colony Optimization
on heavily trafficked networks to improve network performance. We are further
predicting the future network traffic and accordingly scheduling the network routing
mechanism. Our experiment on NS2 simulator proves that the results are far better
than other presented methods of greedy-based computations.
1 Introduction
All of us know the very first experiment of DARPA in the history of the computer
network. The speedy development caused immense pressure to share the resources
like network bandwidth, available hardware for the routing process, and many more
such resources [1]. Hence, innovation leads to prevalent network traffic scheduling
methods using greedy techniques [2]. Although greedy approaches are providing
good result, still the network is not utilized at its full extent. Hence, we propose
M. D. Wangikar (B)
School of Computational Sciences, Swami Ramanand Teerth Marathwada University, Nanded
431605, India
B. R. Bombade
Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded 431605, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 341
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_33
342 M. D. Wangikar and B. R. Bombade
a novel approach using Genetic Algorithms and further add-on of unique Predic-
tive analysis techniques. The Genetic Algorithms were Genetic Modeling of human
genes and is almost successfully incorporated in System design and development [3].
It is widely known that the Genetic Algorithm is used as an optimization technique.
Here we used this context and eventually further improved the network performance
through the predictive analysis [4]. The wireless mesh network is complex and has
multiple hybrid category devices attached to its service point. The study and exper-
imental results with greedy approach carried out earlier by different researchers
resulted in better network performance with some lacuana. Hence, we tried to put
our research on better scheduling of network traffic and used the concept of Ants
Colony Optimization Technique in our proposed method. It is a fact that ant uses a
special liquid or chemical to trace its path. When ants found a portion of food in the
nearby vicinity, it marches back to its colony, and during this march back, it sprays
down a particular chemical called Pheromone. This Pheromone is nothing but a chem-
ical secreted and triggers a predefined social response among the member species.
Various Pheromones exist, e.g., Alarm Pheromones, Food-Trail Pheromones, Escape
Pheromones, etc. Here, we are using the Food-Trail Pheromones concept [5].
In Food-Trail Pheromones, when an Ant finds a portion of food, it lays down
the Pheromones and accordingly a social response is generated to trace back the
food. According to their intelligence, a different ant tries to apply differently to
different paths to reach the same food. But Food-Trail Pheromones vanishes due
to atmospheric conditions like heat, air, humidity, wind flow, other species activate,
etc. Therefore, in such situations, it is observed that the chemical trail which has
the shortest distance from the located food will always be more robust and the most
crucial track details available. However, every ant tries its best, but the ants’ group
will choose only the shortest route, and all other available paths will be discarded [6]
as shown in Fig. 1. A similar concept can be applied to network traffic scheduling
[7]. The network may be LAN, WAN, MAN, wired or wireless, terrestrial or extra-
terrestrial, of the same kind of hybrid. The proposed method uses G.A to avoid delay
in case of congestion irrespective of the type of network.
Further, the predictive technique method is applied to avoid delays by marking
the intermediate nodes used as routers and ranking them as per their busy/available
bandwidths. This will help us to predict in advance the availability of the bandwidth.
No doubt, the prediction may sometimes provide values that might not reach the
predictions but give fair values based on which routing/scheduling of packets can be
done effectively [8].
2 Proposed Methodology
It is observed that when more than two users use a network, network scheduling is
required. The sharing resource algorithm comes in the scenario. To date, lot many
researchers tried to apply various algorithms successfully to sharing resource prob-
lems. But networks started to increase throughout the globe, and hence it becomes
Link Scheduling in Wireless Mesh Network Using Ant … 343
Fig. 1 Ants colony optimization technique for finding the shortest path using pheromones
complex to solve the dispute of network resource sharing. Everyone needs to find
the best optimal network resource sharing policy across the world.
A very brief comparison table of the Scheduling mechanism with the other similar
method is given in the following table, explanatory. The scheduling methods that
are considered are divided into mainly two types as Node Scheduling and Link
Scheduling. Another comparative analysis is done on the network type, whether it is
centralized or distributed in nature, along with the threshold, Medium of access, and
single or multi-channel in the heart (Table 1).
Selecting a Network: First, we chose a small network of 110 computers within our
institute only. We have experimented on more than 200+ nodes until 100,000 node
computer networks, which are virtually available on the NS2 simulator [13]. Later,
we run the proposed algorithm in a real-time network too. The difference between
a simulator and a real-time environment was varying due to clock synchronization
issues. Global and local time stamps on the packets changed highly during network
testing. It is observed from the experiment that several real-time problems are not
incorporated in the simulated environment, e.g., loss of connectivity during the trans-
mission, network bandwidth down due to physical infrastructure loss, and other such
conditions that are only available in real-time systems and cannot be augmented.
Few defined Parameters for the NS2 simulator are as shown in Table 2.
Creating Test Packet Acting Like Pheromone: The termed Pheromone packet will
be a test packet that always moves through the network at a predefined interval and
finds out the busiest routes along with the network [14]. Such a packet is nothing
but similar to the chemical Pheromone used by the Ants. The timestamp acts as a
Pheromone clock, which helps to know the exact conditions of network links and
their availability at the given time. The data received helps to have better scheduling
of the wireless mesh network.
Interference threshold 10 dB
ℽi
Frame duration Tf 10 ms
Path loss exponent β 4
Area covered R*R 886*886 m2
Link Scheduling in Wireless Mesh Network Using Ant … 345
Proposed Algorithm:
Steps Description
1 Fixing Network Parameters like Network Type, B.W., Data Intervals, Test
Packets, test packet intervals
Δ → n1, n2, n3, n5, ……. nm (No. of Nodes)
δ → 5 (Routers/Gateways)
S (n) → S {Ø} = S1, S2, S3, ……. Sn, (n ≥ 25)
G (V, E) → E (G) = (Δ1, Δ2, Δ3, ……. Δn )
Pheromone Packets (P) (P = 1, 2, 3, ……. n)
2 Sort the G(V,E) in ascending or descending order
3 Create traffic { S1, Δ, δ1, δ2, δ3, δ4, δ5, P = Δ2)
4 Deploy the Pheromone Packet and check the link status periodically.
5 Compute & Find out the busiest route and available route for the given time
slot
value of S
6 Rank the routers and links accordingly as per its business and availability for
given time.
7 /* Ranking helps to get Predictive awareness and self-decision making
system to route/reroute the packets for better bandwidth management. Hence
effective scheduling is achieved [15]. */
8 Repeat & Compute for all G (V, E) across all Δ nodes. Increment S = S+1, Δ
= Δ +1 for all values of S
9 End
The termed Pheromone packet will be a test packet that always moves through
the network at a predefined interval, finds out the busiest routes and the web, and
acts like Pheromone of ants.
Once Condition A & B are satisfied, we initiate the system to run our small program in
which the packet is transmitted on the available network. Once a batch of successful
transmission is achieved from more than five nodes, the mean, average time required
for the packet to reach the destination is calculated. Accordingly, we varied the
batches of data to be sent and tested the result. After every batch of packets, a
pheromone packet is transmitted to check the network availability. The result we
received is tabulated in Table 3. It shows that the Pheromone packet has provided
insight into network traffics, and accordingly, the network’s business is obtained.
Once the traffic is analyzed, it becomes easier to rank the intermediate routers and
the available free paths. For the given packet size, data size, available bandwidth,
number of nodes, number of routers, the scheduling can be done using Pheromone
packets.
346 M. D. Wangikar and B. R. Bombade
In the above technique, we fixed the bandwidth for all types of networks, i.e.,
10 MHz. Therefore, this helps us to define the uniform solution to network scheduling
problems. For more significant analysis and compare the experiment is run on 110
nodes executed on NS2. It is always expected to have a minimum signal to noise ratio
for a given test, as the impact of interference may lead to having marginal output
than expected [16].
We had deployed numerous Pheromone packets across the network as per the
network’s size, as shown in Table 4. These numbers of packets are nothing but imper-
sonating as the ants who are laying down Pheromone chemical across the network
space and trying to find out the shortest route for better scheduling and faster access
[13, 17]. If you closely observe the time taken by Pheromone packets to travel and
reach destinations, then it is quite clear that the insight we got can help tune the
Fig. 2 Comparison of throughput value of the proposed method with existing algorithms
scheduling of network traffic effectively. The NS2 simulator has been used with 25,
50, 100, 110, 200, 50,000, and 100,000 nodes in the first experiment. The number
of Pheromone packets quantity is zero over here as we want to make it an example
needed for comparison. This had given the value of throughput, as shown in Fig. 3.
The second experiment slot is related to the LAN network. Here only two slots are
experimented with for 110 and 200 Nodes due to infrastructure constraints. The
number of Pheromones is 10% of the total nodes, and the throughput value is shown
in Fig. 3.
Similarly, the experiment with the WAN-based system was unique, and here the
number of nodes was 50,000 and 100,000, respectively, and Pheromone packets were
10% of the actual node value. The resultant throughput is shown in Fig. 2.
Based on Pheromones packet values to the time taken from source to destination,
the system gets auto-tuned for the shortest path. It has the lowest congestion value
or highest bandwidth availability [18]. Similarly, as per Fig. 3, we can conclude
that the proposed algorithm’s resultant scheduling performance is far better than the
previously used algorithms due to the Pheromone-based Ants Colony Optimization
algorithm [5]. The scheduling performance is almost increased as compared to greedy
algorithms. The system shows better performance as we keep up with an increasing
number of nodes. It is found that the proposed algorithm has good time complexity
in the specified given time [19].
348 M. D. Wangikar and B. R. Bombade
Fig. 3 Comparative scheduling performance of the proposed algorithm with existing algorithms
4 Conclusion
References
1. Simaribba, O., et al.: Robust STDMA, scheduling in multi-hop wireless networks for single
node position perturbation, pp. 566–571. IEEE (2009)
2. Brar, G., et al.: Computationally efficient scheduling with the physical interference model for
throughput improvement in wireless mesh networks. In: Proceeding MobiCom’06 Proceedings
of the 12th Annual International Conference on Mobile Computing and Networking, pp. 2–13
(2006)
3. Mang, K.F., et al.: Genetic algorithms, concept and application in engineering design. IEEE
Trans. Ind. Eng. 1, 519–534 (1996)
4. Martins, D., et al.: Classification with ant colony optimization. IEEE Trans. Evol. Comput. 11,
651–665 (2007)
Link Scheduling in Wireless Mesh Network Using Ant … 349
5. Goyal, M., Agrawal, M.: Optimize workflow scheduling using hybrid ant colony optimization
and particle swarm optimization algorithm in cloud environment. Int. J. Adv. Res. Ideas Innov.
Technol. (IJARIIT) 181–189 (2017)
6. Nayyar, A. et al.: Ant colony optimization—computational swarm intelligence technique. In:
3rd International Conference on Computing for Sustainable Global Development, pp. 392–398
(2016)
7. Bruno, R., et al.: Mesh networks: commodity multihop adhoc networks. IEEE Commun. Mag.
123–131 (2005)
8. Nayyar, A., et al.: Ant colony optimization—computational swarm intelligence technique. In:
3rd International Conference on Computing for Sustainable Global Development, pp. 392–398
(2016)
9. Koutsonikolas, D., Das, S.M., Hu, Y.C.: An interference-aware fair scheduling for multicast in
wireless mesh networks. J. Parallel Distrib. Comput. 68, 372–386 (2008)
10. Wang, K., Chiasserini, C.F., Rao, R.R., Proakis, J.G.: A distributed joint scheduling and power
control algorithm for multicasting in wireless ad hoc networks. In: Proc. of IEEE Int. Conf. on
Communications, pp. 725–731 (2003)
11. Tran, N.H., Hong, C.S.: Fair scheduling for throughput improvement in wireless mesh networks.
pp. 1310–1312
12. Salem, N.B., Hubaux, J.-P.: A fair scheduling for wireless mesh networks. In: Proc. of 1st IEEE
Workshop on Wireless Mesh Networks (WiMesh) (2005)
13. Manickavasagan, V., et al.: Online resource scheduling using ants colony optimization for cloud
computing. Int. J. Eng. Sci. Comput. (IJESC) 5430–5432 (2017)
14. Hasio, Y.T., Chaung, C.L.: Ant colony optimization for best path planning. In: IEEE Inter-
national Symposium on Communication and Information Technology, ISCIT, pp. 668–678
(2004)
15. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Technol. J. 623–656
(1948)
16. Jain, K., et al.: Impact of interference on multi-hop wireless network performance. In: ACM
Proceedings on Network, pp. 66–80 (2003)
17. Adubi, S.A., et al.: A comparative study on the ant colony optimization. In: 2014 IEEE 11th
International Conference on Electronics, Computer and Computation, (ICECCO), pp. 215–228
(2014)
18. Luo, W., Lin, D., Feng, X.: An improved ant colony optimization and its application on TSP
problem. In: IEEE International Conference on Internet of Things and IEEE Green computing
and Communications (greencom) and IEEE Cyber, Physical And Social Computing (cpscom)
and IEEE Smart Data (smart data), pp. 175–188 (2016)
19. Gore, A.D., et al.: Link scheduling algorithms for wireless mesh network. IEEE Commun.
Surv. Tutrorials 13(2), 258–273 (2011)
Development of an Integrated Security
Model for Wireless Body Area Networks
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 351
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_34
352 K. R. Siva Bharathi and R. Venkateswari
2 Proposed Methodology
WBANs are usually deployed in a limited area to collect sensitive data that includes
patient identity and health status and transmit it over an open network. This sensi-
tive information is captured the same way from numerous patients and directed to
a centralized data center. This increases the network burden along with innumer-
able security threats in the open network. In this paper, a practical methodology is
proposed to reduce network traffic and fortify network security.
A random WBAN network is generated initially, and then a clustering approach
is applied to the sensor nodes in the network. Each node in the cluster transmits its
data to the cluster head, usually a border router from which the data is communicated
to the centralized node where it is analyzed. Thus, the clustering approach reduces
all the nodes in the network communicating to the centralized node, thus reducing
network traffic. Once this is done, an integrated hybrid security model is applied to
this clustered architecture to provide a high-security level to the network. The steps
involved in generating an integrated secure WBAN model are illustrated in Fig. 2.
Table 1 Features of a
WBAN features Capacity
WBAN network
Individual WBAN size 10 (max)
Total no. of WBAN nodes 100
Network topology Random
Total WBAN energy Sum of all body sensors
within the coverage region. The outstanding features of a WBAN network are listed
under Table 1.
Clustering plays a vital role in reducing the network load. In general, WBANs are
resource constraint networks with limited node energy and sensing range. Hence,
clustering helps in improving the network lifetime and thus the stability of the
network. Each patient is to be considered as a miniature WBAN network or a single
cluster. Every patient will be having a border router sensor to collect information
from all the nodes in his body and communicate it to the centralized node. The
network is generated, and each node is checked for energy and node probability.
A particular node is selected as the cluster head if it lies within energy and node
threshold probability. Further, the nodes that fall within the selected cluster head’s
coverage become the cluster member. Communication is initiated between the nodes
in the cluster to the border router for reliable data delivery to the centralized node.
Development of an Integrated Security Model for Wireless … 355
where
(p,e) Txt is the Original Message block and E_Txt is the Generated Encoded
Message in the Public key pair.
This E_Txt is transmitted to the Cluster head by the WBAN node. With the private
key it possesses, it performs decoding of the message. It can be exemplified as
356 K. R. Siva Bharathi and R. Venkateswari
where
(p,x) is the private key pair, E_Txt is the Encoded Message, and Txt is the Retrieved
Message block.
To inculcate the third level of security paradigms in our network, we use the
Hash-based Message Authentication Code (HMAC) and SHA-256 cryptographic
algorithm. The message block data is encoded and decoded using the Secure Hash
Algorithm (SHA-256) that yields a high level of security to the data.
HMAC is a Message Authentication Code that utilizes a cryptographic hash func-
tion and a secret key. HMAC does not perform any encryption in the message, but
the message is transmitted alongside the hash. Nodes with the secret key will hash
among themselves, and if it seems authentic, the hashes will match. The inner and
outer padding is done to generate the values to the length of the message. The HMAC
function is defined [22] as
HMAC(x) = F k ⊕ opad, F k ⊕ ipad, x (3)
where
F—Cryptographic Hash function, k—Secret key,
k—block-sized key derived from the secret key, x—Message to be authenticated,
⊕—Bit-wise XOR operation, opad—Outer padding, iPad—inner padding.
The HMAC provides the communicating nodes the secret key producing the hash
function to access the required data. The above-elucidated methods are applied to
the balanced clustered network for accomplishing a high level of security in the
network. The simulation environment and the results obtained are discussed in the
next section.
Development of an Integrated Security Model for Wireless … 357
The proposed integrated security model is defined to improve the performance and
stability of the clustered WBAN. It also provides authenticated and secure commu-
nication. The network’s performance is analyzed utilizing Deceased node analysis,
Packet communication analysis, Energy residue analysis, and Network scalability
analysis. Biometric security is used to provide overall security, RSA is the asym-
metric cryptographic method for node-level authentication, and HMAC-SHA256
is used to attain safe symmetric communication over the network. Transmission is
made for 500, 1000, 1500, and 2000 rounds at different periods for different models,
and the obtained results are illustrated and discussed below. Network reliability is
analyzed based on the number of nodes effectively performing communication in
the network. Figure 3 shows the comparative performance in terms of the existence
of dead nodes in the network.
Improving the transmission rate is the ultimate aim of any network. Communica-
tion failures are widespread in the clustered network as intermediate malicious nodes
occur in the network. Figure 4 shows the comparative performance of the network
in terms of packet communication.
The above graphs and tabulation in Table 4 shows that the proposed inte-
grated security model effectively enhances network performance and scalability.
The comparison of the proposed method with similar models is summarized in Table
5.
4 Conclusions
WBANs suffer from numerous security attacks, both external and internal. Since
these networks deal with life-critical sensitive data, they require strict security prin-
ciples to be secure. In this paper, a three-level integrated security model is proposed
and analyzed. Biometric security, RSA cryptographic algorithm, and HMAC-SHA
358 K. R. Siva Bharathi and R. Venkateswari
Table 5 Comparison of the performance metrics of the proposed model with existing methods
Metric SC_WBAN SC_WBAN_RSA Proposed model
Deceased nodes Highest Medium Lowest
Packet transmission Minimum Medium Maximum
Residual energy Minimum; drops to 0 when no. of Average Maximum
rounds = 2000
algorithms are used for accomplishing security in this model at the identity level,
node level, and data levels. The network is simulated for 500, 1000, 1500, and 2000
cycles to evaluate the network performance and reliability. Our future work focuses
on extending the same, satisfying the network’s demanding security requirements
when used for heterogeneous applications.
Development of an Integrated Security Model for Wireless … 359
References
1. Toorani, M.: Security analysis if the IEEE 802.15.6 standard. Int. J. Commun. Syst. 29(17),
2471–2489 (2016)
2. Kwak, K.S., Ullah, S., Ullah, N.: An overview of IEEE 802.15.6 standard. Appl. Sci. Biomed.
Commun. Technol. (ISABEL) (2010)
3. Niksaz, P., Branch, M.: Wireless body area networks: attacks and countermeasures. Int. J. Sci.
Eng. Res. 6(9), 556–568 (2015)
4. Li, M., Lou, W., Ren, K.: Data security and privacy in wireless body area networks. IEEE
Wirel. Commun. 17(1) (2010)
5. Liu, J., Zhang, L., Sun, R.: 1-raap: an efficient 1-round anonymous authentication protocol for
wireless body area networks. Sensors 16(5), 728 (2016)
6. Liu, J., Zhang, Z., Chen, X., Kwak, K.S.: Certificateless remote anonymous authentication
schemes for wireless body area networks. IEEE Trans. Parallel Distrib. Syst. 25(2), 332–342
(2014)
7. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public
key cryptosystems. Commun. ACM 21(2), 120–126 (1978)
8. Choi, K.Y., Hwang, J.Y., Lee, D.H., Seo, I.S.: ID based authenticated key agreement for low
power mobile devices, pp. 494–505. Springer (2005)
9. Al-Riyami, S.S., Paterson, K.G.: Certificateless public key cryptography, pp. 452–473. Springer
(2003)
10. Masdari, M., Ahmedzadeh, S.: Comprehensive analysis of authentication methods in wireless
body area networks. Secur. Commun. Netw. 9(17), 4777–4803 (2016)
11. He, D., Zeadally, S., Kumar, N., Lee, J.H.: Anonymous authentication for wireless body area
networks with provable security. IEEE Syst. J. 11(4), 2590–2601 (2017)
12. Yeh, C.K., Chen, H.M., Lo, J.W.: An authentication protocol for ubiquitous health monitoring
systems. J. Med. Biol. Eng. 33(4), 415–419 (2013)
13. Koya, A.M., Deepthi, P.: Anonymous hybrid mutual authentication and key agreement scheme
for wireless body area networks. Comput. Netw. 140, 138–151 (2018)
14. Zaho, Z.: An efficient anonymous authentication scheme for wireless body area networks using
elliptic curve cryptosystems. J. Med. Syst. 38(2), 13 (2014)
15. Shankar, S.K., Tomar, A.S., Tak, G.K.: Secure medical data transmission by using ECC with
mutual authentication in WSNs. Procedia Comput. 70, 455–461 (2015)
16. Li, T., Zheng, Y., Zhou, T.: Efficient, anonymous, authenticated key agreement scheme for
wireless body area networks. Secur. Commun. Netw. (2017)
17. Li, X., Ibrahim, M.H., Kumari, S., Sangaiah, A.K., Gupta, V., Choo, K.K.R.: Anonymous
mutual authentication and key agreement scheme for wearable sensors in wireless body area
networks. Comput. Netw. 129, 429–443 (2017)
18. Saba, T., Haseeb, K., Ahmed, I., Rehman, A.: Secure and energy efficient framework using
internet of medical things for e-healthcare. J. Infect. Public Health 13(10), 1567–1575 (2020)
19. Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans.
Circuits Syst. Video Technol. 14(1) (2014)
20. Delac, K., Grgic, M.: A survey of biometric recognition methods. in: ELMAR, pp. 16–18.
(2004)
21. Bhattacharya, D., Ranjan, R., Alisherov, S., Choi, M.: Biometric authentication: a review. Int.
J. u- e-Service 2(3) (2009)
22. Canale, M.: Comparison of authentication schemes for wireless sensor networks as applied to
secure data aggregation. (2010)
An Improved Node Mobility Patten
in Wireless Ad Hoc Network
Manish Ranjan Pandey, Rahul Kumar Mishra, and Arvind Kumar Shukla
Abstract This paper has reported an improved process for an optimized and effec-
tive node management model for mobile wireless ad hoc networks. The improved
technique is based on optimized and route maintenance of the network. The proposed
method aims to overcome the problem when the movement of nodes happens during
the routing process. Mobility Models’ performance has been estimated using param-
eters like Packet Delivery Ratio (PDR), Average Latency, Throughput, etc., using
NS-3.0.
1 Introduction
The main concern in an ad hoc wireless network is ad hoc routing because of its ad
hoc nature, like dynamic (frequently changing) network topology, a shared medium
partial bandwidth, and multimode characters, etc. There is a need for an efficient
mobility management scheme. Node mobility has been frequently used for simulation
functions, while new conversation or direction-finding methods are considered. Node
mobility in the network is the wireless capability that nodes are free to travel in any
direction. This free node can purpose hyperlinks between nodes to alternate pretty
regularly, and the topology is self-motivated and irregular. Access to data in the
free traveling node is essential for the ad hoc wireless network’s normal working.
Constructing and keeping hyperlinks between nodes is an overwhelming venture and
warm lookup topic in ad hoc network. So an improvement in the node management
scheme is needed.
In Mobility Management, there are two directions of research. One approach is
designing a new Mobility Model that predicts a new era of mobility. Another method
is to enhance mobility on account of manipulating routing protocol parameters such
as interruption, jitter, and throughput. The network routing protocols are affected by
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 361
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_35
362 M. R. Pandey et al.
nodes’ movement, linked failure, bit error rate degradation, enhancement in routing
overhand, etc. When cellular nodes’ velocity enhances, the wide variety of cell nodes
below any transmission varies is decreased [1–3].
This paper is focused on enhanced node mobility patterns in wireless ad hoc
networks. This manuscript aims to define an effective node movement pattern with
Random Waypoint, one of the efficient node mobility management models for Wire-
less ad hoc network. The last cause to plan a mobility structure is to depict movement
samples of persons in action and calculate how their velocity, place, and acceleration
trade over time. It is ideal for mobility fashions to consider the movement sample of
centered practical software in a real-looking way. Motion patterns play a vital func-
tion in identifying protocol performance. When evaluating a wireless ad hoc network
protocol, it is essential to pick the acceptable underlying mobility management model
[4, 5]. For instance, the Random Waypoint model’s node works pretty in another way
than the occupation cluster or group. It’s no longer relevant to gauge the purposes that
the place nodes tend to maneuver collectively using the Random Waypoint Model.
Therefore, there shall be a method to improve mobility management models’ right
understanding and their effect on protocol performance.
This paper is organized as follows—The problem statement is discussed in Sect. 2.
The proposed model is reported in Sect. 3, while Sect. 4 reports the adopted method-
ology. The results and discussions are discussed in Sect. 5, and Sect. 6 concludes the
paper.
2 Problem Statement
The node movement pattern is the main problem in wireless ad hoc networking
and plays an essential role in available throughput, PDR, and Quality of Services
(QoSs). The function of mobility models is to express a common workstation node
movement pattern procedure so that the analysis for these purposes may be made
concerning the mobility model. Thus, nodes’ mobility performs a vital position in the
overall performance evaluation of ad hoc wireless networks. The most frequently
used mobility model is the Random Waypoint mobility model. So the next section
deals with explaining the mobility model. It has been verified why it is no longer
appropriate to model a human being’s motion or transportation means. Therefore,
new mobility models are very much needed.
The Random Waypoint Mobility Model (RWP) consists of pauses time between
transformation in route and speed. A node starts evolving by settling in one area
for a unique time (recess time). As soon as this factor terminates, the node opts for
an arbitrary vacation spot inside the simulation region and a velocity is unvaryingly
An Improved Node Mobility Patten in Wireless Ad Hoc Network 363
3 Proposed Model
After improving the mobility model, the proposed problem of getting more practical
mobile nodes can be solved. The mathematical hypothesis and complete analysis of
this model are explained besides the above-said limitation and requirements. We aim
to provide a solution to the random movement of nodes, which may cause the link
to break. Here, we are proposing our node movement method in realistic scenarios
like university sites and shopping malls, etc. Our main objectives of the research
are to improve random waypoint performance in terms of delay, latency, throughput,
reliability and reduce overhead by finding the best path for transferring packets to
their destination.
The mobility model plays a significant role in the assessment of wireless network
protocols. Within the network, wireless mobility models vary from other existing
networks. The connectivity and capacity of the network repeatedly depend on the
nodes’ mobility performance. Compared to other presented models that require Base
Stations (BSs), the wireless mobility models need to cooperate with two or more
communicating nodes [5, 7]. Although separate models exist for other presented
models and ad hoc wireless mobility models, there are some resemblances between
the two categories. The Random Waypoint Model is one of the most extensively
used models among ad hoc wireless models for ad hoc wireless simulation and
has been put into practice in lots of network simulators. The movements of nodes
are self-regulating in many mobility models; it has to be described in the past few
research papers. But the movements of nodes are obsessed with one another in group
364 M. R. Pandey et al.
A route may also be a set of vicinity factors that shape adjoining sections, and no
section overlaps with an impediment inside the surroundings. Here is an algorithm
to satisfy this fact.
(a) In the first step, we initialize the starting and ending points. After initializing
source and target points, we draw a tracing line between source and rest point.
(b) Now we observe the striking objects. If the first object is struck, draw another
tracing line from the striking or hitting position to the rest position, i.e.,
destination position.
(c) Else we add rest position to the path and stop the procedure.
(d) Now we check the first edge of the obstacle strike.
(e) If there is any strike, then add a hit position to the path.
(f) Else again observe the first striking object.
An Improved Node Mobility Patten in Wireless Ad Hoc Network 365
Initially, the opening role and, consequently, the authentic function are equiva-
lent. We initiate a beam from supply to a rest spot and seem for the predominant
impediment strike by using this beam. Now we insert the foremost strike factor to
the trail and look at out to body this obstacle. For doing this, we seem for the main
area hit at some point of this barrier. Suppose a foothold is struck the unique strikes
to the meeting point on this edge. We opt for the closest facet of the strike side up to
attenuate the final direction length. We repeat till the impediment is encountered. It
suggests that the beam from the node role to the rest spot does not strike this barrier’s
fringe.
NS-3.0 simulator [1] is used for the simulation and analysis of the proposed algorithm.
UBUNTO 14.04 LTS is basic hardware and operating system used in simulation
work. The performing configuration is described in Table 1. The BONNMOTION 2.0
is a fundamental mobility state of affairs technology tool [11, 13, 14]. According to
the result given below, we have produced mobility scenarios for RWP and enhanced
mobility model using NS-3.0 to integrate into TCL scripts. Unsystematic traffic
acquaintances of CBR can be group between mobile nodes with the usages of a traffic
scenario generator script. Our study used the random waypoint model and enhanced
mobility model for the node with a pause time of 15 ± 3 s. and speed varying between
Table 1 Performance
Constraints Value
parameters
Type of channel Wireless channel
Simulator NS 3.0 (Version 3.0)
Protocols DSR routing protocol
Time duration for simulation 300 s
Amount of nodes 20, 30, 40, 50
Range of transmission 250 m
Movement management Model Random-waypoint
MAC layer protocol 802.11
Break time (s) 15 ± 3 s
Utmost speed 30
Least speed 0.5
Packet rate Four packets
Type of traffic CBR (Constant Bit Rate)
Data payload 512 bytes/packet
Max of CBR connections (10, 20, 40, 60, 80)
Size on an environment (600 m * 600 m)
366 M. R. Pandey et al.
0 and 100 m/s with a minimum speed of 5 m/s and a maximum speed 20 m/s for a
simulation time of 300 s. Table 1, which is given below, demonstrate the performance
constraint.
For each simulation process, nodes’ position, their movement, and traffic between
them are located arbitrarily. BONNMOTION-2.0 is accountable for the unsystem-
atic residences of the nodes’ locations and actions, and the site visitors, NS-3.0 arbi-
trary variables are utilized. Putting the unsystematic variables is the main factor as
otherwise, it may land up in excessive simulations without any meaningful results.
(a) Packet-Delivery-Ratio (PDR): PDR is the proportion of records packets
transported to the rest spot to these produced from the starting places. It
is estimated by dividing the variety of packets acquired through the rest
spot through the range packet originated from the supply [11, 13].
(c) Average End-to-End Delay: This is consists of every possible set-back precip-
itated with the aid of buffering throughout route-finding latency, which is waiting
in line at the boundary queue, re-transmission set-back at the MAC, and broadcast
and switch times. It is described as the time taken for an information packet to be
transmitted throughout an ad hoc from supply to rest spot [11, 13, 14].
The DSR is a direction-finding protocol for those networks which is wireless. It makes
use of supply routing instead of counting on the routing desk at every interme-
diate device. We can say that Dynamic Source Routing (DSR) is an autonomous
routing protocol for those networks which is wireless. In Dynamic Source
Routing, every supply determines the route to transmit its packets to pick desti-
nations. There are two predominant components, known as pathfinding and path
preservation.
The PDR is the number of packets lucratively transported to the targeted or sink node,
to the whole group of data packets transmitted by different sensor nodes. In Fig. 2,
at nodes 20, the PDR is 0.18 in RWP and 0.21; at nodes 30, PDR is 0.27 in RWP and
0.30 for the proposed mobility model. At node 30, PDR is again in the increasing
stage. However, at nodes, 50 PDR is 0.37 in RWP and 0.22, i.e., there is a decrease in
PDR compared to RWP. It can be seen that between nodes 20–40, PDR’s performance
is increasing in the proposed model compared to the Random Waypoint Model. At
node 50, it’s a bit decreasing, but whenever the node increases, it’ll increase further.
5.3 Throughput
Throughput can be described as the ratio of data packets sent out successfully and
calculated in bits/sec. It is to be noted that that higher values of throughput indicate
better performance. In the given Fig. 3, at nodes 20 throughput is 0.15 bits/s for RWP
and 0.3 bits/s for the proposed mobility model. And at nodes 30 throughput is 0.22
bits/s for RWP and 0.33 for the proposed mobility model. Here, we can analyze that
the throughput is continuously increasing whenever nodes increase compared to the
Random Waypoint Model in the improved model.
An End-to-End delay is the amount of time a packet requires to arrive at its target
location after leaving its source. Figure 4 at nodes 20 end-to-end delays is 0.220 bits/s
for RWP and 0.219 bits/s for the proposed mobility model. And at nodes, 30 end-
to-end delays are 0.178 bits/s for RWP and 0.169 for the proposed mobility model.
Here, we can analyze that the time taken through a packet from source to targeted
spot is equal or slightly decreased.
6 Conclusion
In this paper, for a node which we viewed right here is wireless ad hoc routing
protocol like DSR. Here, we also considered RWP and proposed mobility models.
An Improved Node Mobility Patten in Wireless Ad Hoc Network 369
Here, we observed that for different ad hoc protocols, the performance of mobility
models could change drastically. Our investigational outcomes point up the better
performance of ad hoc network direction-finding protocol with dissimilar mobility
models. According to our outcomes, the performance of the protocol is exaggerated
by the mobility model. The mobility models’ performance should be estimated with
the wireless ad hoc network protocol (like DSR routing protocol on our experimental
basis) in the sense that most strictly equivalent with a predictable real-world scenario.
There are three parameters End-to-End delay, throughput, and PDR, for which we
have made a comparison in this paper. The routing protocol which we considered
here is DSR for our comparative study.
The proposed mobility model carried out improved outcome compared to the
random waypoint mobility model on set constraints like packet delivery ratio, end-to-
end delay, and throughput for movement sample of the node. It is to be observed that
based on evaluation between two models, the Throughput and PDR of our proposed
model shows better at 20, 30, 40, and 50 nodes. But in the case of End-to-End Delay,
our pattern’s performance is just equal, or we can say a little bit well. Based on
these performances parameter, we can say that it will give better results when we
apply our small organization model. The outcome also illustrates that a wireless ad
hoc network’s previous setup in a real-life scenario is not adequate to investigate its
performance with a particular mobility model. The preference for mobility patterns
has a significant impact on performance.
References
3. Soltani, M.D., Purwita, A.A., Zeng, Z., Chen, C., Haas, H., Safari, M.: An orientation-
based random waypoint model for user mobility in wireless networks. In: IEEE International
Conference on Communications Workshops, ICC, Dublin, Ireland, Ireland (2020)
4. Bhusal, N.: A review on impact of mobility model of routing protocols in ad-hoc network. ISTP
J. Res. Electr. Electron. Eng. (ISTP-JREEE). In: 1st International Conference on Research in
Science, Engineering & Management (IOCRSEM ) (2014)
5. Manzoor, A., Sharma, V.: A survey of routing and mobility models for wireless ad hoc network.
SSRG Int. J. Comput. Sci. Eng. 46–50 (2015)
6. Ribeiro, A., Sofia, R.C.: A survey on mobility models for wireless networks. SITI Technical
Report SITI-TR-11-01, February (2011)
7. Pullin, A.: Techniques for Building Realistic Simulation Models for Mobile Ad Hoc Network
Research. Ph.D. thesis, Leeds Beckett University, Leeds, UK (2014)
8. Shukla, A.K., Jha, C.K., Arya, R.: A simulation study with mobility models based on routing
protocol. In: Proceedings of Fifth International Conference on Soft Computing for Problem
Solving, pp. 867–875 (2016)
9. Bai, F., Helmy, A.: A survey of mobility models. In: Wireless Ad-hoc Networks, pp. 1–30
(2004)
10. Agashe, A.A., Bodhe, S.K.: Performance evaluation of mobility models for wireless ad hoc
networks. In: Proceedings of the IEEE First International Conference on Emerging Trends in
Engineering and Technology, pp. 172–175 (2008)
11. Carofiglio, G., Chiasserini, C.F., Garettoy, M., Leonard, E.: Route stability in MANETs under
the random direction mobility model. IEEE Trans. Mobile Comput. 8(9), 1167–1179 (2009)
12. Shukla, A.K., Kapil, M., Garg, S.: Int. J. Eng. Res. Ind. Appl. (IJERIA). 5(III), 1–10 (2012).
ISSN 0974-1518
13. Vetrivelan, N., Reddy: Impact and performance of analysis of mobility models on stressful
mobile WiMax environments. Int. J. Comput. Netw. Secur. (IJCNS) 2 (2010)
14. Gerharz, M., de Waal, C.: BonnMotion—A Mobility Scenario Generation Tool. University of
Bonn [Online]. www.cs.uni-bonn.de/IV/BonnMotion/
15. Bekmezci, Sahingoz, O.K., Temel, S.: Flying Ad-Hoc Networks (FANETs): a survey. Ad-Hoc
Netw. 11(3), 1254–1270 (2013)
IGAN: Intrusion Detection Using
Anomaly-Based Generative Adversarial
Network
1 Introduction
ual system by inspecting networks and logging events. Based on these approaches,
they are classified as either a Signature or an Anomaly-based IDS. Conventionally,
signature-based techniques were used to detect existing patterns and thereby restrict
the ability to see new attacks in a developing world. Therefore, there is a recent shift
to anomaly-based detection where the machine is trained to learn the entity’s nor-
mal behavior to detect any abnormal behavior that deviates from the normal without
having to pre-train the unknown attacks.
Anomaly detection as an unsupervised approach to learning does not rely on nar-
rowly labeled datasets. This feature makes it ideal for an IDS as present network
traffic databases do not have all kinds of attacks, and many are outdated. Addition-
ally, anomaly-based detection is not limited to applications for Intrusion Detection.
It is used for disease detection, sensor networks for event identification, device con-
trol, fault and fraud detection, ecosystem disturbance detection, and also in medical
imaging.
Generative Adversarial Networks [1] have gained immense prominence and have
reached nearly all fields, including detection of anomalies. The use of GANs in the
identification of anomalies is still unexplored, though. Schlegl et al. [2] proposed
AnoGAN that is focused on medical imaging. It is noted that AnoGAN is com-
putationally expensive. The work [3] further developed the architecture and train-
ing based on the original AnoGAN. Their approach, however, has some significant
drawbacks. Both of them train on the 10% KDD dataset, which does not indicate
the larger picture and quickly yields deceptive outcomes. Furthermore, the KDD-99
dataset has redundant entries and the same entries in the train and test dataset, adding
to the system’s enormous bias. They train the network with standard data samples
considered as anomalies, ignoring the anomaly-based detection concept whose sole
aim is to train the usual data to detect any new anomalies. In [4], an encoder-based
adversarial training samples data into Gaussian distribution space using an encoder
and uses a discriminator to test whether the input comes from standard latent space
or is an anomaly by learning to train on the encoder’s latent space. However, it is
arguable that the Gaussian space will distinguish between qualified standard samples
and the anomaly samples that could be encountered during testing. Subsequently, the
authors [2] proposed an optimized AnoGAN framework for faster computation. In
[5], an additional encoder is used to minimize the distance between the images during
training and the latent variables.
In this paper, we present a novel adversarial generative training architecture,
termed as IGAN, for a NIDS that minimizes the mapping error without the use
of an external encoder, and constitutes the following salient points:
• IGAN trains data on the normal training samples only so that the system knows
how to handle the usual network traffic and, thus, identify any anomalies.
• IGAN uses NSL-KDD dataset to prevent the system from any bias and the results
from being deceptive.
IGAN: Intrusion Detection Using Anomaly-Based Generative Adversarial Network 373
2 Related Work
2.2 F-AnoGAN
In [2], the proposed anomaly detection for medical images was much faster than the
previously proposed AnoGAN [2]. It is comprised of two training steps, namely the
GAN training and the encoder training. The GAN is trained on latent representation
using a generator and a discriminator. After this stage, encoder training is carried out,
which maps the normal version of the input image to the latent space variables fed as
input to the generator. It is built on the assumption that in normal image samples, the
conversion to latent representation by an encoder and the consequent mapping back
to the image space via generator should be an identity transform, and the degree of
Fig. 1 Generative
Adversarial Network
374 J. Shah and M. Das
deviation is used for anomaly scoring. The W-GAN [11, 12] architecture was used
in the generator specific for image input.
However, the training for the GAN involves initiating the latent representation by
sampling from noise. This introduces a potential flaw in the method. The mapping
between the input images and the latent space variables could be better achieved if the
initial latent variables were close to optimal than being generated from random noise.
Additionally, the two-step training method does not guarantee an overall identity
transformation since the mapping will not be linear.
2.3 Ganomaly
Ganomaly architecture was proposed in [5], which developed on the original idea
of AnoGAN by adding an encoder. This semi-supervised learning technique uses
the same anomaly-based approach and uses contextual, encoder, and adversarial
losses to train the entire system of two encoders, one decoder, and a discriminator.
The main focus of this method is introducing the new encoder loss, which tries to
minimize the bottleneck features of the input, i.e., z and the encoded features of the
generated image, i.e., z by using the L 2 norm. The second encoder has the same
architecture as the first encoder unit but has different parametrization. Therefore, the
normal reconstructed image passed onto the next encoder cannot be expected to be
identical to the initial latent encoding. The use of two different encoders for achieving
a perfect reconstruction of the generated image and the encoding adds on the non-
linear complexity that each neural network brings with itself, and consequently, the
L 2 norm will not be able to discard the random non-linear noise added. Furthermore,
all the units use the DCGAN [20] architecture. The generator also has a convolutional
transpose layer, ReLU activation, and batch-norm with a tanh layer at the end. This
structure is suited only for image datasets, and they have tested the results on MNIST
[21], CIFAR [22], and X-ray security screening datasets (Fig. 2).
n
Lr = |xi − xi | (1)
i=1
The Latent reconstruction Loss: This loss is used to increase the autoencoder
effectiveness. It is built on the logic that if the reconstructed data is the same as the
input passed in the autoencoder, then, when it is passed again into the encoder, the
latent variable generated should be the same as before.
n
L lr = |z i − z i | (2)
i=1
k
L fe = | f (xi ) − f (xi )| (3)
i=1
Once the training is done, we test the data after passing it through the autoencoder
only and determine its anomaly score using the following formula.
1
n
Anomaly_scor e = (x − xi )2 (5)
n i=1 i
A threshold is determined using the optimal metrics on the test set so that all input
instances that cross the threshold are classified as anomaly.
Objective function: The overall loss function is the summation of all these 4 losses.
We have used the NLS-KDD dataset for evaluating the proposed model, which has
been derived from the original KDD [10] dataset by [7]:
• Removing redundant entries in train-set.
• Keeping no duplicate records in train and test set.
• Selecting records of each difficulty group in inverse proportion to their percentage
of records.
The number of records in the dataset is reasonable so that it is feasible to run the
method on the entire set without randomly selecting a smaller chunk and therefore,
the results obtained become consistent and comparable across all algorithms. The
dataset contains 41 features of the network traffic, of which 34 are continuous and 7
symbolic. We convert the continuous features to one hot encoder, so finally, we have
an input dimension of 122. The type of attack is listed in a one-word format in the
last column of the dataset. It is transformed into a binary form of either 0 for a normal
sample and 1 otherwise. This step helps make the output be one-dimensional. We
IGAN: Intrusion Detection Using Anomaly-Based Generative Adversarial Network 377
use the CSV files KDDTrain+ and KDDTest+ for implementation. The KDDTrain+
datasets have 67343 normal entries which are used for training. The KDDTest+ data
has 9711 normal entries. So, we have a total of 71463 anomalous entries comprised
of 12, 833 from KDDTrain+ data and 58630 from KDDTest+. As a result, we test
on 81174 entries in total.
The proposed IGAN architecture comprises an encoder, a decoder, and a dis-
criminator. Each encoder and decoder has four hidden layers with ReLU activation
function. The encoder has 512, 256, 128 and 64 dimension hidden layers with input
and output dimension as 122 and 32. The decoder has the dimensions of 64, 128,
256 and 512 and the input and output dimension as 32 and 122, respectively. The
discriminator has 4 hidden layers of dimensions 512, 256, 128 and 64 with the input
and output dimension as 122 and 1. The first hidden layer uses the activation function
of Leaky ReLU, the next three layers use the ReLU, and the output layer uses the
sigmoid activation function. The IGAN is trained for 100 epochs with a learning rate
of 0.0001, and the parameters are optimized using Adam optimizer [8].
We compare the proposed IGAN with the 4 other architectures, which we discuss
below.
The training of IGAN results in reducing the loss of the autoencoder, as can be
seen in Fig. 4. The continuous and gradual decreases imply the model’s effective
training, which learns to mimic the normal traffic distribution.
The effectiveness of anomaly-based intrusion detection was analyzed using the
metrics of AUC, Precision, Recall, F_score, and accuracy [17–19]. Because there
are many samples in our test set, accuracy is not the best measure to determine
performance. Due to a very high number of anomaly class samples than normal
samples, if we consider the model’s accuracy, we get a biased result since even if the
model predicts all samples to be anomaly, then accuracy is high since normal samples
are very low. The best measure for our binary classification problem into anomaly or
normal is AUC (Area Under the Curve of the ROC Curve). A ROC Curve is generated
by plotting the True Positive Rate (TPR) against the False Positive Rate (FPR) at
various threshold settings [9]. We determine the best threshold for the classification
of anomalies from the intersection of the two curves: TPR and 1-FPR. It is the point
of the optimal trade-off between the True positive rate and False Positive rate.
378 J. Shah and M. Das
The AUC score is the highest for the proposed IGAN, as seen from the below
table. Even a small percentage of increase makes a significant difference since real-
world network traffic is large in numbers and we want as little as possible rates of
False positives and True negatives. The AUC is better than that of f-AnoGAN by
1.09692% and that of Ganomaly by 0.13764%. The model outperforms the other
variants, as shown in Table 1.
We have also analyzed whether a discriminator as a Feature Extractor (FE) rather than
just a simple classifier. The results showed a significant increase in performance for
all baseline and proposed architectures, as can be seen in the Table II. We obtained
the results without FE training by using the discriminator as only a classifier for
the Generative Adversarial network’s adversarial training. For IGAN, the FE loss
training has led to an increase in AUC score by 1.15942%.
5 Conclusion
References
1. Goodfellow, I., Abadie, J.P., Mirza, M., Xu, B., Farley, D.W., Ozair, S., Courville, A., Bengi,
Y.: Generative Adversarial Nets. In: Advances in Neural Information Processing Systems, pp.
2672–2680 (2014)
2. Schlegl, T., Seebock, P., Waldstein, S.M., Langs, G., Erfurth, U.S.: f-anogan: fast unsupervised
anomaly detection with generative adversarial networks. Med. Image Anal. 54, 30–44 (2019)
3. Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.R.: Adversarially learned
anomaly detection. In: Proceedings of IEEE International Conference on Data Mining, pp.
727–736 (2018)
4. Gherbi, E., Hanczar, B., Janodet, J.C., Klaudel, W.: An encoding adversarial network for
anomaly detection. In: Proceedings of Asian Conference on Machine Learning, pp. 188–203
(2019)
5. Akcay, S., Atapour-Abarghouei, A., Breckon, T.P.: Ganomaly: semi-supervised anomaly detec-
tion via adversarial training. In: Proceedings of Asian Conference on Computer Vision, pp.
622–637 (2018)
6. Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders (2015).
arXiv: 1511.05644
7. Dhanabal, L., Shantharajah, S.P.: A study on NSL-KDD dataset for intrusion detection system
based on classification algorithms. Int. J. Adv. Res. Comput. Commun. Eng. 4(6), 446–452
(2015)
8. Kingma, D.P., Adam, J.B.: A method for stochastic optimization (2014). arXiv:1412.6980
9. Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating char-
acteristic curve. Radiology 143(1), 29–36 (1982)
10. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.: A Detailed Analysis of the KDD CUP 99
data set. In: Proceedings of IEEE Symposium on Computational Intelligence for Security and
Defense Applications (2009)
11. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN 1701, 07875 (2017)
12. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of
wasserstein GANs. In: Proceedings of Advances in Neural Information Processing Systems,
pp. 5767–5777 (2017)
13. Yi, X., Walia, E., Babyn, P.: Generative adversarial network in medical imaging: a review. Med.
Image Anal. 58 (2019)
14. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved
Techniques for Training GANs. In: Proceedings of Advances in Neural Information Processing
Systems, pp. 2234–2242 (2016)
15. Bojanowski, P., Joulin, A., Paz, D.L., Szlam, A., Optimizing the latent space of generative
networks (2017). arXiv:1707.05776
16. Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. 1511,
05644 (2015)
17. Davis, J., Goadrich, M.: The Relationship between Precision-Recall and ROC curves. In: Pro-
ceedings of the international Conference on Machine learning, pp. 233–240 (2006)
18. Goutte, C., Gaussier, E.: A probabilistic interpretation of precision, recall and f-score, with
implication for evaluation. In: Proceedings of European Conference on Information Retrieval,
pp. 345–359 (2005)
19. Hanley, J., McNeil, B.J.: The meaning and use of the area under a receiver operating charac-
teristic (ROC) curve. Radiology 143(1), 29–36 (1982)
20. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolu-
tional generative adversarial networks (2015). arXiv:1511.06434
21. LeCun, Y., Cortes, C.: MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/
22. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. https://
www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
CodeScan: A Supervised
Machine Learning Approach to Open
Source Code Bot Detection
Abstract Enhancing software productivity would help companies to cut their costs
and increase profits. Software metrics rely heavily on the personal experiences and
skills of managers in pattern recognition and rewards. Differentiating between actual
human effort and machine-generated code can help drive an organization’s decision-
making process that is rewarding its employees and provide an assistive tool to the
managers allowing effective monitoring without micromanagement that has a wide
application in managing work from home and other virtual environments. The paper
explores the insight into the quality of machine-generated bot code compared to
actual human coding efforts. It uses machine learning techniques to identify patterns
and gives intelligent insights that can be used as a performance metric for versioning
systems and business intelligence. We successfully distinguished between a bot and
human-written code with an F1-score of 0.945 using the Light Gradient Boosting
Method.
1 Introduction
Programming productivity has been an extensive subject of study for software engi-
neers and product managers. Collaboration through versioning systems has become
essential in modern software development. They come with their own set of new
challenges, including machine-generated bot code, which led to code quality issues
and caused memory complexity problems. The machine-generated code can repli-
cate the human coding efforts to a certain degree. Still, many a time creates merge
conflicts, and some developers use them to increase their contribution to the project,
which can lead to incorrectly rewarding the developer who did not contribute as
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 381
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_37
382 V. Gaurav et al.
much as another team member. Bot-generated code can come in many forms ranging
from database updates pull request evaluation automation, certifications, and many
other code snippets used by developers. The paper provides insights into the previous
works done in identifying bot automation for coding, maintenance, and fault testing,
methodology discussing from data acquisition, business understanding, and data
exploration to transition toward the development of new relevant features to identify
bot characteristics, feature engineering, and modeling. It ends with a brief discussion
about the evaluation of each model and conclusive future scope of whether bot code
can replace actual human coding efforts or not.
2 Literature Survey
The software usage that enables developers to interact and be aware of what their
colleagues are doing has successfully built systems [1], which can be achieved using
versioning systems. Open-source projects can develop software quality prediction
models to become state-of-the-art to detect defects in programs in the upcoming era
[2]. Software faults and failures will lead to customer dissatisfaction, and traditional
software metrics need to be modified to match the versioning system’s suitability to
catalyze the process of detecting bugs [3]. A source file is more fault-prone as the
developers’ contributions to the file are more imbalanced (lower entropy) and can be
useful for predicting fault-prone programs, generally characterized by a high amount
of machine-generated code [4]. Bence Kollanyi studied the nature of bot-generated
code, its influence on Github repositories and identified its characteristics different
from human coded effort [5]. Nagappan et al. [6] found that failure-prone software
entities are statistically correlated with code complexity measures, and automation
makes the maintenance work more comfortable at the cost of introducing unnecessary
code. Idreos and Callaghan [7] designed key-value storage engines and studied the
design characteristics useful in understanding the nature of bots in complex pipelines.
Bavota et al. [8] studied the specific cases when refactoring tends to introduce bugs
in the project repositories, which are not accounted for by automated bots.
3 Methodology
The choice of software metrics without hypothesis testing may prove to be a setback
for product teams, and the failure history of similar projects can give validation and
help extract business value. To resolve this, we studied metrics with business value
such as team velocity, which determines the team’s collaboration quality, set by the
deliverables promised by them in every sprint cycle [9]. It can be quantified as the
CodeScan: A Supervised Machine Learning Approach to Open Source … 383
quotient of the number of relevant commits and time of commit for the project.
We combined it with Halstead complexity [10] to create a business-driven metric
that incorporates the program size and errors reported with the number of features
delivered in a sprint cycle. Escape defects refer to the number of bugs present in
the project before the first release, which the managers are supposed to track and
resolve until it becomes coherent with the customer requirements [11]. We combined
this with cyclomatic complexity [12] to provide the number of errors and study
modularity flow control of the programs committed to the project repository. Thus,
we can formalize an intelligent system to identify actual human coding efforts with a
mix of software metrics and business value. We scraped 1.3 million different project
repositories with the help of Selenium and Git and created programs to calculate
metrics such as cyclomatic complexity, Halstead complexity, fan-out complexity
and their change per commit, and other parameters such as data abstraction coupling
[13], number of methods, time of commit, filetype, filesize, and number of comments.
The data collected using Github API had a mix of human and bot-coded files, and
a few repositories had a proper bot labeling in the code, while others were manu-
ally labeled (1: Bot, 0: Human), turning it into a binary classification problem. The
problem is to identify the actual human coding efforts from the data. Programs were
made to calculate the standard metrics features as aforementioned. The details of all
the features collected are provided as follows (Tables 1 and 2; Fig. 1).
The data collected had a higher percentage of missing entries for bot code than
human coded files. The bots do not consider other features, creating higher missing
values, which is identified as its major characteristic differentiating it from the human
code. The number of code lines written by humans was about twice that of a bot and
showed a similar trend in the filesize as it was directly proportional to the number of
code lines. Most of the bots were programmed for languages like *pom* .xml (used
in Apache Maven), Javascript, Java, XML, and SQL. On the other hand, humans
were more active in high-level object-oriented programming languages such as Java,
Javascript, and Python (Fig. 2).
The number of methods was found to have higher values for humans than bots
implying that human code is characterized by higher modularity. Human-written code
was found to have a higher number of comments than bots, indicating better read-
ability, efficient debugging, and lower escape defect, thus escalating team velocity
and accelerating the development process. Total data abstraction coupling or DAC is
a metric that measures the number of instantiations of other classes within the given
class and is concerned with the reuse degree [13]. Human code had a higher DAC
value signifying substantial complexity and required more significant maintenance
CodeScan: A Supervised Machine Learning Approach to Open Source … 385
and testing efforts. Likewise, reusability and understandability are negatively influ-
enced by the coupling, which means more money and time would be invested in
debugging, which may frustrate the customer, affect deadlines, and impede the team
velocity. Bots have the edge over humans in this case, as they have better reusability,
and the code is less complicated, making it easier to debug and maintain. A signif-
icant drawback of high DAC is stability; as coupling makes the different objects
and classes interrelated, a faulty component can lead to instability resulting in intan-
gible losses for the organization. Fan-out complexity [14] is defined as the number
of functions called within a given position, and its value is lower for code written
by bots than humans and has a high correlation with DAC. Halstead complexity is
significantly higher for humans than bots signifying higher logic and mathematical
prowess of human-written code than the bots (Fig. 3).
The data collected was preprocessed by Single Centered Imputation using Multiple
Chained Equation (SICE) [15] to handle missing values by randomly imputing
continuous two-level data, and maintain consistency between imputations through
passive imputation only for columns having higher than 20% null values, otherwise
using the median for the same. Z-Score normalization [16] was applied to transform
the data into a standard normal distribution to make features comparable. This step
aims to standardize the range of the continuous initial variables so that each one
of them contributes equally to the analysis. We add our own set of new features
combined as aforementioned, quantifiably created as follows:
Cyclomatic Escape Defect Complexity:
M = E − N + 2P + Q (1)
where
E = the number of edges in the control flow graph
N = the number of nodes in the control flow graph
P = the number of connected components
Q = the number of relevant issues reported for the repository
This is combined with the number of issues reported for conflicts merging and
bugs in the repository. We combine Halstead complexity with the team’s velocity
to get an idea about the change in the program’s size and its increased complexity.
Mathematically, we can define it as follows:
Halstead Team Velocity Program Length:
T eam V elocit y
= (N umber o f commits + N umber o f Employees)/T ime o f commit (4)
model. We observed that bot code suffers from code quality issues and lesser logic;
however, it is a good alternative for maintenance work, and actual human effort is
highly modular and logical (Table 4).
5 Conclusions
This paper provides insight into machine-generated code that has a long way to
replace human coding efforts in software development and can only assist managers.
It is more suitable for performing maintenance work, and if overused, can lead to more
problems than benefits. Identifying human coding effort will serve as a cutting edge
metric to evaluate an employee’s performance, giving better ideas to the manager
for incentive distribution and team quality. With automated bots for maintenance
and testing, developers can shift their focus to more pleasing aspects of product
designing, however, to a limited degree.
References
1. Treude, C., Storey, M.: Awareness 2.0: staying aware of projects, developers and tasks using
dashboards and feeds. In: IEEE International Conference on Software Engineering (2010)
2. Canaporo, M., Ronchieri, E.: Data mining techniques for software quality prediction in open
source software: an initial assessment. In: European Physical Journal Conference (2019)
3. Punitha, K., Chitra, S.: Software defect prediction using software metrics: a survey. In:
International Conference on Information Communication and Embedded Systems (2013)
4. Yamauchi, K., Aman, H., Amasaki, S., Yokogawa, T., Kawahara, M.: An entropy-based metric
of developer contribution in open source development and its application to fault-prone program
analysis. Int. J. Netw. Distrib. Comput. 6(3) (2018)
5. Kollanyi, B.: Where do bots come from? An analysis of bot codes shared on GitHub. Int. J.
Commun. (2016)
6. Nagappan, N., Ball, T., Zeller, A: Mining metrics to predict component failures. In: International
Conference on Software Engineering (2006)
7. Idreos, S., Callaghan, M.: Key-value storage engines. In: ACM SIGMOD International
Conference on Management of Data (2020)
CodeScan: A Supervised Machine Learning Approach to Open Source … 389
8. Bavota, G., De Carluccio, B., De Lucia, A., Di Penta, M., Oliveto, R., Strollo, O.: When does
a refactoring introduce bugs? An empirical study. In: IEEE International Workshop on Source
Code Analysis & Manipulation (2012)
9. Abouelela, M., Benedicenti, L.: Bayesian network based XP process modelling. Int. J. Softw.
Eng. Appl. (2010)
10. Chang, Z., Son, R.G., Sun, Y.: Validating halstead metrics for scratch program using process
data. In: IEEE International Conference on Consumer Electronics (2018)
11. Kapur, R., Sodhi, B.: A defect estimator for source code: linking defect reports with
programming constructs usage metrics. In: ACM Transactions in Software Engineering and
Methodology (2020)
12. Misra, S., Fernandez-Sanz, L., Adewumi, A., Crawford, B., Soto, R.: Applicability of cyclo-
matic complexity on WSDL. In: International Conference on Soft Computing, Intelligent
Systems, and Information Technology (2015)
13. Arora, R., Kumar, M.: Dynamic coupling metrics for object oriented software. Int. J. Res. Anal.
Rev. 5(2) (2018)
14. Murgia, A., Tonelli, R., Marchesi, M., Concas, G., Counsell, S., McFall, J., Swift, S.: Refac-
toring and its relationship with fan-in and fan-out: an empirical study. In: IEEE European
Conference on Software Maintenance and Engineering (2012)
15. Khan, S.I., Latiful Hoque, A.S.Md.: SICE: an improved missing data imputation technique. J.
Big Data (2020)
16. Mohsin, M.F.M., Hamdan, A.R., Bakar, A.A.: The effect of normalization for real value negative
selection algorithm. In: International Multi-Conference on Artificial Intelligence Technology
(2013)
17. Fernández, A., García, S., Herrera, F., Chawla, N.V.: SMOTE for learning from imbalanced
data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. (2018)
18. Yadav, S., Shukla, S.: Analysis of k-fold cross-validation over hold-out validation on colossal
datasets for quality classification. In: IEEE International Conference on Advanced Computing
(2016)
19. Sehgal, S., Singh, H., Agarwal, M., Bhasker, V., Shantanu: Data analysis using principal compo-
nent analysis. IEEE International Conference on Medical Imaging, m-Health and Emerging
Communication Systems (MedCom) (2014)
20. Lv, C., Chen, D.-R.: Interpretable functional logistic regression. In: International Conference
on Computer Science and Application Engineering (2018)
21. Zhang, Y.: Support vector machine classification algorithm and its application. In: International
Conference on Information Computing and Applications (2012)
22. Zhang, L., Suganthan, P.N.: Benchmarking ensemble classifiers with novel co-trained kernel
ridge regression and random vector functional link ensembles. IEEE Comput. Intell. Mag.
(2017)
23. Zhong, Y.: The analysis of cases based on decision tree. In: IEEE International Conference on
Software Engineering and Service Science, Beijing (2016)
24. Biau, G.: Analysis of a random forests model. J. Mach. Learn. Res. (2012)
25. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: ACM SIGKDD
International Conference (2016)
26. Priyadarshini, R.K., Banu, A.B., Nagamani, T.: Gradient boosted decision tree based classifica-
tion for recognizing human behavior. In: International Conference on Advances in Computing
and Communication Engineering (2019)
27. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.-Y.: LightGBM: a highly
efficient gradient boosting decision tree. In: International Conference on Neural Information
Processing Systems (NeurIPS) (2017)
28. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn.
(2012)
29. Lipton, Z.C., Elkan, C., Narayanaswamy, B.: Optimal thresholding classifiers to maximize F1
measure. In: Joint European Conference on Machine Learning and Knowledge Discovery in
Databases (2014)
Green Internet of Things: The Next
Generation Energy Efficient Internet
of Things
Abstract The Internet of Things (IoT) is seen as a novel technical paradigm aimed
at enabling connectivity between billions of interconnected devices all around the
world. This IoT is being served in various domains, such as smart healthcare, traffic
surveillance, smart homes, smart cities, and various industries. IoT’s main func-
tionality includes sensing the surrounding environment, collecting data from the
surrounding, and transmitting those data to the remote data centers or the cloud. This
sharing of vast volumes of data between billions of IoT devices generates a large
energy demand and increases energy wastage in the form of heat. The Green IoT
envisages reducing the energy consumption of IoT devices and keeping the envi-
ronment safe and clean. Inspired by achieving a sustainable next-generation IoT
ecosystem and guiding us toward making a healthy green planet, we first offer an
overview of Green IoT (GIoT), and then the challenges and the future directions
regarding the GIoT are presented in our study.
1 Introduction
Day by day IoT-related technologies are getting close to our lives in various forms.
It is believed that IoT will become a revolutionizing technology that can change
the phase of our world [1–7]. This IoT is capable of facilitating the connection of
billions of digital devices. It can also be known as the advanced version of Machine
N. N. Thilakarathne (B)
Department of ICT, Faculty of Technology, University of Colombo, Colombo, Sri Lanka
e-mail: navod.neranjan@ict.cmb.ac.lk
M. K. Kagita
School of Computing and Mathematics, Charles Sturt University, Melbourne, Australia
W. D. M. Priyashan
Department of Mechanical and Manufacturing Engineering, Faculty of Engineering, University of
Ruhuna, Galle, Sri Lanka
e-mail: madhuka.p@mme.ruh.ac.lk
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 391
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_38
392 N. N. Thilakarathne et al.
Many researchers have proposed different IoT architectures. But there is no single
unit of IoT architecture that is generally agreed upon [2, 14, 15]. The most well-
known IoT architecture comprises three layers: the perception, network, and the
application layer [16–20], as depicted in Fig. 1.
• The perception layer comprises physical IoT devices that consist of various
actuators and sensors for sensing the environment and collecting information.
• The network layer supports the transmission and the processing of sensor data
gathered by the perception layer. It is mainly used for connecting to other smart
things, network devices, and servers.
• The application layer holds the responsibility for supplying the user with
application-specific services. It consists of various applications that facilitate
smart cities, smart homes, smart healthcare, and other IoT domains.
IoT’s functionality comprises several key stages: identification, sensing, commu-
nication, computing, services, and semantics [15, 21]. The identification stage
ensures that the information or service required reaches the correct address. Their
sensing deals with collecting data from different resources and sending it to the
data centers or the cloud. These IoT sensing devices can contain various sensing
attributes such as air quality, humidity, temperature, etc. IoT communication enables
IoT devices to provide specific services for users, and most of the time, it is carried out
using wireless media such as Bluetooth, BLE (Low Energy Bluetooth), and Wi-Fi.
Various microcontrollers, microprocessors, and many software applications perform
computations. Depending on the context and domain in which the IoT devices reside,
services can vary and provide various end-users’ services. Finally, Semantics deals
with the gathering of intelligent knowledge to make quality decisions.
Although the IoT has so many problems, such as security and privacy, interoperability
issues [4–6, 17], energy usage will be the most critical obstacle in implementing the
IoT. As the number of IoT devices such as RFIDs, sensors, actuators, and mobile
devices connected to the Internet has risen rapidly [30], energy needs will also grow. If
the billions of IoT devices are constantly working, it will require massive amounts of
Green Internet of Things: The Next Generation … 395
energy daily, and it will generate a large volume of data that will magnify the energy
consumption. For transportation of this data and storage also increases the energy
requirement. The fact that we are short of the traditional form of energy sources is
also deepening the crisis. The side effect of this massive energy consumption will
increase carbon dioxide emissions (CO2) to the environment without control. To
solve these problems GIoT is proposed [15, 31, 32]. It can also be described as the
energy-efficient IoT procedures (hardware, software, and policy-based) that reduce
the greenhouse effect [2, 15]. Figure 2 showcases the ecosystem of GIoT.
Before moving into the GIoT approaches, readers need to understand what is
meant by Green Computing (GC) as GIoT is fundamentally based on the GC tech-
niques. GC or most popularly, known as Green IT (GIT), is the study and practice of
environmental sustainability computing or IT. It encompasses the research and the
practice of design, manufacture, use, and disposal of computing components effi-
ciently and effectively with minimal effect or no effect on the environment, and it
consists of four phases that assist in adapting to the green practices.
• Green Use: This focuses on reducing energy consumption and promotes the
sustainable use of computers and other information systems.
• Green Disposal: This focuses on the refurbishing and re-use of obsolete computers
and recycling unwanted computer items.
396 N. N. Thilakarathne et al.
• Green Design: This focuses on designing computer components that are energy
efficient and environmentally friendly.
• Green Manufacturing: This focuses on developing electronic parts, digital devices
with low impact or no impact on the environment.
It is noted that eco-friendly and energy efficiency are the two unique features
of this GIoT. These characteristics are accomplished by incorporating hardware,
software, and policy-based energy-efficient procedures and techniques to minimize
energy consumption, CO2 emission, and the greenhouse effect [1, 3, 8]. Most IoT
devices are not optimized for energy efficiency. Hence, they waste energy when the
devices are active; even they are not required to be active all the time. Due to this
massive energy consumption and wastage, in GIoT, it is ensured that the IoT device
is ON only when required and idle or OFF when not required. GIoT focuses on the
smart operation of devices with a decrease in energy waste. Proper energy-efficient
ventilation for the heat generated from servers and data centers, intelligent energy-
conserving techniques are various strategies to conserve energy by implementing
GIoT. Several key green technologies, such as green RFID, green sensing networks,
and green cloud computing, have been implemented to achieve GIoT. RFID is a
tiny compact electronic device that contains a variety of RFID tags and small tag
readers [8, 9]. It stores data about the objects to which they are connected. The
transmission range of RFID systems, in general, is a few meters. There are two
types of RFID tags, which are known as passive and active tags. The active tags
have batteries to continuously transmit their signal while there is no battery for the
passive tags. The passive tags need to harvest energy instead of an onboard battery.
Another leading technology for allowing GIoT is the green Wireless Sensor Network
(WSN). Many sensor nodes with minimal power and storage space are used in the
Wireless Sensor Networks (WSNs) [1, 8, 9]. Cloud computing is fundamentally
based on virtualization processes, aiming to reduce energy consumption compared
to having multiple servers in the data centers. Green cloud computing encompasses
various policies for making the cloud more energy efficient. Following, we discuss
the critical GIoT techniques [23, 32, 33].
• Green Internet Technology
Green Internet Technologies require special hardware and software designed to
consume less energy without reducing performance. That includes gateways,
routing devices, and communication protocols, etc.
• Green RFID Tags
Active RFID tags have built-in batteries for continuously transmitting their signal,
while passive RFID tags don’t have an operational battery source. Reducing the
size of an RFID tag can reduce the amount of non-degradable material, and there
are various strategies have been proposed to reduce the energy consumption of
RFID tags. Interested readers are encouraged to refer [1, 23, 33] to understand
Green RFID and associated technologies better.
• Green Wireless Sensor Network
Green WSN can be achieved by green energy conservation techniques, radio
optimization techniques, and green routing techniques, which leads to a reduction
Green Internet of Things: The Next Generation … 397
This subsection intends to provide readers with a brief understanding of recent models
and techniques developed and proposed towards achieving GIoT. We have cate-
gorized them based on the devised GIoT approach that is Hardware-Based (HB),
Software-Based (SB), and Policy-Based (PB) (Table 1).
In GIoT, there are several problems associated with transforming from IoT to GIoT.
It can be based on different parameters like hardware-based, software-based, routing
algorithm-based, policy-based, etc. Hardware-based can be processors, sensors,
398 N. N. Thilakarathne et al.
servers, ICs, RFID devices, etc. At the same time, software-based can be cloud-
based, virtualization, data centers, etc. Policy-based can be smart metering systems,
prediction of energy usage, and so on [3, 14, 15]. GIoT technology is currently in its
infancy, but immense research activities are underway to achieve green technology
and keep the environment safe. As of now, many difficulties and issues have to be
tackled with the urge. Following, we discuss the key challenges that are blocking the
way toward achieving GIoT.
• Universal GIoT Architecture for IoT: Various vendors and standardization orga-
nizations try to allow links between heterogeneous networks and IoT devices with
huge varieties to introduce an energy-efficient architecture that can apply univer-
sally for pervasive IoT ecosystems. But due to the heterogeneity of devices and
networks, it has become a tedious task.
Green Internet of Things: The Next Generation … 399
It is no doubt that the quality of life and the environment can be enhanced by GIoT, by
making the related technologies and related infrastructure more environment friendly.
Recent GIoT research has mainly focused on Green IoT applications and services,
devising advanced energy-efficient RFIDs, energy-efficient models and planning, and
localizing GIoT devices [1, 9, 42]. Also, it is expected that most IoT devices will be
made recycling, again and again, to reduce the toxic and hazardous materials that emit
into the environment. Besides, we can expect that incorporating GIoT with enabling
technologies like cloud computing [43, 44], fog computing, edge computing, and
blockchain will be more familiar with GIoT solutions as those technologies are good
at providing more scalability, security, and high performance for the underlying IoT
ecosystem [2, 11, 15, 45, 46].
400 N. N. Thilakarathne et al.
5 Conclusion
Inspired by achieving a sustainable green smart world, our study provides an overview
of GIoT and various integrated technologies and challenges of the GIoT. Then,
the future research directions and open problems regarding GIoT have also been
presented. Based on our review, we noted that GIoT could offer many advantages,
such as environmental sustainability and protection, end-user satisfaction in different
IoT domains, and minimize the harmful effects on the environment and human health.
Also, we noted that even though GIoT is currently in its infancy, there are a lot of
GIoT-based research activities are being conducted to keep the environment safe and
reduce the harmful effects of using IoT. As IoT constitutes the main part of digital
infrastructure globally, the benefits we can gain from adapting to green practices will
be immense. We believe this study will help researchers, academics, students, and
other key stakeholders interested in making a safer green world.
References
1. Albreem, M.A., El-Saleh, A.A., Isa, M., Salah, W., Jusoh, M., Azizan, M.M., Ali, A.: Green
internet of things (IoT): an overview. In: 2017 IEEE 4th International Conference on Smart
Instrumentation, Measurement and Application (ICSIMA), pp. 1–6 (2017)
2. Shaikh, F.K., Zeadally, S., Exposito, E.: Enabling technologies for green internet of things.
IEEE Syst. J. 11(2), 983–994 (2015)
3. Ahmad, R., Asim, M.A., Khan, S.Z., Singh, B.: Green IoT—issues and challenges. In: Proceed-
ings of 2nd International Conference on Advanced Computing and Software Engineering
(ICACSE) (2019)
4. Kagita, M.K., Thilakarathne, N., Gadekallu, T.R., Maddikunta, P.K.R.: A review on security
and privacy of internet of medical things (2020). arXiv:2009.05394
5. Kagita, M.K., Thilakarathne, N., Gadekallu, T.R., Maddikunta, P.K.R., Singh, S.: A review on
cyber crimes on the internet of things (2020). arXiv:2009.05708
6. Kagita, M.K., Thilakarathne, N., Rajput, D.S., Lanka, D.S.: A detail study of security and
privacy issues of internet of things (2020). arXiv:2009.06341
7. Al-Turjman, F., Kamal, A., Husain Rehmani, M., Radwan, A., Khan Pathan, A.S.: The green
internet of things (G-IoT) (2019)
8. Prasad, S.S., Kumar, C.: A green and reliable internet of things. Commun. Netw. 5(1), 44–48
(2013)
9. Huang, J., Meng, Y., Gong, X., Liu, Y., Duan, Q.: A novel deployment scheme for green internet
of things. IEEE Internet Things J. 1(2), 196–205 (2014)
10. Green IoT.: https://www.telekom.com/en/company/topic-specials/internet-of-things/green-iot.
Last accessed 07 Nov 2020
11. Dogan, O., Gurcan, O.F.: Applications of big data and green IoT-enabling technologies for
smart cities. In: Handbook of Research on Big Data and the IoT, pp. 22–41 (2019)
12. Varjovi, A.E., Babaie, S.: Green internet of things (GIoT): vision, applications and research
challenges. Sustain. Comput.: Inform. Syst. 28, 100448 (2020)
13. Green Power for Mobile.: The global telecom tower ESCO market, Technical Report (2015)
14. Khan, N., Sajak, A., Alam.: Analysis of green IoT (2020)
15. Arshad, R., Zahoor, S., Shah, M.A., Wahid, A., Yu, H.: Green IoT: an investigation on energy
saving practices for 2020 and beyond. IEEE Access 5, 15667–15681 (2017)
Green Internet of Things: The Next Generation … 401
16. Thilakarathne, N.N., Kagita, M.K., Gadekallu, D.T.R.: The role of the internet of things in health
care: a systematic and comprehensive study. Int. J. Eng. Manag.-Ment Res. 10(4), 145–159
(2020)
17. Thilakarathne, N.N.: Security and privacy issues in IoT environment. Int. J. Eng. Manag. Res.
10 (2020)
18. Thilakarathne, N.N., Kagita, M.K., Lanka, D., Ahmad, H.: Smart grid: a survey of architectural
elements, machine learning and deep learning applications and future directions (2020). arXiv:
2010.08094
19. Thilakarathne, N.N., Kagita, M.K., Gadekallu, T.R., Maddikunta, P.K.R.: The adoption of ICT
powered healthcare technologies towards managing global pandemics (2020). arXiv:2009.
05716
20. Bashar, D.A.: Review on sustainable green internet of things and its application. J. Sustain.
Wireless Syst. 1(4), 256–264 (2020)
21. Green IoT way to save the environment.: https://www.techiexpert.com/green-iot-way-to-save-
the-environment/. Last accessed 04 Nov 2020
22. Maksimović, M.: Transforming educational environment through green internet of things (G-
IoT). Trend 2017(23), 32–35 (2017)
23. The Internet of Things: Green Living.: https://en.reset.org/knowledge/internet-things-050
32017. Last accessed 07 Nov 2020
24. Green IoT.: How the internet of things is improving the environment. https://banyanhills.com/
green-iot-how-the-internet-of-things-is-improving-the-environment/. Last accessed 07 Nov
2020
25. Zhu, C., Leung, V.C., Shu, L., Ngai, E.C.H.: Green internet of things for smart world. IEEE
Access 3, 2151–2162 (2015)
26. Li, J., Liu, Y., Zhang, Z., Ren, J., Zhao, N.: Towards green IoT networking: performance
optimization of network coding based communication and reliable storage. IEEE Access 5,
8780–8791 (2017)
27. Smart Car.: https://en.wikipedia.org/wiki/Smart_car. Last accessed 07 Nov 2020
28. Nandyala, C.S., Kim, H.K.: Green IoT agriculture and healthcare application (GAHA). Int. J.
Smart Home 10(4), 289–300 (2016)
29. Ferrag, M.A., Shu, L., Yang, X., Derhab, A., Maglaras, L.: Security and privacy for green IoT-
based agriculture: review, blockchain solutions, and challenges. IEEE Access 8, 32031–32053
(2020)
30. Alsamhi, S.H., Ma, O., Ansari, M.S., Meng, Q.: Greening internet of things for greener and
smarter cities: a survey and future prospects. Telecommun. Syst. 72(4), 609–632 (2019)
31. Maksimović, M., Omanović-Mikličanin, E.: Green internet of things and green nanotechnology
role in realizing smart and sustainable agriculture (2017)
32. Solanki, A., Nayyar, A.: Green internet of things (G-IoT): ICT technologies, principles, applica-
tions, projects, and challenges. In: Handbook of Research on Big Data and the IoT, pp. 379–405
(2019)
33. Gapchup, A., Wani, A., Wadghule, A., Jadhav, S.: Emerging trends of green IoT for smart
world. Int. J. Innov. Res. Comput. Commun. Eng. 5(2), 2139–2148 (2017)
34. Abedin, S.F., Alam, M.G.R., Haw, R., Hong, C.S.: A system model for energy efficient green-
IoT network. In: 2015 International Conference on Information Networking (ICOIN), pp. 177–
182 (2015)
35. Said, O., Al-Makhadmeh, Z., Tolba, A.: EMS: an energy management scheme for green IoT
environments. IEEE Access 8, 44983–44998 (2020)
36. Lenka, R.K., Rath, A.K., Sharma, S.: Building reliable routing infrastructure for green IoT
network. IEEE Access 7, 129892–129909 (2019)
37. Al-Azez, Z.T., Lawey, A.Q., El-Gorashi, T.E., Elmirghani, J.M.: Virtualization framework
for energy efficient IoT networks. In: 2015 IEEE 4th International Conference on Cloud
Networking (CloudNet), pp. 74–77 (2015)
38. Vatari, S., Bakshi, A., Thakur, T.: Green house by using IOT and cloud computing.
In: 2016 IEEE International Conference on Recent Trends in Electronics, Information &
Communication Technology (RTEICT), pp. 246–250 (2016)
402 N. N. Thilakarathne et al.
39. Peoples, C., Parr, G., McClean, S., Scotney, B., Morrow, P.: Performance evaluation of green
data centre management supporting sustainable growth of the internet of things. Simul. Model.
Pract. Theory 34, 221–242 (2013)
40. Zamora-Izquierdo, M.A., Santa, J., Gómez-Skarmeta, A.F.: An integral and networked home
automation solution for indoor ambient intelligence. IEEE Pervasive Comput. 9(4), 66–77
(2010)
41. Eteläperä, M., Vecchio, M., Giaffreda, R.: Improving energy efficiency in IoT with re-
configurable virtual objects. In: 2014 IEEE World Forum on Internet of Things (WF-IoT),
pp. 520–525 (2014)
42. Yaacoub, E., Kadri, A., Abu-Dayya, A.: Cooperative wireless sensor networks for green internet
of things. In: Proceedings of the 8th ACM Symposium on QoS and Security for Wireless and
Mobile Networks, pp. 79–80 (2012)
43. Maksimovic, M.: Greening the future: green internet of things (G-IoT) as a key technological
enabler of sustainable development. In: Internet of Things and Big Data Analytics Toward
Next-Generation Intelligence, pp. 283–313 (2018)
44. Thilakarathne, N.N., Wickramaaarachchi, D.: Improved hierarchical role based access control
model for cloud computing (2020). arXiv:2011.07764
45. Jalali, F., Khodadustan, S., Gray, C., Hinton, K., Suits, F.: Greening IoT with fog: a survey. In:
2017 IEEE International Conference on Edge Computing (EDGE), pp. 25–31 (2017)
46. Sharma, P.K., Kumar, N., Park, J.H.: Blockchain technology toward green IoT: opportunities
and challenges. IEEE Netw. (2020)
iGarbage: IoT-Based Smart Garbage
Collection System
Abstract These days, the populace’s rapid development leads to the development of
garbage and waste materials in urban areas and urban territories. There are numerous
issues to trash assortment for the most part in metropolitan urban communities. In
this paper, we have proposed an IoT-based Smart Garbage Collection System. The
proposed system is used for collecting garbage using IoT-based system. The designed
system takes care of the previously mentioned issue and spares both the garbage
collectors and the individuals from the houses. The practical implementation of the
developed system is very efficient and accurate in its operation. The accuracy results
achieved by real-time operations are very encouraging.
1 Introduction
It is a generally accepted fact that waste and trash are expanded quickly in this day and
age. The Government goes through an enormous whole of cash just on the assortment,
transportation, and the board of trash, and still, it isn’t sufficient [1]. Ecological issues
are raised by current urban areas for trash assortment and removal [2]. Hence, smart
waste management frameworks got essential for urban communities that plan to
decrease cost and oversee assets and time [3]. The run of the mill dustbin can store
trash in it. Regardless of whether the dustbin is full or not, it is upon the clients to
choose. Trash upkeep in dustbins is an issue in numerous homes, extraordinarily in
the metropolitan urban communities principally due to the bustling calendars of city
life [4].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 403
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_39
404 Z. Noorain et al.
The task “IoT-based Smart Garbage Collection System” exhibits a framework that
can be incorporated in a dustbin and permits the dustbin to recognize the degree of
trash in it and send the information to the cloud to check ceaselessly. The framework
utilizes “Thingspeak,” cloud storage going about as both server and controller.
In this model, Arduino is utilized as a microcontroller in which IR Sensor is used
to recognize the degree of the trash in the dustbin, and ESP 8266 Wi-Fi Module is
incorporated to send the gathered information to “Thingspeak” cloud for information
checking and handling. After information handling, in light of the triggers followed
by indicated activities set up by the manager, an SMS will be sent to the concerned
personnel. From that point onward, the concerned person will gather the home’s trash
utilizing the dustbin’s Id number.
The structure of the paper has been organized as follows. Section 2 presents a brief
description of the previous works. Section 3 presents the proposed architecture of
the iGarbage system. Section 4 discusses the results to analyze the system. Section 5
concludes the paper with suggestions for future enhancements.
2 Related Work
The authors in [1] broken down that 85% of the all-out city strong waste administra-
tion financial plan is spent on trash assortment and transportation, so they executed an
IoT-based Smart Bin that will recognize the degree of trash in the container and after-
ward send the data to the database, and important data will be sent to the concerned
position to take appropriate activities.
The authors in [4] examined the number of canisters and the populace in a territory
and attempted to actualize shrewd trash built on a microchip-based stage Arduino
Uno board interfacing with GSM framework and ultrasonic sensors. It can notify the
filling status of the waste in the dustbin to the municipal authority.
The authors in [5] attempted to lessen the measure of strong waste by actualizing
a Smart Dustbin that will distinguish the edge level of the trash in a canister and
afterward utilize the blower to diminish the measure of trash in the container when the
garbage can’t be packed further; the concerned position will discharge the receptacle
by taking activities.
The authors in [6] examined that individuals have less awareness of others’ expec-
tations for neatness and cleanliness in this day and age for the most part in urban
culture. In this way, they had attempted to limit the flood of dustbins by building up
a framework that will distinguish the degree of trash in the dustbin and offer awards
to the clients on the off chance that they toss the garbage in a vacant or incompletely
filled dustbin.
In [7], the authors built up a framework that will identify the degree of trash in
the dustbin and impart a sign if 70% of the dustbin is filled. After the sign is sent,
a blower is utilized to pack the trash level. GSM Module recognizes the area of the
dustbin.
iGarbage: IoT-Based Smart Garbage Collection System 405
The authors in [8] built a framework in which sensor nodes are connected with
an Arduino board control station that sends sensor information in SMS using GSM
module to the trash collecting vehicle and the server. The sensor nodes utilize ultra-
sonic sensors to detect the level of waste against the previously set threshold level.
Also, a GPS module is integrated to get the exact location of the bin. The Amica
R2 NodeMCU microcontroller acts as a controller for GPS modules and ultrasonic
sensors. This board has a built-in Wi-Fi module that is used to send data to the server.
The authors [9] built a system in which an accelerometer sensor is integrated to
detect the dustbin’s closing and closing. Also, a temperature and humidity sensor is
used to check the waste material’s temperature and humidity. An ultrasonic sensor is
used to check the level of garbage in the dustbin. A Zigbee Pro microcontroller is used
to control the sensors. The microcontroller board has a built-in Wi-Fi module that
sends the sensor data to the gateway. The server is over a GPRS. The database man-
agement system used is Caspio. There are many more similar works been presented
in [10–13].
– SMS API
An SMS API is a dedicated code that allows existing platforms to integrate Short
Message Service (SMS) service. Here the same twilio SMS API is used to send
messages. It has been configured with the Thingspeak cloud.
Sensing happens with the assistance of an IR Sensor coordinated in the dustbin. The
detected information continuously gets transferred to ThingSpeak cloud utilizing the
Internet. If the information at ThingSpeak goes to a specific characterized value, an
SMS is sent to the garbage collector. Figure 2 shows the working of the proposed
model.
The sensing module consists of a sensor integrated with the dustbin. This module
is responsible for detecting the trash within the threshold limit by emitting Infrared
radiations and receiving it back.
The networking module is a layer that establishes a connection between the sensing
module and the communication module. It consists of a Wi-Fi ESP8266 module and
Internet.
The communication module consists of ThingSpeak Cloud and SMS alerts with SMS
API integration.
The status of garbage in the dustbin is continuously observed. In the wake of arriving
at the limit level, information is sent to the ThingSpeak cloud. Further, a message is
408 Z. Noorain et al.
sent to the staff with the dustbin id to gather the garbage from a particular dustbin.
After an assortment of waste, the status of the dustbin is refreshed. The flowchart for
the proposed model is presented in Fig. 3.
This proposed system was implemented for a colony with few residential flats for
obtaining the results and checking the proposed model’s performance.
IR sensor is integrated into the Arduino Board to detect the level of the garbage.
To make the IR sensor functional, pin 12 or digital pin 12 of the Arduino board is
connected with the IR sensor’s output pin. A 5V power supply is supplied to the IR
sensor with a USB cable connected to the Arduino board. The above integration of
the IR sensor with Arduino makes it functional.
Whenever any object comes around at approximately 6 cm, the IR sensor will
detect it and send this signal to the microcontroller.
To upload this data in the Thingspeak cloud or server, integration of the Wi-Fi
ESP8266 module is done. To make the Wi-Fi module functional, all 6 pins are used.
The 0(RX) pin of the Arduino Board is connected with the Wi-Fi module’s RX pin.
The digital pin of 1(TX) of Arduino Board is connected with TX pin, the 3.3 V pin
of Arduino is connected with Power pin (3.3 V) and EN pin of Wi-Fi module and the
iGarbage: IoT-Based Smart Garbage Collection System 409
rest, ground pin is associated with the ground. When the microcontroller reads the
IR sensor’s signal, the data gets uploaded to the Thingspeak server. As soon as the
garbage level reaches the threshold limit, the connected LED light turns from green
to red. The LED light is integrated into the dustbin. To make it work, 2 pins of RGB
LED are used. One is connected to the digital pin 13 of the Arduino board as the
output pin, and the other pin is connected to the ground. The circuit design for the i
garbage is presented in Fig. 4.
The SMS API is used to send messages to individual staff whenever the dustbin
reaches the threshold value. Here, the twilio SMS API is used to send messages. It
has been configured with the Thingspeak cloud. The official website of the SMS API
used here is https://www.twilio.com.
410 Z. Noorain et al.
ThingSpeak cloud platform is used to track the real-time garbage level of the dustbin.
Also, this platform is used to integrate SMS API via ThingHttp and React Application
for sending SMS according to the prescribed condition. As soon as the prescribed
condition triggers, an SMS gets delivered to the specified number.
A field chart presented in Fig. 5 is used to display the real-time data. Write
API key and Read API key are provided to send data in ThingSpeak server from
a microcontroller. The figure shows Read and Write API keys as well key URL to
write and read data.
iGarbage: IoT-Based Smart Garbage Collection System 411
4.1.3 ThingHTTP
4.1.4 REACT
React is an Application that is used to define the trigger for SMS sending automation.
Here, a condition is provided that, if the IR sensor’s value (value in Field 1) reaches
1, action triggers, and with the help of ThingHttp, an SMS is sent to the respective
staff.
A site is utilized to get subtleties of the dustbin. The site has three segments, Guest,
Staff and Admin as shown in Fig. 6. A visitor can see the subtleties of the dustbin
by giving the Dustbin ID and password. A staff can list all the dustbins in a specific
state or a city. The administrator has full control over the site and can change the
data to the dustbins.
Figure 9 shows the proposed model’s accuracy when compared with the real-time
data received from the dustbin. The developed model is quite efficient when com-
pared with the real-time data received from the dustbin. The data values reported
from the model are very close to the real-time data received from the dustbin. A
brief comparison of some essential features incorporated in our model is presented
in Table 1.
414 Z. Noorain et al.
The framework named “IoT-based Smart Garbage Collection System” has been ten-
tatively demonstrated to work sufficiently by incorporating various segments con-
strained by the microcontroller. The IR Sensor was tried on numerous occasions by
putting the trash at various levels. The results obtained were quite encouraging. The
efficiency and accuracy achieved are outstanding.
Future enhancements for the proposed framework can be as follows:
– Including AI empower framework that will automatically isolate the dry and wet
waste.
– A component would likewise be added to open the dustbin on the voice identifi-
cation control framework consequently.
– The proposed iGarbage system requires more maintenance cost, and the proposed
system is battery operated. One of the future directions can be to improve the
battery life of iGarbage system.
References
1. Zeb, A., Ali, Q., Saleem, M.Q., Awan, K.M., Alowayr, A.S., Uddin, J., Iqbal, S., Bashir, F.:
A proposed IoT-enabled smart waste bin management system and efficient route selection.
Hindawi J. Comput. Netw. Commun. 2019, 1–9 (2019)
2. Chowdhury, B., Chowdhury, M.U.: RFID-based real-time smart waste management system. In:
Australasian Telecommunication Networks and Applications Conference, pp. 175–180 (2007)
3. Zanella, A., Bui, N., Castellani, A., Vangelista, L., Zorzi, M.: Internet of Things for smart
cities. IEEE Internet Things J. 1(1), 22–32 (2014)
4. Sinha, T., Kumar, K.M., Saisharan, P.: Smart dustbin. Int. J. Ind. Electron. Electr. Eng. 03(05)
(2017)
5. Nagaraju, U., Mishra, R., Kumar, C., Rajkumar: Smart dustbin for economic growth. Project
report
6. Parikh, P.A., Vasani, R., Raval, A.: Smart dustbin—an intelligent approach to fulfill Swatchh
Bharat Mission. Int. J. Eng. Res. Electron. Commun. Eng. 4(10) (2017)
7. Thorat, S., Kanase, S., Bhingardeve, P.: Smart dustbin container using IoT notification. Int.
Res. J. Eng. Technol. 6(4) (2019)
iGarbage: IoT-Based Smart Garbage Collection System 415
8. Omar, M.F., Termizi, A.A.A., Zainal, D., Wahap, N.A., Ismail, N.M., Ahmad, N.: Implemen-
tation of spatial smart waste management system in Malaysia. IOP Conf. Ser.: Earth Environ.
Sci. 37 (2016)
9. Longhi, S., Marzioni, D., Alidori, E., Buo, G.D., Prist, M., Grisostomi, M., Pirro, M.: Solid
waste management architecture using wireless sensor network technology. In: 5th International
Conference on New Technologies, Mobility and Security (NTMS), pp. 1–5 (2012)
10. Ahmad, T., Abbas, A.M.: EEAC: an energy efficient adaptive cluster based target tracking in
wireless sensor networks. J. Interdiscip. Math. 23(2), 379–392 (2020)
11. Murugaanandam, S., Ganapathy, V., Balaji, R.: Efficient IOT based smart bin for clean envi-
ronment. In: International Conference on Communication and Signal Processing (ICCSP), pp.
0715–0720 (2018)
12. Ahmad, T., Haque, M., Khan, A.M.: An energy-efficient cluster head selection using artifi-
cial bees colony optimization for wireless sensor networks. In: Advances in Nature-Inspired
Computing and Applications. EAI/Springer Innovations in Communication and Computing
(2019)
13. Nehete, P., Jangam, D., Barne, N., Bhoite, P., Jadhav, S.: IoT based garbage monitoring system.
In: Second International Conference on Electronics, Communication and Aerospace Technol-
ogy (ICECA), pp. 1454–1458 (2018)
14. Zouai, M., Kazar, O., Bellot, G.O., Haba, B., Kabachi, N., Krishnamurhty, M.: Ambiance
intelligence approach using IoT and multi-agent system. Int. J. Distrib. Syst. Technol. 10(1)
(2019)
15. Louis, L.: Working principle of Arduino and using it as a tool for study and research. Int. J.
Control Autom. Commun. Syst. 1(2) (2016)
16. Karim, A., Andersson, J.Y.: Infrared detectors: advances, challenges and new technologies.
IOP Conf. Ser.: Mater. Sci. Eng. 51 (2013)
IoT-Based Smart Home Surveillance
System
Abstract Most surveillance system prototypes that have been developed to date
utilize sensors “for motion detection and require a memory card for storing data. The
issues like—price associated with the cameras, flexibility, and user-friendliness need
to be addressed for a surveillance system. This paper aims to develop an affordable
surveillance system and doesn’t need external memory devices by deploying cloud
storage. It uses computer vision for motion detection, which allows for facial recog-
nition based on preloaded data. The final model incorporating all these features has
been developed successfully and verified through multiple testing processes.”
1 Introduction
In the face of the increasing number of crimes, it has become essential to ensure
the safety of one’s home by continually monitoring and remaining alert about any
trespassers that try to enter the premises illegitimately. Even though the market
is swamped with different kinds of surveillance systems, there is still scope for
enhancing these systems in terms of users’ features.
Many surveillance systems today depend on a physical memory component like
an external SD card to store data from the surveillance video stream [1]. This surges
cost for the user in the long term. Further, hardly any CCTV cameras are prepared for
motion detection [2], and these can also be more expensive than the existing CCTV
cameras in use. Facial recognition is not yet prevalent or widely seen in the CCTV
systems available in the market [3]. This surveillance system model solves the above
difficulties by integrating cloud storage, allowing motion detection as well as facial
recognition during the video stream, and having a low cost as well. The objectives
of this research are
1. To design an intelligent surveillance system using Raspberry Pi
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 417
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_40
418 S. Dash and P. Choudekar
2 Literature Review
3 Proposed Methodology
The methodology used for this paper is the object-oriented design and analysis,
or OOAD methodology. The first step in this process is to demarcate the system’s
different modules, as done already, and then design the system required. Raspberry Pi
is the heart of this surveillance system and is the primary microcomputer used to run
all programs. The official RPi camera module, PiCamera, has been used to capture
images and stream video. Setting up this surveillance system requires executing a
simple python script that gives a continuous live feed of the surroundings. The setup
only consists of the small RPi board and its attached camera module fixed on top of
the board connected to the nearest power supply.
Figure 1 represents the numerous components in this surveillance system and their
connections. Raspberry Pi’s central element is a small hand-sized microcomputer
that acts as the brain of this surveillance system. RPi is easy to get and offers a very
intuitive programming environment due to Linux distribution, the Raspbian OS [12,
13]. Further, it has its camera module, PiCamera, and can also work with a USB
camera bought separately. This offers ease of setup to the programmer and user. The
other interconnected components, DropBox, host computer, and computer vision, all
work together in sync to capture and store images once motion is detected. Dropbox
is a popular cloud-based storage service used here to store snapshots of detected
motion in this surveillance system. Any popular service can be used for this purpose,
such as Amazon Web Services or Google Cloud. Dropbox makes the prototype cost-
effective and is the best replacement for traditional memory cards. It also makes it
easier to send updates to the user in case of any motion detection. Motion detection
and facial recognition are accomplished by using Python programming to implement
computer vision. The component of computer vision is performed by OpenCV, which
is a dedicated library primarily employed for image processing tasks [14, 15]. “For
motion detection, each frame recorded through the video stream is processed, and
when there is a change between successive frames, motion is detected by the RPi.
If a face is perceptible by the camera for a certain period, facial recognition is also
accomplished. For successfully using facial recognition, it is a precondition to first
load the pictures of known people to the RPi’s internal memory (Fig. 2).”
Fig. 2 Flowchart of
surveillance system
IoT-Based Smart Home Surveillance System 421
The entire setup can be divided into distinct modules, each module working sepa-
rately for live video feed, motion detection, facial recognition, storage, and user
notification. As discussed, this project follows an object-oriented analysis and design
methodology, or the OOAD methods to analyze system requirements, implementing
the design to satisfy these requirements, and finally, testing for ensuring proper
working. The requirement analysis can be summarized as below (Table 3).
Further, the functionality of the prototype is also tested as has been summarized
below (Table 4).
The final step in the model development procedure was to conduct both unit and
system tests. It is necessary to test the prototype to catch any functioning errors and
ensure working is as envisioned during the conception stage. For testing, the entire
hardware setup consisting of Raspberry Pi and PiCamera is connected to a 5-A power
supply and connected to the same internet network as the host computer. Through
this, RPi’s graphical user interface can be seen and operated via the host computer
screen.
Figure 3 shows the physical arrangement of the surveillance system model. The
initial step was to perform unit functionality tests for each module to check various
aspects like camera setting, the video display on the host screen, motion detection and
facial recognition by RPi, and storage of PiCamera photographs into the DropBox.
Python code for all these tasks is first written separately and tested out one by one.
Once it is confirmed that all the above features are working, the entire code for these
individual tasks is created to create a final surveillance system model program. This
comes under system testing. The camera can click images, and these images are
stored successfully on DropBox.
Figure 4 shows the successful setup of the PiCamera module. Room status is
unoccupied when no motion is detected.
Figure 5 shows the successful setup of the video stream of surroundings. As soon
as RPi detects motion, the image is captured and uploaded to DropBox, as shown
here. Room status changes to occupied.
Figure 6 shows that motion detection is successfully enabled. All snapshots are
available in the DropBox account, along with timestamp and date.
Figure 7 shows that the surveillance system has been successfully connected to the
cloud. Facial recognition is seen to be working correctly as well, using the preloaded
images on RPi memory.
Figure 8 shows that the home surveillance system is successful in identifying
faces. For the final system test, the hardware setup was connected to a power supply
and stationed in front of the entrance of a home to alert the user of any intrusion. As
already established through unit testing, the snapshots were uploaded to DropBox,
and face was recognized for the individuals whose pictures were existing in the
database of RPi. There is an additional possibility for upgrading the proposed work,
wherein the camera module can be substituted by the infrared PiCcamera module
through which RPi can capture pictures and identify faces in dark settings as well,
the face recognition feature can be changed to distinguish recurrently seen faces by
itself instead of relying on a preloaded database, streaming the video feed to a website
accessible only to the user, and adding an alarm or beeper for forming an improved
alert system. All these modifications have been summarized in Table 5. Recently,
Internet of Everything (IoET) and Cloud-based computing systems have become
very popular due to their inherent and location-independent operation, low-power
requirements, portability, and high scalability [16, 17]. The proposed work can also
be expanded using these technologies.
IoT-Based Smart Home Surveillance System 425
5 Conclusion
This surveillance system archetype was planned, keeping in mind the requirement for
a small form factor, low cost, ease of use in terms of position-setting, and flexibility
for the user. Raspberry Pi is the operating system used here. Since it has a Linux
distribution, it is easily modifiable. As it is readily available to people, either in
physical markets or online, the surveillance system also proves cost-effective. Novel
features that had not been earlier seen in other surveillance systems have also been
incorporated, like using a cloud storage client, Dropbox in this case, using computer
vision for motion detection and facial recognition, and optimizing the time delay in
sending user updates and live video stream.
References
1. Hou, J., Wu, C., Yuan, Z., Tan, J., Wang, Q., Zhou, Y.: Research of intelligent home secu-
rity surveillance system based on ZigBee. In: 2008 International Symposium on Intelligent
Information Technology Application Workshops (2008)
2. Keat, L.H., Wen, C.C.: Smart indoor home surveillance monitoring system using Raspberry
Pi, vol 2. International Journal on Informatics Visualization (2018)
3. Pi, R.: Raspberry pi. Raspberry Pi 1, 1 (2013)
4. Richardson, M., Wallace, S.: Getting started with raspberry PI. O’Reilly Media, Inc. (2012)
5. Upton, E., Halfacree, G.: Raspberry Pi User Guide. John Wiley & Sons (2014)
6. Keval, H.: CCTV control room collaboration and communication: does it work? In: Proceedings
of Human-Centered Technology Workshop, pp. 11–12 (2006)
7. Iyer, B., Pathak, N.P., Ghosh, D.: RF sensor for smart home application. Int. J. Syst. Assur.
Eng. Manag. 9, 52–57 (2018). https://doi.org/10.1007/s13198-016-0468-5
8. Poole, N.R., Zhou, Q., Abatis, P.: Analysis of CCTV digital video recorder hard disk storage
system. Digit. Invest. 5(3), 85–92
9. Gerrard, G., Parkins, G., Cunningham, I., Jones, W., Hill, S., Douglas, S.: National CCTV
strategy. Home Office, London (2007)
10. Boghossian, B.A., Velastin, S.A.: Motion-based machine vision techniques for the manage-
ment of large crowds. In: Electronics, Circuits and Systems, 1999. The 6th IEEE International
Conference on Proceedings of ICECS’99, vol. 2, pp. 961–964. IEEE
11. Quadri, S.A.I., Sathish, P.: IoT based home automation and surveillance system. In: 2017
International Conference on Intelligent Computing and Control Systems (ICICCS) (2017).
IoT-Based Smart Home Surveillance System 427
12. Sruthy, S., George, S.N.: Wi-Fi enabled home security surveillance system using Raspberry Pi
and IoT module. In: 2017 IEEE International Conference on Signal Processing, Informatics,
Communication and Energy Systems (SPICES) (2017)
13. Noble, F.K.: Comparison of OpenCV’s feature detectors and feature matchers. In: 2016 23rd
International Conference on Mechatronics and Machine Vision in Practice (M2VIP) (2016)
14. Lin, C., Tang, Y.: Research and design of the intelligent surveillance system based on
DirectShow and OpenCV. In: 2011 International Conference on Consumer Electronics,
Communications, and Networks (CECNet) (2011)
15. Ahmad Razimi, U.N., Alkawaz, M.H., Segar, S.D.: Indoor intrusion detection and filtering
system using raspberry Pi. In: 2020 16th IEEE International Colloquium on Signal Processing
& Its Applications (CSPA) (2020)
16. Deshpande, P., Iyer, B.: Research directions in the Internet of Every Things (IoET). In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, 2017, pp. 1353–1357. https://doi.org/10.1109/CCAA.2017.8230008
17. Deshpande, P., Sharma, S.C., Peddoju, S.K., Abhrahm, A.: Efficient multimedia data storage
in cloud environment. Inf. Int. J. Compu. Inform. 39(4), 431–442 (2015)
Optimized Neural Network for Big Data
Classification Using MapReduce
Approach
Abstract This paper proposes a big data classification approach over big data based
on the MapReduce framework. In the mapper phase, feature selection is carried out
by selecting features based on Principal Component Analysis (PCA). Once feature
selection is made, the selected features are passed to the reducer phase, where clas-
sification is done by the proposed Rider Neural Network (RideNN) categorizes the
data into two classes, like normal and abnormal classes. The proposed RideNN clas-
sifier achieves a high accuracy of 0.932, maximal sensitivity of 0.831, and maximal
specificity of 0.958 based on the Cleveland dataset.
1 Introduction
S. Gujjeti (B)
Computer Science & Engineering, Kakatiya Institute of Technology & Science, Bheemaram,
Hanamkonda 506015, India
S. Pabboju
Information Technology, Chaitanya Bharathi Institute of Technology, Gandipet, Hyderabad
500075, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 429
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_41
430 S. Gujjeti and S. Pabboju
Several techniques are developed in the literature for big data classification [15,
16]. Gerardo et al. [17] developed two hybrid neural architecture that combines
perceptrons and the morphological neurons. Hassib et al. [3] presented a machine
learning approach for classifying imbalanced datasets. Lin et al. [18] developed an
improved Cat Swarm Optimization algorithm for the feature selection for solving
significant data classification issues. Elkano et al. [19] modeled distributed learning
method, termed CFM-BD, for constructing accurate fuzzy rule-enabled classification
systems for Big Data.
Proposed RideNN for disease prediction over big data: The proposed RideNN
is introduced for big data classification in the reducer phase in which training is
carried out based on Rider Optimization Algorithm (ROA) such that the classification
accuracy is enhanced.
The rest of the paper is organized as follows: Sect. 2 describes the proposed
MapReduce-based Rider Neural Network (RideNN) for big data classification.
Section 3 provides the results and discussion. The conclusion is provided in Sect. 4.
Figure 1 illustrates the disease prediction based on RideNN. The developed model
processes upon two functions, like feature selection and classification.
Let us consider the big input data G with different attributes, expressed as
G = {duv } ; (1 ≤ u ≤ P) ; (1 ≤ v ≤ H ) (1)
where uth data in vth attribute is denoted as duv , the term P represents the total
data points, H indicates every data point’s total attributes. In the mapper phase, the
dimension of features is reduced based on PCA. The big data is initially classified
into data subsets, expressed by
G j = r j ; (1 ≤ j ≤ M) (2)
where the total sub-sets of data are denoted by M. The total data subsets are then fed
to the mapper phase, expressed by
N = N1 , N2 , ..., N j , ..., N M (3)
Optimized Neural Network for Big Data Classification Using … 431
The PCA output is denoted as D j , which is passed to the reducer phase for big
data classification.
Once the features are selected based on PCA, the concatenated feature is forwarded
to the NN [20] for classification and is given by
B = D1 D2 , D j D M (5)
where the total input neurons are denoted as Bx . The hidden layers representation is
given by
K = k1 , k2 , ..., kt , ..., kq (8)
where q indicates the total hidden neurons in NN and kt denotes the tth hidden
neuron, which has the output calculated as
1
s
kt = Rs Bs (9)
s e=1
The term Rs refers to weight s among the input and hidden neurons and the total
weights are denoted as s. The NN output is calculated using the below equation:
x
Ta = ka Ba (10)
a=1
When the output layer is rewritten, the solution becomes Yi = H (Ba , B), which
means the output layer is the input layer’s function along with weights. The expression
of weights in NN is given by
where the count of riders is denoted as E the total coordinates are indicated as
J . Z g (y, z) refers to the location of the rider y at a time interval z. The bypass,
overtaker, follower, and attacker are indicated as F, A, L, and I , respectively. The
coordinate, steering, and location angle of rider vehicle y are denoted as φ, S of,w ,
and αi , respectively. The gear, accelerator, and brake of the rider y are indicated as
vs , ls , and z s , respectively. The gearing value ranges between 0 and 4; meanwhile,
accelerator and brake take values from 0 and 1. Here, Z ∈ W .
Optimized Neural Network for Big Data Classification Using … 433
where training samples are represented as n. Ta and Rtarget are the estimated and the
target output of the classifier.
Step 3: Update the leading rider’s location: The fitness is computed for all the
riders where the maximal fitness is considered a leader such that the leading rider is
nearer to the target. In case of not fixing the leading rider, the updation is carried out
at the iteration end based on fitness rate.
Step 4: Update the overtaker’s location: The overtaker upgrades the position
using a direction indicator, coordinate selector, and the success rate. The update of
overtaker position is given by
Z gA (y, z) = Z g (y, z) + OZ g (y) ∗ Z L (L , z) (14)
where the term Z gA (y, z) refers to the position of yth rider at zth coordinate and the
direction indicator of yth rider is denoted as Z g (y).
Step 5: Re-compute the fitness rate: Once the rider location is updated, every
rider’s fitness rate gets updated. Therefore, the maximal rider fitness rate is chosen
as the leading rider.
Step 6: Update rider parameters: The steering angle, gear, accelerator ride
off-time, and the brake, along with the activity counter, are updated.
Step 7: End: Steps 1–6 are continued till the iteration end. Finally, the optimization
derives the best solution (weights and biases) for tuning the RideNN classifier. Thus,
the RideNN classifier determines the classes as normal or abnormal.
The developed model’s implementation is carried out in the JAVA tool with windows
10OS, 4-GB RAM, and the Intel I3 processor. The experimentation is carried out
based on datasets, such as the Cleveland dataset [21] and the diabetic dataset [22].
The metrics utilized for the analysis are accuracy, sensitivity, and specificity.
434 S. Gujjeti and S. Pabboju
The comparative analysis is performed with the existing methods, such as NN [20],
Support Vector Neural Network (SVNN) [23], Support Vector Machine (SVM) [24],
and RideNN.
Comparative analysis based on Diabetic dataset: Figure 2 portrays the analysis
using diabetic dataset. Figure 2a demonstrates the analysis of the accuracy parameter.
When 90% of training data is considered, accuracy values measured by NN, SVNN,
SVM, and RideNN are 0.685, 0.827, 0.840, and 0.841, respectively. Figure 2b demon-
strates the analysis of sensitivity. For 90% training data, sensitivity values obtained
by NN, SVNN, SVM, and RideNN are 0.599, 0.698, 0.717, and 0.718, respectively.
Figure 2c demonstrates the analysis of specificity parameter. When the training data
percentage is 90, corresponding specificity values obtained by NN, SVNN, SVM,
and RideNN are 0.595, 0.885, 0.897, and 0.899, respectively.
Comparative analysis using the Cleveland dataset: Figure 3 illustrates the analysis
of methods based on the Cleveland dataset. Figure 3a represents the analysis based on
accuracy. When training data is 90%, the accuracy values obtained by NN, SVNN,
SVM, and RideNN are 0.735, 0.867, 0.917, and 0.932, respectively. The analysis
of the sensitivity parameter is depicted in Fig. 3b. When 90% of training data is
considered, the sensitivity obtained by NN, SVNN, SVM, and RideNN are 0.337,
0.679, 0.794, and 0.830, respectively. The analysis based on the specificity parameter
is illustrated in Fig. 3c.
Fig. 2 Analysis based on Diabetic dataset by changing training data percentage a accuracy
b sensitivity, c specificity
Optimized Neural Network for Big Data Classification Using … 435
Fig. 3 Analysis of methods by varying the training data percentage based on Cleveland dataset
a accuracy b sensitivity, c specificity
The comparative discussion of the proposed method with the existing methods based
on the best performance is provided in Table 1. From the analysis, it is exposed that
the proposed RideNN performs the big data classification more effectively.
Table 1 Comparative
Accuracy Sensitivity Specificity
discussion
Diabetic dataset
NN 0.686 0.599 0.595
SVNN 0.828 0.699 0.886
SVM 0.902 0.821 0.938
Proposed RideNN 0.931 0.827 0.957
Cleveland dataset
NN 0.743 0.609 0.839
SVNN 0.867 0.679 0.911
SVM 0.918 0.795 0.949
Proposed RideNN 0.932 0.831 0.958
436 S. Gujjeti and S. Pabboju
4 Conclusion
In this paper, an effective data classification method based on the MapReduce frame-
work is presented. The proposed technique involves two steps, which include feature
selection and classification. Here, selecting the features is carried out in the MapRe-
duce framework’s mapper function using PCA, and classification is performed in
reducer based on RideNN. The experimentation of the developed model is done
based on two databases, like Cleveland and Diabetic datasets. The proposed model
achieves maximal accuracy of 0.932, maximal sensitivity of 0.831, and the maximal
specificity of 0.958 based on the Cleveland dataset. In future, the method will be
expanded by performing additional analysis using different datasets.
References
1. Storey, V.C., Song, I-Y.: Big data technologies and management: what conceptual modeling
can do. Data Knowl. Eng. 108, 50–67 (2017)
2. Sivarajah, U., Kamal, M.M., Irani, Z., Weerakkody, V.: Critical analysis of Big Data challenges
and analytical methods. J. Bus. Res. 70, 263–286 (2017)
3. Hassib, E.M., El-Desouky, A.I., Labib, L.M., El-kenawy, E-S.M.: WOA+ BRNN: an imbal-
anced big data classification framework using Whale optimization and deep neural network.
Soft Comput. 1–20 (2019)
4. Gupta, B.B.: Computer and Cyber Security: Principles, Algorithm, Applications, and Perspec-
tives. CRC Press (2018)
5. Manogaran, G., Thota, C., Lopez, D.: Human-computer interaction with big data analytics. In:
HCI Challenges and Privacy Preservation in Big Data Security IGI Global, pp. 1–22 (2018)
6. Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: a MapReduce solution for
prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)
7. Banchhor, C., Srinivasu, N.: Integrating cuckoo search-grey wolf optimization and correlative
Naive Bayes classifier with map reduce model for big data classification. Data Knowl. Eng.
101788 (2019)
8. Tsai, C.-F., Lin, W.-C., Ke, S.-W.: Big data mining with parallel computing: a comparison of
distributed and MapReduce methodologies. J. Syst. Softw. 122, 3–92 (2016)
9. Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1),
72–77 (2010)
10. Din, S., Paul, A., Ahmad, A., Gupta, B.B., Rho, S.: Service orchestration of optimizing contin-
uous features in industrial surveillance using big data based fog-enabled internet of things.
IEEE Access 6, 21582–21591 (2018)
11. Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural
Comput. 18(7), 1527–1554 (2006)
12. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep
networks. In: Advances in Neural Information Processing Systems, pp. 153–160 (2007)
13. Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagc, E.:
Deep learning techniques in big data analytics. In: Big Data Technologies and Applications.
Springer, pp.133–156 (2016)
14. Zhou, L., Pan, S., Wang, J., Vasilakos, A.V.: Machine learning on big data: opportunities and
challenges. Neurocomputing 237, 350–361 (2017)
15. Hassib, E.M., El-Desouky, A.I., Labib, L.M., El-kenawy, E.-S.M.: WOA + BRNN: an imbal-
anced big data classification framework using Whale optimization and deep neural network.
Soft Comput. 24, 5573–5592 (2020)
Optimized Neural Network for Big Data Classification Using … 437
16. Hernández, G., Zamora, E., Sossa, H., Téllez, G., Furlán, F.: Hybrid neural networks for big
data classification. Neurocomputing 390, 327–340 (2020)
17. Gerardo, H., Zamora, E., Sossa, H., Téllez, G., Furlán, F.: Hybrid neural networks for big data
classification. Neurocomputing (2019)
18. Lin, K.-C., Zhang, K.-Y., Huang, Y.-H., Hung, J.C., Yen, N.: Feature selection based on an
improved cat swarm optimization algorithm for big data classification. J. Supercomput. 72(8),
3210–3221 (2016)
19. Elkano, M., Sanz, J.A.A., Barrenechea, E., Bustince, H., Galar, M.: CFM-BD: a distributed rule
induction algorithm for building compact fuzzy models in Big Data classification problems.
IEEE Trans. Fuzzy Syst. (2019)
20. Binu, D., Kariyappa, B.S.: RideNN: a new rider optimization algorithm-based neural network
for fault diagnosis in analog circuits. IEEE Trans. Instrum. Meas. 68(1), 2–26 (2018)
21. Cleveland dataset taken from https://archive.ics.uci.edu/ml/datasets/Heart+Disease. Accessed
March 2020
22. Diabetic dataset taken from https://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospit
als+for+years+1999-2008. Accessed March 2020
23. Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection using neural networks and support
vector machines. In: Proceedings of International Joint Conference on Neural Networks, vol.
2, pp. 1702–1707 (2002)
24. Demidova, L., Nikulchev, E., Sokolova, Y.: Big Data classification using the SVM classifiers
with the modified particle swarm optimization and the SVM ensembles. Int. J. Adv. Comput.
Sci. Appl. 7(5) (2016)
Impact of Deployment Schemes
on Localization Techniques in Wireless
Sensor Networks
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 439
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_42
440 Prateek et al.
techniques. In this case, the beacon deployment was shown to improve localiza-
tion performance. A spatially circular deployment configuration was presented in
work by [4] to address the tracking method and the trajectory position algorithm
for localization of autonomous underwater vehicles (AUVs) in the marine environ-
ment [5]. Authors in [6] formulated a novel technique based on underestimating
non-convex Maximum Likelihood (ML) function to achieve appreciable execution
time and convergence rate compared to Barzilai–Borwein and Nesterov’s optimal
method. Some challenges faced by the works mentioned above are poor localization
accuracy due to unfavorable node deployment patterns, low localization ratio in case
of sparse anchor nodes, and high RMS errors associated with obstructed target nodes.
The significant contributions of the work presented in this paper are as follows:
• A square random sensor node deployment is carried out for a set of 300 nodes,
of which 60 are anchor nodes. The nodes have a communication radius of 200 m,
and either anchor or non-anchor nodes may occupy the vertices.
• A “C” shaped random node deployment is also carried out where the sequence of
anchor node and sensor node is not fixed.
• A square-shaped regular deployment pattern is carried out, which differs from
random square deployment in the sense that the position of sensor nodes and
anchor nodes follows a specified sequence. Vertices are occupied with anchor
nodes.
• A “C” shaped regular deployment pattern is also done in which the anchor nodes
and sensor nodes are evenly distributed.
• An “O” shaped random deployment scenario is performed in which nodes are
distributed evenly in a circular fashion.
• Parabola-shaped node deployment enables us to approach position coordinates
analytically, wherein the parabolic equation shall govern the deployment pattern.
• A circumcircle is also tried as a node deployment scenario. It can be accomplished
by considering triplets of nodes, Centroid as the center at a given time. The rest
of the nodes are deployed on its circumference.
The rest of the paper is organized as follows: Sect. 2 describes the signal and
the system’s localization model. The derivation of Cramer Rao’s lower bound is
discussed in Sect. 3. Explanation of numerical findings is detailed in Sect. 4. Simu-
lation and numerical computations are carried out in Sect. 4. The paper is concluded
in Sect. 5.
2 Localization Model
The method for the PIT test and the consequent APIT algorithm is discussed in
[7]. Based on the centroid formula, the localization depends upon N anchor node
information (X i , Yi ) to estimate coordinates of the target (X est , Yest ), as shown in
Eq. (1)
Impact of Deployment Schemes on Localization … 441
(X 1 + X 2 + · · · + X N ) (Y1 + Y2 + · · · + Y N )
[X est , Yest ] = , (1)
N N
Centroid algorithm is preferred for its simplicity. Another technique utilizing the
DV-Hop mechanism is also a range-free technique. The average single-hop distance
is estimated by ith anchor using the following formula:
(xi −xj )2 +( yi −y j )2
ij hj
γk = (2)
h pki + h pk j
The anchor node position h is the number of hop notations, with the subscript
i, j, k denoting the ith, jth anchor nodes, and kth target node, respectively. This
hop count information is propagated to nearby nodes, and it is the indication of the
approximate position of the target node. The RSSI-based [8] cooperative localization
technique, which uses a log distance path loss model, is governed by the equation:
The power received [9] at the sensor node of interest from the transmitting node
is that the propagation loss due to path loss depends primarily on the distance of
separation di j between the source and destination sensor nodes. n i j is the additive
noise, which is Gaussian distributed with zero mean and variance σ 2 .
The localization scheme involves the deployment of nodes using one of the deter-
ministic ways, such that we get the coordinates based on the said scheme. Then
the stated localization techniques, namely APIT, DV-hop, centroid-based, and RSSI-
based techniques, are implemented to estimate the node coordinates. Subsequently,
localization error is computed for each of the methods. The further analysis involves
the computation of neighbor anchor nodes and average connectivity of the nodes for
different configurations.
li = deb − ai (4)
The presence of debris in and around the sensor node is familiar in industrial,
military, arid areas, to name a few [10]. Due to debris, the RF signal undergoes
scattering when the signal wavelength matches debris’s physical dimensions. For
442 Prateek et al.
current work, we assume that while locating a piece of debris, the neighboring debris
particles act as noise sources by restricting signal or altering its natural form. Let the
signal between the ith anchor node and the target debris be scattered by an angle of
αi . Then, αi can be expressed as
−1 (x − xi )2 + (y − yi )2
αi = tan (5)
(z − z i )
Let the angle made by the signal with respect to the horizontal axis be given by
θi , computed as
(z + z i )
θi = tan−1 (6)
(x − xi )2 + (y − yi )2
Based on the traveling speed of the signal, the actual time required for a signal to
propagate from ith the anchor node to the target debris is
1 + sin(θi + αi ) 1 + sin(θi − αi )
ti = ln − ln (7)
cos(θi + αi ) cos(θi − αi )
Due to the presence of debris, the estimated time for a signal to propagate from
ith the anchor node to the target debris is
tˆi = ti + n i (8)
t̂ = t(deb ) + n
= [t1 , t2 , . . . t N ]T + [n 1 , n 2 , . . . n N ]T (9)
The Fisher information Matrix is thus calculated with the help of the Jacobian
Matrix J . The general expression of Jacobian is expressed as
∂ti ∂ti ∂ti
J= ∂ x ∂ y ∂z , ∀ i = {1, 2, . . . N } (11)
F I M = J T Cov(n)−1 J (12)
Computing the inverse of the Fisher Matrix shall yield the Cramer’s Rao
lower bound, which would indicate the minimum possible variation in terms of
measurement errors by deploying various sensor node configurations and different
localization techniques.
4 Numerical Findings
The localization problem is solved using four range-free techniques: DV-Hop local-
ization, APIT technique, RSSI-based localization, and centroid-based localiza-
tion. The said methods’ performance evaluation is compared by considering fully
connected networks in 1000 × 1000 m2 with 300 sensor nodes and 60 anchor nodes.
The communication radius of each anchor node is taken to be 200 m. The commu-
nication model taken here is the Regular Channel model, which consists of the log
distance path loss model with debris and dust factors are taken into account. The
table of parameters is summarized in Table 1. Figure 1 represents the positioning
error of the deployment schemes.
The localization error associated with each sensor node is the average positioning
error, which is the ratio of Euclidean distance from the estimated position to the sensor
node’s actual position and the sensor node’s communication radius. Because of the
effect of dust and debris, the hardest hit method is RSSI-based localization [11]. DV-
Hop method also faces the issue of poor performance under a random deployment
scheme. APIT has the advantage of triangulating the target precisely, but the best is
the centroid method for the said configuration (Fig. 1).
Fig. 2 Comparison of
average connectivity of
sensor nodes of different
deployment schemes
The mean value of the count of anchor nodes surrounding the target node is found
with the help of the ratio of sensor nodes, which can sense nearby anchor nodes, to
the total number of nodes [12]:
Snbr
Anbr = (13)
S
Snbr is the number of sensor nodes that can communicate with anchor nodes
and S is the total number of sensor nodes. For the said configuration, the average
connectivity is close to 300 for parabolic-shaped nodes, whereas all other deployment
shapes have average connectivity below 50 nodes (Fig. 2).
It is essential to judge the localization algorithm’s network connectivity in sensor
nodes communicating with other sensors. The requisite expression is given by [13]
Ss_cnv
Scnv = (14)
S
where Ss_cnv denotes the count of sensor nodes in applicable communication with
other sensor nodes. S is the overall count of sensor nodes. The mean count of neighbor
anchor nodes is the maximum for parabola shape, whereas all different deploy-
ment shapes have average neighbor anchors below 20, as shown in Fig. 3. A tabular
Impact of Deployment Schemes on Localization … 445
Fig. 3 Comparison of
0
DVHop APIT RSSI Centroid
Different Localisation techniques
5 Conclusion
The present work aimed to achieve a rigorous comparison between different methods
of sensor node deployment. Simulations were carried out to determine the perfor-
mance of techniques such as DV-Hop algorithm, APIT, RSSI-based localization, and
centroid-based localization. While the RSSI method would be highly dependent on
the received signal strength, it was erroneous whenever the communication range
446 Prateek et al.
was obstructed by dust or debris. Though inferior to APIT, the DV-Hop algorithm
is better than the RSSI method because the sensor hop count is more related to
node connectivity than signal strength variations. This work’s future scope would
include specific terrains (such as underwater sensor networks) and the possibility of
using advanced computational techniques to address more targeted sensor network
localization aspects.
References
1. Zhang, H., Liu, Y., Lei, H.: Localization from incomplete euclidean distance matrix: perfor-
mance analysis for the SVD-MDS approach. IEEE Trans. Signal Process. 67, 2196–2209
(2019). https://doi.org/10.1109/TSP.2019.2904022
2. Won, J., Bertino, E.: Robust sensor localization against known sensor position attacks. IEEE
Trans. Mob. Comput. 18, 2954–2967 (2019). https://doi.org/10.1109/TMC.2018.2883578
3. Dai, L., Wang, B., Yang, L.T., Deng, X., Yi, L.: A nature-inspired node deployment strategy
for connected confident information coverage in industrial internet of things. IEEE Internet
Things J. 6, 9217–9225 (2019). https://doi.org/10.1109/JIOT.2019.2896581
4. Han, G., Zhang, C., Shu, L.: Rodrigues, JJPC: impacts of deployment strategies on localization
performance in underwater acoustic sensor networks. IEEE Trans. Ind. Electron. 62, 1725–1733
(2015). https://doi.org/10.1109/TIE.2014.2362731
5. Li, Y., Cai, K., Zhang, Y., Tang, Z., Jiang, T.: Localization and tracking for AUVs in marine
information networks: research directions, recent advances, and challenges. IEEE Netw. 78–85
(2019). https://doi.org/10.1109/MNET.2019.1800406
6. Zhang, Y., Li, Y., Zhang, Y., Jiang, T.: Underwater anchor-AUV localization geometries with
an isogradient sound speed profile: a CRLB-based optimality analysis. IEEE Trans. Wirel.
Commun. 17, 8228–8238 (2018). https://doi.org/10.1109/TWC.2018.2875432
7. He, T., Huang, C., Blum, B.M., Stankovic, J.A., Abdelzaher, T.: Range-free localization
schemes for large scale sensor networks. In: Proceedings of the Annual International Confer-
ence on Mobile Computing and Networking, MOBICOM (2003). https://doi.org/10.1145/938
985.938995
8. Poduri, S., Sukhatme, G.S.: Constrained coverage for mobile sensor networks. In: Proceed-
ings—IEEE International Conference on Robotics and Automation (2004). https://doi.org/10.
1109/robot.2004.1307146
9. Hou, Y.T., Shi, Y., Sherali, H.D., Midkiff, S.F.: On energy provisioning and relay node place-
ment for wireless sensor networks. IEEE Trans. Wirel. Commun. (2005). https://doi.org/10.
1109/TWC.2005.853969
10. Kim, H.S., Abdelzaher, T.F., Kwon, W.H.: Minimum-energy asynchronous dissemination to
mobile sinks in wireless sensor networks. In: SenSys’03: Proceedings of the First International
Conference on Embedded Networked Sensor Systems (2003). https://doi.org/10.1145/958491.
958515
11. Bulusu, N., Heidemann, J., Estrin, D.: GPS-less low-cost outdoor localization for very small
devices. IEEE Pers. Commun. (2000). https://doi.org/10.1109/98.878533
12. Cho, H., Lee, J., Kim, D., Kim, S.W.: Observability-based selection criterion for anchor nodes
in multiple-cell localization. IEEE Trans. Ind. Electron. (2013). https://doi.org/10.1109/TIE.
2012.2213557
13. Al-Turjman, F.M., Hassanein, H.S., Ibnkahla, M.: Quantifying connectivity in wireless sensor
networks with grid-based deployments. J. Netw. Comput. Appl. (2013). https://doi.org/10.
1016/j.jnca.2012.05.006
14. Deshpande, P., Iyer, B.: Research directions in the internet of every things (IoET). In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, 2017, pp. 1353–1357. https://doi.org/10.1109/CCAA.2017.8230008
A Survey on 5G Architecture
and Security Scopes in SDN and NFV
J. Hasneen (B)
Institute of Information and Communication Technology (IICT), Bangladesh University of
Engineering and Technology (BUET), Dhaka 1000, Bangladesh
K. M. Sadique
Department of Computer and Systems Sciences, Stockholm University, Borgarfjordsgatan 8, 164
07 Kista, Sweden
e-mail: sadique@dsv.su.se
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 447
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_43
448 J. Hasneen and K. M. Sadique
1 Introduction
The era of 4G, as technology has bloomed to its fullest. The convenience of 4G and its
technical capacity made today’s high-speed communication possible. The outreach
functionality of 4G enabled us to develop sprawling remote sensor networks and
generated the idea of IoT [4]. But the extensive engagement of IoT devices cannot
be met with present 4G capabilities. 4G technology has few limitations like user
privacy leakage, weak home network control, limitations of architecture, and risk of
radio interfaces, and a few more [4, 5]. All these limit proper deployment of IoT and
other lower latency services properly, thus arises the demand for 5G technology. 5G
technology has to evolve to render a more scalable, adaptive, flexible, agile, secure,
and trustworthy programmable network platform on which different applications and
services with alternating needs could be installed and operated according to preset
A Survey on 5G Architecture and Security Scopes in SDN and NFV 449
standardization [6]. With the proliferation of data volume, diversified service require-
ments, 5G is expected to handle massive data, massive control, enormous resilience,
and massive IoT connectivity [7]. All these 5G capabilities will be discussed in the
following section.
2.1 5G Capabilities
The least technical requirements for 5G got their approval in [11]. All those key
performance parameters and their values using cases will be found in tabular form
[11]. The current microwave bands in use are lower than 6 Hz with small-scale
capacity due to high traffic caused by heavy usage. To meet the growth, 5G is planned
to be introduced in different frequency bands [4]. Several candidate frequency bands
in the mm-wave ranges from 24 to 100 GHz have been approved for investigation
purposes by WRC-15. Additionally, the spectrum in the unlicensed 60-GHz bands
can also be used for study purposes [12]. We can wisely choose lower frequency bands
for comprehensive area network coverage while combining all the above bands. High-
range mm-wave bands will be selected for LAN and personal area communications,
and comparably shorter range links of the unlicensed spectrum of the mm-wave
bands [2, 12, 13]. The subsequent sections will give a precise idea about SDN and
NFV, the two critical technological enablers of the 5G core network. The network
slicing concept is also studied and depicted in detail.
450 J. Hasneen and K. M. Sadique
SDN and NFV are the two leading technologies required to implement services and
applications supported by 5G. Both will operate on network slicing operations. Table
1 represents 5G technological enablers identified by different papers. As illustrated
in Table 1, cloud computing is also recognized as a technology enabler for 5G. But
Table 1 5G technology
Technology enablers for 5G Reference papers
enablers
Cloud computing [4, 16, 17, 16–20]
SDN [16, 21, 16–18, 20–26]
NFV [16, 17, 12, 16–18, 27, 22–27]
Network slicing [16, 12, 22, 3, 22–25, 25]
A Survey on 5G Architecture and Security Scopes in SDN and NFV 451
we have only discussed SDN, NFV, and network slicing in detail because SDN and
NFV are part of the cloud computing paradigm.
Software-Defined Networks (SDN): Networking devices usually handle control
plane management and data plane management that forwards network traffic. SDN
is a technology that sets apart the control plane management from the data plane of
network devices [25, 26, 29, 30]. The separation of the control plane from the data
plane enables more automation and policy-based governance [8, 29, 31]. An SDN
controller controls information flow within a data center [3, 12, 31]. It identifies
larger data flows, prioritizes them, and hence renders optimized data flow. SDN
controllers are set to track frequent and infrequent traffic patterns and optimize them
according to demand [17, 31]. Since its inception, SDN provides a programmable
networking protocol that allows the system to manage and propagate network traffic
among routing and switching devices vendors independently [32].
SDN and NFV include some extra network components. Namely, those are SDN
controller, Orchestrator, Hypervisor, Security Function Virtualization, and a few
more [8, 25]. Including extra components causes new security issues and unknown
risks [8]. Other than SDN/NFV components, some network functions like cloud
radio access network (Cloud RAN/c RAN/centralized RAN), Mobile Edge Cloud
(MEC), and Network Slicing are obligatory to enable the system for resource sharing
optimization and to support real-time (low latency) services [8, 25]. All these new
functions also make the system more prone to security risks. All these security issues
are described in detail in later sections of this paper.
Network Functions Virtualization (NFV): Conventional Network Functions are
embedded in hardware appliances. NFV defines sharing common physical resources
through a specific VM ware or virtual machine. NFV implies Network Functions
will run on cloud computing infrastructure located in a data center [5, 17]. But NFV
infrastructure will not be analogous to commercial or enterprise Cloud [5]. NFV will
ensure the highest possible use of enterprise cloud resources, and those will not be
interchangeable [12, 26]. Usually, various networking functions run on a range of
industry-standard hardware. The idea of NFV in 5G technology came into the scene
to aggregate several network functions onto software appliances. NFV decouples
software from hardware, reducing operating cost, increasing scalability, and making
network services more resilient [24, 26].
Network Slicing: Network Slicing is introduced to accommodate multiple logical
self-contained networks [25]. It will allow customized services to different users
orchestrated in different ways [7]. The new core network architecture determined for
5G enables network operators to specify Network Slices [2, 25, 28]. These Network
Slices are tuned/associated with particular service-level agreements by network oper-
ators, which can be noted as Logically Isolated Network Partition (LINP) [2]. It is
assumed that Network Slice Selection Function (NSSF) is placed in the Radio Access
Network (RAN) link to have proper Network Slices regulations. Slicing a network
space into chunks allows diversified users with varying requirements to parallel self-
contained logical networks [3, 6, 12]. While tuned, the user attains full control over
452 J. Hasneen and K. M. Sadique
all the 5G infrastructure’s vertical layers, namely the physical layer, the virtualiza-
tion layer, and the service layer [6, 22]. Slicing networks can pose trust and security
threats discussed in a later section of this paper.
Many recent types of research addressed the 5G technology and its’ security issues.
In [2], Shafi et al. provided a tutorial overview about 5G technology requirements,
5G use cases, and 5G enabling technologies like network slicing, SDN, and NFV.
The authors discussed 5G use cases in detail. Dutta et al. [8] identified 5G supporting
technologies, their potential security issues, opportunities, and suggested solutions.
Li et al. [38] focused on incorporating SDN techniques enhanced IoT inclusion into
5G technology. In [6], Yousaf et al. presented an overview of SDN and NFV as
5G network enablers. All the preceding papers and few more surveys discussed the
5G architecture, 5G enabling software and technologies, their limitations, poten-
tial challenges, and mitigation countermeasures. Some surveys presented overall
pictures, and some surveys emphasize particular areas. This paper tried to portray
a birds’ eye picture from 5G background to 5G requirements, use cases through
technological innovation inclusive to 5G technology and their potential threats and
probable mitigation measures.
3 5G Security Threats
While setting up a plan of action for new technology, some points of hindrance
come forward. As the development process progresses, some impeding factors get
solved, and a few more get added. These sections describe various security threats
and potential risk issues that come with 5G supporting technologies like SDN, NFV,
and network slicing, and their mitigation techniques are also explained later.
SDN controllers have a vital role in SDN security and threat management, as it
controls the data flow through SDN devices [12, 16, 32]. So, malicious attackers can
easily take hold of precious and sensitive data if the SDN controller is compromised.
SDN controllers can be exploited by various security threats, namely API flood attack,
454 J. Hasneen and K. M. Sadique
• If different network slices do not have good isolation, attackers may misuse one
slice’s capacity flexibility to grab resources from another slice. This could send
the victim slice out of service.
• Attackers may listen to security, or privacy information exchanged between UE
and network when network slice selection occurs. Attackers could forge this
information to take illegal access to network resources.
Some potential threats and possible risk factors in 5G supporting technologies are
discussed earlier. Several threat mitigation techniques are summarized below:
There are several mitigation measures for NFV security risks are suggested in [5, 7,
16, 20, 24], which are listed below:
NFV MANO risks mitigation: Single point failure can be prevented by sepa-
rating the administration and introducing distributed control. Security monitoring
can check DoS attacks. Malicious insiders’ entry can be controlled by fine-grained
access control.
Interface risks mitigation: Improvement in confidentiality management and
integrity protection will eliminate Sensitive data leakage problems. Network
topology validation can prevent Malicious routing loop attacks.
VNF security risk mitigation: Signing a VNF image cryptographically can
prevent VNF images from getting infected. To stop the Malicious VNF hypervisor,
introspection and abnormality detection can be done. Deployment of flexible VNF
and scaling strategy can prevent DDoS attacks.
456 J. Hasneen and K. M. Sadique
5G technology is now at its pre-commercial trial phase, and the process is accelerating
each day to meet early deployment races. Such a situation compels us to address
5G technological challenges and security threats urgently. While doing the survey,
we tried to address those security challenges and developed the following table.
Table 2 portrays our findings of several security threats toward those new technology
enablers (presented in Table 1). Our findings on security threats in SDN, NFV, and
network slicing are identified from [2, 7, 16, 17, 20, 23–28, 30, 32, 38, 40] and
presented in Table 2. We also put relevant target points for solutions in Table 2,
which opens new sets of research opportunities. Further study on target network
elements like centralized control points, SDN controller and switches, SDN controller
hypervisor, SDN controller–switch communication, SDN controller communication,
SDN virtual switches and routers, shared cloud resources. A data center in Cloud
will shed light on solutions to the threats targeted on those elements.
There will be more practical threats when 5G is implemented, and many more
end-user service-specific applications will be using 5G. Automation of defense mech-
anisms, threat monitoring, and mitigation processes can be used to address this
issue. Future application of anomaly detection machine learning algorithms can be
employed in the SDN control plane for trait analysis and pattern recognition in SDN
architecture [3, 16, 30]. Efficient assignment of network slices over NFV and their
management is a vast area to research further [12]. URLLC use cases that require
A Survey on 5G Architecture and Security Scopes in SDN and NFV 457
highly reliable, ultra-low latency real-time interaction, latency has to be less than
10 ms, which is quite a critical challenge to solve in future wireless networks [30].
6 Conclusions
This paper tried to survey and summarize key points of 5G technology, 5G network
requirements and analysis, 5G network architecture, and new components required
to envisage 5G technology. We also focused on some open research challenges in the
5G architecture reinforced by SDN and NFV. A shortlist of potential research issues,
and their future research directions are listed in this paper. The idea of SDN and
NFV and their possible implementations are discussed a bit in detail. The network
slicing technique on which SDN and NFV built up is also explained. Potential trust
issues, security risks, and threats are also identified, and their possible mitigation
458 J. Hasneen and K. M. Sadique
techniques are also discussed. 5G use cases, their expansions potential future appli-
cations of 5G are also identified. Following 5G use cases, 5G capabilities are also
focused. All possible risks related to trust and security handlers are expected to be
addressed during the design phase while setting up the 5G network architecture and
security model. These security issues are crucial and are addressed by standardization
authorities like 3GPP, IEEE, and ETSI.
References
1. Pirinen, P.: A brief overview of 5G research activities. In: Proc. 2014 1st Int. Conf. 5G Ubiq-
uitous Connect. 5GU 2014, vol. 5, pp. 17–22 (2014). https://doi.org/10.4108/icst.5gu.2014.
258061
2. Shafi, M., Fellow, L., Molisch, A.F., Smith, P.J., Haustein, T., Zhu, P., Member, S., Silva,
P.D., Tufvesson, F., Benjebbour, A., Member, S.: 5G: a tutorial overview of standards, trials,
challenges, deployment, and practice. IEEE J. Sel. Areas Commun. 35, 1201–1221 (2017)
3. Fourati, H., Maaloul, R., Chaari, L.: A survey of 5G network systems: challenges and machine
learning approaches. Springer Berlin Heidelberg (2020). https://doi.org/10.1007/s13042-020-
01178-4.
4. Gupta, A., Jha, R.K.: A survey of 5G network: architecture and emerging technologies. IEEE
Access 3, 1206–1232 (2015). https://doi.org/10.1109/ACCESS.2015.2461602
5. Andrews, J.G., Buzzi, S., Choi, W., Hanly, S.V., Lozano, A., Soong, A.C.K., Zhang, J.C.: What
will 5G be? IEEE J. Sel. Areas Commun. 32, 1065–1082 (2014). https://doi.org/10.1109/JSAC.
2014.2328098
6. Yousaf, F.Z., Bredel, M., Schaller, S., Schneider, F.: NFV and SDN-key technology enablers
for 5G networks. IEEE J. Sel. Areas Commun. 35, 2468–2478 (2017). https://doi.org/10.1109/
JSAC.2017.2760418
7. Zhang, S., Wang, Y., Zhou, W.: Towards secure 5G networks: a survey. Comput. Netw. 162
(2019). https://doi.org/10.1016/j.comnet.2019.106871
8. Dutta, A., Hammad, E.: 5G security challenges and opportunities: a system approach. In: 2020
IEEE 3rd 5G World Forum, 5GWF 2020—Conf. Proc. pp. 109–114 (2020). https://doi.org/10.
1109/5GWF49715.2020.9221122
9. Wen, F., Wymeersch, H., Peng, B., Tay, W.P., So, H.C., Yang, D.: A survey on 5G massive
MIMO localization. 94, 21–28 (2019)
10. Li, S., Xu, L.D., Zhao, S.: 5G internet of things: a survey. J. Ind. Inf. Integr. 10, 1–9 (2018).
https://doi.org/10.1016/j.jii.2018.01.005
11. Mohyeldin, E.: Minimum requirements relate r d to technical performance for IMT-2020
radio interface(s), document ITU-R M. [IMT-2020. TECH PERF REQ]. https://www.itu.
int/en/ITU-R/study-groups/rsg5/rwp5d/imt-2020/Documents/S01-1_Requirements%20for%
20IMT-2020_Rev.pdf (2020). Last accessed 5 Dec 2020
12. Morgado, A., Huq, K.M.S., Mumtaz, S., Rodriguez, J.: A survey of 5G technologies: regulatory,
standardization and industrial perspectives. Digit. Commun. Netw. 4, 87–97 (2018). https://
doi.org/10.1016/j.dcan.2017.09.010
13. Hansen, C.: WIGIG: multi-gigabit wireless communications in the 60 GHZ band. 60–61 (2011)
14. Nguyen, T.: Small cell networks and the evolution of 5G (Part 1). https://www.qorvo.com/des
ign-hub/blog/small-cell-networks-and-the-evolution-of-5g. Last accessed 4 Jan 2020
15. Liu, F., Peng, J., Zuo, M.: Toward a secure access to 5G network. In: Proc.—17th IEEE Int.
Conf. Trust. Secur. Priv. Comput. Commun. 12th IEEE Int. Conf. Big Data Sci. Eng. Trust,
pp. 1121–1128 (2018). https://doi.org/10.1109/TrustCom/BigDataSE.2018.00156
16. Ahmad, I., Kumar, T., Liyanage, M., Okwuibe, J., Ylianttila, M., Gurtov, A.: 5G security:
analysis of threats and solutions. In: 2017 IEEE Conf. Stand. Commun. Networking, CSCN
2017, pp. 193–199 (2017). https://doi.org/10.1109/CSCN.2017.8088621
A Survey on 5G Architecture and Security Scopes in SDN and NFV 459
17. Neves, P., Calé, R., Costa, M., Gaspar, G., Alcaraz-Calero, J., Wang, Q., Nightingale, J., Bernini,
G., Carrozzo, G., Valdivieso, Á., Villalba, L.J.G., Barros, M., Gravas, A., Santos, J., Maia, R.,
Preto, R.: Future mode of operations for 5G—the SELFNET approach enabled by SDN/NFV.
Comput. Stand. Interfaces 54, 229–246 (2017). https://doi.org/10.1016/j.csi.2016.12.008
18. Panwar, N., Sharma, S., Singh, A.K.: A survey on 5G: the next generation of mobile
communication. Phys. Commun. 18, 64–84 (2016). https://doi.org/10.1016/j.phycom.2015.
10.006
19. Singh, S., Saxena, N., Roy, A., Kim, H.S.: A survey on 5G network technologies from social
perspective. IETE Tech. Rev. (Institution Electron. Telecommun. Eng. India) 34, 30–39 (2017).
https://doi.org/10.1080/02564602.2016.1141077
20. Ahmad, I., Kumar, T., Liyanage, M., Okwuibe, J., Ylianttila, M., Gurtov, A.: Overview of 5G
security challenges and solutions. IEEE Commun. Stand. Mag. 2, 36–43 (2018). https://doi.
org/10.1109/MCOMSTD.2018.1700063
21. Krishnan, P., Najeem, J.S.: A review of security, threats and mitigation approaches for SDN
architecture. Int. J. Innov. Technol. Explor. Eng. 8, 389–393 (2019)
22. Gohil, A., Modi, H., Patel, S.K.: 5G technology of mobile communication: a survey. In: 2013
Int. Conf. Intell. Syst. Signal Process. ISSP 2013, pp. 288–292 (2013). https://doi.org/10.1109/
ISSP.2013.6526920
23. Khettab, Y., Bagaa, M., Dutra, D.L.C., Taleb, T., Toumi, N.: Virtual security as a service for
5G verticals. In: IEEE Wirel. Commun. Netw. Conf. WCNC (2018). https://doi.org/10.1109/
WCNC.2018.8377298.
24. Ji, X., Huang, K., Jin, L., Tang, H., Liu, C., Zhong, Z., You, W., Xu, X., Zhao, H., Wu, J.,
Yi, M.: Overview of 5G security technology. Sci. China Inf. Sci. 61 (2018). https://doi.org/10.
1007/s11432-017-9426-4
25. Blanco, B., Fajardo, J.O., Giannoulakis, I., Kafetzakis, E., Peng, S., Pérez-Romero, J.,
Trajkovska, I., Khodashenas, P.S., Goratti, L., Paolino, M., Sfakianakis, E., Liberal, F., Xilouris,
G.: Technology pillars in the architecture of future 5G mobile networks: NFV MEC and SDN.
Comput. Stand. Interfaces 54, 216–228 (2017). https://doi.org/10.1016/j.csi.2016.12.007
26. Akpakwu, G.A., Silva, B.J., Hancke, G.P., Abu-Mahfouz, A.M.: A survey on 5G networks for
the internet of things: communication technologies and challenges. IEEE Access 6, 3619–3647
(2017). https://doi.org/10.1109/ACCESS.2017.2779844
27. Lal, S., Taleb, T., Dutta, A.: NFV: security threats and best practices. IEEE Commun. Mag. 55,
211–217 (2017). https://doi.org/10.1109/MCOM.2017.1600899
28. Cunha, V.A., da Silva, E., de Carvalho, M.B., Corujo, D., Barraca, J.P., Gomes, D., Granville,
L.Z., Aguiar, R.L.: Network slicing security: challenges and directions. Internet Technol. Lett.
2, e125 (2019). https://doi.org/10.1002/itl2.125
29. Thomasm, M.: 24 top internet-of-things (IOT) examples you should know. https://builtin.com/
internet-things/iot-examples. Last accessed 5 Dec 2020
30. Agiwal, M., Roy, A., Saxena, N.: Next generation 5G wireless networks: a comprehen-
sive survey. IEEE Commun. Surv. Tutorials 18, 1617–1655 (2016). https://doi.org/10.1109/
COMST.2016.2532458
31. Alqarni, M.A.: Benefits of SDN for big data applications. In: 2017 14th Int. Conf. Smart Cities
Improv. Qual. Life Using ICT IoT, HONET-ICT 2017, pp. 74–77 (2017). https://doi.org/10.
1109/HONET.2017.8102206
32. Zhong, H., Fang, Y., Cui, J.: LBBSRT: an efficient SDN load balancing scheme based on server
response time. Futur. Gener. Comput. Syst. 68, 183–190 (2017). https://doi.org/10.1016/j.fut
ure.2016.10.001
33. Ullah, H., Gopalakrishnan Nair, N., Moore, A., Nugent, C., Muschamp, P., Cuevas, M.: 5G
communication: an overview of vehicle-to-everything, drones, and healthcare use-cases. IEEE
Access 7, 37251–37268 (2019). https://doi.org/10.1109/ACCESS.2019.2905347
34. Storck, C.R., Duarte-Figueiredo, F.: A survey of 5G technology evolution, standards, and
infrastructure associated with vehicle-to-everything communications by internet of vehicles.
IEEE Access 8, 117593–117614 (2020). https://doi.org/10.1109/ACCESS.2020.3004779
460 J. Hasneen and K. M. Sadique
35. Jahng, J.H., Park, S.K.: Simulation-based prediction for 5G mobile adoption. ICT Express 6,
109–112 (2020). https://doi.org/10.1016/j.icte.2019.10.002
36. Wang, C.X., Bian, J., Sun, J., Zhang, W., Zhang, M.: A survey of 5g channel measurements
and models. IEEE Commun. Surv. Tutorials 20, 3142–3168 (2018). https://doi.org/10.1109/
COMST.2018.2862141
37. Barakabitze, A.A., Ahmad, A., Mijumbi, R., Hines, A.: 5G network slicing using SDN and
NFV: a survey of taxonomy, architectures, and future challenges. Comput. Netw. 167 (2020).
https://doi.org/10.1016/j.comnet.2019.106984
38. Li, Y., Su, X., Ding, A.Y., Lindgren, A., Liu, X., Prehofer, C., Riekki, J., Rahmani, R.,
Tarkoma, S., Hui, P.: Enhancing the internet of things with knowledge-driven software-defined
networking technology: future perspectives. (2020)
39. Sattar, D., Matrawy, A.: Towards secure slicing: using slice isolation to mitigate DDoS attacks
on 5G core network slices. arXiv. pp. 82–90 (2019)
40. Cao, J., Ma, M., Li, H., Ma, R., Sun, Y., Yu, P., Xiong, L.: A survey on security aspects for
3GPP 5G networks. IEEE Commun. Surv. Tutorials 22, 170–195 (2020). https://doi.org/10.
1109/COMST.2019.2951818
Study and Analysis of Hierarchical
Routing Protocols in Wireless Sensor
Networks
1 Introduction
Wireless Sensor Networks (WSNs) have proven their mettle and have gained enor-
mous popularity over a while. Their practical serving areas not limited to include
environmental monitoring [1], traffic control [2], medical health care [3], home
automation [4], field monitoring, military applications and border surveillance [5],
and other fields [6]. WSNs are deployable in friendly and hostile environments [7,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 461
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_44
462 A. Choudhary et al.
8]. Hostile environmental deployment of these networks has always been a chal-
lenge as the same is a concern. As the sensors accumulate the readings from the
deployed region and forward it to the central location commonly known as a sink,
a significant amount of energy is expended in the process involving collection and
transmission. The overall setup will work as long as there is sufficient energy left
for the process [9–12]. The energy dissipation depends mostly on the routing proto-
cols used, i.e., the more adept the routing protocol, the better the efficiency, thus
leading to an extended network lifetime [13]. Moreover, prolonging the battery life
and improving efficiency remains a challenge. There are different routing protocols
designed for prolonging the network life in the case of WSNs. Depending on how
the sensors get interconnected and the route they follow to communicate the sensed
data toward the base station, the routing protocols are generally classified as classical
Flat, Location-Based, and Hierarchical routing protocols.
Flat Routing Protocols—deployment of the sensors is uniform, each node is each
other’s peer, and there is no organization or any segmentation structure between the
nodes. These protocols based on the routing technique they implement can be further
categorized into Proactive and Reactive routing protocols [14], e.g., DSDV, AODV,
FSR, etc.
Location-Based Routing Protocols—as the name suggests, the sensors are cate-
gorized based on their location in the network. Here received signal strength is the
basis for determining the distance among the sensors. More the signal strength, the
closer the sensors and vice versa [15], e.g., GAF, MECN, GEAR, GPSR, etc.
Hierarchical Routing Protocols—deployed sensors are arranged into groups. Each
group or cluster is governed by an elected sensor commonly known as the Cluster
Head (generally node having maximum energy). The node with maximum energy is
preferred to perform the duty of cluster head simply because its work is doubled up
as a normal data collecting node, coordinating with the rest of the nodes in the cluster
and communicating with other C.H.s or the B.S. (depending on the algorithm used).
Cluster Head (C.H.) generally receives the environmental values from the sensors
deployed in its cluster, removes redundancy from the received data, and forwards
it to the other C.H. or Base Station [16, 17], e.g., LEACH, PEGASIS, SEP, EAP,
REAP, TEEN, APTEEN, etc. Figure 1 depicts the broad categorization of routing
protocols, and Fig. 2 illustrates the primary clustered sensor network.
This paper talks about the hierarchical routing protocols and aims to benefit
the researchers who intend to start work on clustered-based routing protocols. The
2 Literature Review
There are various quality surveys available about improving the network lifetime
[18], optimization techniques [19, 20], congestion control [21], and other domains.
However, this work focuses on hierarchical routing protocols, and the following
literature is reviewed. Akkaya and Younis [22] studied various routing protocols,
discussed and categorized them into location-based, hierarchical, and datacentric
routing protocols. The paper also talks about the quality of service modeling methods.
Deosarkar et al. [23], in the survey, evaluated the cluster head selection mech-
anism. The survey classified the cluster head selection mechanism into different
categories: deterministic, adaptive, hybrid, and combined metric clustering.
Ramesh and Somasundaram [24] present a survey on clustering techniques,
discuss, and compare different cluster head selection methodologies. Liu [25]
surveyed clustering routing protocols and categorized clustering attributes into clus-
tering process, characteristics of the cluster, cluster head, and total proceeding of the
technique used. The paper also presents their goals and capabilities. In the survey,
Sha et al. [26] discussed the multipath routing technique based on wireless sensor
networks’ design structure. The paper categorizes these techniques into infrastruc-
ture, non-infrastructure, and coding-based methods with a discussion under each
category. The paper also compares different categories’ approaches in load balancing,
464 A. Choudhary et al.
energy efficiency, route setup, and reliability. Guo and Zhang [27] surveyed intel-
ligent routing protocols. This paper categorized algorithms into Neural Networks
(NN), Genetic Algorithms (GA), Ant Colony Optimization (ACO), Reinforcement
Learning (RL), and Fuzzy Logic (FL). Afsar and Tayarani-Najaran [28] survey cate-
gorize the clustering methods into equal- and unequal-sized clustering algorithms.
This paper also compares each category method in terms of cluster size, cluster
count, mobility, etc. Singh and Sharma [29] surveyed cluster-based routing proto-
cols. The paper focuses on three classifications: block cluster, chain cluster, and
grid cluster-based classification. This paper also evaluates the methods on various
parameters like stability, efficiency, scalability, etc. Arora et al. [30] surveyed leach
and its variants covering C LEACH, MODLEACH, Heterogeneous LEACH, Two-
Level LEACH, Multi-hop LEACH, Vice LEACH, and other hierarchical routing
protocols, including PEGASIS. They also presented modifications over hierarchical
routing protocol.
Shokouhi et al. [31] categorized the clustering methods as homogeneous and
heterogeneous. This survey also compares various methods according to different
features like cluster head count, cluster count, intercluster communication, etc.
Fanian and Rafsanjani [32] discussed various cluster-based routing protocols from a
methodology perspective. This literature classifies the methods based on the method-
ologies used in classical approaches, metaheuristic-based strategies, fuzzy-based
approaches, and hybrid metaheuristic-based approaches.
The sensor network is grouped into various clusters, and elected nodes (cluster head)
from different clusters (preferable nodes with more energy) are responsible for peri-
odically communicating with the member nodes to collect the data, perform some
local computations, remove redundancies from the collected data, and communicate
it to the central location [33]. Clustered routing protocols avoid long-distance data
transmission between C.H.s and B.S. The normal neighbor sensing nodes are at the
bottom level, above which are the elected cluster heads of different clusters respon-
sible for aggregating data collected from the bottom level sensors. Eventually, the
aggregated data is communicated to the centralized station. Now, this aggregated
data can be analyzed for decision-making purposes. This makes it a two-level hier-
archical routing protocol. Similarly, there are three hierarchical routing protocols
where another level of C.H. nodes above the second-level C.H.s. Here, these top-
level C.H. nodes aggregate the data received from the second-level C.H.s, which is
then finally communicated to the centralized station.
Study and Analysis of Hierarchical Routing Protocols … 465
A protocol developed for reactive networks, where during the cluster change, the
current C.H. broadcasts Hard Threshold (H.T.) and Soft Threshold (S.T.) along with
the other attributes [40], this helps in controlling the amount of data transmissions.
The nodes sense the attributes continuously but transmit them to the cluster head only
when the sensed characteristic exceeds the H.T. value. The first time a parameter from
the attribute set reaches its hard threshold value, the sensed attribute gets stored in
an internal variable called the sensed value (S.V.). The nodes transmit based on two
conditions:
468 A. Choudhary et al.
The value of the sensed attribute is greater than the hard threshold value, and the
value of the perceived attribute differs from S.V. by an amount equal to or greater
than the S.T.
This helps in reducing the data transmission frequency by ignoring the small
changes in the sensed attribute. S.T. value is set according to the application require-
ment. The smaller the S.T. value more accurate the network but at the cost of increased
energy consumption. Thus, there is a need to tradeoff between accuracy and energy
consumption.
Energy consumption occurs in the transmission and reception of data. The energy
requirements are calculated using the formula below [16]:
In this paper, a 100 m * 100 m area is considered for the deployment of 100 nodes,
and simulation is done in MATLAB using the following parameters shown in Table
1 (Figs. 5, 6, 7 and 8).
A comparative study of the two algorithms LEACH and PEGASIS based on the
overall network lifetime was done in the paper and shown in Table 2. The paper
considered initial energy to be 1 J/node and 2 J/node for analysis (rest parameters
being same). The simulation analyzed the number of rounds it took for both the
algorithms to reach 1, 10, 20, 50, 70, and 100% nodes to go down. The simulation
showed that LEACH and PEGASIS almost doubled their rounds when the energy
per node was doubled while keeping the rest parameters.
The first node was exhausted in PEGASIS compared to LEACH in some simu-
lations, but the overall network lifetime of PEGASIS was much better than that of
EACH for the same initial energy per node. This indicates that the algorithms’ perfor-
mances will largely depend on how efficiently the random deployment is achieved
by the respective algorithms each time.
As the network size increased, the performance of PEGASIS got better over
LEACH for the same parameters. PEGASIS is more energy efficient due to better
energy dissipation distribution and stability for the deployment of WSN.
470 A. Choudhary et al.
Table 1 Simulation
Parameters Values
parameters
Area 100 m * 100 m
Number of nodes 100
Sink location 50,150
CH probability 5%
Initial energy of node 1 and 2 J
Data packet size 4000 bits
Energy consumption: power amplifier, 100 * 10−12 J/bit/m2
εamp
Energy consumption: transmission 50 * 10−9 J/bit
(ETX )
Energy consumption: receiving (ERX ) 50 * 10−9 J/bit
Energy consumption: aggregation 5 * 10−9 j/bit
(EDA)
5 Conclusion
The paper discussed some of the existing surveys on clustered routing protocols.
WSN Energy consumption is one of the significant constraints, and different proto-
cols have been studied for decades to address the issue. Hierarchical routing protocols
have shown excellent results in prolonging the network lifetimes by efficient energy
distribution. The paper also discusses the famous LEACH protocol along with its
prominent variants developed. Other hierarchical routing protocols like PEGASIS
and TEEN have also been discussed. A simulation-based comparison is also shown
between LEACH and PEGASIS to understand the two routing protocols better.
Study and Analysis of Hierarchical Routing Protocols … 471
However, we have seen that the PEGASIS outperforms the LEACH in terms of
the network lifetime due to better energy distribution.
Further, it also induces an additional delay as it creates a chain in the network.
Moreover, we should not forget the fact that LEACH remains the basis for the new
algorithms. LEACH was a turning point and had outperformed the then clustering
protocols by inducing adaptive clusters and changing cluster heads after each round,
which lead to better energy dissipation distribution among the network. But the
random cluster head positioning was one of the restricting issues in LEACH, which
have been addressed in many different versions of LEACH that followed over time.
472 A. Choudhary et al.
References
1. Xu, G., Shen, W., Wang, X.: Applications of wireless sensor networks in marine environment
monitoring: a survey. Sensors 14(9), 16932–16954 (2014)
2. Mini, S., Udgata, S.K., Sabat, S.L.: Sensor deployment and scheduling for target coverage
problem in wireless sensor networks. IEEE Sens. J. 14(3), 636–644 (2014)
3. Wu, F., Li, X., Sangaiah, A.K., Xu, L., Kumari, S., Wu, L., Shen, J.: A lightweight and robust
two-factor authentication scheme for personalized healthcare systems using wireless medical
sensor networks. Futur. Gener. Comput. Syst. 82, 727–737 (2018)
4. Fathany, M.Y., Adiono, T.: Wireless protocol design for smart home on mesh wireless sensor
network. In: Proceedings of the International Symposium on Intelligent Signal Processing and
Communication Systems (ISPACS), Nusa Dua, Indonesia, pp. 462–467, 9–12 November 2015
(2015)
5. Butun, I., Morgera, S.D., Sankar, R.: A survey of intrusion detection systems in wireless sensor
networks. IEEE Commun. Surv. Tutor. 16(1), 266–282 (2014)
6. Mohamed, R.E., Saleh, A.I., Abdelrazzak, M., Samra, A.S.: Survey on wireless sensor network
applications and energy efficient routing protocols. Wirel. Pers. Commun. 101(2), 1019–1055
(2018)
Study and Analysis of Hierarchical Routing Protocols … 473
7. Raghavendra, C.S., Sivalingam, K.M., (eds.): Wireless Sensor Networks. Kluwer Academic,
New York (2004)
8. Znati, T., Raghavendra, C., Sivalingam, K.: Special issue on wireless sensor networks, guest
editorial. Mob. Netw. Appl. 8 (2003)
9. Avci, B., Trajcevski, G., Tamassia, R., Scheuermann, P., Zhou, F.: Efficient detection of motion-
trend predicates in wireless sensor networks. Comput. Commun. 101, 26–43 (2017)
10. Khelladi, L., Djenouri, D., Rossi, M., Badache, N.: Efficient on-demand multi-node charging
techniques for wireless sensor networks. Comput. Commun. 101, 44–56 (2017)
11. Rashid, B., Rehmani, M.H.: Applications of wireless sensor networks for urban areas: a survey.
J. Netw. Comput. Appl. 60, 192–219 (2016)
12. Wang, D., Lin, L., Xu, L.: A study of subdividing hexagon-clustered WSN for power saving:
analysis and simulation. Ad Hoc Netw. 9(7), 1302–1311 (2011)
13. Gnanambigai, J., Rengarajan, N., Anbukkarasi, K.: Leach and its descendant protocols: a
survey. Int. J. Commun. Comput. Technol. 1(3)(2), 15–21 (2012)
14. Arce, J., Pajares, A., Lazaro, O.: Performance evaluation of video streaming over Ad hoc
networks of sensors using FLAT and hierarchical routing protocols. Mobile Netw. Appl. 13(3–
4), 324–336 (2008)
15. Savvides, A., Han, C.-C., Srivastava, M.B.: Dynamic fine-grained localization in Ad-Hoc
networks of sensors. In: Proceedings of the Seventh ACM Annual International Conference on
Mobile Computing and Networking (MobiCom), pp. 166–179, July 2001 (2001)
16. Heinzelman, W.R., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication
protocol for wireless microsensor networks. In: Proceedings of the 33rd Hawaii International
Conference on System Sciences (ICSS), Washington, USA, vol. 2, pp. 1–10, 04–07 Jan 2000
(2000)
17. Handy, M.J., Haase, M., Timmermann, D.: Low energy adaptive clustering hierarchy with
deterministic cluster-head selection. In: Proceedings of 4th IEEE Conference on Mobile and
Wireless Communications Networks, Stockholm, vol. 1, pp. 368–372, Sep. 9–11, 2002 (2002)
18. Jung, J.W., Weitnauer, M.A.: On using cooperative routing for lifetime optimization of multi-
hop wireless sensor networks: analysis and guidelines. IEEE Trans. Commun. 61(8), 3413–
3423 (2013)
19. Curry, R.M., Smith, J.C.: A survey of optimization algorithms for wireless sensor network
lifetime maximization. Comput. Ind. Eng. 101, 145–166 (2016)
20. Fei, Z., Li, B., Yang, S., Xing, C., Chen, H., Hanzo, L.: A survey of multi-objective optimization
in wireless sensor networks: metrics, algorithms, and open problems. IEEE Commun. Surv.
Tutor. 19(1), 550–586 (2017)
21. Sergiou, C., Antoniou, P., Vassiliou, V.: A comprehensive survey of congestion control protocols
in wireless sensor networks. IEEE Commun. Surv. Tutor. 16(4), 1839–1859 (2014)
22. Akkaya, K., Younis, M.: A survey on routing protocols for wireless sensor networks. Ad Hoc
Netw. 3(3), 325–349 (2005)
23. Deosarkar, B.P., Yadav, N.S., Yadav, R.: Cluster head selection in clustering algorithms for wire-
less sensor networks: a survey. In: Proceedings of the International Conference on Computing,
Communication and Networking, (ICCCN), VI, USA, pp. 1–8, 18–20 December 2008 (2008)
24. Ramesh, K., Somasundaram, D.K.: A comparative study of cluster head selection algorithms
in wireless sensor networks. Int. J. Comput. Sci. Eng. Surv. 2(4), 153–164 (2011)
25. Liu, X.: A survey on clustering routing protocols in wireless sensor networks. Sensors 12(8),
11113–11153 (2012)
26. Sha, K., Gehlot, J., Greve, R.: Multipath routing techniques in wireless sensor networks: a
survey. Wirel. Pers. Commun. 70(2), 807–829 (2013)
27. Guo, W., Zhang, W.: A survey on intelligent routing protocols in wireless sensor networks. J.
Netw. Comput. Appl. 38(1), 185–201 (2014)
28. Afsar, M.M., Tayarani-Najaran, M.H.: Clustering in sensor networks: a literature survey. J.
Netw. Comput. Appl. 46, 198–226 (2014)
29. Singh, S.P., Sharma, S.: A survey on cluster based routing protocols in wireless sensor networks.
Procedia Comput. Sci. 45, 687–695 (2015)
474 A. Choudhary et al.
30. Arora, V.K., Sharma, V., Sachdeva, M.: A survey on LEACH and other’s routing protocols in
wireless sensor network. Optik-Int. J. Light Electron. Opt. 127(16), 6590–6600 (2016)
31. Shokouhi Rostami, A., Badkoobe, M., Mohanna, F., Hosseinabadi, A.A.R., Sangaiah, A.K.:
Survey on clustering in heterogeneous and homogeneous wireless sensor networks. J.
Supercomput. 74(1), 277–323 (2018)
32. Fanian, F., Rafsanjani, M.K.: Cluster-based routing protocols in wireless sensor networks: a
survey based on methodology. J. Netw. Comput. Appl. 142, 111–142 (2019)
33. Kaur, R., Sharma, D., Kaur:, N. Comparative analysis of leach and its descendant protocols in
wireless sensor network. Int. J. P2P Netw. Trends Technol. 3(1), 22–27 (2013)
34. Heinzelman, W.B., Chandrakasan, A.P., Balakrishnan, H.: An application-specific protocol
architecture for wireless microsensor networks. IEEE Trans. Wirel. Commun. 1(4), 660–670
(2002)
35. Mahmood, D., Javaid, N., Mahmood, S., Qureshi, S., Memon, A.M., Zaman, T.: MODLEACH:
a variant of LEACH for WSNs. In Proceedings of the International Broadband and Wireless
Computing, Communication and Applications (BWCCA), Compiegne, France, pp. 158–163,
Oct 28–30 2013 (2013)
36. Neto, A.S., Cardoso, A.R., Celestino, J.: MH-LEACH: a distributed algorithm for multi-hop
communication in wireless sensor networks. In: ICN, The Thirteenth International Conference
on Networks, pp. 55–61, 23–27 February 2014 (2014)
37. Peng, H., Dong, H., Li, H.: LEACH protocol based two-level clustering algorithm. Int. J. Hybrid
Inf. Technol. 8(10), 15–26 (2015)
38. Fu, C., Jiang, Z., Wei, W., Wei, A.: An energy balanced algorithm of LEACH protocol in WSN.
Int. J. Comput. Sci. 10(1), 354–359 (2013)
39. Lindsey, S., Raghavendra, C.S.: PEGASIS: power-efficient gathering in sensor information
systems. In: Proceedings of the IEEE Aerospace Conference Proceedings, vol. 3, pp. 1125–
1130, Big Sky, Mont, USA, 9–16 March 2002 (2002)
40. Manjeshwar, A., Agrawal, D.P.: TEEN: a routing protocol for enhanced efficiency in wireless
sensor networks. In: Proceedings of the 15th International Parallel and Distributed Processing
Symposium (IPDPS), pp. 2009–2015, San Francisco, CA, USA, 23–27 April 2001 (2001)
Circularly Polarized 1 × 4 Antenna
Array with Improved Isolation
for Massive MIMO Base Station
1 Introduction
In 5G, there is a need for a 10-Gbps data rate, 1msec of latency, and more than
101 devices connected to the base station compared to 4G. Massive MIMO will
ensure maximum coverage and low power consumption of the devices [1]. The
R. S. Bakale (B)
Department of Electronics and Telecommunication Engineering, College of Engineering,
Ambajogai, Beed, Maharashtra, India
A. B. Nandgaonkar · S. B. Deosarkar
Department of Electronics and Telecommunication Engineering, DBATU Technological
University, Lonere, Raigad, India
e-mail: sbdeosarkar@dbatu.ac.in
R. Bhadade
MIT College of Engineering Pune, Pune, India
e-mail: raghunath.bhadade@mitpune.edu.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 475
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_45
476 R. S. Bakale et al.
terms of simulated gain, bandwidth, isolation between the port and array element,
axial ratio. The fourth section deals with linear and planer antenna array geometry
for massive MIMO base station. Final section deals with a conclusion on a linear
antenna array of 1 × 4 size for massive MIMO base station application.
The authors’ contribution is to design and develop a circularly polarized hexagonal
microstrip antenna for massive MIMO base station applications with a gain of 4.97 dB
per port. The array antenna gain of 1 × 4 is 11.37 dB for element spacing of 0.55λ.
The impedance bandwidth of 160 MHz and axial ratio less than 3 dB is achieved for
the given antenna at 3.7 GHz.
1.8412c
a= √ (1)
2π fr r
2c
S= √ (2)
n fr r
S2 n
Area = ◦ (3)
4 tan( 180
n
)
Hexagonal microstrip antenna is designed at 3.7 GHz, the side length of the
antenna is calculated using Eq. (2), final dimension of the antenna is as follows in
Table1.
Circular polarization is achieved using the dual feed in which magnitudes applied
are the same and phase is in quadrature. It improves the performance against the
478 R. S. Bakale et al.
multipath fading. Circularly polarized antenna in Massive MIMO BS can serve many
tens of terminals in the same time–frequency resource. The performance of CP
antennas is measured in an axial ratio. The axial ratio should be less than 3 dB over
the operating frequency range. Right-Hand Circularly Polarized (RHCP) hexagonal
microstrip antenna with dual feed is shown in Fig. 1.
FP1 and FP2 are feed with equal amplitude value and phase shift of 90 degrees
between them, respectively. The antenna is simulated over a 3.0–4.5 GHz frequency
range with 3.7 GHz as the center frequency. More than −25 dB return loss (S11) is
achieved at 3.7 Hz. Isolation between the port (S12) is more than 36 dB is achieved.
The impedance bandwidth at −10 dB is approximately 160 MHz. Results are shown
in Fig. 2.
The radiation pattern of the simulated antenna with a maximum gain of 4.97 dB
per port is achieved. E plane and H plane of radiation pattern obtained are shown in
Fig. 3. A hexagonal microstrip antenna is used to design a 1 × 4 antenna array for
massive MIMO base station applications. The simulated result shows that impedance
bandwidth of 160 MHz and axial ratio of <3 dB is achieved. 1 × 4 antenna array is
fabricated and tested using Vector Network analyzer. Axial ratio is a quality metric
Fig. 2 S11 and S12 parameters of simulated Hexagonal microstrip antenna at 3.7 GHz
Fig. 3 The radiation pattern of simulated Hexagonal microstrip antenna with E plane and H plane
used in circular polarization. The axial ratio value is 0.27 at the center frequency of
3.7 GHz, as shown in Fig. 4.
Fig. 4 The axial ratio of simulated Hexagonal microstrip antenna at 3.7 GHz
480 R. S. Bakale et al.
The antenna array of 1 × 4 is designed using a hexagonal microstrip antenna with the
center to center spacing of 0.5λ, where λ is the wavelength of the electromagnetic
wave at 3.7 GHz. The antenna is simulated using HFSS13.0v. Isolation between
elements is less than 20 dB. The maximum achievable gain for of 1 × 4 antenna
array is 11.10 dB. Size of 1 × 4 antenna array is 40.5 mm × 162 mm × 1.57 mm.
Mutual coupling S13, S15, S17 for the array are as shown in Fig. 5.
Hexagonal microstrip antenna is used to design an antenna array of 1 × 4 with the
center to center spacing of 0.6λ, where λ is the wavelength of the electromagnetic
wave at 3.7 GHz. The antenna is simulated using HFSS13.0v. Isolation between
elements is much improved compared to the result with 0.5λ spacing. The maximum
achievable gain for the 1 × 4 antenna array is 11.43 dB. Gain and Isolation are
improved, but the size of the antenna is increased. The size of the antenna array is
40.5 mm × 186.3 mm × 1.57 mm. Mutual coupling S13, S15, S17 for the array are
as shown in Fig. 6.
Fig. 5 Mutual coupling between antenna elements of 1 × 4 antenna array with spacing between
antenna elements is 0.5λ
Fig. 6 Mutual coupling between antenna elements of 1 × 4 antenna array with spacing between
antenna elements is 0.6λ
Circularly Polarized 1 × 4 Antenna Array … 481
The antenna array element spacing will decide the performance of MIMO and
massive MIMO for base station application. Transmission and reception of the inde-
pendent signal are analyzed in a rich scattering environment. In a linear array geom-
etry radiating elements are placed along an axis; similarly, in a planar array geometry
elements are placed along both axes. The performance of the linear array is measured
using different software such as HFSS, SystemVue in terms of maximum directivity
(D0 ), half-power beamwidth (HPBW), and side lobe level (SLL). The D0 and HPBW
(θh ) of a linear array are given by Eq. (4) and (5), respectively.
π −1 1.391λ
h = 2 cos (4)
2 π Nd
d
D0 = 2N (5)
λ
482
Table 2 Analysis of mutual coupling at the different spacing between array elements
Parameter Distance between ants Distance between ants Distance between ants
Array (d = 0.50λ) Array (d = 0.55λ) Array (d = 0.60λ)
Mutual coupling S13 S15 S17 S13 S15 S17 S13 S15 S17
(dB) −19.0 −36.2 −53.3 −22.3 −42 −63.8 −25 −50.4 −75
R. S. Bakale et al.
Circularly Polarized 1 × 4 Antenna Array … 483
(a) Top View of designed antenna array (b) Bottom view of the designed antenna array
Table 4 Benchmarking
Reference Frequency Bandwidth Gain (dB)
results of antenna element
elements (GHz) (MHz)
with the literature
[6] 3.6 230 5.4/port (2 × 2
antenna array)
[7] 2.45 171.9 6.17/port (No
antenna array)
[10] 5.8 200 13/port (1 × 4
antenna array)
[11] 2.596 194 10/port (1 ×
4antenna array)
Proposed 3.696 160 4.97/port (1 × 4
element antenna array)
Recently, concurrent multiband systems have become very popular [18–21]. The
proposed prototype of antenna design can be extended in this direction. This approach
will reduce the dimensions of the prototype and supports the multiple bands of
operation simultaneously.
5 Conclusions
This paper proposes a Massive MIMO antenna for mobile base station applications
designed at 3.7-GHz frequency. Antenna S parameters are measured, such as S11,
S21, and found closer to the simulated one. 1 × 4 antenna array having a simulated
gain of 11.37 dB and impedance bandwidth of 160 MHz at spacing 0.55λ. At 0.60λ
spacing, isolation is improved, but at the cost of increasing the antenna array size and
at 0.50λ spacing, isolation is less than 20 dB. For the proposed design, 0.55λ spacing
is selected for higher isolation. The antenna array is designed with circular polariza-
tion by feeding two-port with equal amplitude and quadrature-phase to achieve an
axial ratio of less than 3 dB.
Circularly Polarized 1 × 4 Antenna Array … 485
References
1. Shaikh, A., Kaur, M.J.: Comprehensive survey of massive MIMO for 5G communications. In:
2019 Advances in Science and Engineering Technology International Conferences (ASET),
Dubai, United Arab Emirates, pp. 1–5 (2019). https://doi.org/10.1109/ICASET.2019.8714426
2. Artiga, X., Devillers, B., Perruisseau-Carrier, J.: Mutual coupling effects in multi-user massive
MIMO base stations. In: Proceedings of the 2012 IEEE International Symposium on Antennas
and Propagation, Chicago, IL, pp. 1–2 (2012). https://doi.org/10.1109/APS.2012.6349354
3. Gampala, G., Reddy, C.J.: Massive MIMO—beyond 4G and a basis for 5G. In: 2018 Inter-
national Applied Computational Electromagnetics Society Symposium (ACES), Denver, CO,
pp. 1–2 (2018). https://doi.org/10.23919/ROPACES.2018.8364192
4. Manteuffel, D.: Compact multi-port multi element antenna for Massive MIMO. In: 2016 IEEE
International Symposium on Antennas and Propagation (APSURSI), Fajardo, pp. 11–12 (2016).
https://doi.org/10.1109/APS.2016.7695714
5. Li, Y., Zou, H., Peng, M., Wang, M., Yang, G.: Hybrid 12-antenna array for quad-band 5G/Sub-
6GHz MIMO in micro wireless access points. In: 2018 International Conference on Microwave
and Millimeter Wave Technology (ICMMT), Chengdu, pp. 1–3 (2018). https://doi.org/10.1109/
ICMMT.2018.8563780
6. Al-Tarifi, M.A., Faouri, Y.S., Sahrawi, M.S.: A printed 16 ports massive MIMO antenna system
with directive port beams. In: 2016 IEEE 5th Asia-Pacific Conference on Antennas and Prop-
agation (APCAP), Kaohsiung, pp. 125–126 (2016). https://doi.org/10.1109/APCAP.2016.784
3130
7. Bhadade, R., Mahajan, S.: High gain circularly polarized pentagonal microstrip for massive
MIMO base station. AEM 8(3), 83–91 (2019)
8. Vieira, J., et al.: A flexible 100-antenna testbed for Massive MIMO. In: 2014 IEEE Globecom
Workshops (GC Wkshps), Austin, TX, pp. 287–293 (2014). https://doi.org/10.1109/GLO
COMW.2014.7063446
9. Li, Y., Sim, C., Luo, Y., Yang, G.: 12-Port 5G massive MIMO antenna array in sub-6GHz
mobile handset for LTE bands 42/43/46 applications. IEEE Access 6, 344–354 (2018). https://
doi.org/10.1109/ACCESS.2017.2763161
10. Xingdong, P., Wei, H., Tianyang, Y., Linsheng, L.: Design and implementation of an active
multi-beam antenna system with 64 RF channels and 256 antenna elements for massive MIMO
application in 5G wireless communications. China Commun. 11(11), 16–23 (2014). https://
doi.org/10.1109/CC.2014.7004520
11. Kim, Y., et al.: Full dimension mimo (FD-MIMO): the next evolution of MIMO in LTE systems.
IEEE Wirel. Commun. 21(2), 26–33 (2014). https://doi.org/10.1109/MWC.2014.6812288
12. Payami, S., Tufvesson, F.: Channel measurements and analysis for very large array systems at
2.6 GHz. In: 2012 6th European Conference on Antennas and Propagation (EUCAP), Prague,
pp. 433–437 (2012). https://doi.org/10.1109/EuCAP.2012.6206345
13. Yuan, H., Wang, C., Li, Y., Liu, N., Cui, G.: The design of array antennas used for Massive
MIMO system in the fifth generation mobile communication. In: 2016 11th International
Symposium on Antennas, Propagation and EM Theory (ISAPE), Guilin, pp. 75–78 (2016).
https://doi.org/10.1109/ISAPE.2016.7833881
14. Geraci, G., Garcia-Rodriguez, A., Galati Giordano, L., López-Pérez, D., Björnson, E.: Under-
standing UAV cellular communications: from existing networks to massive MIMO, IEEE
Access 6, 67853–67865 (2018). https://doi.org/10.1109/ACCESS.2018.2876700
15. Huang, H., Yang, J., Huang, H., Song, Y., Gui, G.: Deep learning for super-resolution channel
estimation and DOA estimation based massive MIMO system. IEEE Trans. Vehicular Technol.
67(9), pp. 8549–8560 (2018). https://doi.org/10.1109/TVT.2018.2851783
16. Larsson, E.G., Edfors, O., Tufvesson, F., Marzetta, T.L.: Massive MIMO for next genera-
tion wireless systems. IEEE Commun. Mag. 52(2), 186–195 (2014). https://doi.org/10.1109/
MCOM.2014.6736761
486 R. S. Bakale et al.
17. Lu, L., Li, G.Y., Swindlehurst, A.L., Ashikhmin, A., Zhang, R.: An overview of massive MIMO:
benefits and challenges. IEEE J. Select. Top. Signal Process. 8(5), 742–758 (2014). https://doi.
org/10.1109/JSTSP.2014.2317671
18. Iyer, B., Pathak, N.P., Ghosh, D.: Dual-input dual-output RF sensor for indoor human occu-
pancy and position monitoring. IEEE Sens. J. 15(7), 3959–3966 (2015). https://doi.org/10.
1109/JSEN.2015.2404437
19. Iyer, B., Pathak, N.P., Ghosh, D.: Concurrent dualband patch antenna array for non-invasive
human vital sign detection application. In: 2014 IEEE Asia-Pacific Conference on Applied Elec-
tromagnetics (APACE), Johor Bahru, pp. 150–153 (2014). https://doi.org/10.1109/APACE.
2014.7043765
20. Iyer, B., Pathak, N.P., Ghosh, D.: Reconfigurable multiband concurrent RF system for non-
invasive human vital sign detection. In: 2014 IEEE Region 10 Humanitarian Technology
Conference (R10 HTC), Chennai, pp. 111–116 (2014). https://doi.org/10.1109/R10-HTC.
2014.7026309
21. Rathod, B., Iyer, B.: Concurrent triband filtenna design for WLAN and WiMAX applications.
In: Hitendra Sarma, T., Sankar, V., Shaik, R. (eds.) Emerging Trends in Electrical, Commu-
nications, and Information Technologies. Lecture Notes in Electrical Engineering, vol. 569,
pp. 775–784. https://doi.org/10.1007/978-981-13-8942-9_66
Analysis of Rectangular Microstrip
Array Antenna Fed Through Microstrip
Lines with Change in Width
Tarun Kumar Kanade, Alok Rastogi, Sunil Mishra, and Vijay D. Chaudhari
Abstract This paper deals with a detailed investigation of a microstrip array antenna
with step discontinuities at its feed line has been presented. In the proposed configu-
ration, antenna arrays at 2.45 GHz are designed, simulated, and fabricated to demon-
strate the concept of step discontinuities in the feed lines. A four-element rectangular
patch array is fully characterized, and its performance is critically assessed for no
step, single step, and double step microstrip feed lines. The return loss S11 [dB] is
better for microstrip array antennas with double step feed lines than array antennas
with no step and single step feed lines. Impedance matching and higher isolation
between the patches and feed lines were appropriate using step discontinuities at the
feed lines. FR4 substrates were used to design, simulate, and fabricate the microstrip
array antennas. The simulated S11 [dB] for no-step feed lines, single-step feed lines,
and double-step feed lines for rectangular microstrip array antennas are −8.78 dB, −
16.48 dB, and −17.15 dB, respectively. Prototypes of these antennas are then fabri-
cated and measured to validate the analysis and design experimentally. The simulated
and measured results agree with each other.
T. K. Kanade (B)
Assistant Professor, Department of Science, The Bhopal School of Social Science, Bhopal, MP,
India
A. Rastogi · S. Mishra
Professor, Department of Physics & Electronics, Institute for Excellence in Higher Education,
Bhopal, MP, India
e-mail: akrastogi_bpl@yahoo.co
V. D. Chaudhari
Assistant Professor, E & TC Engineering Department, G.F.’s Godavari College of Engineering,
Jalgaon, MS, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 487
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_46
488 T. K. Kanade et al.
1 Introduction
Printed Antennas are the promising candidates for microwave and millimeter-
wave communications, where the dimensions of the antenna should be kept to a
minimum. In the twenty-first century, planar antennas have found their applications
in cellular communication systems, digital communication systems, wireless LAN,
and personal communication systems. In modern wireless devices, the microstrip
patch antennas have been progressively demanded because of smart performance,
low-profile, lightweight, ease to construct, and conformability in the microwave and
millimeter-wave circuits. Microstrip antenna has some limitations like narrow band-
width and somewhat lower gain. Microstrip antenna consists of three parts: metal
layer or patch, dielectric substrate and ground metal layer, and a substrate are sand-
wiched between the metal layer and ground metal layer. Together the single patch
antenna and an array of microstrip patch antenna have their benefits in respective
domains. Microstrip array antenna consists of microstrip patch antenna elements,
interconnected and fed using microstrip transmission lines. Array configurations are
extensively used in microwave and millimeter-wave communication systems where a
narrow beam is required. The commonly used feeding techniques in microstrip array
antennas are parallel or series feeding. In a parallel feed network, all the patches are
coupled by single transmission lines, while in a series feed network, the radiating
elements are organized in a line and connected to a planar transmission line. The feed
networks are to be designed carefully to curtail any adverse effects on array perfor-
mance. As the feed line itself radiates, the feed line’s proper optimization must get the
appropriate return loss, gain, and directivity [1–4]. Section 2 describes the antenna
array design and fabrications, followed by Sect. 3, which deals with simulation and
measurement results. Conclusions are drawn in the last Sect. 4.
The design and fabrication of various microstrip patch antennas require empirical
formulas and the parameters like dielectric constant and height of the substrate mate-
rial (εr), requiring frequency (fr ). The microstrip patch antenna’s width and length
are determined by the empirical formulae [3–7]. The single element microstrip patch
antenna is designed for fixed frequency and gain, and the radiation pattern is rela-
tively wide with a low directivity or gain. It is essential to design antennas with
specific directive features or large gain to meet long-distance communications in
various applications. The directivity and gain may be increased by increasing the
antenna’s electrical size, but the size increase also doesn’t fulfill the desired require-
ments. Another technique to increase the antenna’s dimensions without increasing
the individual patch elements’ size is to form an assembly of radiating patch elements
in an electrical and structural configuration. Thus, the array antenna is formed by
merging more than one patch element [8–10].
Analysis of Rectangular Microstrip Array Antenna Fed … 489
Fig. 1 Structure of rectangular microstrip patch array with no-step feed line
490 T. K. Kanade et al.
Fig. 2 Structure of rectangular microstrip patch array with a single-step feed line
Fig. 3 Structure of rectangular microstrip patch array with a double-step feed line
Printed Antennas are the favorable candidates for microwave and millimeter-wave
communications, where the dimensions of the antenna should be kept to a minimum.
The microstrip patch array antenna is designed and simulated using FEM-based
HFSS software and fabricated on FR4 substrate. The fabricated microstrip patch
array antennas with three different feed lines are shown in Figs. 4, 5, and 6. The
resulting parameters, like return loss, VSWR, and radiation patterns, were analyzed.
Figures 7, 8, and 9 presents the simulated reflection coefficient versus frequency. The
Analysis of Rectangular Microstrip Array Antenna Fed … 491
graphical analysis shows that S11 [dB] is enhanced for a microstrip array antenna
with a double-step feed line compared to an array antenna with single-step and no-
step feed lines. For an array of microstrip patch antennas, the simulated S11 [dB] is
8.79 dB at 2.45 GHz for no step feed line, 16.48 dB at 2.55 GHz for single-step feed
line, and 17.15 dB at 2.45 GHz for double step feed line.
The measured reflection coefficients versus frequency for the three different
microstrip patch arrays with no-step, single-step, and double-step feed lines are
shown in Figs. 10, 11, and 12, respectively. The measured S11 [dB] for a microstrip
492 T. K. Kanade et al.
Fig. 6 Fabricated PCB of the rectangular microstrip patch array—double-step feed line
Fig. 7 S11 [dB] of the rectangular microstrip patch array—no-step feed line
patch array with no-step feed line is −12.756 dB at 2.51 GHz, with a single-step feed
line is −15.199 dB at 2.52 GHz, and with double-step the feed line is −15.207 dB at
2.48 GHz. The S11 [dB] for a microstrip patch array with a double-step feed line is
resonant at a frequency of 2.48 GHz, near the required frequency. The simulated and
measured results nearly agree with each other—the variance between the simulated
and measured results to the extent of 2.0 dB. A slight deviation is also observed
between the measured and simulated operating frequencies due to the inaccuracies
in the fabrication process and measurement errors. Table 1 shows our implemented
array patch with the earlier implemented single patch [14].
Analysis of Rectangular Microstrip Array Antenna Fed … 493
Fig. 8 S11 [dB] of the rectangular microstrip patch array—single-step feed line
Fig. 9 S11 [dB] of the rectangular microstrip patch array—double-step feed line
Recently, concurrent multiband systems have become very popular [15–20]. The
proposed prototype of antenna design can be extended in this direction. This approach
will reduce the prototype’s dimensions and support the multiple operation bands
simultaneously with significantly less power requirements.
494 T. K. Kanade et al.
Fig. 10 Measured S11 [dB] of the rectangular microstrip patch array—no-step feed line
Fig. 11 Measured S11 [dB] of the rectangular microstrip patch array—single-step feed line
Analysis of Rectangular Microstrip Array Antenna Fed … 495
Fig. 12 Measured S11 [dB] of the rectangular microstrip patch array—double-step feed line
Table 1 Comparisons of single patch and array patch simulated and experimental results at
2.45 GHz
No-step feed Single-step feed Double-step feed
Single Array Single patch Array Single patch Array
patch [dB] patch [dB] [dB] patch [dB] [dB] patch [dB]
Simulation −11.91 −8.79 −14.32 −16.47 −15.91 −17.15
Experimental −11.77 −12.76 −10.44 −15.20 −19.96 −15.21
4 Conclusions
In this paper, four-element microstrip patch array antennas with three different feed
lines have been presented for wireless devices operating at 2.45 GHz. A new strategy
was proposed and analyzed by simulations, fabrications, and measurements to inves-
tigate the role of step discontinuities in a feed line. The simulation and the measured
result for the microstrip patch array antennas reveal that the array antenna with
double-step feed lines result in better performance than a single-step and no-step
feed line array antennas.
496 T. K. Kanade et al.
References
1. Lamminen, A., Säily, J., Ala-Laurinaho, J., de Cos, J., Ermolov, V.: Patch antenna and antenna
array on multilayer high-frequency PCB for D-band. IEEE Open J. Ant. Propagat. 1, 396–403
(2020)
2. Wang, L., En, Y.-F.: A wideband circularly polarized microstrip antenna with multiple modes.
IEEE Open J. Ant. Propagat. 1, 413–418 (2020)
3. Balanis, C.A.: Antenna Theory: Analysis and Design, 3rd edn. Wiley, New York (1997)
4. Waterhouse, R.B.: Microstrip Patch Antennas: A Designer’s Guide, 1st edn. Springer Science
+ Business Media, New York (2003)
5. Abohmra, A., Abbas, H., Al-Hasan, M., Mabrouk, I.B., Alomainy, A., Imran, M.A., Abbasi,
Q.H.: Terahertz antenna array based on a hybrid perovskite structure. IEEE Open J. Ant.
Propagat. 1, 464–471 (2020)
6. Chiu, C.-Y., Lau, B.K., Murch, R.: Bandwidth enhancement technique for broadside tri-modal
patch antenna. IEEE Open J. Ant. Propagat. 1, 524–533 (2020)
7. Gupta, C., Gopinath, A.: Equivalent circuit capacitance of microstrip step change in width.
IEEE Trans. Microwave Theory Tech. MTT-25, 819–822 (1977)
8. Easter, B.: The equivalent circuit of some microstrip discontinuities. IEEE Trans. Microwave
Theory Tech. MTT-23, 655–660 (1975)
9. Horton, R.: Equivalent representation of an abrupt impedance step in microstrip line. IEEE
Trans. Microwave Theory Tech. MTT-21, 562–564 (1973)
10. Thompson, F., Gopinath, A.: Calculation of microstrip discontinuity inductances. IEEE Trans.
Microwave Theory Tech. MTT-23, 648–655 (1975)
11. Krage, M.K., Haddad, G.I.: Frequency dependent characteristics of microstrip transmission
lines. IEEE Trans. Microwave Theory Tech. MTT-20, 678–688 (1975)
12. Raicu, D.: Universal taper for compensation of step discontinuities in microstrip lines. IEEE
Microwave Guided Lett. 1, 249–251 (1991)
13. Koster, N.H.L., Jansen, R.H.: The microstrip step discontinuity: a revised description. IEEE
Trans. Microwave Theory Tech. MTT-34, 213–223 (1986)
14. Kanade, T.K., Rastogi, A.K., Mishra, S.: Design simulation and experimental investigations of
microstrip patch antennas and its feed line. Int. J. Eng. Res. Technol. 4, 25–28 (2015)
15. Iyer, B., Pathak, N.P., Ghosh, D.: Dual-input dual-output RF sensor for indoor human occu-
pancy and position monitoring. IEEE Sens. J. 15(7), 3959–3966 (2015). https://doi.org/10.
1109/JSEN.2015.2404437
16. Iyer, B., Pathak, N.P., Ghosh, D.: Concurrent dualband patch antenna array for non-invasive
human vital sign detection application. In: 2014 IEEE Asia-Pacific Conference on Applied
Electromagnetics (APACE), Johor Bahru, 2014, pp. 150–153. https://doi.org/10.1109/APACE.
2014.7043765
17. Iyer, B., Pathak, N.P., Ghosh, D.: Reconfigurable multiband concurrent R.F. system for non-
invasive human vital sign detection. In: 2014 IEEE Region 10 Humanitarian Technology
Conference (R10 HTC), Chennai, 2014, pp. 111–116. https://doi.org/10.1109/R10-HTC.2014.
7026309
18. Rathod, B., Iyer, B.: Concurrent triband filtenna design for WLAN and WiMAX applications.
In: Hitendra Sarma, T., Sankar, V., Shaik, R. (eds.) Emerging Trends in Electrical, Commu-
nications, and Information Technologies. Lecture Notes in Electrical Engineering, vol. 569,
pp. 775–784 (2020). https://doi.org/10.1007/978-981-13-8942-9_66
19. Iyer, B.: Characterisation of concurrent multiband RF transceiver for WLAN applications. Adv.
Intell. Syst. Res. 834–846 (2016). https://doi.org/10.2991/iccasp-16.2017.112
20. Iyer, B., Garg, M., Pathak, N., Ghosh, D.: Contactless detection and analysis of human vital
signs using concurrent dual-band R.F. system. Procedia Eng. 64, 185–194 (2013)
Parametric Study of Electromagnetic
Coupled MSA Array for PAN Devices
with RF Survey
1 Introduction
Array antenna has become one of the essential parts of today’s digital (wireless
communication) world. In wireless communication, connectivity and bandwidth are
significant factors. Microstrip Antenna (MSA) arrays are very popular, with gain
up to 30 dB. These are available in two different forms, linear and planar. Though
both have their merits and demerits, planar array antenna occupies comparatively
less space. Feeding techniques play an essential role in the radiation characteristics
of the array antenna. The literature shows that aperture coupled microstrip array
antenna has gained more rapid development in MSA technology [1], as it gives high
gain and low sidelobe levels [3].
The only difference between aperture coupled and electromagnetic coupled array
antenna is that aperture coupling occurs between top and base layers through the
slot, which occurs in the base layer where the feeding network exists electromagnetic
S. Nandedkar (B)
Maharashtra Institute of Technology, Aurangabad, MS 431005, India
S. Nawale
N B Navale Sinhgad College of Engineering, Solapur, MS 413255, India
A. Kulkarni
Team Leader Cyronics Instruments Pvt Ltd, Pune, MS 411009, India
e-mail: anirudha@cyronics.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 497
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_47
498 S. Nandedkar et al.
coupling, top, and base layers are coupled without slot. For bandwidth enhancement,
the broadband proximity fed gap coupled RMSAs can be used [4]. Series fed network
for microstrip array antenna minimizes feedline length and radiation from feedline
[5, 6]. A corporate feed network is preferred when space limitation takes place. It also
has advantages such as equal power to all elements, larger bandwidth, and modular
nature [7–11]. The microstrip-line feed is easy to fabricate, and if the inset position
is selected correctly, impedance matching becomes easy [12].
Wireless PAN and its applications are in more demand because of their advantages,
such as high data rate and small coverage area up to 10 m. Devices operating with Wi-
Fi are desktops, computers, laptops, smartphones, printers, smartwatches, personal
digital assistances, etc. These devices’ interconnection may include Personal Area
Network (PAN), Bluetooth, and Ethernet [13]. For achieving a high data rate with a
small area, antennas with different structures are used, such as ultra-wideband and
many inputs, many output (MIMO) antenna, planar configuration instead of linear,
and so on [14]. For handheld devices in PAN characterization, the MIMO channel
has been proposed [15–17]. As it is one of the requirements of PAN, planar array
antennas are proposed in this work. Array antenna is simulated using CST microwave
studio. The antenna is fabricated by using FR4 substrate, and operating parameters
are measured with a network analyzer. Simulated and measured results are compared
for the analysis of the antenna.
Further sections of this article are elaborated as Sect. 2 gives a single patch design
procedure, array antenna with the base layer (with microstrip line feed and coaxial
feed), top layer, and their parametric studies. Fabricated antenna and measured results
along with testing for PAN applications are presented in Sect. 3. Finally, a comparison
of measured and simulated results and conclusions of the paper are given in Sect. 4.
A single patch antenna is designed with resonant frequency at 2.4 GHz with FR-4
substrate with a dielectric constant of 4.3 and a thickness of 1.6 mm, as shown in
Fig. 1. The dimensions of the patch and microstrip line are also shown. Length and
width are calculated for this design (38.0 mm by 29 mm). Microstrip lines with
dimensions as 23.35 mm with an inset feed of depth of 8.85 mm are used. The
gap between patch and inset feed (Gpf ) is taken 1 mm. The calculated width of the
Parametric Study of Electromagnetic Coupled MSA Array … 499
microstrip line is 3.14 mm. The ground plane and patch have a thickness of 0.035 mm
with copper material. It shows the dimensions of the patch and microstrip line.
The array antenna’s base layer is designed with substrate material as FR-4 has a
dielectric constant of 4.3 and a thickness of 1.6 mm. All patches are placed in planar
configuration to reduce the antenna’s size and make it more compact. Initially, two
by two (2 × 2) array with microstrip line feed is designed. For achieving impedance
matching between line feed and individual patch, a quarter wave transformer is used
[16]. The input impedance (Zi) is 50 . Line impedance Zc is calculated by using
the quarter wave transmission line impedance equation.
Zc = Z1 ∗ Z2 (1)
The array antenna’s top layer is designed with a similar substrate material and has a
thickness of 0.11 mm. This array is without a feed line and ground plane. Spacing
between two layers of the array is varied to get optimum bandwidth. Figure 3 shows
the top layer of the array antenna.
500 S. Nandedkar et al.
Fig. 2 a Array with a microstrip line feed. b Array with coaxial feed
The array antenna’s base layer is simulated with microstrip line feed in which results
obtained main lobe magnitude 4 dbi, main lobe direction 11°, half-power beamwidth
or angular width 58.7 degree, and sidelobe level of -2.6 dB. A base layer antenna with
coaxial feed is also simulated. Similarly, an electromagnetic array (with both base
and top layer) is also simulated. These results are compared and presented in Sect. 4.
Figures 4 and 5 show far-field directivity and return loss curves for electromagnetic
coupled array antenna.
The fabricated antenna (base layer with microstrip line) is shown in Fig. 6. Parameters
are measured with a network analyzer. Parameters such as return loss with a resonant
Parametric Study of Electromagnetic Coupled MSA Array … 501
frequency, bandwidth, VSWR, and impedance are measured for the base antenna. It
shows two peaks at 5.5 GHz with—21.19 dB as return loss and another at 5.09 GHz
with a return loss of—21.39 dB. Bandwidth is measured as 124.9 and 203 MHz at
the resonant frequency of 5.09 GHz, shown in Fig. 7. Similarly, VSWR is 1.20 at
5.5 GHz and 1.19 at 5.09 GHz. It shows an impedance of 47.729 ohms.
The radiating microstrip patch elements (four elements) are etched on the antenna’s
top layer and the base layer with a coaxial feed line. The thickness of these two
substrates is chosen independently to optimize radiation and circuitry’s distinct elec-
trical functions [4]. Electromagnetically coupled array antenna with a top layer
without feed and base layer with coaxial feed is shown in Fig. 8.
502 S. Nandedkar et al.
Table 1 compares two different slave devices’ measured results, one with 2.4 GHz
and another with 865 MHz. Parameters that are compared are return loss, bandwidth,
and impedance measured. These slave results are compared with the standard test
RF survey. The Master transmits −90 dB at 1 m, the Slave1 receives −89 dB at 3 m,
and Slave 2 receives −80 dB at 10 m. Table 3 shows a comparison of measured and
simulated results for electromagnetic coupled array antenna with various parameters.
Recently, concurrent multiband systems have become very popular [18–23]. The
proposed prototype of antenna design can be extended in this direction. This approach
will reduce the dimensions of the prototype and supports the multiple bands of
operation simultaneously.
4 Conclusions
A parametric study has been done for electromagnetic coupled array antenna for PAN
devices. The proposed antenna is placed at the transmitter side, and 2.4 GHz input
is given from transmitter and at receiver for various distances from 60 cm to 10 m,
receiving signals is observed for different devices. This proposed antenna is also
Parametric Study of Electromagnetic Coupled MSA Array … 505
tested with a vector network analyzer, and the following conclusions are drawn. It
gives wider bandwidth as well as reasonable return loss. When the distance between
top and base antenna is less than 1 mm, it provides more bandwidth, increasing
to 84%. When the top layer is placed horizontally, it gives wider bandwidth as
compared to vertical placement. Bandwidth increases to 440MHz from 279 MHz,
i.e., approximately 57.7%, and there is a change in return loss from −16 to −33 dB.
After the master–slave study at two different ranges, the power received is observed,
which shows good bandwidth response for PAN applications, and range is observed
and verified with the standard procedure of open RF test survey.
References
1. Pozar, D.M.: A review of aperture coupled microstrip antennas: history, operation, develop-
ment, and applications. University of Massachusetts at Amherst
2. Poduval, D., Ali, M.: Wideband aperture coupled patch array antennas high gain, low side lobe
design. Prog. Electromagnet. Res. 160, 71–87 (2017)
3. Amita, A., Ray, K.P.: Proximity fed gap-coupled half E-shaped microstrip antenna array.
Sadhana Acad. Proc. Eng. Sci. 40:75–87 (2015)
4. Wu, K.L., Spenuk, M., Litva, J., Fang, D.G.: Theoretical and experimental study of feed network
effects on the radiation pattern of series-fed microstrip antenna arrays. IEE Proc. H Microw.,
Antennas Propag. 138, 238–242 (1991)
5. Honari, M.M., Abdipour, A., Moradi, G., Mirzavand, R., Mousavi, P.: Design and analysis of
a series-fed aperture-coupled antenna array with wideband and high-efficient characteristics.
IEEE Access 6, 22655–22663 (2018)
6. Sahu, A.K., Das, M.R.: 4×4 rectangular patch array antenna for bore sight application of
conical scan S-band tracking radar. In: 2011 IEEE Indian Antenna Week—Work Advanced
Antenna Technology IAW (2011). https://doi.org/10.1109/IndianAW.2011.6264931
7. Alam, M.M., Sonchoy, M.R., Goni, O.: Design and performance analysis of microstrip array
antenna. Prog. Electromagn. Res. Symp. Proc. 1837–1842 (2019)
8. Nataraj, A.N., Sujatha, M.N.: Analysis and design of microstrip antenna array for S-band
applications. Int. Conf. Commun. Signal Process. ICCSP 2016, 2023–2027 (2016). 978–1–
5090–0396–9/16/$31.00 ©2016 IEEE
9. Gunasekaran, T., Veluthambi, N., Ganeshkumar, P., Kumar, K.R.S.: Design of edge fed
microstrip patch array antenna configurations for WiMAX. In: 2013 IEEE International
Conference on Computer Computational Intelligence Research, IEEE ICCI, pp. 1–4 (2013)
10. Hadzic, H. Verzotti, W., Blazevic, Z., Skiljo, M.: 2.4 GHz microstrip patch antenna array with
suppressed sidelobes. In: 2015 23rd International Conference Software Telecommunication
Comput Networks, SoftCOM, pp. 96–100 (2015)
11. Balanis, C.A. Antenna Theory: Analysis and Design, 3rd edn.
12. Seol, K., Choi, S.: A study on design of antenna for PAN application. In: Proceedings of the 18th
International Zurich Symposium on Electromagnetic Compatibility, EMC, vol. 4, pp. 221–223
(2007)
13. Mallahzadeh, A.R. Es’haghi, S., Alipour, A.: A. design of an E-shaped MIMO antenna using
IWO algorithm for wireless application at 5.8 GHz. Prog. Electromagn. Res. 90, 187–203
(2009)
14. Aredal, J., Johansson, A.J., Tufvesson, F., Molisch, A.F.: Characterization of MIMO channels
for handheld devices in personal area networks at 5 GHz. Eur. Signal Process. Conf. (2006)
15. Sahoo, R., Vakula, D.: Gain enhancement of conformal wideband antenna with parasitic
elements and low index metamaterial for WiMAX application. AEU 105, 24–35 (2019). https://
doi.org/10.1016/j.aeue.2019.03.014
506 S. Nandedkar et al.
16. Farserotu, J., Hutter, A., Platbrood, F., Ayadi, J., Gerrits, J., Pollini, A.: UWB transmission and
MIMO antenna systems for nomadic users and mobile PANs. Wirel Pers Commun 22, 297–317
(2002)
17. Sipal, D., Abegaonkar, M.P., Koul, S.K..: UWB MIMO USB dongle antenna for personal are
network applications. Asia-Pacific Microw. Conf. APMC (2016)
18. Iyer, B., Pathak, N.P., Ghosh, D.: Dual-input dual-output RF sensor for indoor human occupancy
and position monitoring. IEEE Sens. J. 15(7), 3959–3966 (July 2015). https://doi.org/10.1109/
JSEN.2015.2404437
19. B. Iyer, N. P. Pathak, and D. Ghosh, “Concurrent dualband patch antenna array for non-invasive
human vital sign detection application,” 2014 IEEE Asia-Pacific Conference on Applied
Electromagnetics (APACE), Johor Bahru, 2014, pp. 150–153, DOI: https://doi.org/10.1109/
APACE.2014.7043765.
20. Iyer, B., Pathak, N.P., Ghosh, D., Reconfigurable multiband concurrent RF system for non-
invasive human vital sign detection: IEEE Region 10 Humanitarian Technology Conference
(R10 HTC). Chennai 2014, 111–116 (2014). https://doi.org/10.1109/R10-HTC.2014.7026309
21. Rathod, B., Iyer, B.: Concurrent triband filtenna design for WLAN and WiMAX applications.
In: Hitendra Sarma, T., Sankar, V., Shaik, R. (eds.), Emerging Trends in Electrical, Commu-
nications, and Information Technologies. Lecture Notes in Electrical Engineering, vol. 569,
pp. 775–784. https://doi.org/10.1007/978-981-13-8942-9_66
22. Iyer, B.: Characterisation of concurrent multiband RF transceiver for WLAN applications. Adv.
Intell. Syst. Res. 834–846 (2016). https://doi.org/10.2991/iccasp-16.2017.112
23. Iyer, B., Garg, M., Pathak, N., Ghosh, D.: Contactless detection and analysis of human vital
signs using concurrent dual-band RF system. Proc. Eng. 64 185–194 (2013)
Fractal Tree Microstrip Antenna Using
Aperture Coupled Ground
Abstract Fractal tree microstrip antenna is proposed with the non-contracted ground
with slot call as aperture coupling. The outcome of such design is concluded in an
expansion in the resonant frequency bandwidth and gain. The proposed antenna is
designed for a complete UWB band with a 1-GHz increment on the two sides (for
example, 2.7–12.5 GHz). A similar antenna printed utilizing epoxy substrate having
a relative permittivity of 4.4, and the thickness of the structure is 1.6 mm with a fixed
size of 49 × 65.7 mm. The antenna is simulated by HFSS, fabricated in a university
lab, and tested using a vector network analyzer.
1 Introduction
The microstrip antenna (MSA) is used for various applications because of its essen-
tial characteristics like lightweight, compact in size, simple to create, and reproduc-
tion just as testing. Regardless, it has the limitation of low gain and small band-
width. To improve these disadvantages, the aperture coupled technique is utilized
in the proposed paper. In an aperture coupled feed tree-formed microstrip antenna,
an alternate dielectric substrate is used for the feed and the patch. The essential
differentiation is both the substrate is disengaged by a ground plane, which contains
a coupling opening or space between the feed and fix. The proposed design of an
antenna shows up in Fig. 1. A plan among layers and the correct decision of hole
size and position will be fundamental in controlling the antenna impedance. The
common presence of holes between the dielectric substrate layers can change the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 507
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_48
508 S. Khobragade et al.
MSA with a tree-shaped structure is shown in Fig. 1, and its details of the proposed
design are shown in Table 1. The MSA tree-shaped antenna is planned to reverberate
at multiband frequencies using this innovative structure. This design involves two
Fractal Tree Microstrip Antenna Using Aperture Coupled Ground 509
layers of the substrate of FR4, which has a dielectric constant of 4.4 and dielectric
loss tangent (σ ) of 0.019, and a single-layer substrate thickness of 1.6 mm.
The design discussed in this paper is considered in three stages. First is with the
simple ground with a given size of 49 × 65.7 mm. Similarly, the substrate is of FR4
having the 4.4 dielectric constant and loss tangent of 0.019 and height of the substrate
is 1.6 mm. The second stage consists of the ground with mirror image slot, and the
size is a replica of the main patch. For the perfect feeding, we inserted the additional
patch of size 12 mm × 6 mm. The design is modified with the insertion of one more
substrate with the previous one. The size of both these substrates is the same as given
in Table 1.
2 Antenna Design
As per the specification provided in Table 1, the antenna is simulated in three stages.
All three steps are shown in Fig. 2. Simultaneously, Fig. 2a represents the proposed
design with the single substrate and complete ground. The ground’s size is as per
the specification provided as 49 × 65.7 mm, and the substrate is FR4 with 1.6 mm
height with the symmetrical fractal tree structure. This tree is iterated up to the fifth
stage.
The antenna represented in Fig. 2 is of the basic fractal tree-shaped with ground
having a mirror image slot, which is the perfect representation of the patch’s main
design. One more patch is added to the design so that smooth ground is obtained.
The size of this patch is 12 × 6 mm. The substrate used for the design is of the same
size with a thickness of 1.6 mm. Similarly, the last Figure is the final design with two
510 S. Khobragade et al.
Fig. 2 Proposed design of aperture coupling MSA using mirror slotted ground with stages 1, 2,
and 3
substrates with the same details provided in Table 1. The design is arranged so that
the middle slot upper substrate and patch will be an aperture coupled feed structure.
In the Result and Discussion, the author discussed all the three-stage designs. The
detailed specification of all the three stages is already discussed in the introduction
and provided in Table 1.
Fractal Tree Microstrip Antenna Using Aperture Coupled Ground 511
The proposed antenna is simulated for stage one, as discussed in the earlier section.
The results are discussed in Table 2.
Here we obtained five resonating bands, which are 7.2533, 9.68, 10.9244, 11.6089,
and 12.6356 GHz. This represents the excellent VSWR and Return loss S11. Reso-
nant frequency 9.68 GHz shows the best result in terms of VSWR bandwidth and
return loss bandwidth also. Figure 3 represents the directive gain of stage one.
Results concluded that the direction pattern lobe is inclined more toward 30° and
−30°. The proposed design is perfectly balanced and symmetrical, and hence that
reflects in the directive gain pattern.
The proposed antenna is simulated for stage two, as discussed in the earlier section.
The results are discussed in Table 3.
Here we discussed stage two results. We obtained nine resonant frequency bands,
which are in the range of 1.9–12.5 GHz. Following are the frequencies 1.9365,
2.9666, 3.3411, 5.214, 6.8060, 8.0234, 9.1003, 10.0368, and 12.5184 GHz. Here,
the best result for the VSWR bandwidth is 18.9, which is almost 3.9 times the
bandwidth in stage one. Similarly, the S11 bandwidth is also increased 3.69 times
that of stage one for S11 bandwidth. The directive gain of the said stage is shown in
Fig. 4.
The directive gain, as shown in Fig. 4 distributed in all directions. The best results
for the resonant frequency are at 8.0234 and 9.1003 GHz, where the directive gain
pattern is perfectly directive.
The proposed antenna is simulated for stage three, as discussed in the earlier section.
The results are discussed in Table 3. This feed is placed with two patches with R1 =
22.31 mm and R2 = 22.1266 mm shown in Table 1.
Here we discussed stage three design. We obtained seven frequency bands in
the range of 2.732–12.05 GHz. Following are those frequencies 2.7324, 3.9967,
2.514, 8.0234, 9.2876, 10.1773, and 12.0502 GHz. Results concluded that VSWR
bandwidth is 9.33. This VSWR bandwidth is double that of stage one. Similar changes
occurred for the S11 bandwidth also. Stage three directive gain is shown in Fig. 5.
Fractal Tree Microstrip Antenna Using Aperture Coupled Ground 513
The direction pattern for stage three shows that the lobe is distributed in all direc-
tions. For perfect direction pattern resonant frequencies, 2.7334 GHz is a suitable one.
For the remaining frequencies, the lobe is inclined based on the geometric structure.
The current distribution is shown for all stages in Fig. 6. Which shows the distri-
bution is balance in nature. In the last iteration, because of the limitation of geometry,
the current intensity is significantly less. We are working on this limitation.
We designed, simulated, and tested the antenna in the university lab. Testing results
are plotted in Fig. 7.
514 S. Khobragade et al.
We compared both VSWR and S11 for the proposed design for simulation as well
as experimentation results. Figure 7b, c conclude the simulation and experimental
results are well matched in all three stages. Results are tabulated in Tables 2, 3, and
4. The highest bandwidth for stage one is tabulated as 4.85%, for stage 2 is 18.9%,
and the same for stage three 9.33%. Similarly, 450, 1.66, and 930 MHz for respective
stages for S11 bandwidth. It concludes that stage two and stage three provide the
VSWR and S11 bandwidth increase verified by testing results.
Fractal Tree Microstrip Antenna Using Aperture Coupled Ground 515
4 Conclusion
The most powerful characteristics of fractal geometry are self-similarity. The author
took advantage of this property to obtain multiband behavior. Tree-shaped fractal
is one of the simplest geometry in terms of the algorithm among all the fractals
available in nature. We implemented this architecture using the aperture coupling
feeding method to enhance bandwidth. Results validate the enhancement in both the
bandwidth. The proposed antenna covers the range from 2.7 to 12.05 GHz range.
The aperture couple improves the bandwidth because of its non-contacting nature.
Results show that VSWR bandwidth increases up to 3.9 times for the second and 1.92
times for the third stage. Respective changes occur for S11 bandwidth also, which
offers 1.66 GHz for stage two and 930 MHz for stage three compared to 450 MHz
for stage one.
516 S. Khobragade et al.
Fig. 7 Proposed design with VSWR and S11 Comparison for all stages
Fractal Tree Microstrip Antenna Using Aperture Coupled Ground 517
References
1. Liu, L., Lu, Q., Ghassemlooy, Z., Korolkiewicz, E.: Investigation of transformer turn ratio and
design procedure for an aperture coupled slot antenna. IET J. 61–65 (2011)
2. Feresidis, A.P., Konstantinos, K., Lancaster, M.J., Peter, S.: Waveguide fed high gain antenna at
submillimeter wave frequencies. IET J.
3. Kirov, G.S., Mihaylova, D.P.: Circularly polarized aperture coupled microstrip antenna with
resonant slot and screen smith. Radio Eng. 19(1) (2010)
4. Lai, C.H., Han, T.Y., Chen, T.R.: Broadband aperture coupled microstrip antenna with low cross
polarization and back radiation. Prog. Electromagn. Res. Lett. 5, 187–197 (2008)
5. Kumar, G., Ray, K.P: Broadband Microstrip Antennas. Antennas and Propagation, pp. 1–167.
Artech House, Boston London (2003)
6. Ansoft HFSS12.1 Simulation software
7. Satyanarayana, D.S.S., Bathula, A.: Aperture coupled microstrip antenna design and analysis
using MATLAB. Int. J. Eng. Res. Technol. (IJERT) 8(06) (2019). ISSN: 2278-0181
8. Sujatha, C.N., Murti Sarma, N.S.: Design of aperture coupled microstrip planar array. IJIREEICE
5(6) (2017). ISSN: 2321-2004 (online)
Wind Speed at Hub Height (Using
Dynamic Wind Shear) and Wind Power
Prediction
Abstract Prediction of hub-height wind speed with the ground-level (10 m) wind
speed is difficult as the wind is chaotic. Several forecasters provide wind speed
forecasts, but due to variations in hub heights, conversion of a hub-height wind speed
is challenging. At present, lots of research is going on to predict the wind speed by
using mathematical formulae and statistics, and biologically inspired computing have
also been used to predict particular height wind speed. Weather parameter affects
the accuracy and increases the error band. To solve this issue, the models have been
created based on the Decision Tree Regressor/Keras Neural Network ML technique,
which uses the weather parameter and ground-level wind speed to predict the wind
shear. These attributes will help in predicting the wind particular hub height and
wind speed for at least 1.5–3 h. Besides, there are also two power forecast models
(Decision Tree Regressor/Keras Neural Network ML) which take the hub-height
wind speed and weather parameters as input and forecast the power generation for
the given power plant. It also provides brief information about the power-law method
to calculate the wind shear coefficient. This model will help many wind power plants
know about the present wind prediction model capabilities; it will also allow us to
predict the particular hub-height wind speed and power generation for their specific
wind farms.
Keywords Decision regressor tree (DRT) · Keras neural network · Wind shear
coefficient · Recursive and non-recursive modeling · MAE (mean absolute error) ·
MAPE (mean absolute percentage error) · WS (wind shear)
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 519
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_49
520 R. Kumbhare et al.
1 Introduction
The wind has always been dynamic; it does not follow any pattern for a long time,
resulting in a good forecast’s unavailability, and thus predicting wind is a challenging
task. Predicting wind in a particular region is very beneficial; it can help wind farms,
the aviation industry, etc. [1, 7, 9]. Though the natural phenomenon rule can determine
wind speed by describing wind shear for it. Wind speed is unstable and fluctuates
randomly. Different wind speeds are indirectly related to other zones, latitudes, longi-
tudes, and continents, sometimes at the same zone or place may have different wind
speeds simultaneously, while wind speeds can also be different at particular height
and area [13]. Features like humidity, temperature, air density, pressure, seasonality,
and various parameters need to be checked as these features result in the change
of wind shear [15]. Hence, we still have significant difficulties in forecasting wind
speed, particularly at hub height.
Wind shear is a microscale unpredictable meteorological event/phenomenon that
is very useful for predicting height wind speed using the ground-level wind speed
[2, 6]. Even after 30–40 years of research, there is no dynamic technique to solve
the forecasting problem due to the instability of weather phenomena and complex
terrains. Recent research in wind prediction is mostly focused on short-term wind
predictions with a range from minutes to a few days [14]. The forecast should be
accurate and updated for wind energy production. The power plant owner needs to
plan and schedule grids to initiate the power generation [8, 11, 14]. Hence, a model
is required to forecast the wind speed and power generation at least 1.5–3 h ahead.
Generally, to get a wind speed forecast, the meteorologists usually use some
formulae related to wind extrapolation like the Hellmann coefficient equation [12],
logarithmic equation [16], etc. Various methods are introduced, such as the Machine
learning models, to estimate wind speed using historical data, but these techniques
require different weather parameters on an equal time interval [10]. The data required
are the direction of the wind, atmospheric pressure, temperature, or derived param-
eters such as wind shear exponent. But for most cases, the wind shear is taken as
constant due to the unavailability of other weather parameters. To predict dynamic
wind shear at each 15 min interval of time, we have implemented a decision tree
regressor/Keras neural network that will take some weather parameters and fore-
casted wind speed at ground level and predict the hub-height wind speed at 1.5–3 h
ahead.
The Wind Shear calculated has been multiplied with the ground-level wind speed
through which hub-height wind speed is generated. This hub-height wind speed and
other weather parameters are taken as input for the power forecast model (based
on Decision Tree Regressor/Keras Neural Network), predicting the 1.5–3 h ahead
forecast.
The decision tree regressor is used as its implementation is fast, and groups most of
the prediction under the same group (parent node), resulting in a similar wind shear
value. The Keras neural network is used for its accuracy as it is better than other
Wind Speed at Hub Height (Using Dynamic Wind Shear) … 521
machine learning models and predicts more accurately by adjusting the weights of
the features.
The methodology present in this work describes how the hub height, wind speed,
and power generation are predicted using the dynamic wind shear, weather parame-
ters, and the historical data present. Keras neural network and the decision regressor
tree are used as the machine learning model. These models use recursive and non-
recursive modeling for prediction based on the recent historical data. The compar-
ison between constant wind shear and dynamic wind shear for wind speed and power
prediction has been analyzed and shown in results and discussions.
2 Proposed Methodology
The machine learning model is created as per the requirements for predicting wind
shear and power generation. In Fig. 1, the architecture of the proposed methodology
of the model is given. It describes the exact execution of the problem which is used
in this application. First, we have to fetch and preprocess the data; the data is taken
as an input for the model in which the hub-height (turbine level) wind speed, and
ground-level wind speed are used to calculate the wind shear, which is taken as a
target for training phase in which features are wind speed at ground level and other
weather parameters, and same features for prediction (testing) phase. Along with it,
feature engineering is also done for the power forecasting model.
Input: The input data contains the actual ground-level wind speed, hub-height
wind speed, wind gust, wind bearing, power generation, and parameters (of weather)
such as humidity, air pressure, and temperature.
Data Preprocessing: As the data which is coming from the forecasters are hourly,
and we need to convert it into 15 minutely information for which we need to resample
it, this data also contains many outliers that are to be removed such as exponential
values, infinite values, and NAN value.
Feature Engineering: In feature engineering, two datasets are created first to
predict wind shear and next to predict wind power generation; for wind shear features,
we use the wind speed lags, wind bearing, wind chill, pressure, humidity, wind change
rate, and wind speed day lags (if required), and for the power generation features, we
use wind speed (at hub-height), wind bearing, wind chill, pressure, humidity, power
lags, and power change rate.
Modeling: It is the primary phase in which the prediction is made; there are two
decision tree regressor/Keras neural network models running; along with it, we also
need to tune the hyperparameters present for the decision tree regressor/Keras neural
network.
Decision Tree Regressor: It is a machine learning model that builds a tree based
on its features and creates a similar dataset subset. On the top of the tree, the most
crucial component is selected, and as the tree increases, the number of branches and
depth increases with it; in this model, I have used the standard deviation reduction
technique in which first the standard deviation of output is calculated then standard
deviation (for each feature) is estimated (known as standard deviation for target
and predictor). The standard deviation of the output is subtracted from the standard
deviation of the target and predictor. The result is known as a typical deviation
reduction.
Keras Neural Network: Keras is a high-level API that uses TensorFlow 2.0. It
provides easy abstractions and essential building blocks for developing and shipping
machine learning models with higher iteration velocity. In the model, I have used the
basic Keras sequential model.
Parameter Tuning: The machine learning model contains many hyperparam-
eters that can be changed according to the dataset to increase its accuracy. The
simple method is a grid search that selects parameters by checking the parameter
performance on a model.
Recursive Model: It is a type of modeling in which the wind lags/power lags were
included in features, and the model executes one row at a time considering each row
as an input of features; the aim of this type of modeling is that the predicted value
is taken as a feature for next iteration of the model. This type of modeling can be
beneficial for short-term forecasting.
Non-recursive Model: It is a type of modeling in which no lags are taken as an
input; it takes the dataset all at once and predicts them. It is mostly used for day-ahead
forecasting.
Predicted Wind Shear: The wind shear model’s output is then put into the loga-
rithmic formula and the wind speed at ground level to calculate the hub-height wind
speed.
Calculating Wind Speed: Hub-height wind speed is calculated using the
predicted wind shear and wind speed at ground level.
Wind Speed at Hub Height (Using Dynamic Wind Shear) … 523
Predicting Power Generation: It is the output coming from the power forecast
model.
Data Used: The weather data that is used in the model is of the external weather
forecaster for a city located in Tamil Nadu nearby a wind power plant; the data
contains the parameters(of weather) such as pressure, temperature, humidity, ground-
level wind speed, wind gust, wind bearing, and wind chill. The data also contains the
hub-height wind speed collected from a wind power plant located at the same place
in Tamil Nadu.
The data provided by the external weather forecaster is hourly based, and the
data provided by the wind power plant is in 15 min intervals. To match the external
weather forecaster data, we interpolate it according to 15 min timely interval.
4 Conclusion
This model designs a machine learning technique that predicts the wind shear coef-
ficient and also the power generation for every 15 min for 1.5–3 h ahead, as 1.5–3 h
ahead forecast of wind and power is essential for wind farms and aviation. The built
technique uses various features that include weather parameters such as humidity,
temperature, pressure, wind speed, i.e., ground level, power lags, and wind lags.
The result is the dynamic wind shear coefficient used to convert the hub-height
wind speed (up to a particular range) and predict power generation. The model has
increased accuracy as compared to power-law and Panofsky and Dutton model. The
technique can be beneficial in terrain areas where the climate is dynamic and wind
changes usually in a short period.
References
1. Albani, A., Ibrahim, M.Z., Yong, K.H.: Wind shear data at two different terrain types. Data
Brief 25, 104306 (2019)
2. Ambach D., Vetter, P.: Wind speed and power forecasting-a review and incorporating asym-
metric loss. In: 2016 Second International Symposium on Stochastic Models in Reliability
Engineering, Life Science and Operations Management (SMRLO), pp. 115–123. IEEE (2016)
3. Deshpande, P., Sharma, S.C., Peddoju, S.K. et al.: Security and service assurance issues in
Cloud environment. Int. J. Syst. Assur. Eng. Manag. 9, 194–207 (2018). doi:https://doi.org/10.
1007/s13198-016-0525-0
4. Deshpande P.S., Sharma S.C., Peddoju S.K.: Predictive and prescriptive analytics in big-data
era. In: Security and Data Storage Aspect in Cloud Computing. Studies in Big Data, vol. 52,
pp. 71–81. Springer, Singapore (2019). doi:https://doi.org/10.1007/978-981-13-6089-3_5
5. Deshpande, P., Sharma, S.C., Sateesh Kumar, P.: Security threats in cloud computing. In:
International Conference on Computing, Communication & Automation, pp. 632–636. (2015)
Wind Speed at Hub Height (Using Dynamic Wind Shear) … 527
6. Gao, J., Zhao, Y.: Simulation research on wind shear prediction of airborne weather radar. In:
2014 International Conference on Virtual Reality and Visualization, pp. 435–438. IEEE (2014)
7. Gualtieri, G.: Atmospheric stability varying wind shear coefficients to improve wind resource
extrapolation: a temporal analysis. Renew. Energy 87, 376–390 (2016)
8. Huang, C.-J., Kuo, P.-H.: A short-term wind speed forecasting model by using artificial neural
networks with stochastic optimization for renewable energy systems. Energies 11(10), 2777
(2018)
9. Jiang, Z., Jia, Q.-S., Guan, X.: Review of wind power forecasting methods: from multi-spatial
and temporal perspective. In: 2017 36th Chinese Control Conference (CCC), pp. 10576–10583.
IEEE (2017)
10. Kulkarni, M.A., Patil, S., Rama, G.V., Sen, P.N.: Wind speed prediction using statistical
regression and neural network. J. Earth Syst. Sci. 117(4), 457–463 (2008)
11. Kumar, T.B., Sekhar, O.C., Ramamoorty, M., Rao, S.K., Rao, D.V.B.: Comparitive study on
wind forecasting models for day ahead power markets. In: 2017 IEEE International Conference
on Signal Processing, Informatics, Communication and Energy Systems (SPICES), pp. 1–5.
IEEE (2017)
12. Li, J., Wang, X., Yu, X.B.: Use of spatio-temporal calibrated wind shear model to improve
accuracy of wind resource assessment. Appl. Energy 213, 469–485 (2018)
13. Qawasmi, A., Kiwan, S.: Effect Weibull distribution parameters calculating methods on energy
output of a wind turbine: a study case. Int. J. Thermal Environ. Eng. 14(2), 163–173 (2017)
14. Singh, A., Gurtej, K., Jain, G., Nayyar, F., Tripathi, M.M.: Short term wind speed and power
forecasting in Indian and UK wind power farms. In: 2016 IEEE 7th Power India International
Conference (PIICON), pp. 1–5. IEEE (2016)
15. Tizgui, I., Bouzahir, H., El Guezar, F., Benaid, B.: Wind speed extrapolation and wind power
assessment at different heights. In: 2017 International Conference on Electrical and Information
Technologies (ICEIT), pp. 1–4. IEEE (2017)
16. Werapun, W., Tirawanichakul, Y., Waewsak, J.: Wind shear coefficients and their effect on
energy production. Energy Procedia 138, 1061–1066 (2017)
Modeling and Simulation of Microgrid
with P-Q Control of Grid-Connected
Inverter
1 Introduction
The advent of DGs has made a revolution in microgrids. The microgrid consists of
interconnected loads and various energy sources such as wind and solar, operated in
amalgamation to the main grid to share connected loads. The conjugated operation
can increase the credibility of the system [1, 2]. The overall system can be operated
in a grid-connected mode where the load is shared among DGs and main grid and
in Islanded mode where the main grid is turned off, and supply is provided by
DGs. The changeover between a grid linked and an Islanded mode requires a proper
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 529
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_50
530 N. U. I. Wani et al.
control scheme. When operating in grid-linked mode, the microgrid sources are used
for providing active (P) and reactive power (Q) control, and in Islanded mode, the
sources are used for delivering voltage (v) and frequency (f) control.
The different types of sources may be used in the microgrid, such as converter-
based sources and rotating machine-based sources. The other kind of sources may
lead to various control problems as in converter base sources, the response may
be quick, but on the other hand, in the case of rotating machine-based sources,
the response may be too slow. In Sect. 1, an introduction to microgrid is provided.
Section 2 describes the microgrid model and its modules. In Sect. 3, various operating
modes of a microgrid are related, and a way of detecting the operating mode is
provided. The importance of control strategies and some of them are incorporated in
Sect. 4. Section 5 describes the simulation model and the inverter control algorithm.
The results are discussed and analyzed in Sect. 6. Section 7 gives the conclusion of
the paper.
2 Microgrid Model
of solar and wind, the storage device in the form of the fuel cell or batteries, and the
central generating unit called the utility grid all connected [2].
3 Modes of Operation
The microgrid is linked to the network using a PCC. The flow of P & Q in PCC
symbolizes the mode of operation. When P&Q flow in the PCC corresponds to
zero, there is a balance in power, and there is no trading of power among the DGs
and the network. These operating conditions are considered as best suited and best
economical operating conditions of the microgrid. Moreover, any imbalance in the
P & Q may correspond to the trading of power between the DGs and the network,
not the right operating conditions. Figure 3 shows the different modes of operation.
This approach of working has the primary grid along with all DGs connected to the
microgrid. Thus, in this fashion of working, microgrid supplies and draws power
532 N. U. I. Wani et al.
according to the generator and load demand. The primary grid maintains voltage
and frequency control.[1] The distribution generators are deactivated for supplying
the P & Q. The main aim of this mode of operation is maximation of efficiency and
increasing the overall utilization of renewable sources. In this functional approach,
the microgrid ensures grid voltage, power factor, and bus voltage with permissible
operation limits. The primary grid is linked to the distribution system at a point
known as PCC. Microgrid in gird connected mode should operate in constant P-Q
mode, and this ensured only when an inverter is governed in the continuous current
approach [6].
In this operation approach, the primary grid is disconnected, and hence this operation
method is also called isolated mode. In this technique, DGs functions to cater to loads
independent of the grid for supervising Voltage (V) and Frequency (F). The islanded
mode helps with any increment or decrement in V & F by generator tripping and
load shedding, respectively, to maintain them in the permissible operation region.
The principal grid is decoupled from the distribution network [4, 7].
4 Control Strategies
The microgrid has an advantage over other distribution networks in terms of better
controllability. The microgrid control is required mainly for:
(a) Upstream network interface to check whether it works in grid-linked mode or
the isolated mode.
(b) Supervision and Security.
(c) Local defense.
The microgrid control can be operated in a Centralized Control mode where the
main focus is on optimizing the microgrid or in a decentralized mode where the main
focus is on maximizing the power production and selling of additional generated
power.
The control strategies in a microgrid are dependent on the method of operation
[9, 10].
This strategy’s primary goal is to maintain active and reactive power when V & F
variations occur due to changing load. The active power is maintained stationary by
the active power controller, and the reactive power controller stabilizes the reactive
power at given reference [1].
The primary grid maintains the V&F. In this mode of the control scheme, there are
two bands of functioning, an inner band or current loop and the outer band or power
loop. The internal current loop responds to disturbances caused due to voltages.
In this control strategy, the three-phase voltages and current at the grid side
are converted into rotating frame components by employing park transformation.
Consider I the output current of the inverter; Id be the d-axis component of current
and Iq be the q-axis component of the current, then
P
Id = (1)
U
Q
Iq = (2)
U
where
P is reference Active power.
Q is reference Reactive power.
The synchronized reference frame phase-locked loop is used to find the voltage
phases; by employing a simple PI Controller, the value Vd and Vq are determined.
After that, dq-abc Transformation is used to find inverter voltage in the abc domain.
534 N. U. I. Wani et al.
This control strategy’s primary goal is to restore voltage (v) and frequency (f) at their
nominal value regardless of active and reactive power variations. The frequency is
stabilized by the frequency controller enduring the active power, and the voltage
controller manages steady voltage at given reference. The disconnection with the
primary grid leads to active and reactive power imbalances at the load terminal. The
variation in generation and demand causes fluctuations of voltage and frequency
and hence frequency settles at a different value. Thus, for voltage and frequency
settlement, the DGs need to increase the supply. In this mode of the control scheme,
the outer loop is responsible for maintaining the voltage as an inner current loop that
keeps the current by acting as a servo meter. The dual loop carries a high dynamic
level of precision [1, 2].
The V/f control is necessary for the smooth operation of various sources as it is
responsible for marinating up to the constant flux.
The Droop Characteristics may be P-Q control characteristics or V-F control char-
acteristics. It includes active power control and voltage control. Figure 4 shows rela-
tions. It doesn’t require any connections, and decentralized control can be achieved.
The distortion associated with voltage is very high and synchronization with the
primary grid is not maintained.
The simulated model of a microgrid consists of two DGs; as in Fig. 5, the DGs
are converter based and thus require the inverter. The inverter is designed from a
universal bridge. Since we are using the topologies of directly connected inverter
to PV cell thus, we use the grid-connected inverter’s P-Q control strategy in the
microgrid [11–14]. In the inverter’s P-Q control, the inverter’s grid output current
and output current are compared. The reference current is generated by giving the
voltage and current of PV to an MPPT algorithm. Comparing currents is made using
controllers, as the tuning of three different controllers is difficult. Thus, we use the
abc-dq transformation to get the currents in the d-axis and q-axis. The currents are
again transformed by using a dq-abc transformation, and from these currents, we
generate a gate pulse for the inverter by using a PWM generator.
5.1 PV Array
The inputs to a PV array are given by a signal showing the relationship between
the Irradiance and the temperature. A signal builder generates the signal, and the
variation of Irradiance and temperature is shown in the figure. It can be seen from
Fig. 6 that with the increase in the temperature, the Irradiance also increases.
The inverter is designed from the IGBTS. Since we are using the topologies of directly
connected inverter to PV cell thus, we are using the P-Q control strategy of the grid-
connected inverter in the microgrid. The RC block is used to match the PV terminal’s
load line to draw maximum power from the PV array. In this work, the P-Q control
scheme for the inverter has been used. In this scheme, the terminal current and voltage
of the PV are given to an MPPT algorithm. The current from the inverter side and
voltage from the grid side are transformed using parks transformation. Comparing
currents is made using controllers, as the tuning of three different controllers is
difficult. Thus, we use the abc-dq transformation to get the currents in the d-axis
and q-axis. The transformed voltages and currents in d-q are compared using a PI
controller. The d-q components of voltage are connected, and by applying inverse
parks transformation, the Vabcref is generated. This is given to a PWM generator to
provide the necessary PWM signals to the inverter as in Fig. 8. The low-frequency
transformer, as used in Fig. 7, is to eliminate the harmonics caused by the switching
of the inverter.
536
50 0.5 kW/m 2
0.01 kW/m 2
0
0 100 200 300 400 500 600 700 800
Voltage (V)
4
10
6
1 kW/m 2
Power (W)
4
0.5 kW/m 2
2
0 0.01 kW/m 2
0 100 200 300 400 500 600 700 800
Voltage (V)
the load. At the high Irradiance, the solar PV’s output power increases, and thus the
load demand is majority filled up by the solar PV. The variation of the Irradiance
value affects the active and reactive power at the PCC or the bus.
The simulation model with the converter-based source has been modeled. The
inverter has been designed, and P-Q control in the DC grid model is also simulated.
Simulation of various control strategies and control algorithms in grid-connected
mode and islanded operation mode needs to be done in the future. Further, the Internet
of Things and Cloud-based technologies [15, 16] can also improve the proposed
circuits’ utilization.
7 Conclusion
primary source, which is responsible for supplying the main power. Thus, the micro-
grid has the main grid and other DGs connected to it and thus provided the microgrid’s
various functioning methods, such as grid-connected mode, Islanded mode and Dual
mode. The microgrid can be switched to multiple modes and this switching requires a
good pattern. Thus, various methods of functioning of a microgrid have their benefits
and flaws. Moreover, a proper control pattern must be followed while switching to
multiple modes of operation.
References
1. Haider, S., Li, G., Wang, K.: A dual control strategy for power sharing improvement in islanded
mode of AC microgrid. Prot. Control. Mod. Power Syst. 3(10) (2018)
2. Das, D., Gurrala, G., Shenoy, U.: Transition between grid-connected mode and islanded mode
in VSI-fed microgrids. Indian Acad. Sci. 42(8), 1239–1250 (2017)
3. Vignesh, S.S., Sundaramoorthy, R.S., Megallan, A.: The combined V-F, P-Q and droop control
of PV in microgrid. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 4(III) (2016)
4. Adhikari, S., Li, F.: Coordinated V-f and P-Q control of solar photovoltaic generators with
MPPT and battery storage in microgrids. IEEE Trans. Smart Grid 5(3), 1270–1281 (2014)
5. Lasseter, R.H.: Micro grids. In: Proceedings of IEEE Power Engineering Society Winter
Meeting, vol. 1, pp. 305–308 (2002)
6. Chandorkar, M., Divan, D., Adapa, R.: Control of parallel connected inverters in standalone ac
supply systems. IEEE Trans. Ind. Appl. 29(1), 136–143 (1993)
7. Najy, W., Zeineldin, H., Woon, W.: Optimal protection coordination for micro grids with
grid-connected and islanded capability. IEEE Trans. Ind. Electron. 60(4), 1668–1677 (2013)
8. Wasynczuk, O., Anwah, N.A.: Modeling and dynamic performance of a self-commutated
photovoltaic inverter system. IEEE Trans. Energy Convers. 4, 322–328 (1989)
9. Hatziargyriou, N., Asano, H., Iravani, R., Marney, C.: Microgrids. IEEE Power Energy Mag.
5, 78–94 (2007)
10. Loix, T., Wijnhoven, T., Deconinck, G.: Protection of microgrids with a high penetration of
inverter-coupled energy sources. In: Proceedings of 2009 CIGRE/IEEE PES Joint Symposium:
Integration of Wide-Scale Renewable Resources into the Power Delivery System, July2009
11. Bose, B., Tayal, V.K., Moulik, B.: Solar-based electric vehicle charging infrastructure with grid
integration and transient overvoltage protection. Bentham Science Publishers (2020)
12. Sahu, A.R., Bose, B., Kumar, S., Tayal, V.K.: A review of various power management schemes
in HEV. In: 2020 8th International Conference on Reliability, Infocom Technologies and
Optimization (Trends and Future Directions) (ICRITO), pp. 1296–1300. IEEE (2020)
13. Bose, B.: Modelling of microinverter and pushpull flyback converter for SPV application.
In: 2020 8th International Conference on Reliability, Infocom Technologies and Optimization
(Trends and Future Directions) (ICRITO), pp. 458–462. IEEE (2020)
14. Bose, B., Kumar, S.: Design of push-pull flyback converter interfaced with solar PV system. In:
2020 First International Conference on Power, Control and Computing Technologies (ICPC2T),
pp. 117–121. IEEE (2020)
15. Deshpande, P., Iyer, B.: Research directions in the internet of every things (IoET). In: 2017
International Conference on Computing, Communication and Automation (ICCCA), Greater
Noida, pp. 1353–1357 (2017). https://doi.org/10.1109/CCAA.2017.8230008
16. Deshpande, P., Sharma, S.C., Peddoju, S.K., Abraham, A.: Efficient multimedia data storage
in cloud environment. Inform. Int. J. Comput. Inform. 39(4), 431–442 (2015)
Smart Student Assessment System
for Online Classes Participation
Abstract During this COVID 19 epidemic time, the students could not attend the
classes regularly in physical form. Online courses have come to the students’ rescue
so that the technology has been taken to their homes. In certain instances, the students
are misusing this option and attending the classes just for participation. Generally,
the institution allocates certain internal marks for student attendance. In this paper,
we assess the students’ participation through various parameters such as the duration
of class the student is present, the number of poles responded, the number of chats
and talks, and the number of times the student raised doubts. Students’ participation
has been categorized into four levels: active, average, poor, and very poor. When the
student’s participation is abysmal and repetitive, he will be marked absent from that
class. The student participation is assessed with an artificial neuro-fuzzy inference
system using test and train data and satisfactory results.
1 Introduction
S. K. Nagothu (B)
RVR & JC College of Engineering, Chowdavram, Guntur 522019, Andhra Pradesh, India
e-mail: nsudheerkumar@rvrjc.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 541
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_51
542 S. K. Nagothu
course. Various parameters such as the number of forum questions read and posted,
the number of chat sessions participated, and the number of chat messages submitted
are considered to access students’ participation [1]. Certain weightage has been
allocated for every field, and when it is above a specific value, their participation was
recognized [2]. Thorough investigations proposed methods to encourage student
participation and achieve academic excellence in large classes [3].
Research has been done to measure students’ physical presence in the classroom
using a GPS sensor [4, 5]. The physical location will be sent to the server, and it will
be checked with the student’s predefined area [6, 7]. When the student is present at
the predefined location at the scheduled time, attendance will be marked [8]. This
research has concentrated only on the attendance of the student but not participation
[9]. In this research paper, it is proposed to measure the participation index of the
student by considering various parameters [10, 11]
Figure 9 shows the visualization of rules for input parameters to measure the output
of student assessment. The rule bar for each input can be slide from one end to the
other to measure the output parameters. Using training and testing data sets, the
system will become robust and adaptive. After training, the probability of student
participation assessment for various input parameters can be seen in Fig. 10.
Figure 10 shows that the average testing error is deficient, confirming that the
current model is perfect for assisting student participation. For testing, a date set
544 S. K. Nagothu
of 108 samples is used. The data has been collected from various faculties who are
handling the subject at multiple time intervals and days. The measurement of student
participation will vary from 0 to 1. Students’ involvement has been categorized into
four levels: active, normal, low, and very poor. The range of data for each label
is given in Table 2. When the student’s participation is very poor and in previous
Smart Student Assessment System for Online Classes Participation 545
classes, he will be marked absent if his participation is poor and very poor. If the
student’s involvement is low or very poor and in previous classes, the student will
be alerted if his participation is normal or active.
The membership function between input and output parameters for detail analyzed
using 3D graph Figs. 11, 12, 13, and 14 shows the way student participation varies to
the changes in input parameters like variation with poll votes and total duration, chats
and entire duration, chats and poll votes, talks and poll votes, etc. Figure 11 shows that
546 S. K. Nagothu
A smart and intelligent student participation assessment system has been proposed
in this paper. The proposed model does not concentrate on the physical presence of
the student but on his participation. The proposed research was implemented using
ANFIS, which makes the system reliable and robust. The system will alert the students
when student participation is low or very poor to improve their class participation.
Using the proposed model, the student will become an active participant, but not a
passive listener, as his participation is evaluated, and marks will be awarded only
when his participation is satisfactory.
548 S. K. Nagothu
Table 2 Student
SL. No. Participation assessment range Category
participation assessment
ranges with a label 1 0–0.19 Very poor
2 0.2–0.49 Poor
3 0.5–0.69 Normal
4 0.7–1 Active
Smart Student Assessment System for Online Classes Participation 549
References
1. Chan, A.Y.K., Chow, P.K., Cheung, K.S.: Student participation index: student assessment in
online courses. In: Liu, W., Shi, Y., Li, Q. (eds.) Advances in Web-Based Learning—ICWL
2004. ICWL 2004. Lecture Notes in Computer Science, vol. 3143. Springer, Berlin, Heidelberg
(2004). https://doi.org/10.1007/978-3-540-27859-7_58
2. Bergmark, U., Westman, S.: Student participation within teacher education: emphasising demo-
cratic values, engagement and learning for a future profession. Higher Educ. Res. Dev. 37(7),
1352–1365 (2018). https://doi.org/10.1080/07294360.2018.1484708
3. Kumaraswamy, S.: Promotion of students participation and academic achievement in large
classes: an action research report. Int. J. Instruct. 12(2), 369–382 (2019). https://doi.org/10.
29333/iji.2019.12224a
4. Nagothu, S.K., Kumar, O.P., Anitha, G.: Autonomous monitoring and attendance system
using inertial navigation system and GPRS in predefined locations. In: 2014 3rd International
Conference on Eco-friendly Computing and Communication Systems, Mangalore, pp. 261–265
(2014). https://doi.org/10.1109/Eco-friendly.2014.60
5. Nagothu, S.K., Anitha, G., Annapantula, S.: Navigation aid for people (joggers and runners)
in the unfamiliar urban environment using inertial navigation. In: 2014 Sixth International
Conference on Advanced Computing (ICOAC), Chennai, pp. 216–219 (2014). https://doi.org/
10.1109/icoac.2014.7229713
6. Nagothu, S.K., Kumar, O.P., Anitha, G.: GPS aided autonomous monitoring and attendance
system. Procedia Comput. Sci. 87, pp. 99–104 (2016). https://doi.org/10.1016/j.procs.2016.
05.133. ISSN 1877-0509
7. Nagothu, S.K.: Automated toll collection system using GPS and GPRS. In: 2016 International
Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, Tamilnadu,
India, pp. 0651–0653 (2016). https://doi.org/10.1109/ICCSP.2016.7754222
8. Nagothu, S.K., Anitha, G.: INS-GPS integrated aid to partially vision impaired people using
Doppler sensor. In: 2016 3rd International Conference on Advanced Computing and Commu-
nication Systems (ICACCS), Coimbatore, pp. 1–4 (2016). https://doi.org/10.1109/ICACCS.
2016.7586386
9. Nagothu, S.K., Anitha, G.: INS-GPS enabled driving aid using Doppler sensor. In: 2015 Inter-
national Conference on Smart Sensors and Systems (IC-SSS), Bangalore, pp. 1–4 (2015).
https://doi.org/10.1109/SMARTSENS.2015.7873619
10. Nagothu, S.K., Anitha, G.: Low-cost smart watering system in multi-soil and multi-crop envi-
ronment using GPS and GPRS. In: Proceedings of the First International Conference on
Computational Intelligence and Informatics, vol. 507. Advances in Intelligent Systems and
Computing,pp. 637–643. https://doi.org/10.1007/978-981-10-2471-9_61
Smart Student Assessment System for Online Classes Participation 551
11. Nagothu, S.K.: Weather based smart watering system using soil sensor and GSM. In: 2016
World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup
Conclave), Coimbatore, pp. 1–3 (2016). https://doi.org/10.1109/STARTUP.2016.7583991
Recommendation System
for Location-Based Services
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 553
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_52
554 R. Gupta et al.
users, the recommendation will include that place. In the content-based filtering
approach [6, 7], suggestions are similar to user preferences in the past. This method
can give personalized recommendations to the user that could be more useful to
them. For example, if a user likes to visit a temple, then his/her recommendations
will include a temple. Bao et al. [8] proposes a location-based recommendation
system and uses selectivity to provide users with local recommendations based on
geo-location according to the observed user’s behavioral pattern.
There are some research works suggesting geo-fencing [3, 4, 8, 9] and tracking in
area-based recommendation programs. Huming [10] proposes a method for analyzing
user location data to find ways to use it. This kind of information is useful for building
a personalized advertising plan. Hlaing [11] has researched mobile devices being
part of user data and proposes a customized recommendation system developed on
a map showing user preferences. Babur et al. [12] proposed a mining data technique
to extract the latent data patterns is of utmost importance while having to make
decisions.
As found from the literature, the existing services give recommendations primarily
based on reviews, ratings, and vicinity from the user. A static record of location
timelines rather than a dynamic record is used to track user location history. Thus,
the objective of this research was to create dynamic recommendations through our
location-based recommender. There is also a need for more personalized and user-
centric recommendations, which can be achieved by focusing on the previously clas-
sified categories’ sub-classifications. The proposed models make recommendations
that are more personalized after deeply studying patterns in user location history.
This will be useful to users moving to a new location. It will help the user to see
all the places of interest in his/her vicinity. Tagging the locations into several cate-
gories, such as banks and temples, and analyzing each category’s frequency will help
identify the user’s top places of interest and thus provide a better user experience by
drilling more in the previously classified categories (Fig. 1).
The sections to follow will discuss in detail the proposed recommendation
systems. Section 2 deals with steps involved in personalized and generalized loca-
tion recommendation Data processing and time stamp processing are some of the
few steps required before applying the algorithms. Sections 3 and 4 deal with the
This paper proposes two location-based recommender system models. The first
model is based on content filtering techniques, and the second model is based on
the collaborative filtering technique.
In this model, the behavioral pattern from the user’s location history is extracted and
then providing personalized recommendations based on the extracted patterns. The
complete methodology is as follows:
Step 1: Data Collection and pre-processing
The data related to the travel history of a user is collected for building the model.
The data is then pre-processed to remove duplicate locations and then annotate the
data into different categories like hotels, banks, temples, etc.
Step 2: Data Reduction
In this step, clustering is used to group nearby locations into a set of representative
stay points.
Step 3: Processing time stamp information
Timestamp information is processed to add new properties like weekday and period
of the day.
Step 4: Behavioral pattern extraction using Association rule mining
The user’s behavioral patterns are extracted by applying association rule mining
techniques. The rules are like the following: (day, time) -> Category.
Step 5: Knowledge base creation
The exciting association rules satisfying the specified support and confidence
thresholds are selected and created the knowledge base.
Step 6: Providing recommendation to the user
In the last step, based on the current day and time, the model will provide
recommendations of categories in the current location.
556 R. Gupta et al.
In this model, clusters of hotels have been made with the clustering technique’s help,
and then based on the group to which the user belongs, it will recommend the top 5
hotels from that cluster. The complete methodology is as follows:
Step 1: Finding the optimal value of cluster number
In the first step, the optimal value of cluster number is selected based on graphical
results and plots.
Step 2: Sorting of the dataset
The dataset is sorted in descending order based on some useful properties of a hotel.
Step 3: Cluster creation and assignment to hotels
In this step, clusters of hotels are created and assigned to the hotels.
Step 4: Selection of a cluster
Based on a user’s input, an appropriate cluster is selected.
Step 5: Providing recommendation to the user
The model will provide the top 5 hotels’ recommendations in the current location
based on the selected cluster.
We have used our google location history data for building this model. The raw
location history data available in JSON format is cleaned and converted to data frame
format. Then the duplicate latitude–longitude pairs from the data are removed as a
part of pre-processing. Semantic annotation of data is done using reverse geocoding
API, which converts geo coordinates to human-readable address.
In the next step, the data points are further reduced by clustering the nearby
points. The DBSCAN clustering algorithm [13] is used to convert the nearby points
to spatially representative points or stay points. In DBSCAN, clustering is done based
on the distance between the points and the cluster size. In this implementation, a point
is assigned to a cluster if the physical distance is less than 100 m and the minimum
cluster size is 1. Apart from this, we have used the haversine metric [14] (to calculate
pointwise distance) and the ball tree algorithm (to find nearest neighbors of points)
to estimate excellent circle distances between points in DBSCAN. The clustering
output is shown in Fig. 2.
Recommendation System for Location-Based Services 557
After that, the timestamp information is integrated into the data by introducing
new columns like period and weekday. The Apriori algorithm [15] is then used
to extract user-specific behavioral patterns based on time zone (period), weekday,
and location type, and a rule base is created. When latitude, longitude, timestamp
information is provided to the recommender system, it will first find the user location
types as per the association rules/patterns in the knowledge base. Then, GOOGLE
PLACES API is used to suggest suitable locations based on the current latitude and
longitude information.
If a weekday is Thursday and the time zone is Noon, then the top three location type
retrieved from the Knowledge Base is
Applying Google places API, the recommendations provided are shown in Fig. 3.
558 R. Gupta et al.
We have used a dataset of hotels provided by GoIbibo. This is pre-built data, taken
as the most extensive dataset (over 33,344 hotels) generated by data extraction on
goibibo.com, a leading travel site from India.
K-means clustering technique [15] has been used to assign clusters to the hotels
based on user rating and image count. The elbow method, along with silhouette
metric [16], is used to find the optimal value for cluster number K. In the elbow
method, within-cluster sum of squared error is calculated for some values of K, and
then that value of K is selected for which WSS becomes first to diminish. WSS is
defined as the sum of the squared errors for all the points. The distance metric used
for clustering is Euclidean distance. WSS (distortions) values for cluster sizes 1–19
are shown in Fig. 4. An elbow in the curve represents the optimal value for K. In
silhouette metric, the silhouette value gives the cluster’s similarity, and a considerable
value shows a successful clustering. The range of the Silhouette value is between +
1 and −1. Silhouette values for cluster sizes 2–19 are shown in Fig. 4. A peak in the
curve represents the globally optimal value for K. As per the graphical results from
both methods, the optimal value for K (cluster count) is 4.
We have added cluster numbers as a feature in our dataset. We have sorted (in
descending order) the restaurants based on the image_count and hotel_star_rating
characteristics. Image count represents the count of images uploaded by the visitors
of a specific hotel. Hotel star rating means the star rating that a customer gives to
a hotel based on its service satisfaction. Using the K-means clustering technique,
clusters of hotels are created based on the Euclidean distance metric, and then hotels
are assigned their group. When latitude and longitude information is provided to
the recommender system, it will first predict the cluster number to which the user
belongs and then recommend the top 5 restaurants.
If latitude and longitude are given as 77.223300 and 28.604700, the recommendations
are shown in Table 1.
The above results outline the top 5 hotels that are recommended based on location-
based vicinity. The result also features the facilities, address, and geo-location
information in the form of latitude and longitude.
4 Conclusions
References
1. Sahoo, S.: Location-based personalized recommendation systems for the tourists in India. Int.
J. Res. Appl. Sci. Eng. Technol. 1167–1177 (2017)
2. Bao, J., Zheng, Y., Wilkie, D., Mokbel, M.F.: A survey on recommendations in location-based
social networks. ACM Trans. Intell. Syst. Technol. 1–30 (2013)
3. Cumbreras, M.Á. Ráez, A.M. Díaz-Galiano, M.C.: Pessimists and optimists: improving
collaborative filtering through sentiment analysis. Expert Syst. Appl. 40, 6758–6765 (2013)
Recommendation System for Location-Based Services 561
4. Fenza, G., Fischetti, E., Furno, D., Loia, V.: A hybrid context aware system for tourist guidance
based on collaborative filtering. In: 2011 IEEE International Conference on Fuzzy Systems
(FUZZ-IEEE 2011), pp. 131–138. IEEE (2011)
5. Sarwar, B.: Item-based collaborative filtering recommendation algorithms. (2001)
6. Liu, S., Meng, X.: A location-based business information recommendation algorithm. Math.
Probl. Eng. 2015
7. Tung, H., Soo, V.: A personalized restaurant recommender agent for mobile e-service. In: IEEE
International Conference on e-Technology, e-Commerce and e-Service. (2004)
8. Bao, J., Zheng, Y., Mokbel, M.F.: Location-based and preference-aware recommendation using
sparse geo-social networking data. In: Proceedings of the 20th International Conference on
Advances in Geographic Information Systems, pp. 199–208 (2012)
9. Mavalankar, A., Gupta, A., Gandotra, C., Misra, R.: Hotel recommendation system (2019).
arXiv:1908.07498
10. Huming, G., Weili, L.: A hotel recommendation system based on collaborative filtering and
rankboost algorithm. In: 2010 Second International Conference on Multimedia and Information
Technology, vol. 1, pp. 317–320. IEEE (2010)
11. Hlaing, H.H., Ko, K.T.: Location-based recommender system for mobile devices on University
campus. In: Proceedings of 2015 International Conference on Future Computational Technolo-
gies (ICFCT’2015); International Conference on Advances in Chemical, Biological & Envi-
ronmental Engineering (ACBEE) and International Conference on Urban Planning, Transport
and Construction Engineering (ICUPTCE’15), p. 7. (2015)
12. Babur, I.H., Ahmad, J., Ahmad, B., Habib, M.: Analysis of dbscan clustering technique on
different datasets using weka tool. Sci. Int. 27, 5087–5090 (2015)
13. Wang, F., Franco-Penya, H.H., Kelleher, J.D., Pugh, J., Ross, R.: An analysis of the application
of simplified silhouette to the evaluation of k-means clustering validity. In: International Confer-
ence on Machine Learning and Data Mining in Pattern Recognition, pp. 291–305. Springer,
Cham (2017)
14. Swara, G.Y.: Implementation of Haversine formula and best first search method in searching
of tsunami evacuation route. In: E&ES, vol. 97, no. 1 p. 012004. (2017)
15. Yuan, C., Yang, H.: Research on K-value selection method of K-means clustering algorithm.
Multidiscip. Sci. J. 2(2), 226–235 (2019)
16. Yabing, J.: Research of an improved apriori algorithm in data mining association rules. Int. J.
Comput. Commun. Eng. 2(1), 25 (2013)
17. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters
in large spatial databases with noise. In: Kdd, vol. 96, no. 34, pp. 226–231. (1996).
Optimal and Higher Order Sliding Mode
Control for Systems with Disturbance
Rejection
Abstract This paper presents the higher order sliding mode control for a typical
unstable process to maintain the system’s stability with disturbance rejection. The
control of uncertainty and distance rejection is a difficult task in control engineering
applications. The literature found that non-linear uncertain systems have been studied
by different researchers in the control engineering field. In this paper second-order
integral sliding mode control (SMC) surface is chosen to derive the value of switching
surface control. The proposed controller design depends on the calculation of poles
of the systems irrespective of stable or unstable poles and gives practical value for
the control input signal, and it is implemented for the system’s nominal model. In
the optimal controller, the computed values of gains from systems poles are used to
derive the one of SMC law. In the presented work, the system’s unstable or stable
poles give the proper value for the control input signal. The proposed technique’s
significant advantages include disturbance rejection, insensitivity to variation in plant
variables, and implementation issues. The simulation results show an advantage over
the designed SMC approach to stabilize the system and its output responses.
I. S. Jadhav (B)
Department of Electronics & Telecomm Engineering, Godavari Foundation’s Godavari
College of Engineering, Jalgaon, India
G. M. Malwatkar
Department of Instrumentation Engineering, Government College of Engineering,
Jalgaon, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 563
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_53
564 I. S. Jadhav and G. M. Malwatkar
1 Introduction
The literature study observed that the SMC approach had gained more focus in the
last few decades for controlling certain and uncertain types of the systems [1, 2].
The sliding mode control was first studied by Utkin [2]. This study found that the
presented sliding approach has been achieved due to changing the controller’s struc-
ture. The sliding mode control concept is mainly an extension of the variable structure
systems (VSS) based control strategy, in which the control input is switched between
two control signals. The selected system state trajectory is switched on a selected
frame in the state space called the sliding manifold by generating a proper VSS
control signal. SMC is a powerful control tool to design a very robust and stable
system. It eliminates all types of external disturbances with appropriate matching
conditions [3]. The SMC has been given different approaches like dynamic SMC,
higher order SMC, and optimal SMC. These methods emphasize SMC’s primary
advantages and focus on accuracy, robustness, and specific performances [4, 5]. The
proposed approach to design the sliding surface using the calculation of input state
variable matrix eigenvalues with some tuning parameters guidelines is addressed in
this paper [6]. Due to its robustness properties and excellent invariance, VSS concepts
have been developed in real time, mainly control of servo motors [7], robotic manip-
ulators [8], Permanent magnet synchronous servo motors, induction motors, aircraft
control, spacecraft control, and flexible space structure control [9]. These experi-
mental examples show the practical applicability to confirm the theoretical results
regarding the robustness of VSS with sliding modes. However, it was found in the
researchers’ study that the resulting control strategy is discontinuous, and therefore
the chattering phenomenon leads to lower accuracy in the control applications [10].
These problems can be overcome by replacing a continuous control into the control
input computation (a sign function). We observed that larger error generally occurs
due to the discontinuity function in controller [11]. Also, it is observed that the sys-
tem’s behaviors at small error regions become high gain of the system, and this is the
same with the discontinuous control strategy. Hence, SMC’s high gain effect based
on VSS tolerate the uncertainties arises because of variation in parameter, external
disturbances, and change in loads [12]. In the literature, various approaches of VSS
such as given in [13–15] and optimal approaches are given in [16, 17] are developed
for acceptable and better performance of robotics and industrial applications.
The implementation of work is as per the following sections. In Sect. 2, problem
statement is discussed. In Sect. 3, controller design approaches in the form of state
space are emphasized. Section 4 describes the stabilizing control design in the SMC
techniques with switching control and equivalent control. The stabilization concept
with magnetic levitation (maglev) system applications and remarks are included in
Sects. 5 and 6.
Optimal and Higher Order Sliding Mode Control for Systems … 565
2 Problem Formulation
d x(t)
= [A + A(t)]x(t) + [B + B(t)]u(t) + δ(t) (1)
dt
In above equations state vector denoted as x(t) R n×1 , u(t) R shows the system input
control signal and δ(t) R n external disturbances vector. The matrices A, B, C shows
the real constant terms with proper dimension, while A(t) and B(t) represent the
parametric uncertainties present in systems. Let us consider that system uncertainty
and disturbance to be unknown but bounded so that there is resistance to derivatives.
These uncertainties, along with the disturbances present in the system, fulfill the
matching condition written as
d x(t)
= Ax(t) + Bu(t) + d(t) (3)
dt
y(t) = C x(t)
where d(t) represents the disturbance that arises in the system (1). The main objec-
tive of designing a robust controller is to reject the disturbance occurring, track the
uncertainty in plant parameters and stabilize the process. To achieve this objective,
SMC is combined with an optimal controller. In SMC, it is well known that combined
controller signal u(t) is
in Eq. (4) u 1 (t) shows the equivalent controller used to bring the state of the system
on a sliding manifold (surface),and u 2 (t) is input control signal enforce to keep the
system state variables once it reaches the sliding manifold.
As per the linear transformation concept, it is known that system disturbance and
uncertainties fulfill matching conditions after the transformation. Hence, mathemat-
ically, it is written as
ˆ
A(t)z(t) ˆ
+ B(t)u(t) ˆ = T −1 [A(t)x(t) + B(t)u(t) + δ(t))]
+ δ(t) (6)
= T −1 Bd(t)
ˆ
= B(t)d(t))
⎡ ⎤ ⎡ ⎤
0 1 0 ... 0 0
⎢0 0 1 ... 0 ⎥ ⎢0⎥
ˆ = T AT = ⎢
A(t) −1 ⎥; B̂ = T B = ⎢ ⎥; and Ĉ = C T
−1
⎣ . . . . 1⎦ ⎣ . ⎦
a1 a2 a3 . . . an bi
The introduced robust optimal controller worked to convert the trajectory tracking
problem to the regulatory concept of control problem by calculating the error in the
given system. Hence, the optimal controller useful to eliminate effect occurs due to
system uncertainties and disturbance rejection and satisfies the matching condition
for this considering both these cases. It is investigated that for minimum control
input, the tracking error minimizes system states x1 (t) is tracking to the known
signal having the desired state. Let xd (t) is the desired trajectory and achieved in
terms of transformed domain z d (t). In this case, the error e(t) can be represented as
follows
where z d(1) (t) . . . z d(n−1) (t) denotes (n − 1) derivatives of z d (t). Now write the Eq. (1)
into transformation form and represent it in the form of error coordinate positions of
the system
de(t)
= Â + Â e(t) + B̂ + B̂ u(t) + δ(t) + Ad (t) (8)
dt
now write the expression in terms of system external disturbances and uncertainties
presents
where unknown function denoted by φ(t) which fulfill the condition of matching
and its time derivatives. From Eqs. (8) and (9) with de(t)
dt
= ė(t)
The main focus is to derive the control input, so that which regulates control input
is u 1 . Consider normal condition, ignoring the uncertain part, Eq. (10) becomes
K is represented as the gain matrix K = [k1 , k2 , . . . kn ] are the tuning gain for sta-
bilization and robust performance of the systems. The gain K is a function of poles
of the systems and can be related as
K εeig(A)ε(Pi + j Q i ) (13)
K i ε(Pi2 + Q i2 ) (14)
In this paper, the tuning parameter λi ε(0.01, 0.99) is used for the system’s smooth-
ness and robustness without compromising its performance. The various tuning
parameters can be designed and calculated as
√ P12 + Q 21
p12 + Q 21 K1
K1 = , K2 = and K 3 ε unstable poles and K 3 = .
λ1 λ2 λ3
The parameters obtained using the poles’ location are the optimal values of the gains,
and these gains are used to get the desired performance of the systems.
568 I. S. Jadhav and G. M. Malwatkar
The proposed controller with SMC apply minimum control efforts to track the uncer-
tainty in the system. The above optimal controller strategy can be combined into a
proposed sliding surface based SMC described as follows. Now assume that an inte-
gral sliding surface s(t) written as
t
s(t) = e(t) − e0 − φ̇(τ )dτ (15)
0
where calculates the design parameter selected in such fashion that the inverse of
matrix B̂ is non-singular φ̇(t) = Âe(t) + B̂u 1 (t). Let e0 is the initial error condition
and which is constant, therefore
ṡ(t) = ė(t) − φ̇(t) . (16)
It represented as
ṡ(t) = Âe(t) + B̂(u 1 (t) + u 2 (t)) + B̂φ(t) − Âe(t) − B̂u 1 (t) (17)
or
ṡ(t) = B̂u 2 (t) + B̂φ(t) . (18)
In this Integral Sliding Mode Control (ISMC), observe that the reaching phase is
removed and system states become reached to the sliding manifold within a short time
interval. In ISMC, the controller design based on achieving the reaching condition
is written as
where constant values chosen as ρ > 0 and sgn(s(t)) = [1, −1, 0] for s(t) >
0, s(t) < 0, s(t) = 0, respectively. So from above three equations
The above equation shows that due to the sign function switching control input,
u 2 (t) is influenced and becomes oscillatory. It is also known as the controller’s
chattering. To remove this chattering effect in ISMC, a design of the second-order
sliding manifold is required. There are two steps required to design the second-order
SMC manifold represent as follows.
t
s(t) = e(t) − φ̇(τ )dτ (21)
0
Optimal and Higher Order Sliding Mode Control for Systems … 569
and it is written as
ṡ(t) = B̂u 2 (t) + B̂φ(t) . (22)
The main advantage is that it is unnecessary to require initial conditions for the design
above the sliding surface. Hence, achieving all system states on the sliding surface
of s(t) requires a non-singular terminal sliding surface. In this case, the non-singular
terminal sliding surface is written as
δ>0 (24)
and the terms α, β are chosen that to fulfill the following conditions:
α, β ∈ [2n + 1] (25)
The second-order SMC combines the linear sliding surface with non-linear terminal
sliding manifold σ (t). The constant pulse proportional reaching law can be defined
as
To demonstrate and compare the effectiveness of the proposed controller with Das and
Mahant [3], the simulation studies are conducted for vertical displacement tracking
of the maglev model. Mathworks MATLABR2019b and its Simulink is used to
implement the sliding mode controller for the magnetic levitation (maglev) system.
Assume that the position control of a maglev vehicle model is extensively studied
in [3, 16]. The displacement in vertical direction of maglev system is shown in Fig. 1.
The aim of this simulation is to control the position of ball in vertical direction x(t)
570 I. S. Jadhav and G. M. Malwatkar
d x(t)
= [A + A(t)]x(t) + [B + B(t)]u(t) (28)
dt
where
⎡ ⎤
0 1 0
A=⎣ 0 0 1 ⎦ (30)
57000 1938 −16
⎡ ⎤
0 0 0
A(t) = ⎣ 0 0 0 ⎦ (31)
57000L r (t) 1624L r (t) 16L r (t)
and
⎡ ⎤
x1 (t)
x(t) = ⎣ ẋ1 (t) ⎦ (32)
ẍ1 (t)
with
⎡
⎤
0
B=⎣ 0 ⎦ (33)
14.25
Optimal and Higher Order Sliding Mode Control for Systems … 571
⎡ ⎤
0
B(t) = ⎣ 0 ⎦ (34)
14.25L r (t)
C = [1, 0, 0] (35)
and L r (t) = 0.5sint is the uncertainty considered. The problem is to tract x1 (t) which
is expected to follow the xd (t), that is
t
s(t) = e(t) − e0 − φ̇(τ )dτ (36)
0
In this simulation xd (t) = cos(t) is the desired trajectory. The proposed method
is applied with xd (t) = cos(t) and results are obtained. During the simulation
required parameters selected as α = 7, β = 5, σ = 0.25, η = 29300, = 0.3. The
tuning of gain matrix is calculated by the proposed guideline so K = [2.945 ×
104 171.604, 6.9563]. The results are shown in Figs. 2 and 3 and it is clear that
the proposed method tracks the desired position while the method given by Das
and Mahant [3] produces offset. The systems like maglev are very sensitive distur-
bances. Therefore, any offset in tracking leads to the instability of the system. The
proposed method seems to be effective as any change in the desired position is pre-
cisely tracked. The precise monitoring is effective due to the optimal control law
designed using stable and unstable poles.
2
Desired
Actual-SO-SMC
1.5 Actual-Proposed
Vertical Position (mm)
0.5
-0.5
-1
-1.5
0 1 2 3 4 5 6 7 8 9 10
Time (sec.)
4
10
3
SO-SMC
Proposed
2
Control input (N-m)
-1
-2
-3
-4
0 1 2 3 4 5 6 7 8 9 10
Time (sec.)
Fig. 3 Case 1: Total control signal of maglev of maglev. SO-SMC = second-order SMC [3]
6
Desired
Actual-SO-SMC
5 Actual-Proposed
Vertical Position (mm)
0
0 1 2 3 4 5 6 7 8 9 10
Time (sec.)
4
10
6
SO-SMC
Proposed
4
2
Control input (N-m)
-2
-4
-6
-8
0 1 2 3 4 5 6 7 8 9 10
Time (sec.)
Fig. 5 Case 2: Total control signal of maglev of maglev. SO-SMC = second-order SMC [3]
6 Conclusions
References
1. Utkin, V.I.: Variable structure systems with sliding modes. IEEE Trans. Autom. Control 22(2),
212–222 (1977)
2. Utkin, V.I.: Sliding mode control design principles and applications to electric drives. IEEE
Trans. Ind. Electron. 40(1), 23–36 (1993)
3. Das, M., Mahanta, C.: Optimal second order sliding mode control for linear uncertain systems.
ISA Trans. 53(6), 1807–1815 (2014)
4. Emel’yanov, S.V.: Variable-Structure Control Systems. Nauka, Moscow (1967)
574 I. S. Jadhav and G. M. Malwatkar
5. Khandekar, A.A., Malwatkar, G.M., Kumbhar, S.A., Patre, B.M.: Continuous and discrete
sliding mode control for systems with parametric uncertainty using delay ahead prediction. In:
Twelfth IEEE Workshop on Variable Structure Systems. Mumbai, India (2012)
6. Gao, W.: Variable structure control of non-linear systems: a new approach. IEEE Trans. Indus.
Electron. 40 (1993)
7. Wai, R.J., Lin, F.J.: Adaptive recurrent neural network control for linear induction motor. IEEE
Trans. Aerosp. Electron. Syst. 37(4) (2001)
8. Slotine, J.J., Sastry, S.S.: Tracking control of non-linear systems using sliding surfaces with
applications to robot manipulators. Int. J. Control 38, 465–492 (1983)
9. Takahashi, I., Koganezawa, T., Su, G., Ohyama, K.: A super high speed PM motor drive system
by a quasi-current source inverter. IEEE Trans. Ind. Appl. 30, 683–690 (1994)
10. Jezernik, K., Curk, B., Harnik, J.: Discrete-time chattering free sliding mode control. In: Pro-
ceedings of the Workshop on Robust Control via Variable Structure & Lyapunov Techniques,
pp. 319–324. Benevento (1994)
11. Roh, Y.H., Oh, J.H.: Sliding mode control with uncertainty adaptation for uncertain input-delay
systems. Int. J. Control (2000)
12. Bianchi, N., Bolognani, S., Jang, J.H., Sul, S.K.: Comparison of PM motor structures and sen-
sorless control techniques for zero-speed rotor position detection. IEEE Trans. Power Electron.
22, 2466–2475 (2007)
13. Gao, W.B., Wang, Y., Homaifa, A.: Discrete-time variable structure control systems. IEEE
Trans. Ind. Electron. 42(2), 117–122 (1995)
14. Liu, Z.Z., Chen, W., Lu, J., Wang, H., Wang, J.: Formation control of mobile robots using
distributed controller with sampled-data and communication delays. IEEE Trans. Control Syst.
Technol. 24(6), 2125–2132 (2016)
15. Ding, S., Park, J.H., Chen, C.-C.: Second-order sliding mode controller design with output
constraint. Automatica 112, 108704 (2020). ISSN: 0005-1098 (1995)
16. Shieh, N., Liang, K., Mao, C.: Robust output tracking control of an uncertain linear system via
a modified optimal linear-quadratic method. J. Optim. Theory 117(3), 649–59 (2003)
17. Malwatkar, G.M., Khandekar, A.A., Nikam, S.D.: PID controllers for higher order systems
based on maximum sensitivity function. In: 3rd International Conference on Electronics, vol.
1, pp. 259–263 (2011)
Synchronization and Secure
Communication of Chaotic Systems
Ajit K. Singh
1 Introduction
A. K. Singh (B)
Department of Mathematics, Amity University Maharashtra, Mumbai 410206, India
e-mail: ajit.brs@gmail.com
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 575
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_54
576 A. K. Singh
matter, a chaotic system is studied under drive-response systems. The main aim is
to synchronize the response system to the drive system by adding a controller with
a signal [4, 5].
The rest of the article is organized in the following way. A literature review is
presented in Sect. 2. In Sect. 3, the proposed methodology deals with the mathemat-
ical model and synchronization of the model in general. It also includes the circuit
description as well as application of synchronization to secure communication. Syn-
chronization of Chua’s circuit and numerical simulation is described under the results
and discussion in Sect. 4. Finally, conclusions are drawn in Sect. 5.
2 Literature Review
d X 1 (t)
Transmitter: = B X 1 (t) + f 1 (X 1 (t)) , (1)
dt
and a chaotic response system as
Synchronization and Secure Communication of Chaotic Systems 577
d X 2 (t)
Receiver: = C X 2 (t) + f 2 (X 2 (t)) + u(t), (2)
dt
Controller: u(t),
3.2 Synchronization
d e (t)
= C e (t) + F (X 1 (t) , X 2 (t)) + u(t), (3)
dt
where F (X 1 (t) , X 2 (t)) = f 2 (X 2 (t)) − f 1 (X 1 (t)) + (C − B) X 1 (t). To stabi-
lize error system (3), appropriate control function is chosen by synchronization
method.
Both transmitter and receiver are uniform circuits involving the integrator-formed
second-order R.C. resonance loop, a correctness, a complete OR gate, and the input to
external source and buffer to keep away from overloading the XOR gate. If someone
is willing to put into code a random message, then it is the order of square impulse of
interval or even additional complicated signal. The transmitter and receiver system
is displayed in Fig. 1.
The entire structure’s reliability depends on the frameworks and chaotic systems
applied to transmitter and receiver systems. The formation of chaotic systems could
be altered by filling in distinct circuits to attain military exercises’ tremendous secu-
578 A. K. Singh
4.1 Synchronization
Chua’s circuit [19, 20] is made up using two capacitors, one inductor, one piecewise-
linear nonlinear resistor, and one linear resistor. The mathematical description of the
circuit is given as
dx
= α (y − x − f (x))
dt
dy
=x−y+z (4)
dt
dz
= −βy
dt
where f (x) = bx + 0.5 (a − b) [|x + 1| − |x − 1|] . The voltages over two capaci-
tors are represented by variables x and y and the current through the inductor is rep-
resented by variable z. For system parameters value a = −8/7, b = −5/7, α = 9,
and β = 100/7 and construct chaotic nature in system (4).
Two Chua’s circuit chaotic systems in which drive system along with three state
vectors represented by subscript d and response system having same equations rep-
resented by subscript r. Initial condition of drive system is xd (0) = 0.5, yd (0) = 0.5
and z d (0) = −0.5 which is different from the initial condition of the response sys-
tem xr (0) = 1, yr (0) = 1, zr (0) = −0.2, and then two Chua’s circuit systems are
represented, respectively, by in the equations
d xd
= α (yd − xd − f (xd ))
dt
dyd
= xd − yd + z d
dt (5)
dz d
= −βyd
dt
f (xd ) = bxd + 0.5 (a − b) [|xd + 1| − |xd − 1|]
580 A. K. Singh
and
d xr
= α (yr − xr − f (xr )) + u 1
dt
dyr
= xr − yr + zr + u 2
dt (6)
dzr
= −βyr + u 3
dt
f (xr ) = bxr + 0.5 (a − b) [|xr + 1| − |xr − 1|] .
Three control functions u 1 , u 2 , and u 3 in system (6) are inserted. Phase portraits of
the chaotic system are depicted in Fig. 2.
Chaotic systems are solved by using the fourth-order Runge-Kutta method with a
time step size of 0.001. Unknown parameters are taken as a = −8/7, b = −5/7,
α = 9, and β = 100/7 in the simulations process. At result of this Chua’s circuit
system shows chaotic behavior in the absence of control functions. Initial condition
of the drive system is xd (0) = 1, yd (0) = 4 and z d (0) = −4 and of the response
system is xr (0) = 4, yr (0) = 1, z d (0) = 4. Hence, the error system is e1 (0) = −3,
e2 (0) = 3, e3 (0) = −8. Figure 3 exhibits that Chua’s circuit systems have been
asymptotically synchronized. Error plot of drive and response systems is presented
in Fig. 4.
5 Conclusions
20
10
0
30
20
10 −10
0
−10 x(t)
−20 −20
y(t) −30
60
40
z(t)
20
0
40
20 20
0 10
−20 0
−10
−40 −20
y(t) x(t)
20
x(t)
−20
0 10 20 30 40 50 60 70 80 90 100
50
y(t)
−50
0 10 20 30 40 50 60 70 80 90 100
50
z(t)
0
0 50 100 150
t
60
x
1
40 x
2
20
2
x
x1,
−20
−40
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t
150
y
1
100
y
2
50
y1, y2
−50
−100
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t
400
z
1
300 z2
200
z1, z2
100
−100
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t
−2
−4
−6
−8
0 1 2 3 4 5 6 7 8 9 10
t
Acknowledgements The author, Dr. A. K. Singh, is extending his gratitude to the Ph.D. thesis
supervisor Prof. S. Das, Department of Mathematical Sciences, Indian Institute of Technology
(BHU), Varanasi-221005, India, for the continuous guidance.
References
1. Runzi, L., Yinglan, W.: Finite-time stochastic combination synchronization of three different
chaotic systems and its application in secure communication. Chaos: An Interdisc. J. Nonlinear
Sci. 22(2), 023109 (2012)
2. Singh, A.K., Yadav, V.K., Das, S.: Synchronization of time-delay chaotic systems with uncer-
tainties and external disturbances. Discontin. Nonlinear. Complex. 8(1), 13–21 (2019)
3. Miliou, A.N., Antoniades, I.P., Stavrinides, S.G., Anagnostopoulos, A.N.: Secure communi-
cation by chaotic synchronization: Robustness under noisy conditions. Nonlinear Anal. Real
World Appl. 8(3), 1003–1012 (2007)
4. Singh, A.K., Yadav, V.K., Das, S.: Dual combination synchronization of the fractional order
complex chaotic systems. J. Comput. Nonlinear Dyn. 12(1), 011017 (2017)
5. Dasgupta, T., Paral, P., Bhattacharya, S.: Fractional order sliding mode control based chaos
synchronization and secure communication. In: 2015 International Conference on Computer
Communication and Informatics (ICCCI). pp. 1–6. IEEE (2015)
6. Martínez-Guerra, R., García, J.J.M., Prieto, S.M.D.: Secure communications via synchroniza-
tion of Liouvillian chaotic systems. J. Franklin Inst. 353(17), 4384–4399 (2016)
7. Singh, A.K., Yadav, V.K., Das, S.: Synchronization between fractional order complex chaotic
systems with uncertainty. Optik 133, 98–107 (2017)
8. Naderi, B., Kheiri, H.: Exponential synchronization of chaotic system and application in secure
communication. Optik 127(5), 2407–2412 (2016)
9. Kwon, O., Park, J.H., Lee, S.: Secure communication based on chaotic synchronization via
interval time-varying delay feedback control. Nonlinear Dyn. 63(1–2), 239–252 (2011)
10. Yang, T., Chua, L.O.: Impulsive stabilization for control and synchronization of chaotic sys-
tems: theory and application to secure communication. IEEE Trans. Circuits Syst. I: Fundam.
Theory Appl. 44(10), 976–988 (1997)
584 A. K. Singh
11. Gao, X., Hu, H.: Adaptive-impulsive synchronization and parameters estimation of chaotic
systems with unknown parameters by using discontinuous drive signals. Appl. Math. Model.
39(14), 3980–3989 (2015)
12. Yang, J., Chen, Y., Zhu, F.: Associated observer-based synchronization for uncertain chaotic
systems subject to channel noise and chaos-based secure communication. Neurocomputing
167, 587–595 (2015)
13. Singh, A.K., Yadav, V.K., Das, S.: Synchronization between fractional order complex chaotic
systems. Int. J. Dyn. Control 5(3), 756–770 (2017)
14. Al-Hussaibi, W.: Effect of filtering on the synchronization and performance of chaos-based
secure communication over rayleigh fading channel. Commun. Nonlinear Sci. Numer. Simul.
26(1–3), 87–97 (2015)
15. Singh, A.K., Yadav, V.K., Das, S.: Nonlinear control technique for dual combination synchro-
nization of complex chaotic systems. J. Appl. Nonlinear Dyn. 8(2), 261–277 (2019)
16. Tsimring, L.S., Sushchik, M.M.: Multiplexing chaotic signals using synchronization. Phys.
Lett. A 213(3–4), 155–166 (1996)
17. Martinez-Guerra, R., Yu, W.: Chaotic synchronization and secure communication via sliding-
mode observer. Int. J. Bifurcat. Chaos 18(01), 235–243 (2008)
18. Lian, K.Y., Chiang, T.S., Chiu, C.S., Liu, P.: Synthesis of fuzzy model-based designs to syn-
chronization and secure communications for chaotic systems. IEEE Trans. Syst. Man, Cybern.
Part B (Cybernetics) 31(1), 66–83 (2001)
19. Chua, L.O., Itoh, M., Kocarev, L., Eckert, K.: Chaos synchronization in Chua’s circuit. J.
Circuits, Syst. Comput. 3(01), 93–108 (1993)
20. Murali, K., Lakshmanan, M.: Chaotic dynamics of the driven Chua’s circuit. IEEE Trans.
Circuits Syst. I: Fundam. Theory Appl. 40(11), 836–840 (1993)
Improvement in Ranking Relevancy
of Retrieved Results from Google Search
Using Feature Score Computation
Algorithm
Abstract Websites with a higher position in search engine ranking result; directly
and positively affect visitors’ number to such sites. Search engine optimization (SEO)
has become a promoting business that attempts to improve websites’ ranking. Some-
times, search engine results may contain undeserving websites at top rank due to
SEO techniques in an unethical way. It misleads the search engine, and thereby it
will increase the page rank of unfit websites. Due to this, such results downgrade
the performance of search engines and frustrate the users. These irrelevant pages
must be moved top-down from the search results to improve search engine quality.
This paper analyzes Google results and proposes a novel approach to move down the
top-ranking irrelevant Google search engine results. A ‘feature Score computation’
algorithm was presented here to compute scores based on features found in pages,
and using the score, the pages are re-ranked to move down irrelevant results and
uplift the relevant products. The accuracy of the corpus results’ relevancy was 88%,
and after applying the algorithm, it was improved to 99%. This work improved the
ranking of relevant products efficiently.
1 Introduction
The motivation behind designing a search engine is to return relevant search results
to a user. Generally, the user enters a query to a search engine and expects a list of
the most relevant websites. To determine which pages are most appropriate, search
engines match search keywords within their database, select the exact keyword or a
part of the keyword matches, and display the search results with ranking. Users are
interested in top-ranking results. If a website gets a place in top-ranking results, it
can gain more visitors. More visitors means business growth. The website owners
S. Borse (B)
S.S.V.P.S’s L.K. Dr. P.R. Ghogrey Science College, Dhule, India
B. V. Pawar
School of Computer Sciences, North Maharashtra University, Jalgaon, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 585
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9_55
586 S. Borse and B. V. Pawar
compete for higher ranking in search engines, so that number of people will frequent
the website, and more revenues will be generated. That is the reason why search
engines keep their ranking algorithm a secret. To promote the ranking of a website in
the top search result, website owners use other ways. The most prominent method of
promoting a website in the search engine result list is using Search Engine Optimiza-
tion (SEO) techniques [1]. SEO is the process of increasing visitors to the website
from the search engine’s listing for selected keywords [2]. SEO engineers are trying
to find innovative ways to control search rankings. For this purpose, they use ethical
(white hat SEO) tactics or sometimes even unethical (black hat SEO) techniques.
Manipulation using unethical SEO tactics can be done with irrelevant content on a
page, excessive and unnecessary links, redirection, cloaking, click fraud, and manip-
ulating the tags’ contents. The search engine emphasizes some of the top-ranking
factors; the web page feature is one of them. Specific properties of a web page that are
used to mislead a search engine to push undeserving sites to the top of the result page
are called features. It includes the number of links pointing to other pages, frequency
or location of keywords or presence of keywords in the title tag, meta description
tag H1 tag, anchor text, etc. [3]. Such manipulated sites do not benefit the user but
instead lead to the problem of webspam.
Although search engines are continuously investing a lot of money and efforts to
fight against this, search results still contain irrelevant links. It consumes user time
by lowering the quality of search results and unnecessarily increasing the load of
traffic.
It becomes necessary to find out irrelevant sites from top search results and move
them downwards so that relevant results get the desired position. For this purpose,
we proposed a ‘Feature Score Computation’ algorithm [4]. This algorithm reads the
source code of web pages, checks each feature’s presence, assigns them weights
considering specific parameters, and accordingly computes a feature score. Further,
a total score was calculated for each page by adding all feature scores. This procedure
was repeated for the top 20 search results. The web pages were then rearranged by
descending order of total score, and a predicted ranking was assigned to each page.
Here google search results were collected as a database corpus. The actual Google
results were compared with the results after re-ranking. The predicted relevancy of
the effects of Google improved from 88 to 99%.
2 Related Work
Search engines continually attract many visitors for searching for information. A
search engine is also used as an essential website promotion method for commercial
sites. Kehoe and Pitkow report that 80% of people used a search engine as a starting
point [5]. Among all search engines on the web, Google is widely favored [6]. Several
researchers have evaluated search engine performance relating to parameters like
precision, recall, relevance, duplication, degree of overlap, and others. Hacking and
Marilyn compare the search engines Alta Vista search engine, Excite, and Lycos for
Improvement in Ranking Relevancy of Retrieved Results … 587
evaluation. Ten queries had been taken for examination. Their top 10 search results
were examined with different parameters like irrelevant, somewhat relevant, and
relevant. The author claims that Alta vista performs better over Excite and Lycos,
relating to high precision and search facilities [7]. Gordon and Pathak evaluated
eight search engines using thirty-three queries for the top two hundred search links.
The researchers have examined results and categorized them as highly irrelevant,
somewhat irrelevant, somewhat relevant, and highly relevant. Among the selected
eight search engines, Open Text, Lycos, and Alta Vista performed best with higher
precision and higher recall [8]. Shang and Longzhuang evaluated six leading search
engines with 300 queries. They computed the relevance score of hits from search
engines and the ranking of search engines based on the statistical comparison of
relevance score. This paper reports that Google performs the best among six search
engines [6]. Su evaluated four search engines over the top 20 links, proposed 16
performance measures, the evaluation parameters like efficiency, user satisfaction,
relevance, utility, and connectivity. Alta Vista has high precision among 04 search
engines [9]. Griesbaum evaluated 03 search engines Google, Lycos, and Alta Vista
(German) related to accuracy for top 20 results based on randomly selected 50 queries.
The results of Google were significantly better than Alta Vista, but Google and
Lycos have no significant difference [10]. Vaughan and Thelwall compared three
search engines with 04 queries for the first 20 links. Human evaluation was used for
checking ranking quality. Top rank pages were retrieved, and these results were stable
over ten weeks. These evaluations used multiple search engines for comparison. The
relevancy of search results was determined using various methods to examine various
search engines’ usability and precision [11]. Olakekan evaluated 05 popular search
engines’ performance based on the quantity of document retrieval, response time,
accuracy, and advert content. Google, MSN, and Yahoo retrieved high document
quality retrieval capacity with low response time, precision value, and suitability for
advert content. Alta Vista is excellent in both relevance and advertisement [12]. Singh
and Sharan compared semantic search performance measured on the precision ratio
of the keyword-based search engine (Yahoo, Google) and semantic-based search
engines (DuckDuckGo, Bing, Hakia). The paper classified the top 20 documents
as relevant and non-relevant for selected ten queries. Among all selected search
engines, Bing retrieved more relevant results. For one of the questions, Hakia and
Google perform better [13].
All the above papers work performed on comparison among search engines
relating to relevance. It could not be located that dealt explicitly with finding non-
relevant results in the top rank list due to SEO techniques’ illegal use. No previous
work has been found to move down irrelevant results from the search engine’s full
list and improve search engine performance. As search engines grow exponentially,
much research is still required to return relevant and accurate results to the top-ranking
list.
588 S. Borse and B. V. Pawar
3 Methodology
Manually each website was studied, and 52 features were found, which played a
significant role in ranking. From these features, the ten most important features [3]
were selected. SEO techniques help websites improve ranking in search results using
keyword formatting, high-quality backlinks, and quality content. Major features used
in keyword formatting like keyword present in H1 tag, title tag, meta tag, meta
description tag, anchor text, URL path, domain name, keyword position in the title
tag, and keyword density title tag also outgoing links count, total links were selected.
The researchers defined some precise values for each feature. Weight is defined for
each feature according to the location and occurrence of the feature.
Then the total score of each page was calculated using the score of features present
on that page. These web pages are then arranged in descending order according to the
score of the web page. The web page’s predicted ranking with the original Google
ranking was compared to carry out the analysis. The criteria assigning precise weights
to each feature are shown in Table 3.
The precision value was compared for search results of selected queries. Precision
is computed as a fraction of documents retrieved which are relevant according to the
user’s perception.
Table 4 depicts the total number of irrelevant results found and the precision value
computed for the top 20 results for each of the queries.
Precision value is 1 for query “house plan” and “insurance” because these queries
returned the first 20 relevant results. For query “PSP,” the lowest precision value
among all is 0.75. Graphical representation for the number of irrelevant results and
their precision value is shown in Fig. 1.
According to the above figure, we can conclude that for query “PSP,” the highest
irrelevant results were returned by Google. Among the top twenty results, five irrel-
evant results found for query “PSP.” 0.75 is the lowest precision weight for query
“PSP.” When we examined each page returned by query “PSP,” it was realized that
query “PSP” is ambiguous. The “PSP” is a short form and used for multiple mean-
ings. Query “PSP” was attempted for “Portable PlayStation.” However, the results
are obtained with various purposes containing the web pages regarding “Personal
Software Process,” “Progressive Supranuclear Palsy,” “Dynamic PSP” technology
for Oracle 11g, “PSP video express” video for software of PSP converter. Such results
lower the performance of search engines.
Here we can see that search engine performance goes down because of the web
pages containing natural language words that are ambiguous. Multiple meaning is
there for the same query. Search results can be improved if the web page contains
precise semantic annotations [14] or the search engine generates a list with multiple
meanings to a submitted query. From this list, the user can select the appropriate
definition for a given query, and the search engine shows the correct result without
ambiguity.
From result analysis, it was found that top search results of other queries also
contained irrelevant results. The next work implemented is to develop the system to
remove the irrelevant results from the top list. Keyword to be searched and retrieved
web pages are the two inputs given to the system. From given search keywords,
features are identified by the system from retrieved web pages. For each feature, using
precise weights, a score was computed. A web page’s total score was calculated by
adding each feature’s weights found on the page. The resulting pages are re-rank by
using this entire score. For the search keyword “3d TV”, the system returns the actual
results and results after re-ranking; its snapshot is depicted in Fig. 2. The position
of actual search results with predicted results was compared. It can be seen that for
592 S. Borse and B. V. Pawar
most of the queries, the results which are not relevant to the query are moved down
after re-ranking are given in Table 5.
For search keyword “3d TV” results which are not relevant get ranking position
2, 7 which moves down to 18 and 20 resp. In this way, relevant results are placed
at the top. For query “iPod,” the irrelevant result at position two also moves down
at position 20, and the same happened for other queries. Table 5 shows the position
of irrelevant results found in Google’s original result and result after processing the
system, and its graphical representation is in Fig. 3. It shows that almost all the
irrelevant results are placed below, and relevant results are moved to the top because
the system computed a high score value for relevant pages and a low score value for
non-relevant pages.
The sample result for query “beauty” after computation of total score is shown in
Table 6.
Table 6 contained search results returned by Google for query “beauty.” Sites
present at ranking positions 3, 8, 9 are not relevant for query beauty. These results are
re-ranked in descending order according to the score of web pages. Due to re-ranking,
the results at positions 3, 8, and 9 move down, with a change in its ranking position
to 17, 11, and 10, respectively, are shown in Table 7. We can see that 03 non-relevant
sites have been placed in the top 10 search results by Google from observation. Still,
Improvement in Ranking Relevancy of Retrieved Results … 593
Table 5 Position of
Sr. No Query Original results Result after
irrelevant results after
re-ranking
re-ranking
1 Beauty 03 17
08 11
09 10
2 3d TV 02 18
07 20
3 Deluxe room 04 11
10 18
16 15
4 Graphic design 09 17
15 16
5 Pasta pizza 03 18
19 17
20 16
6 IPod 02 20
7 PSP 04 16
05 19
16 15
19 12
20 17
8 Search engine 13 18
optimization
after executing the system, the result contained only 01 non-relevant sites that remain
in the initial ten search results.
The original top 10 results and search results after processing and re-ranking are
compared for selected queries, as shown in Table 8. It was observed that occurrence
of relevant results in top pages increases than the actual results. For query “beauty”,
“3d TV”, “deluxe room”, and “PSP” the performance increased up to 20%. For the
query “graphic design”, “iPod” and “Pasta Pizza”, the performance of the result
increased up to 10% than the original. Comparative performance of relevant results
is shown graphically in Fig. 4.
5 Conclusion
A system was designed and developed, extracting the retrieved search results’
features—a precise weight assigned to each feature. Total features are found on the
page, and using their precise weight, each page’s score was computed. Using these
Improvement in Ranking Relevancy of Retrieved Results … 595
calculated scores, the search result pages are re-rank in descending order of calcu-
lated score. The ranking position of original results of Google with predicted results
was duly compared. It was then successfully concluded that there is an improvement
in the actual results. Thus, it will improve the Google search engine results from 88
to 99% for the corpus prepared and compiled.
References
1. Patil, S.P., Pawar, B.V., Patil, A.S.: Search engine optimization: a study. Res. J. Comput. Inf.
Technol. Sci. 1(1), 10–13 (2013)
2. Patil Swati, P., Pawar, B.V.: Study of website promotion techniques and role of SEO in search
engine results. Int. J. Recent Innov. Trends Comput. Commun. 3(11), 6229–6234 (2015)
3. Pawar, B.V., Patil Swati, P.: System for identification of ranking terms from retrieved results
of major search engines. Int. J. Inf. Retri. 8(2), 201–207 (2015)
4. Patil Swati, P., Pawar, B.V.: Removing non-relevant links from top search results using feature
score computation. Bull. Pure Appl. Sci. 37E(2), 311–320 (2018)
5. Pitkow, J.E., Kehoe, C.M.: Emerging trends in the WWW user. Commun. ACM 39(6), 106–108
(1996)
6. Shang, Y., Longzhuang, L.: Precision evaluation of search engines. World Wide Web 5(2),
159–179 (2002)
7. Heting, C., Rosenthal, M.: Search engines for the World Wide Web: a comparative study and
evaluation methodology. Proc. ASIS Ann. Meet. 33, 27–35 (1996)
8. Gordon, M., Pathak, P.: Finding information on the World Wide Web: the retrieval effectiveness
of search engines. Inf. Process. Manage. 35(2), 141–180 (1999)
9. Su, L.T.: A comprehensive and systematic model of user evaluation of Web search engines: I.
Theory and background. J. Am. Soc. Inf. Sci. Technol. 54(13), 1175–1192 (2003)
10. Griesbaum, J.: Valuation of three German search engines: Altavista.de, Google.de, and
Lycos.de. Inf. Res. Int. Electr. J. 9(4) (2004)
Improvement in Ranking Relevancy of Retrieved Results … 597
11. Vaughan, L., Thelwall, M.: Search engine coverage bias: evidence and possible causes. Inf.
Process. Manage. 40(4), 693–707 (2004)
12. Olakekan, A.: Comparative study of some popular web search engines. Afr. J. Comp. Sci. ICT
3(1), 3–20 (2010)
13. Singh, J., Sharan, A.: A comparative study between keyword and semantic-based search
engines. In: International Conference on Cloud, Big Data and Trust, pp 130–134 (2013)
14. Inkpen, D.: Information retrieval on the internet. Ph.D. thesis, University of Toronto (2006)
Author Index
A De Ghosh, Ishita, 93
Agrawal, Vaishnavi, 137 Deosarkar, S. B., 475
Ahire, Vijaya, 83 Devi, S. M. Renuka, 293
Ahmad, Tauseef, 403 Dhekane, Shariva, 137
Ahmed, Sajjad, 167
Ansari, Mohd. Javed, 403
Arya, Rajeev, 439 G
Ashok, M., 129 Garg, Aakansha, 439
Ashok, Umadevi, 129 Gaurav, Vipul, 381
Gujjeti, Sridhar, 429
Gupta, Ritigya, 553
B Gupta, Supriya, 147
Bakale, Ravindra S., 475
Bedi, S. S., 217
Begum, Shameedha, 255 H
Bhadade, R., 475 Haque, Md. Asraful, 403
Bisht, Arinjay, 187 Hasneen, Jehan, 447
Bombade, Balaji R., 341 Hoang, Vinh Truong, 1
Bopche, Litesh, 177 Ho, Toan Pham, 1
Borse, Swati, 83, 585
I
Islam, Saiful, 167
C Iswarya, N., 39
Chattopadhyay, Abir, 93
Chaudhari, Vijay D., 487
Chavan, Satishkumar, 121 J
Choudekar, Pallavi, 417, 529 Jadhav, Ishwar S., 563
Choudhary, Ankur, 461 Jagtap, Abhishek, 63
Jaware, Tushar H., 265
Joshee, Minita, 121
D Joshi, Amit, 519
Das, Anup, 255 Joshi, Yashwant, 273
Dash, Shruti, 417
Das, Maniklal, 371
Das, Sumanta, 93 K
Datta, Aniruddha, 137 Kagita, Mohan Krishna, 391
© The Editor(s) (if applicable) and The Author(s), under exclusive license 599
to Springer Nature Singapore Pte Ltd. 2022
B. Iyer et al. (eds.), Applied Information Processing Systems, Advances in Intelligent
Systems and Computing 1354, https://doi.org/10.1007/978-981-16-2008-9
600 Author Index
V W
Vaidya, Atharva, 311 Wagh, Abhay, 157
Varma, G. Parthasaradhi, 227 Wangikar, Makarand D., 341
Venkateswari, R., 39, 351 Wani, Nasir Ul Islam, 529