Professional Documents
Culture Documents
recent-studies-on-computational-intelligence-doctoral-symposium-on-computational-intelligence-dosci-2020-1st-ed-9789811584688-9789811584695
recent-studies-on-computational-intelligence-doctoral-symposium-on-computational-intelligence-dosci-2020-1st-ed-9789811584688-9789811584695
recent-studies-on-computational-intelligence-doctoral-symposium-on-computational-intelligence-dosci-2020-1st-ed-9789811584688-9789811584695
Ashish Khanna
Awadhesh Kumar Singh
Abhishek Swaroop Editors
Recent Studies
on Computational
Intelligence
Doctoral Symposium on Computational
Intelligence (DoSCI 2020)
Studies in Computational Intelligence
Volume 921
Series Editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
ments and advances in the various areas of computational intelligence—quickly and
with a high quality. The intent is to cover the theory, applications, and design
methods of computational intelligence, as embedded in the fields of engineering,
computer science, physics and life sciences, as well as the methodologies behind
them. The series contains monographs, lecture notes and edited volumes in
computational intelligence spanning the areas of neural networks, connectionist
systems, genetic algorithms, evolutionary computation, artificial intelligence,
cellular automata, self-organizing systems, soft computing, fuzzy systems, and
hybrid intelligent systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the world-wide distribution,
which enable both wide and rapid dissemination of research output.
The books of this series are submitted to indexing to Web of Science,
EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink.
Abhishek Swaroop
Editors
Recent Studies
on Computational
Intelligence
Doctoral Symposium on Computational
Intelligence (DoSCI 2020)
123
Editors
Ashish Khanna Awadhesh Kumar Singh
Department of Computer Science Department of Computer Engineering
and Engineering NIT Kurukshetra
Maharaja Agrasen Institute of Technology Kurukshetra, India
New Delhi, India
Abhishek Swaroop
Department of Computer Science
Engineering
Bhagwan Parushram Institute of Technology
New Delhi, India
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore
Pte Ltd. 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Preface
v
vi Preface
DoSCI 2020 invited six keynote speakers, who are eminent researchers in the
field of computer science and engineering, from different parts of the world. In
addition to the plenary sessions on each day of the conference, 15 concurrent
technical sessions are held every day to assure the oral presentation of around nine
accepted papers. Keynote speakers and session chair(s) for the session are leading
researchers from the thematic area of the session.
DoSCI 2020 of such magnitude and release proceedings by Springer has been
the remarkable outcome of the untiring efforts of the entire organizing team. The
success of an event undoubtedly involves the painstaking efforts of several con-
tributors at different stages, dictated by their devotion and sincerity. Fortunately,
since the beginning of its journey, DoSCI 2020 has received support and contri-
butions from every corner. We thank them all who have wished the best for DoSCI
2020 and contributed by any means towards its success. The edited proceedings
volume by Springer would not have been possible without the perseverance of all
the steering, advisory and technical program committee members.
All the contributing authors owe thanks from the organizers of DoSCI 2020 for
their interest and exceptional articles. We would also like to thank the authors of the
papers for adhering to the time schedule and for incorporating the review com-
ments. We wish to extend my heartfelt acknowledgment to the authors, peer
reviewers, committee members and production staff whose diligent work put shape
to the DoSCI 2020 proceedings. We especially want to thank our dedicated team of
peer reviewers who volunteered for the arduous and tedious step of quality
checking and critique on the submitted manuscripts. The management, faculties,
administrative and support staff of the college have always been extending their
services whenever needed, for which we remain thankful to them.
Lastly, we would like to thank Springer for accepting our proposal for pub-
lishing the DoSCI 2020 proceedings. Help received from Mr. Aninda Bose, the
acquisition senior editor, in the process has been very useful.
vii
Editors and Contributors
Prof. (Dr.) Abhishek Swaroop completed his B.Tech. (CSE) from GBP
University of Agriculture & Technology, M.Tech. from Punjabi University Patiala
ix
x Editors and Contributors
and Ph.D. from NIT Kurukshetra. He has 28 years of teaching and industrial
experience. He has served in reputed educational institutions such as Jaypee
Institute of Information Technology, Noida, Sharda University Greater Noida and
Galgotias University Greater Noida. He is actively engaged in research. One of his
Ph.D. scholar has completed his Ph.D. from NIT Kurukshetra, and he is currently
supervising 4 Ph.D. students. He has guided 10 M.Tech. dissertations also. He has
authored 3 books and 5 book chapters. His 7 papers are indexed in DBLP and 6
papers are SCI. He had been part of the organizing committee of three IEEE
conferences (ICCCA-2015, ICCCA-2016, ICCCA-2017), one Springer conference
(ICICC-2018) as Technical Program Chair. He is member of various professional
societies like CSI and ACM and editorial board of various reputed journals.
Contributors
Abstract Tourism is the world’s most ideal development segment. There has been
profound development in the measure of the tourism insights on the Web. It is a
disgraceful circumstance that paying little heed to the overburden of data, we for
the most part neglect to find important data. This is because of nonattendance of
semantics distinguishing proof of the client query in getting the necessary outcomes.
Spurred by means of these restrictions, a framework is proposed called “Design and
Implementation of Semantically Enhanced Information Retrieval using Ontology.”
The objective of the paper is to exhibit semantic Indian tourism search framework
to upgrade India’s positioning as worldwide explorer with the goal that India could
utilize its favored characteristic assets and along these lines enhance amount of
visitor appearances and income from the travel industry. The proposed strategy uses
ontology constructed for the tourism of India for the precise retrieval. The framework
is assessed with keyword-based Web search engines to see adequacy of semantic over
commonly used methodologies and figures the performance as far as exactness and
execution time as assessment parameters. The outcome obtained clarifies that there
is fabulous improvement in information retrieval utilizing this methodology.
1 Introduction
In this world of technology, life without Web is not imaginable. Web, which associates
billions of people all around the world, is the speediest, least complex and most simple
mechanism of correspondence. Internet is the greatest stockroom of data, and it is
S. S. Laddha (B)
Government College of Engineering, Aurangabad 431001, India
e-mail: kabrageca@gmail.com
P. M. Jawandhiya
PL Institute of Technology and Management, Buldana 443001, India
e-mail: pmjawandhiya@gmail.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license 1
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_1
2 S. S. Laddha and P. M. Jawandhiya
2 Literature Review
Since the beginning of written language, people have been creating methods for
rapidly indexing and retrieving information. Information retrieval has a variety of
paradigms. Information retrieval is defined as an act of storing, seeking and fetching
information that matches a user’s demand [8].
Until 1950s, data recovery was commonly a library science. In 1945, Vannevar
Bush introduced his idea without limits where machines would be used to give basic
access to the libraries of the world [9]. In 1950s, the first electronic recovery frame-
works were structured by using punch cards. An absence of PC power confined the
helpfulness of these frameworks. During 70s, PCs started to have enough preparing
capacity to deal with data recovery with close to moment results. With the develop-
ment of the Web, data recovery turned out to be progressively significant and looked
into. Presently, great many people use some kind of current data retrieval framework
consistently like Google or some interestingly made structure for libraries.
The volume of data open on the Web makes it difficult to find pertinent data.
The prerequisite for a suitable method to sort out data turns out to be fundamentally
indispensable. The keyword search is not fitting to locate the pertinent data for a
particular idea. In a regular keyword-based Web search engine, the inquiry terms are
coordinated with the terms in a transformed list comprising of all the archive terms
of a book corpus [10]. Just coordinated records are fetched and showed to the end
client. The examination in [6] has talked about the critical reasons why an absolutely
message-based pursuit neglects to discover a portion of the applicable reports because
of the vagueness of regular language and absence of semantic relations.
Literary data recovery depends on keywords extricated from reports and used
as the building blocks of both document and query representations. Nonetheless,
keywords may have various equivalent words. For example, the expression “train”
regarding the travel industry alludes to “vehicle for transportation” though the equiv-
alent “train” term in education industry signifies “to teach.” The current keyword-
based Web search tools coordinate the term in the inquiry with the terms in the reports
and return every one of the records with this term independent of the semantics.
Therefore, endeavors are required to devise semantic Information retrieval methods
to render significant archives based on importance as opposed to keywords. The major
idea at the premise of semantic data retrieval is that the importance of content relies
upon reasonable connections to objects on the conceptual relationships as opposed
to phonetic relations found in content.
In the zone of the travel industry, Tomai et al. [11] introduced philosophy which
helped basic leadership in trip arranging. They introduced utilization of two separate
ontologies, one for the travel industry data and the other meant for client profiles
[12]. Jakkilinki et al. in the year 2005 have presented an ontology-driven smart
travel industry data framework utilizing the travel industry domain philosophy [13].
Lam et al. in the year 2006 have introduced an ontology-driven operator system for
semantic Web administration “OntiaiJADE” and ontology of upper-level utilizing
auxiliary data from different Web sites that are related and relevant to Hong Kong.
4 S. S. Laddha and P. M. Jawandhiya
3 Challenges/Research Gap
Today, existence without Web and that too without Web search tool is not possible.
Looking through the net has become the part of our regular day-to-day existence. This
incorporates the entire thing from peering out an appropriate book to contemporary
improvements in exceptional advancements. Web indexes have quite changed the
manner in which people get right of section to and find data, empowering data about
practically any theme to be effectively and in a matter of seconds available. All
the data recovery strategies are chipping away at keyword coordinating. On the off
chance that keyword matches are on accessible data, at that point just that page will
be returned, in some other case dismissed. These methodologies give similarly more
noteworthy wide assortment of results. We need to explore the pages to get required
Onto-Semantic Indian Tourism Information Retrieval System 5
data. These systems are unable in providing exact answer to given question. As these
techniques do not consider the semantic driven by the inquiry terms, they are doing
whatever it takes not to comprehend what client needs to ask bringing about low
exactness and relevancy rate. The basic issue [20] incorporates:
• Fetching and displaying irrelevant outcomes.
• Large volume of results making hard for the client to locate the relevant
information.
• The user is not aware about the rationale being used to bring the outcomes for the
question making it hard to the client to investigate the outcomes properly.
• Query execution takes time, and precision is low.
The above issues are common for keyword-based search engines. The present Web
is the collection of wide variety of information, and the Web indexes are expected
to give the information as indicated by the client’s inquiry. Additionally, many times
client does not know about the exact term expected to look. Along these lines if
exact query term is not matching, at that point the outcome may not be extremely
precise. Web indexes must not confine themselves to keyword based as it were. The
semantics of the words should likewise be contemplated. The logic ought to be fuzzy.
The framework is required to create data recovery interface that renders exact and
effective query items in relatively less time.
4 Objective
Considering vast unstructured data on the Web, the traditional search engines are
incapable to render relevant, precise and efficient information from the Web. The
primary goal of this research is to enhance the precision and efficiency of informa-
tion retrieval semantically using ontology to satisfy the user query and attain user
satisfaction in search result. This semantic information retrieval is evaluated against
generally used conventional search engines, viz. Google, Yahoo and Bing, and the
improvement is demonstrated in terms of efficiency and precision of search results
of the resulting application. The results show that an information retrieval system
using domain ontology can achieve better result than the keyword-based information
retrieval systems.
5 Hypothesis
This research work is an attempt to address few of the problems mentioned in research
issues. The proposed system is aimed at innovations in the design of enhanced
semantic information retrieval system on the Web for generalized purpose used for
specific domain and implement Web-based interface to accept query from user and
6 S. S. Laddha and P. M. Jawandhiya
provide the result by using ontology-based enhanced semantic retrieval system. This
interface ensures multiple users to remotely access the same application through Web
browser.
6 System Architecture/Methodology
At the point when end user gives any query pertaining to Indian tourism like “traveler
place in India,” “places of interest in India,” “India the travel industry,” “investigate
vacationer goal and improvement of India,” “incredible India” and so forth, then the
basic inquiry mapper is invoked and the pertinent outcomes are appeared to the query
seeker client along with meta-data and the time taken for its processing [21].
Mapping tool for query prototype [7] is a novel idea inferred in this study, utilizing
which one query prototype can deal with various client inquiries. The query models
contain (i) simple tokens, (ii) template tokens, (iii) ontological tokens and (iv)
stopwords.
Onto-Semantic Indian Tourism Information Retrieval System 7
For example, (flight) from [from-city] to [to-city]. Herein, various defined query
prototypes for 17 services are recognized for Indian travel industry domain. This
module will work if client question coordinated precisely with any of the query
prototype defined for different identified services [7].
If client question does not coordinate precisely with any of the query prototype
defined for different identified services, then there is strong probability that the
sequence of words in client inquiry is not matching with sequence of words in
Fig. 1 (continued)
defined query prototypes, then to handle such inquiries and to locate the service for
execution, query word order mapper is invoked.
Another possibility is the client mistakenly may enter misspelled state/city name in
the inquiry. To deal with this, valid Indian cities and states’ name list is kept which
this module utilizes to replace the incorrectly spelled term with the nearest matching
term in the stored list, reframe the query and forward it to query prototype mapper
[22].
Onto-Semantic Indian Tourism Information Retrieval System 9
There is probability that rather than ontological token utilized in the defining the query
prototypes, client inquiry may use different term. To coordinate this sort of queries,
ontological mapper is utilized which inside makes use of ontology constructed using
WordNet. This accelerates the performance of the framework amazingly by dealing
with practically every query identified with Indian tourism domain. This research set
forwards a sort of semantic data recovery strategy dependent on ontology created
using clustering technique. The clustering algorithm is designed and implemented
which creates the cluster based on the different ontological tokens called as cluster
head defined in the query prototypes. The cluster elements are fetched using Java
WordNet Library (JWNL), the relationship is assigned, and the score is calculated
for each ontological token/cluster head with respect to cluster head. This process
results into creation of ontology stored in memory to shorten the time of retrieval.
The ontology representation for the ontological cluster head “Train” is as appeared
in Fig. 2.
The job of defined ontology is to characterize the relations among the terms
important to Indian tourism domain. At the point when the client enters a query, the
inquiry will be deciphered by related terms characterized in the ontology to improve
the exhibition of the semantic search. The specific tourism service concerning client
query is located semantically and executed.
Another probability is that client may provide just name of city/state. To deal with
such questions, this mapper is utilized to summon the “About service” for that
particular state or city depicted in the performance analysis section of [22].
Ordinarily, client enters the inquiry which will not match with any city or state
name as well as with no defined query prototypes. In the event that any of the
previously mentioned mappers/stages cannot deal with the client-mentioned inquiry,
then keyword mapper attempts to coordinate keywords showed up in client query
with the keywords list which are retrieved at the beginning of the mapper interface
of resulting Web page on the basis of other input queries given by the end user. To
delineate, if client enters inquiry “about Mumbai”, it shows data about Mumbai and
simultaneously in the background the framework get every keyword from the result
giving URL’s and saves them in keyword.dat. Afterward, if client gives any of the
relevant keywords like “Gate Way of India”, framework can give the Web site page
with the relevant data. Along these lines, the framework turns out to be gradually
smarter as it develops and as it processes progressively more and more other type of
relevant numerous queries.
In this investigation, a meta-processor is planned which gives meta-data like title, time
and brief data regarding pertinent URLs of the client-mentioned data. At whatever
point the client enters a question first time, just the Web joins are shown to the client to
give the speedy outcomes and yet a string is produced by the meta-processor, which
in background gets the meta-data and dumps it on the server. Handling of these
URLs for meta and title at run time takes additional time, as it requires association
with numerous servers for these data. At next run of a similar question, client gets
the important data alongside the meta-data. Preparing the meta-data is a foundation
procedure performed by meta-processor. This novel meta-processor is enhancing the
performance of this system.
Onto-Semantic Indian Tourism Information Retrieval System 11
A few services like “About city” service, “Best time to visit” service and so forth in the
tourism space require information in one line or passage rather than URL links. For
this, a template manager is planned in novel way which gets invoked in background.
Templates are site explicit. For including new URL, individual site format should be
included as various sites utilize diverse structure/layout to show/store the data. This
epic methodology of template manager helps in fetching these data.
Along these lines, various modules portrayed above get invoked based on the
pattern of the query. As appeared in Fig. 1 (Part 1 and Part 2), the client question is
initially matched with defined the query prototypes. On the off chance that precise
match is discovered, at that point the query prototype mapper recognizes the service
to be executed. If client inquiry is not coordinated with any of the defined query
prototype, in that case the query word order mapper checks for the alteration in word
sequence and determines the service to be invoked. If this mapper fails to invoke the
service, then the check is made for incorrectly spelled city or state name, the revision
is made by the spelling correction module, and the inquiry is sent back to the query
prototype mapper to recognize the service. In the event that still the coordinating
question model is not discovered, at that point the inquiry tokens are coordinated
with the closest ontological cluster parent head as clarified in [23] to invoke the
appropriate service. There is the likelihood that client may enter just city/state name,
and then the “About” service type is revoked for the separate state/city. In the event
that the client enters extremely basic inquiry, identified with respect to domain, at
that point the basic query mapper is summoned.
There is additionally the likelihood that client may demand data whose query
prototype is not defined in the framework; then, the framework handles such question
by summoning the basic keyword mapper. Along these lines, the recognized service
type is invoked and the pertinent Web links are recovered semantically. The initial
step of the procedure begins when the client enters the question in the semantic
pursuit interface delineated in Fig. 3.
User will enter the query in the search box, and on clicking the search button, the
relevant links along with the meta-information and time required for processing are
displayed to the end user as search result in Fig. 4.
7 Performance Analysis
The Web application is developed to make usage of the framework. The base
outcomes utilizing fundamental model are talked in detail in [24, 25]. The advanced
framework gets client query as an input through user interface designed as appeared
in Fig. 3, and the outcomes are obtained semantically utilizing ontology dependent
on the terms in the inquiry as appeared in Fig. 4. The framework performance is
controlled by computing the accuracy and effectiveness in terms of query execution
12 S. S. Laddha and P. M. Jawandhiya
time. Exactness is utilized to gauge the precision of the framework. The performance
of the semantic hunt interface is assessed by setting up the wide variety of queries
for every one of the identified services recognized for tourism domain of India as
appeared in Table 1, and for each distinguished service, the testing is accomplished.
In each sort of service, we tried testing of diversified queries. The complete testing
results of major services are depicted in detail as follows.
Onto-Semantic Indian Tourism Information Retrieval System 13
The query is given for processing to onto-semantic search interface and the
conventional keyword-driven Web search tools like Bing, Google and Yahoo, and
the outcomes retrieved are analyzed for each query. The framework performance is
determined in terms of average accuracy and time taken for processing as appeared
in following table.
Different users may demand the data from various perspectives. In view of the client
demand, the framework deciphers the service, and then, it renders the outcome. The
outcome gave by those all Web searching engines and individual time taken for the
processing of the query and the accuracy we observed is far better in performance
of the onto-semantic search engine over generally used conventional search engines
available, viz. Google, Bing and Yahoo.
14 S. S. Laddha and P. M. Jawandhiya
Table 2 Comparative average precision and average processing time analysis for all services
Service Id No. of Semantic Google Bing Yahoo
name unique search search
queries Average Average Average Average Average Average
precision processing precision processing precision precision
time time
About 1 19 92.5 0.21 56 0.52 54.73 51.59
city
service
Distance 2 12 100 0.37 65.83 0.57 61.92 53.32
service
(continued)
Onto-Semantic Indian Tourism Information Retrieval System 15
Table 2 (continued)
Service Id No. of Semantic Google Bing Yahoo
name unique search search
queries Average Average Average Average Average Average
precision processing precision processing precision precision
time time
Best time 3 10 79.86 0.2 66.4 0.56 53.11 38.31
to visit
service
How to 4 17 100 0.24 82.57 0.63 89.03 73.43
reach
service
Things to 5 10 74.07 0.25 100 0.8 91.89 71.85
do
service
Hotel 6 13 96.61 0.18 89.23 0.69 88.07 85.56
service
Hotel 7 10 98.84 0.24 99.29 0.75 100 95.56
type
service
Hotel 8 10 92.84 0.18 88 0.71 90.96 84.65
rating
service
Flight 9 19 97.74 0.28 97.44 0.72 93.77 90.43
service
Tourist 10 11 100 0.28 94.95 0.72 80.69 66.83
places
service
Train 11 10 94 0.23 69 0.57 62.6 40.04
service
Weather 12 10 75.6 0.19 96 0.49 88.53 63.95
service
Bus 13 11 97.56 0.19 64.55 0.63 64.2 62.25
service
India 14 10 100 0 86 0.86 82.88 81.95
place
service
Keyword 15 22 100 0.11 84.8 0.79 83.07 70.12
base
service
City state 16 20 100 0.23 67.73 0.7 61.71 54.37
service
State 17 10 100 0.21 65.1 0.79 54.66 59.25
service
16 S. S. Laddha and P. M. Jawandhiya
Graph 2 Comparative average processing time analysis of semantic and Google search engines
for all services
References
1. Buhalis, D., & Law, R. (2008). Progress in information technology and tourism manage-
ment: 20 years on and 10 years after the internet—The state of eTourism research. Tourism
Management, 29(4), 609–623.
2. Hall, C. M. (2010).Crisis events in tourism: Subjects of crisis in tourism. Current issues in
Tourism, 13(5), 401–417.
3. Hauben, J. R. (2005). Vannevar Bush and JRC Licklider: Libraries of the future 1945–1965.
The Amateur computerist (p. 36).
4. Jakkilinki, R., Sharda, N., & Ahmad, I. (2005). Ontology-based intelligent tourism information
systems: An overview of development methodology and applications. In Proceeding of TES.
5. Laddha S. S., & Jawandhiya P. M. (2018) Onto semantic tourism information retrieval. Inter-
national Journal of Engineering & Technology (UAE), 7(4.7), 148–151. ISSN 2227-524X,
Onto-Semantic Indian Tourism Information Retrieval System 17
https://doi.org/10.14419/ijet.v7i4.7.20532.
6. Laddha S.S., Koli N.A., & Jawandhiya P. M. (2018). Indian tourism information retrieval
system: An onto-semantic approach. Procedia Computer Science, 132, 1363–1374. ISSN 1877-
0509, https://doi.org/10.1016/j.procs.2018.05.051.
7. Laddha S. S., & Jawandhiya P. M. (2020). Novel concept of spelling correction for semantic
tourism search interface. In: Tuba M., Akashe S., Joshi A. (eds) Information and Communica-
tion Technology for Sustainable Development. Advances in Intelligent Systems and Computing,
Vol. 933. Springer, Singapore. https://doi.org/10.1007/978-981-13-7166-0_2 ISBN: 978-981-
13-7166-0, ISSN: 2194-5357, Pages 13–21.
8. Laddha S. S., & Jawandhiya P. M. (2018) Novel concept of query-prototype and query-
similarity for semantic search. In: Deshpande A. et al. (eds) Smart Trends in Information Tech-
nology and Computer Communications. SmartCom 2017. Communications in Computer and
Information Science, Vol. 876. Springer, Singapore.Online ISBN 978-981-13-1423-0 ISSN:
1865-0929.
9. Kanellopoulos, D. N. (2008). An ontology-based system for intelligent matching of trav-
ellers’ needs for Group Package Tours. International Journal of Digital Culture and Electronic
Tourism, 1(1), 76–99.
10. Aslandogan, Y. A., & Clement T. Y. (1999). Techniques and systems for image and video
retrieval. IEEE Transactions on Knowledge and Data Engineering, 11(1), 56–63.
11. Tomai, E., Spanaki, M., Prastacos, P., & Kavouras, M. (2005). Ontology assisted deci-
sion making–a case study in trip planning for tourism. In OTM Confederated International
Conferences on the Move to Meaningful Internet Systems (pp. 1137–1146). Berlin: Springer.
12. Vinayek, P. R., Bhatia, A., & Malhotra, N. E. E. (2013). Competitiveness of Indian tourism
in global scenario. ACADEMICIA: An International Multidisciplinary Research Journal, 3(1),
168–179.
13. Kathuria, M., Nagpal, C. K., & Duhan, N. (2016). A survey of semantic similarity measuring
techniques for information retrieval. In 2016 3rd International Conference on Computing for
Sustainable Global Development (INDIACom) (pp. 3435–3440). IEEE.
14. Laddha S. S., & Jawandhiya P. M. (2017). Semantic tourism information retrieval inter-
face. In 2017 International Conference on Advances in Computing, Communications and
Informatics(ICACCI), Udupi, (pp. 694–697). https://doi.org/10.1109/icacci.2017.8125922.
15. Wang, W., Zeng, G., Zhang, D., Huang, Y., Qiu, Y., & Wang, X. (2008). An intelligent ontology
and Bayesian network based semantic mash up for tourism. In IEEE Congress on Services-Part
I (pp. 128–135). IEEE.
16. Laddha S. S., Laddha, A. R., & Jawandhiya P. M. (2015). New paradigm to keyword search:
A survey. In IEEE Xplore digital library (pp. 920–923). https://doi.org/10.1109/icgciot.2015.
7380594. IEEE Part Number: CFP15C35-USB, IEEE ISBN: 978-1-4673-7909-0.
17. Song, T. -W., & Chen, S. -P. (2008). Establishing an ontology-based intelligent agent system
with analytic hierarchy process to support tour package selection service. In International
Conference on Business and Information, South Korea.
18. Chiu, D. K. W, Yueh, Y. T. F., Leung, H., & Hung, P. C. K. (2009). Towards ubiquitous tourist
service coordination and process integration: A collaborative travel agent system architecture
with semantic web services. Information Systems Frontiers, 11, 3, 241–256.
19. Laddha S. S., & Jawandhiya P. M. (2017). An exploratory study of keyword based search
results. Indian Journal of Scientific Research, 14(2), 39–45. ISSN: 2250-0138 (Online).
20. Laddha S. S., & Jawandhiya P. M. (2019) Novel concept of query-similarity and meta-
processor for semantic search. In: Bhatia S., Tiwari S., Mishra K., & Trivedi M. (eds) Advances
in Computer Communication and Computational Sciences. Advances in Intelligent Systems
and Computing, Vol. 760. Springer, Singapore. https://doi.org/10.1007/978-981-13-0344-9_9.
Online ISBN 978-981-13-0344-9.
21. Laddha S. S., & Jawandhiya P. M. (2017). Semantic search engine. Indian Journal of Science
and Technology, 10(23), 01–06. https://dx.doi.org/10.17485/ijst/2017/v10i23/115568. Online
ISSN : 0974-5645.
18 S. S. Laddha and P. M. Jawandhiya
22. Park, H., Yoon, A., & Kwon, H. C. (2012). Task model and task ontology for intelligent tourist
information service. International Journal of u-and e-Service, Science and Technology, 5(2),
43–58.
23. Lee, R. S. T. (Ed). (2007). Computational intelligence for agent-based systems (Vol. 72).
Springer Science & Business Media.
24. Pan, B., Xiang, Z., Law, Rob, & Fesenmaier, D. R. (2011). The dynamics of search engine
marketing for tourist destinations. Journal of Travel Research, 50(4), 365–377.
25. Korfhage, R. R. (1997). Information storage and retrieval.
An Efficient Link Prediction Model
Using Supervised Machine Learning
1 Introduction
Online Social Network (OSN) has established an era where the life of humans is
being highly influenced by the trends and activities prevailing across these social
networks. The power of the social network could be understood in a way that
majority of the market and business trends have been decided and set on social
networks, even governments are relying upon these mediums to implement their part
© The Editor(s) (if applicable) and The Author(s), under exclusive license 19
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_2
20 P. K. Bhanodia et al.
and parcels of policy implemented. Social network has now offering new avenues
to build new friendships, associations and relationships may be in business life of
social life [1]. Facebook, Twitter, LinkedIn and Flickr are few popular names of
online social network sites around us which have become integrated part of our daily
life. The rapid and exponential development of social networks has attracted the
research community for exploring its evolution in order to investigate and understand
its nature and structure. Researchers study these networks employing various math-
ematical models and techniques. Online social networks are represented through
graphs (Fig. 1 sample social network graph representation) wherein the network
nodes are representation of users and the relationship or association between the
nodes is being represented through links. The social networks are being crawled up
by the researchers for further examination, and during collection of information at a
particular instance, the network information collected is partially downloaded where
certain links between the nodes may be missing. The missing link information is
an apprehension in understanding the standing structure of the network. Apart from
this, crunching the network structure attributes for approximation of new fresh links
between the nodes is another interesting challenge to be addressed.
The formal definition according to Libon Novel and Klienberg [2] referred as: In
a social network represented by G(V, E) where e = (u, v) belongs to E is the link
between the vertices (endpoints) at a specific timestamp t. Multiple links between
the vertices have been represented using the parallel links between vertices (nodes).
Let us assume for time t ≤ t that a subgraph denoted by G [t, t ] of G restricted
by the edges of time instance between the t and t’. According to supervised training
methodology, a training interval [t0, t0 ] and a test interval [t1, t1 ] where t < t1.
Consequently, the link prediction gives an output list of links which are not present
in graph G[t0, t0 ] but are predicted to appear in G[t1, t1 ].
Identification of contributing information for determining such missing and new
links is helpful in new friendship recommendation mechanism usually used in online
social networks. Several algorithmic techniques described by Chen et al. [3] have
been introduced by IBM in their internal local private online social network estab-
lished for their employees and workers in order to connect with each other digitally.
The prediction or forecasting of such existing hidden links or creating new fresh
links using the existing social network data is termed as link prediction problem.
The applications of link prediction include domains like bioinformatics for finding
protein–protein interaction [4]; prediction of potential link between nodes across
network could be used to develop various recommendation systems for e-commerce
Web sites [5]; besides link prediction can also assist the security systems for detecting
and tracking the hidden terrorist groups or terrorist networks [6]. As to address the
problem of link prediction pertaining to answer the relevant different scenarios, many
algorithms and procedures were proposed and majority of the algorithms usually
belong to machine learning approaches.
The applications mentioned above are do not work only on social networks;
rather many different networks like information network, Web links, bioinformatics
network, communication network, road network, etc., may be included for further
processing.
It is obvious that crunching large and complex social networks in one single
processing is although possible but would be inefficient and complex. This complex
task can be simplified into subtasks and handled separately. The basic building blocks
of any social networks are nodes, edges, the degree associated with the node and
local neighborhood nodes. The prediction of potential links approximated exploiting
global and local topological features of the network. Exploitation of the local neigh-
borhood features will be used to identify the similarity between the nodes. Thus, the
feature estimated using neighborhood techniques like JC, AA and PA would further
be used in developing a model classifier for link prediction.
The paper tries to address the problem of link prediction based upon machine
learning approach or classifier which will be trained using certain similarity feature
extracted by exploiting topological features. The proposed classifier would be exper-
imentally evaluated using social networking dataset (Facebook and Wikipedia).
The paper also introduces the state-of-the-art similarity techniques which include
Adamic/Adar, Jaccard’s coefficient, preferential attachment which are used as a
feature extraction techniques. The objective of the paper is to explore:
• Online social network and its evolution along with appropriate representation
using graphs.
• The problem of link prediction and its evolution.
• How link prediction problems could be comprehended and addressed.
• The techniques employed for link prediction for establishing relationships
between nodes across the online social network.
• Contribution of machine learning in addressing link prediction between nodes in
online social network.
• Accordingly propose a model for effective and efficient link prediction between
nodes in a online social network.
22 P. K. Bhanodia et al.
Advent of online social network has attracted the researcher to crunch the bulging
and getting complex network to extract knowledge for further predictions and recom-
mendations. Various techniques and predictive models have been introduced and
proposed to analyze the online social networks; these methods are classified based
on the way they exploit the data; local, global and machine learning-based methods
are usually employed for network data exploitation. Distinguished methods may be
explored in [7].
Common Neighborhood (CN). According to Newman [8], CN measure is
deduced by computing the number of existing common neighbors between adja-
cent nodes across which the future link is supposed to be predicted. Thus, it is a
score of similarity calculated by the intersection of number of adjacent connected
node to the nodes for identifying the similarity for having a potential link to establish
a relationship. It can be approximated using following expressions. The number of
common neighbors to x is represented by (x) and to y is by (y).
(x) ∩ (y)
J C(x y) =
(x) ∪ (y)
Preferential Attachment. In the same fashion, Kunegis et al. [10] have proposed
another approach for detecting the similarity between the nodes which is typically
identifying potential nodes across the social network. It refers as the maximum
number of nodes will be attracted to the node of highest degree in the network.
Mathematical representation of the measure is:
2 ∗ |(x) ∩ (y)|
Sx y =
k(x) + k(y)
2 ∗ |(x) ∩ (y)|
Sx y =
min|k(x), k(y)|
Hub Depressed Index (HDI). The technique is similar to HPI; the only difference
here is the denominator is the degree of the node associated with the maximum
number of neighbor nodes associated with either of the node of the pair (x or y);
mathematically, it is,
2 ∗ |(x) ∩ (y)|
Sx y =
max|k(x), k(y)|
2 ∗ |(x) ∩ (y)|
Sx y =
k(x) ∗ k(y)|
Path Distance. It is basically a global method where the network global structure
is exploited for generating a measure over which the link prediction is estimated. It is
typical measured distance between the nodes for identifying the closeness between
the nodes. Dijkstra algorithm could be applied for retrieving the shortest path, but it
would be an inefficient method for large complex type of social networks. It is also
known as the geodesic distance between two nodes.
Katz. It considers all the paths across two nodes and designates the shortest path
with highest value. The approximation would reduce exponentially the involvement
of the path in a way to assign lower values to the longer paths; mathematically, it is
represented as
n
Katz(x y) = β < path(x, y)
l=0
24 P. K. Bhanodia et al.
where β is generally used for controlling the length of the paths, how much it should
be considered.
It has been discussed during thorough literature review that social networks are
exploited on the basis of their graphical structures. Various methods have been used
to compute the links between the nodes, the methods typically vary with respect
to the nature of the networks, we have got information network, business network,
friendship network and so on, and therefore, there is no single method which can
effectively address the problem of link prediction. Thus, to simplify it has been solved
in two phases, wherein first-phase local structure of the network is exploited and a
new resultant network is formed with additional features. With additional features,
the new network is processed with machine learning techniques to build a classifier
for link prediction in social network. Naïve Bayes network has been used for further
experimental analysis.
Bayes Theorem. The theorem is used to find the probability of having an even A
given that event B is occurred. It is supposed that here B is designated as evidence
and A is designated as hypothesis. Assume that the attributes or predictors are inde-
pendent. It is understood that availability of one specific attribute does not affect the
other one; therefore, it is known as naive. The expression of naïve is represented as
under
P(A|B)P( A)
P(A|B) =
P(B)
The experimental study is evaluated over Wikipedia network, the dataset for which
is downloaded from snap Web site. The performance parameters used for analysis
are precision, recall, F1 score and accuracy. As the link prediction problem is a kind
of binary classification problem where positive link between nodes is designated
as presence of links and negative link between nodes is designated as absence of
potential link. Precision is determined dividing true positive value by the sum of
false positive and true positive both. Sensitivity or recall value can be determined by
division of true positive value by sum of true positive and false negative values. The
equations for performance evaluation are as follows.
True Positive
Precision =
True Positive + True Negative
True Positive
Recall =
True Positive + False Negative
2 ∗ True Positive
F1 score =
2 ∗ True Positive + false Negative + false positive
4 Experimental Study
The dataset used for experimental analysis is of voted history data. It includes around
2794 nodes and around 103,747 votes casted among 7118 users including existing
admins (either voted or being voted on). Partially, it consists of around half of the
votes in the dataset which are by existing admins, while the other half comes from
ordinary Wikipedia users. The dataset is downloaded from https://snap.stanford.edu/
data/wiki-Vote.html The network nodes are users, and directed edges are from node
i to node j designated user i has voted on user j.
Naive Bayes network classification technique is used to create a classifier model
for link prediction in a social network. The model created using stratified tenfold
crosses validation. It has been observed from Table 1 demonstrated below that the
classifier has predicted around 90.37% of the instances correctly leaving around
9.62% of the incorrect classified instances. The total time taken for building up of
the model is 0.03 s which is not much although the network selected may be of much
smaller size and in future on real data may be increased; however, it is reasonably
fair.
Naive Bayes when combined with Jaccard’s coefficient has significantly produced
results where accuracy is improved to 99.12%. The classifier model is built in negli-
gible time. It has correctly classified around 340 instances compared to three incor-
rectly classified instances. Figures 2 and 3 represent the classification of true instances
and false instances of the network.
In this paper, online social networks are studied from the point of link prediction
between the set of nodes in a large scaling online social network. In the process,
we have introduced various local and global classical techniques which produce
a measure used for identification of a potential link between the nodes. These
dyadic structural techniques in this paper have been studied with supervised machine
learning techniques. Adamic/Adar and Jaccard’s coefficient are combined with naive
Bayes classification technique to build a classifier. the experimental analysis shows
that use of Jaccard’s coefficient with naive Bayes has produced better accurate results
than the previous one. Though the results are witnessing over-fitting compared to the
previous approach which reasonably fair as well but even though the later approach
is superseding in accuracy. The model trained and tested over only one type of social
network. Exploitation of other types of social network may produce a significant
result to generalize the model over other online social networks.
References
1. Liben-Nowell, D., & Kleinberg, J. (2007). The link prediction problem for social networks.
Journal of the American Society for Information Science and Technology, 58(7), 1019–1103.
2. Kautz, H., Selman, B., & Shah, M. (1997). Referral web: Combining social networks and
collaborative filtering. Communications of the ACM, 40(3), 63.
3. Chen, J., Geyer, W., Dugan, C., Muller, M., & Guy, I. (2009). Make new friends, but keep the old:
Recommending people on social networking sites. In: Proceedings of the 27th İnternational
Conference on Human Factors in Computing Systems, ser. CHI’09 (pp. 201–210). NewYork:
ACM. https://doi.acm.org/10.1145/1518701.1518735.
4. Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2006). Fixed membership stochastic
block models for relational data with application to protein-protein interactions. In Proceedings
of International Biometric Society-ENAR Annual Meetings.
5. Huang, Z., Li, X., & Chen, H. (2005). Link prediction approach to collaborative filtering. In
Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries.
6. Hasan, M. A., Chaoji, V., Salem, S., & Zaki, M. (2006). Link prediction using supervised
learning. Counter terrorism and Security: SDM Workshop of Link Analysis.
7. Pandey, B., Bhanodia, P. K., Khamparia, A., & Pandey, D. K. (2019). A comprehensive survey
of edge prediction in social networks: Techniques, parameters and challenges. Expert Systems
with Applications, Elsevier. https://doi.org/10.1016/j.eswa.2019.01.040.
8. Newman, M. E. J. (2001). Clustering and preferential attachment in growing networks. Physical
Review Letters E.
9. Adamic, L. A., & Adar, E. (2003). Friends and neighbors on the web. Social Networks, Elsevier,
25(3), 211.
10. Kunegis, J., Blattner, M., & Moser, C. (2013). Preferential attachment in online networks:
Measurement and explanations. In Proceedings of the 5th Annual ACM Web Science Conference
(WebSci’13) (pp. 205–214). New York: ACM.
11. Zhou, T., Lü, L. & Zhang, Y. C. (2009). The European Physical Journal B, 71, 623. https://doi.
org/10.1140/epjb/e2009-00335-8.
Optimizing Cost and Maximizing Profit
for Multi-Cloud-Based Big Data
Computing by Deadline-Aware Optimize
Resource Allocation
Abstract Cloud computing is most powerful and demanding for businesses in this
decade. “Data is future oil” can be proved in many ways, as most of the business
and corporate giants are very much worried about business data. In fact to accom-
modate and process this data, we required a very expensive platform that can work
efficiently. Researchers and many professionals have been proved and standardize
some cloud computing standards. But still, some modifications and major research
toward big data processing in multi-cloud infrastructure need to investigate. Reliance
on a single cloud provider is a challenging task with respect to services like latency,
QoS and non-affordable monetary cost to application providers. We proposed an
effective deadline-aware resource management scheme through novel algorithms,
namely job tracking, resource estimation and resource allocation. In this paper, we
will discuss two algorithms in detail and do an experiment in a multi-cloud environ-
ment. Firstly, we check job track algorithms and at last, we will check job estimation
algorithms. Utilization of multiple cloud service providers is a promising solution
for an affordable class of services and QoS.
1 Introduction
The last decade was a “data decade.” Many multi-national company changes its
modes of operation based on data analysis. Big data and data analysis is an essential
and mandate for every industry. Companies like Amazon, Google and Microsoft are
ready with their data processing platform completely based on the cloud [1] in other
© The Editor(s) (if applicable) and The Author(s), under exclusive license 29
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_3
30 A. Manekar and G. Pradeepini
sense, all social media companies are also targeting cloud as a prominent solution.
Netflix and YouTube have already started using the cloud [2]. Cloud computing
impacted and proved as a very effective and reliable solution for multivariate huge
data. Still, researchers and professionals are working to enhance more and more
possibilities from the existing cloud structure. One of the major and critical tasks
is resource provisioning in a multi-cloud architecture. We tried to solve some of
the issues in multi-cloud architecture by implementing a prominent algorithm in a
multi-cloud architecture. Cloud computing is available in three types for each of us
[3], the foremost Publics Cloud Platform in which third-party providers are respon-
sible to provide services on the public cloud. In most cases, these services are maybe
free or sold by service providers on-demand, sometimes customers have to pay only
per usage for the CPU cycles, storage or bandwidth they consume for their appli-
cations [4–6]. Second is the Private Cloud Platform in this entire infrastructure is
privately owned by the organization; also it completely maintained and manages via
internal resources. For any organization, it is very difficult to maintain and manage
the entire infrastructure then, they can own VPC (Virtual Private Cloud) where a
third-party cloud provider-owned infrastructure but used under organization premises
[7–9]. The third is the Hybrid Cloud Platform; as the name indicated, it is a mixed
computing resource from public and private services. This platform is rapidly used
by many as a cost-saving and readily available on demand for fast-moving digital
business transformation. Cloud providers enhanced their infrastructure in distributed
by expanding data centers in different geographical regions worldwide [4–6]. Google
itself operates 13 data centers around the globe. Managing distributed data centers
and maximizing profit is a current problem. Ultimately, the customer is affected by
high cost and maintenance charges by these data centers. This cost has four prin-
ciples bound by applications serving to big data. Numerous cost-effective parallel
and time-effective tools are available in big data processing with the programming
paradigm. The master player in this tool or every big data application is the manage-
ment of resource which use an available resource and manage trade-offs between
cost and result. Complexity, scale, heterogeneity and hard predictability are the key
factors of these big data platforms. All challenges like complexity, which exactly
in inner of architecture, consist of proper scheduling of resources, managing power,
storage system and many more. The scale totally depends on target problem—data
dimensions and parallelism with high deadline [10]. Heterogeneity is a technology
need—maintainability and evolving nature of the hardware. Hard predictability is
nothing but the crunching of these their major factors explained earlier as well as a
combined effect of hardware trade-offs.
Inacio, E. C., Dantas in 2014 specified characterization [11] which deals with opti-
mization problems related to large dataset has mentioned the scale exacerbates. A
variety of aspects have an effect on the feat of scheduling policies such as data volume
Optimizing Cost and Maximizing Profit for Multi-Cloud-Based Big Data … 31
(storage), data variety, data velocity, security and privacy, cost, connectivity and data
sharing [12, 13]. The resource manager can be organized in a two-layer architecture
as shown in Fig. 1. The job scheduler [12] is responsible for allocating resources to
mitigate the execution of various different jobs running at the same time.
Figure 1 represents the local executable resource scale which exacerbates known
management and dimensioning problems, both in relation to architecture and
resource allocation and coordination [14, 15]. The task-level scheduler, on the other
hand, decides how to assign tasks on multiple task executors for each job [10, 16].
Cluster scheduler measures each job as a black box and executes as a general policy
and strategy. Our efforts are that by optimizing fiscally application-specific features,
we finally optimize resource scheduling decisions and achieve better performance
for advanced data analytics [17].
Figure 2 shows various open-source big data resource management frameworks
[18]. In many pieces of literature, it is observed that most of the available big
data processing framework is an open-source framework. Some of the preparatory
frameworks have license fees and the necessity of specialized high-end infrastructure.
On the contrary, open-source uses commodity hardware with marginal varia-
tion and requirements. Basically, Spark is a mainstream data streaming framework
which is the industry likely and can be expanded and ultimately used in various
IoT-based application data analysis. YARN is the heart of Hadoop which works for
global resource management (ResourceManager) and per-application management
(Application Master) [19–21].
As far as research gap identification and problem formulation, some observations
are mentioned.
3 Problem Formulation
Missing a deadline disturbs entire large intensive data processing and leads to under-
used resource utilization, incurred the cost of cloud uses for both cloud service
provider and user, and leads to poor decision making [22, 23]. To address this issue,
we designed a framework that is actually to be framework-agnostic and not rely on job
repetition or pre-emption support. On the other side in this work, focus is maintained
to utilize job histories and statics to control job admission. Instead of traditional fair
share resource utilization, we design a deadline-aware optimized resource allocation
policy by implementing two algorithms—one is job tracking and other is resource
estimation and resource allocation [8, 24]. Consideration of the second algorithm
Optimizing Cost and Maximizing Profit for Multi-Cloud-Based Big Data … 33
Algorithm 1 Job_Tracking
1 Initiation of Asp
2 Accept Fun Job_Track( C time , RTask , D, N cpu Allo, R)
3 CPU Deadline = C time /D
4 M Cpus = min(ReqTask , CC)
5 ReqMinRate = CPU Deadline /Max CPU
6 ReqminList . add( ReqMinRate)v
7 CPU Frac Min = min(ReqminList)
8 CPU Frac Max = max(ReqminList)
9 CPU Frac Last = NcpuAllo/CPUFracMax
10 Success_Last = Success
11 Function Ends
5 Experimental Setup
Existing fair share resource allocator does not take consideration of the deadline
of every individual job. The general assumption in this kind of resource allocation
is every job has indefinitely and that there is no limit on the turnaround time a
job’s owner is willing to tolerate. The proposed algorithm will be implemented
for the basic allocator by considering the job deadline for the resource-constrained
environment. Attempting to a trace-based simulation developed in Python and Java
for the admission control while submitting a job will give the desired result. We
are in phase to implement this for different resource-constrained with a variety of
hardware precisions. For the entire experimental setup, nodes run on Ubuntu 12.04
Linux system with mapped reduce Hadoop stack.
The proposed algorithm is promising in tracking the success of its allocation deci-
sions and improves its future allocations accordingly. Every time a job completes,
it updates a cluster-wide model that includes information about the duration, size,
maximum parallelization, deadline and provided resources for each job. If the job is
successful, the proposed algorithm is more optimistic providing the jobs that follow
with fewer resources hoping they will still meet their deadlines? Next we compare
the result of the proposed algorithm with the existing methodology in the big data
analytics framework. The novelty of the proposed algorithm is if the job is unsuc-
cessful Justice provides more conservative allocations to make sure no more jobs
miss their deadlines.
Our research aims to satisfy deadlines and preserve fairness to enable reliable
use of multi-analytic systems in resource-constrained clusters. It achieves this in
a framework-agnostic way by utilizing admission control and predicting resource
requirements without exploiting job repetitions. A key point of our research is its
applicability without costly modifications and maintenance in existing popular open-
source systems like Apache Mesos and YARN. Thus it requires minimal effort to
integrate with the resource manager without the need to adapt to API or structural
changes of the processing engines.
Optimizing Cost and Maximizing Profit for Multi-Cloud-Based Big Data … 37
Reference
1. GERA, P., et al. (2016). A recent study of emerging tools and technologies boosting big data
analytics.
2. Shvachko, K., et al. (2010). The Hadoop distributed file system. In Proceedings of the 26th
IEEE Symposium on Mass Storage Systems and Technologies (MSST), Washington, DC, USA.
3. George, L. (2011). HBase: The definitive guide: Random access to your planet-size data.
O’Reilly Media, Inc.
4. Ghemawatand, S., & Dean, J. (2008). Mapreduce: Simplified data processing on large clusters.
Communications of the ACM.
5. Malik, P., & Lakshman, A. (2010). Cassandra: A decentralized structured storage system. ACM
SIGOPS OS Review.
6. Zaharia, M., et al. (2012). Resilient distributed datasets: A fault-tolerant abstraction for in-
memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked
Systems Design and Implementation, San Jose, CA.
7. Vavilapalli, V. K., et al. (2013). Apache Hadoop yarn: Yet another resource negotiator. In
Proceedings of the 4th ACM Annual Symposium on Cloud Computing, Santa Clara, California.
8. Hu, M., et al. (2015). Deadline-oriented task scheduling for MapReduce environments. In
International Conference on Algorithms and Architectures for Parallel Processing (pp. 359–
372). Berlin: Springer.
9. Golab, W., et al. (2018). OptEx: Deadline-aware cost optimization for spark. Available
at https://github.com/ssidhanta/OptEx/blob/master/optex_technical.pdf, Technical Report, 01
2018.
10. Hindman, B., et al. (2011). Mesos: A platform for fine-grained resource sharing in the data
center. In NSDI (pp. 22–22).
11. “Netflix at spark+ai summit 2018,” by F. Siddiqi, in 2018.
12. Laney, D., et al. (2001). 3D data management: Controlling data volume, velocity, and variety.
13. Hindman, B., et al. (2011). Mesos: A platform for fine-grained resource sharing in the data
center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and
Implementation, Boston, MA, USA.
38 A. Manekar and G. Pradeepini
14. Ghemawat, S., et al. (2004). MapReduce: Simplified data processing on large clusters.
In Proceedings of the 6th Conference on Symposium on Operating Systems Design &
Implementation (Vol. 6 of OSDI’04, pp. 10–10).
15. Pradeepini, G., et al. (2016). Cloud-based big data analytics a review. In Proceedings—2015
International Conference on Computational Intelligence and Communication Networks, CICN
IEEE 2016 (pp. 785–788).
16. Misra, V., et al. (2007). PBS: A unified priority-based scheduler. In ACM SIGMETRICS
Performance Evaluation Review (Vol. 35. 1, pp. 203–214). ACM.
17. Zaharia, M., Das, T., & Armbrust, M., et al. (2016). Apache spark: A unified engine for big
data processing. Communications of the ACM.
18. Dimopoulos, S., & Krintz, C., et al. (2017). Justice: A deadline-aware, fair-share resource
allocator for implementing multi-analytics. In 2017 IEEE International Conference on Cluster
Computing (CLUSTER) (pp. 233–244).
19. Jette, M. A., et al. (2003). Slurm: Simple Linux utility for resource management. In Workshop
on Job Scheduling Strategies for Parallel Processing (pp. 44–60). Berlin: Springer.
20. Pradeepini, G., et al. Experimenting cloud infrastructure for tomorrows big data analytics.
International Journal of Innovative Technology and Exploring Engineering, 8(5), 885–890.
21. Cheng, S., et al. (2016). Evolutionary computation and big data: Key challenges and future
directions. In Proceedings of the Data Mining and Big Data, First International Conference,
DMBD 2016, Bali, Indonesia, (pp. 3–14).
22. Singer, G., et al. (2010). Towards a model for cloud computing cost estimation with reserved
instances. CloudComp.
23. Xiong, N., et al. (2015). A walk into metaheuristics for engineering optimization: Principles,
methods, and recent trends. International Journal of Computational Intelligence Systems, 8,
606–636.
24. https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarnsite/FairScheduler.html. for
YARN Fair Scheduler.
25. Chestna, T., & Imai, S., et al. Accurate resource prediction for hybrid IAAS clouds using
workload-tailored elastic compute units. ser. UCC’13.
26. Pradeepini, G., et al. (2017). Opportunity and challenges for migrating big data analytics in
cloud. In IOP Conference Series: Materials Science and Engineering.
A Comprehensive Survey on Passive
Video Forgery Detection Techniques
V. Kumar · M. Gaur
Department of Computer Science and Engineering, Centre for Advanced Studies,
Dr. A.P.J Abdul Kalam Technical University, Lucknow, India
e-mail: vinay.kumar@cas.res.in
M. Gaur
e-mail: director@cas.res.in
A. Singh (B) · V. Kansal
Department of Computer Science and Engineering, Institute of Engineering and Technology
Lucknow, Dr. A.P.J Abdul Kalam Technical University, Lucknow, India
e-mail: 2216@ietlucknow.ac.in
V. Kansal
e-mail: vineetkansal@ietlucknow.ac.in
© The Editor(s) (if applicable) and The Author(s), under exclusive license 39
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_4
40 V. Kumar et al.
1 Introduction
Video forgery in the modern era requires attention significantly. The prime reason
for the same is that transmission of information using multimedia is preferred choice
due to low encryption cost. The processing of information within multimedia is
through frame reading. Due to mass utilization of this mechanism in transmission,
it is maliciously attacked by hackers and frames are altered. To this end, researcher
uses distinct mechanism to perform encryption and detecting forgery if any within
video frames.
The digital video tampering in which the contents of videos is modified or changed
to made it doctored or fake video [1]. Attacks that change video can be divided into
three domains: spatial, temporal and spatial–temporal. Tampering can be done using
various techniques [2]. There are following types of tampering that are applied to
videos:
• Shot-level tampering: In this, the scene is detected from videos, and then this scene
is copied to another place or manipulation is done in this scene. This tampering
is used in temporal or spatial level.
• Frame-level tampering: The frames from videos are extracted first, then tampering
is done on these frames. The forger may remove, add or copy the frames for
changing the contents of videos. It is one of temporal tampering mechanisms
used to alter frames within the videos.
• Block-level tampering: It is applied on blocks of videos, i.e., any specified area
of video frames. In this, blocks are cropped and replaced in videos. It is spatial
tampering that is performed at block level.
• Pixel-level tampering. In this, video frames are changed at pixel level. In this,
pixels of videos are modified or copied or replaced [3]. The spatial attacks are
performed at pixel level.
The last decade has seen video forensic becoming an important field of study. As
shown in Fig. 1, it is divided into three types of categories [4–6].
Video Forensic
Differentiate
Source Video Original & edited Forgery Detection
Video
The categories are source recognition, the ability to distinguish between computer
generated and actual video, and the detection of forgeries. The first group emphasizes
on describing the source of a digital product, such as mobile phones, camcorders,
cameras, etc. The second objective is to differentiate between the real videos and
edited video. The third is forgery detection aimed at finding proof of tempering in
video digital data.
Digital video forensic is concerned with the three main tasks as shown in Fig. 2. To
tackle the contest of digital content authentication, the video media forensics area
provides a set of tools and techniques known collectively as tamper or forgery detec-
tion techniques. Minute digital video or image content adjustments can cause real
societal and valid problems. Altered recordings may be used as executing misleading
news accounts, or misleading individuals. There are larger number of networks who
manipulate media data on social networking sites such as Yahoo, Twitter, TikTok,
Instagram, Facebook and YouTube.
This paper is organized under following sections: Section 2 gives details of video
forgery detection methods which are used to avoid above tempering methods and
qualitative analysis of video forgery passive detection, and Sect. 3 presents compara-
tive analysis of different techniques. Section 4 presents highlights and issues in video
forgery detection, after that we conclude and present future scope of this paper in
Sect. 5.
Video forgery detection aims to establish the authenticity of a video and to expose
the potential modifications and forgeries that the video might have undergone [7].
Undesired post-processing operations or forgeries generally are irreversible and leave
some digital footprints. Video forgery detection techniques scrutinize these footprints
Objective Video
Forensic
Passive Approach
Active Approach
in order to differentiate between original and the forged videos. When a video is
forged, some of its fundamental properties change and to detect these changes is what
is called as video forgery detection techniques used for. There are two fundamental
approaches for video forgery detection: active approach and passive approach as
shown in Fig. 3.
Active forgery detection includes techniques like digital watermarking and digital
signatures which are helpful to authentic content ownership and copyright violations
[8]. Though the basic application of watermarking and signatures is copyright protec-
tion, it can be used for fingerprint, forgery detection, error concealment, etc. There
are several drawbacks to the active approach as it requires a signature or watermark
to be embedded during the acquisition phase at the time of footage or an individual to
embed it later after acquisition phase. This restricts the application of active approach
due to the need of distinctive hardware like specially equipped cameras. Other issues
which have an impact on the robustness of watermarks and signatures are factors
like compression, scaling, noise, etc.
sometime not identified because of defect in software system; this defect is removed
by early predicting defect in software [9, 10].
Different types of descriptive features were used by various researchers to accom-
plish the task of forgery detection [11–16]. Figure 4 presents features used for video
forgery detection. Thus, to overcome the inefficiency encountered in the active
approach, the use of passive approach for video forgery detection can be made.
Passive approach thus proves to be better than the active ones as it works on the
first-hand and information without the need for extra information bits and hardware
requirements. It totally relies on the available forged video data and its intrinsic
features and properties without the need of original video data.
To be specific, active techniques include motion detection mechanisms and passive
technique includes static mechanisms. The forgery under static mechanisms falls
under inter-, intra- and compression-based mechanisms.
Intra-frame forgery detection uses the gaps between the frames to detect the forgery
if any between the video frames. These mechanisms include copy-move forgery,
splicing, etc. The image frames within videos are altered by the use of this mechanism.
44
Table 1 Inter-frame forgery based on frame deletion, insertion & duplication techniques
Paper references Year Description Advantages Limitation Results
Detection of Inter-frame 2018 Proposes new forensic It gives reliable result and It is not efficient due to using Detection rate is better and
Forgeries in Digital Videos footprint based on the detection rate is improved more than one for utilizes CBR and VBR
[18] variation of the macro-block compressed video
prediction types (VPF) in the
P-frames &, also estimate
the size of a GOP
Inter-frame Passive-Blind 2018 A passive-blind video Motion-based detections Motion within the video can Accuracy of scheme is
Forgery Detection for Video shooting forensics scheme is done with accuracy using be further detected 99.01%
Shot Based on Similarity that inter-frame forgeries are tangent-based approach accurately using noise
Analysis [19] found. This method consists handling procedure
of two parts:
hue-saturation-value (HSV)
colour histogram comparison
and speeded-up robust
features (SURF) extraction
function along with fast
library for approximate
nearest neighbours (FLANN)
matching for double-check
Inter-frame Forgery 2017 Proposes methodology that It reduces the conflicting Performance of system It has average detection
Detection in H.264 Videos uses residual and optical results. It gives precise suffers when high accuracy around 83%
Using Motion and flow estimation in consistent localization of forgery illumination videos are used
Brightness Gradients [20] to detect frame insertion,
duplication in removal
videos encoded in MPEG-2
and H.264. It is used for
detecting forgeries in videos
by exhibiting object motion
V. Kumar et al.
Table 2 Inter-frame forgery based on copy frame analysis techniques
Video Inter-frame Forgery 2017 It proposes hybrid The defects are automatically It is unable to detect It detects max. and min.
Detection Approach for mechanism that uses motion detected using spikes count forgery frame in slow number of frames forged
Surveillance and and gradient feature to motion videos which is 60 and 10
Mobile-Recorded Videos extent variation between
[21] various frames. In this,
forensic artefacts are
analysed using objective
methodology
A New Copy Move 2016 Proposes a method for This method is efficient & The feature detection is The detection is better even
Forgery Detection Method detecting copy-move suitable for slower than the ORB if forged image has been
Resistant to Object forgery in the videos. It removing/inserting frames feature rotated, blurred, 98.7%
Removal of Uniform utilizes hybrid methodology
Background Forgery [22] of AKAZE feature and
RANSAC for detection of
copied frame and for
elimination of false match.
It detects forgery of object
A Comprehensive Survey on Passive Video Forgery …
Detection of 2016 Proposes a forensic It is inexpensive and The frame addition is not Frame removal detection
Re-compression, technique that is used to independent of heuristically considered in this approach technique achieved an
Transcoding and Frame identify recompressed or computed thresholds average accuracy of 99.3%
Deletion for Digital Video transcoded videos by
Authentication [24] inspecting videos optical
flow. Its detection accuracy
is better because it does not
limit by the number of
post-production
compressions
Chroma Key Background 2016 Proposes a blurring Gives better recall rate with It does not handled Method achieving detection
Detection for Digital Video artefact-based technique for efficiency background colour accuracy of 91.12%
Using Statistical detecting features in video
Correlation of Blurring along with chroma key. It
Artifact [25] first of all extracts the frame
that has blurring effect; then
it is further analysed for
forged region
V. Kumar et al.
A Comprehensive Survey on Passive Video Forgery … 47
To detect such forgery, boundary colours and frames distinguishment are analysed.
Result in terms of bit error rate is expressed using these mechanisms. Table 3 is
describing the different intra-forgery detection techniques used in video forgery their
advantages, disadvantages and accuracy result.
In this section, comparative analysis of various video forgery detection. Earlier paper
appraises only a few forensic recording techniques. Many noteworthy and recent
achievements were not examined or analysed [35–38]. It analyses the performance
of copy-paste forgery detection techniques on motion-residue based approach [39],
object based approach [40] and optical-flow-based approach [41]. Figure 5 presents
the comparative outcomes of approach based on the quality factors.
Figure 6 analyses the performance of inter-forgery detection on noise-based
approach [42], optical-flow-based approach [43] and pixel based approach [44] and
presents comparative overview of the findings as a feature of specific Quality factors
percentage and different number of inserted/deleted/duplicated frames.
Now as analysis suggests that motion-based forgery detection mechanisms are
uncommon and hard to detect. In category 1(Inter-frame forgery) mechanism of the
research papers is analysed and major part of the research is focused upon the param-
eters such as mean square error and peak signal-to-noise ratio. In category 2(Intra
frame forgery) of research papers, noise handling procedure accommodated within
these papers allows peak signal-to-noise ratio to enhance. In category 3(Compres-
sion based mechanism) of the papers lies and video forgery detection mechanism
employed within such situation causes frame rate to decrease and hence noise within
frame increases. Sometimes video forgery cannot detect due to software failure due
to that peak signal-to-noise ratio value is altered [45–47]. These detection mecha-
nisms allow parameters like PSNR and MSE to be optimized, and the accuracy result
obtained in these mechanisms is shown Fig. 7. Generally, more than 100 videos were
tested and used during comparative analysis. All these videos show both basic and
Table 3 Intra-frame forgery detection techniques
48
100
90
Moon Based
80
70 Object based
60 Opcal Based
50
85.3
85.2
76.1
85.2
82.1
82.6
79.9
88.7
89.1
40
X axis: Bit-rates
30
Y axis: Quality
20 Factor
10
0
Bitrate(3) Bitrate(6) Bitrate(9)
Fig. 5 Comparative outcomes of copy-paste forgery at different bit rate and quality factors
100
Noise Based
90
80 Opcal flow
70 based
60 Pixel Correcon
50
82.3
X axis: Number
85.3
86.1
40
83.9
82.9
88.0
71.9
85.4
79.1
30 Of Frames
20
Y axis: Quality
Factor
10
0
Frame (30) Frame (60) Frame (100)
complex lifelike scenarios, depicting scenes both indoor and outdoor. All of the
forgeries were created plausibly to simulate practical forensic scenarios.
52 V. Kumar et al.
100
90
80
70
Inter Frame Forgery
60
96.73
Intra Frame Forgery
93.42
98.62
50
Compression Based Forgery
40
30
20 X axis : Video Forgery
10
Detection techniques.
0 Y-axis : Accuracy
Category 1 Category 2 Category 3 Percenatage
The domain of video forensic and video anti-forensic is explored. The results consist
majority of passive video methodologies discussed in this survey. Most of the tech-
niques use GOP structure [48–53] because it is easier to understand and they are
having fixed number of frames. Types of temper video can suffer and various source
for passive technique used to detect attack. The major highlights for detecting forgery
of video are following.
• In inter-frame techniques, detection of forgery is done by taking one frame at
single instance.
• In intra-frame techniques, detection of forgery is done by establishing the
relationship between two adjacent frames.
• Various techniques in which detection of forgery via detection of double
compression.
• Detection of forgery by motion and brightness feature-based inter-frame forgery
detection technique.
• Pixel-level analysis-based techniques for detecting pixel similarities in video
forgery.
• Analysis by copy-paste forgery detection techniques by looking for similarities
or correlation between same regions.
A Comprehensive Survey on Passive Video Forgery … 53
Digital video forensics was also seen as being in its rudimentary stages. The iden-
tification of digital forgery is a very difficult activity, and the lack of a widely avail-
able solution exacerbates the situation. The various issues in video forgery detection
techniques obtained during this survey are following
• A significant shortcoming is that on realistically manipulated video, they lack
sufficient validation. Manually producing fake videos is very time-consuming and
so most authors performed research on synthetically doctored sequences [54–56].
• Digital video forensics was also seen as still in its rudimentary stages. The identifi-
cation of digital forgery is a rather complex job, and the lack of a widely applicable
solution exacerbates the situation [57–59].
• Video forensic detects frame manipulation by double compression if forger
directly modifies the encoded video than insufficient anti-forensic and counter
anti-forensic strategies [60, 61].
• For better video forgery detection, a huge database of tempered video is required
[62–65].
From video forgery detection, we analyse the performance of various techniques
like optical-based, motion-based, object-based, noise-based, pixel correction-based
and copy-paste detection techniques. The major finding we obtained during these
technique analyses is following
• Understanding the reliability factors in video forgery detection in much better way.
Video forgery includes issue related to multimedia heterogeneity, issue related to
editing software and content of video which effect the reliability.
• In future combining the active and passive techniques for obtaining the better
accuracy in quality factors of forged video.
• Integration of fields like artificial intelligence, machine learning, signal
processing, computer vision and deep learning with the discussed techniques
can also produce more accurate result.
detection mechanism like tangent-based strategy that can be used to enhance for
better encryption and decryption of video frames along with splicing techniques for
enhancement.
References
1. R. Saranya, S. Saranya, & R. Cristin. (2017). Exposing Video Forgery Detection Using Intrinsic
Fingerprint Traces 1 1, 2, 3 IEEE Access, 73–76.
2. Kelly, F. (2006). Fast probabilistic inference and GPU video processing [thesis], Trinity College
(Dublin, Ireland). Department of Electronic & Electrical Engineering, p. 178.
3. Li J, He H, Man H, Desai S (2009) A general-purpose FPGA-based reconfigurable platform
for video and image processing. In W. Yu, H. He, & N. Zhang (Eds.), Advances in neural
networks—ISNN 2009. ISNN 2009. Lecture notes in computer science (Vol. 5553, Berlin:
Springer).
4. H. T. Sen car, & N. Memon (2008). Overview of state-of-the-art in digital image forensics.
Statistical Science and Interdisciplinary Research, 325–347.
5. Asok, D., Himanshu, C., & Sparsh, G. (2006). Detection of forgery in digital video. In 10th
World Multi Conference on Systemics, Cybernetics and Informatics, pp. 16–19, Orlando. USA.
6. Su, L., Huang, T., & Yang, J. (2014). A video forgery detection algorithm based on compressive
sensing. Springer Science and Business Media New York.
7. Shanableh, T. (2013). Detection of frame deletion for digital video forensics. Digital
Investigation, 10(4), 350–360.
8. Hsu, C. C., Hung, T. Y., Lin, C. W., & Hsu, C. T. (2008). Video forgery detection using
correlation of noise residue. In 2008 IEEE 10th workshop on multimedia signal processing,
pp. 170–174.
9. Ghosh, S., Rana, A., & Kansal, V. (2019). Evaluating the impact of sampling-based nonlinear
manifold detection model on software defect prediction problem. In S. Satapathy, V. Bhateja, J.
Mohanty, & S. Udgata (Eds.), Smart intelligent computing and applications. Smart innovation,
systems and technologies (Vol. 159), pp. 141–152.
10. Ghosh, S., Rana, A., & Kansal, V. (2017). Predicting defect of software system. In Proceedings
of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Appli-
cations (FICTA-2016), Advances in Intelligent Systems and Computing (AISC), pp. 55–67,
2017.
11. Kurosawa, K., Kuroki, K., & Saitoh, N. (1999). CCD fingerprint method identification of a
video camera from videotaped images. In Proceedings of IEEE International Conference on
Image Processing, Kobe, Japan, pp. 537–540.
12. Lukáš, J., Fridrich, J., & Goljan, M. (2006). Digital camera identification from sensor pattern
noise. IEEE Transactions on Information Forensics and Security, 1(2), 205–214.
13. Goljan, M., Chen, M., Comesaña, P., & Fridrich, J. (2016). Effect of compression on sensor-
fingerprint based camera identification. Electronic Imaging, 1–10.
14. Mondaini, N., Caldelli, R., Piva, A., Barni, M., & Cappellini, V. (2007). Detection of malev-
olent changes in digital video for forensic applications. In E. J. Delp, & P. W. Wong (Eds.),
Proceedings of SPIE Conference on Security, Steganography and Watermarking of Multimedia
Contents (Vol. 6505, No. 1).
15. Wang, W., & Farid, H. (2007). Exposing digital forgeries in interlaced and deinterlaced video.
IEEE Transactions on Information Forensics and Security, 2(3), 438–449.
16. Wang, W., & Farid, H. (2006). Exposing digital forgeries in video by detecting double MPEG
compression. In: S. Voloshynovskiy, J. Dittmann, & J. J. Fridrich (Eds.), Proceedings of
8th Workshop on Multimedia and Security (MM&Sec’06) (pp. 37–47). ACM Press, New York.
A Comprehensive Survey on Passive Video Forgery … 55
17. Hsia, S. C., Hsu, W. C., & Tsai, C. L. (2015). High-efficiency TV video noise reduction
through adaptive spatial–temporal frame filtering. Journal of Real-Time Image Processing,
10(3), 561–572.
18. Sitara, K., & Mehtre, B. M. (2018). Detection of inter-frame forgeries in digital videos. Forensic
Science International, 289, 186–206.
19. Zhao, D. N., Wang, R. K., & Lu, Z. M. (2018). Inter-frame passive-blind forgery detection for
video shot based on similarity analysis. Multimedia Tools and Applications, 77(19), 25389–
25408.
20. Kingra, S., Aggarwal, N., & Singh, R. D. (2017). Inter-frame forgery detection in H.264
videos using motion and brightness gradients. Multimedia Tools and Applications, 76(24),
25767–25786.
21. Kingra, S., Aggarwal, N., & Singh, R. D. (2017). Video inter-frame forgery detection approach
for surveillance and mobile recorded videos. International Journal of Electrical & Computer
Engineering, 7(2), 831–841.
22. Ulutas, G., & Muzaffer, G. (2016). A new copy move forgery detection method resistant to
object removal with uniform background forgery. Mathematical Problems in Engineering,
2016.
23. Abbasi Aghamaleki, J., Behrad, A. (2016). Inter-frame video forgery detection and localization
using intrinsic effects of double compression on quantization errors of video coding. Signal
Processing: Image Communication, 47, 289–302.
24. Singh, R. D., & Aggarwal, N. (2016). Detection of re-compression, transcoding and frame-
deletion for digital video authentication. In 2015 2nd International Conference on Recent
Advances in Engineering & Computational Sciences RAECS.
25. Bagiwa, M. A., Wahab, A. W. A., Idris, M. Y. I., Khan, S., & Choo, K. K. R. (2016). Chroma key
background detection for digital video using statistical correlation of blurring artifact. Digital
Investigation, 19, 29–43.
26. Baradel, F., Neverova, N., Wolf, C., Mille, J., & Mori, G. (2018). Object level visual reasoning
in videos. In Lecture Notes in Computer Science (including Subseries Lecture Notes Artificial
Intelligence, Lecture Notes Bioinformatics) (Vol. 11217, pp. 106–122). LNCS.
27. Afchar, D., Nozick, V., Yamagishi, J., & Echizen, I. (2018). MesoNet: A compact facial video
forgery detection network. In 2018 IEEE International Workshop on Information Forensics
and Security (WIFS).
28. Jia, S., Xu, Z., Wang, H., Feng, C., & Wang, T. (2018). Coarse-to-fine copy-move forgery
detection for video forensics. IEEE Access, 6(c), 25323–25335.
29. Kaur Saini, G., & Mahajan, M. (2016). Improvement in copy—Move forgery detection using
hybrid approach. International Journal of Modern Education and Computer Science, 8(12),
56–63.
30. Kaur, G., & Kaur, R. (2016). A video forgery detection using discrete wavelet transform and
scale invarient feature transform techniques. 5(11), 1618–1623.
31. Chen, K., Wang, J., Yang, S., Zhang, X., Xiong, Y., Loy, C. C., & Lin, D. (2018). Optimizing
video object detection via a scale-time lattice. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (pp. 7814–7823).
32. He, P., Jiang, X., Sun, T., Wang, S., Li, B., & Dong, Y. (2017). Frame-wise detection of relocated
Iframes in double compressed H.264 videos based on convolutional neural network. Journal
of Visual Communication and Image Representation, 48, 149–158.
33. Yao, H., Song, S., Qin, C., Tang, Z., & Liu, X. (2017). Detection of double-compressed
H.264/AVC video incorporating the features of the string of data bits and skip macroblocks.
Symmetry (Basel), 9(12), 1–17.
34. Rocha, A., Scheirer, W., Boult, T., & Goldenstein, S. (2011). Vision of the unseen: Current
trends and challenges in digital image and video forensics. ACM Computing Surveys, 43(4),
26.
35. Milani, S., Fontani, M., Bestagini, P., Barni, M., Piva, A., Tagliasacchi, M., & Tubaro, S. (2012).
An overview on video forensics. APSIPA Transactions on Signal and Information Processing,
1(1), 1–18.
56 V. Kumar et al.
36. Wahab, A. W. A., Bagiwa, M. A., Idris, M. Y .I., Khan, S., Razak, Z., & Ariffin, M. R. K. Passive
video forgery detection techniques: a survey. In Proceedings of 10th International Conference
on Information Assurance and Security, Okinawa, Japan, pp. 29–34.
37. Joshi, V., & Jain, S. (2015). Tampering detection in digital video e a review of temporal
fingerprints based techniques. In Proceedings of 2nd International Conference on Computing
for Sustainable Global Development, New Delhi, India, pp. 1121–1124.
38. Bestagini, P., Milani, S., Tagliasacchi, M., & Tubaro, S. (2013). Local tampering detection in
video sequences. In Proceedings of 15th IEEE International Workshop on Multimedia Signal
Processing. Pula, pp. 488–493.
39. Zhang, J., Su, Y., Zhang, M. (2009). Exposing digital video forgery by ghost shadow artifact.
In Proceedings of 1st ACM Workshop on Multimedia in Forensics (MiFor’09) (pp. 49–54).
NewYork: ACM Press.
40. Bidokhti, A., Ghaemmaghami, S.: Detection of regional copy/move forgery in MPEG videos
using optical flow. In: International symposium on Artificial intelligence and signal processing
(AISP), Mashhad, Iran, pp. 13–17 (2015).
41. De, A., Chadha, H., & Gupta, S. (2006). Detection of forgery in digital video. In Proceedings
of 10th World Multi Conference on Systems, Cybernetics and Informatics (pp. 229–233).
42. Wang, W., Jiang, X., Wang, S., & Meng, W. (2014). Identifying video forgery process using
optical flow. In Digital forensics and watermarking (pp. 244–257). Berlin: Springer.
43. Lin, G. -S., Chang, J. -F., Chuang, F. -H. (2011). Detecting frame duplication based on spatial
and temporal analyses. In Proceedings of 6th IEEE International Conference on Computer
Science and Education (ICCSE’11), SuperStar Virgo, Singapore, pp. 1396–1399.
44. Ghosh, S., Rana, A., & Kansal, V. (2017). Software defect prediction system based on linear and
nonlinear manifold detection. In Proceedings of the 11th INDIACom; INDIACom-2017; IEEE
Conference ID: 40353, 4th International Conference on—Computing for Sustainable Global
Development (INDIACom 2107) (pp. 5714–5719). INDIACom-2017; ISSN 0973–7529; ISBN
978–93–80544–24–3.
45. Ghosh, S., Rana, A., & Kansal, V. (2018). A nonlinear manifold detection based model for
software defect prediction. International Conference on Computational Intelligence and Data
Science; Procedia Computer Science, 132(8), 581–594.
46. Ghosh, S., Rana, A., & Kansal, V. (2019). Statistical assessment of nonlinear manifold detec-
tion based software defect prediction techniques. International Journal of Intelligent Systems
Technologies and Applications, Inderscience, Scopus Indexed, 18(6), 579–605. https://doi.org/
10.1504/IJISTA.2019.102667.
47. Luo, W., Wu, M., & Huang, J. (2008). MPEG recompression detection based on block artifacts.
In E. J. Delp, P. W. Wong, J. Dittmann, N. D. Memon, (Eds.), Proceedings of SPIE Security,
Forensics, Steganography, and Watermarking of Multimedia Contents X (Vol. 6819), San Jose,
CA.
48. Su, Y., Nie, W., & Zhang, C. (2011). A frame tampering detection algorithm for MPEG
videos. In Proceedings of 6th IEEE Joint International Information Technology and Artificial
Intelligence Conference, Vol. 2, pp. 461–464. Chongqing, China.
49. Vázquez-Padín, D., Fontani, M., Bianchi, T., Comesana, P., Piva, A., & Barni, M. (2012). Detec-
tion of video double encoding with GOP size estimation. In Proceedings on IEEE International
Workshop on Information Forensics and Security, Tenerife, Spain, Vol. 151.
50. Su, Y., Zhang, J., & Liu, J. (2009). Exposing digital video forgery by detecting motion-
compensated edge artifact. In Proceedings of International Conference on Computational
Intelligence and Software Engineering (Vol. 1, no. 4, pp. 11–13). Wuhan, China.
51. Dong, Q., Yang, G., & Zhu, N. (2012). A MCEA based passive forensics scheme for detecting
framebased video tampering. Digital Investigation, 9(2), 151–159.
52. Kancherla, K., & Mukkamal, S. (2012). Novel blind video forgery detection using Markov
models on motion residue. Intelligent Information and Database System, 7198, 308–315.
53. Fontani, M., Bianchi, T., De Rosa, A., Piva, A., & Barni, M. (2011). A Dempster-Shafer frame-
work for decision fusion in image forensics. In Proceedings of IEEE International Workshop
on Information Forensics and Security (WIFS’11) (pp. 1–6), Iguacu Falls, SA. https://doi.org/
10.1109/WIFS.2011.6123156.
A Comprehensive Survey on Passive Video Forgery … 57
54. Fontani, M., Bianchi, T., De Rosa, A., Piva, A., & Barni, M. (2013). A framework for decision
fusion in image forensics based on Dempster-Shafer theory of Evidence. IEEE Transactions
on Information Forensics and Security, 8(4), 593–607. https://doi.org/10.1109/TIFS.2013.224
8727.
55. Fontani, M., Bonchi, A., Piva, A., & Barni, M. (2014). Countering antiforensics by means
of data fusion. In Proceedings of SPIE Conference on Media Watermarking, Security, and
Forensics. https://doi.org/10.1117/12.2039569.
56. Stamm, M. C., & Liu, K. J. R. (2011). Anti-forensics for frame deletion/addition in mpeg video.
In Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing
(ICASSP’11) (pp. 1876–1879), Prague, Czech Republic.
57. Stamm, M. C., Lin, W. S., & Liu, K. J. R. (2012). Temporal forensics and anti-forensics for
motion compensated video. IEEE Transactions on Information Forensics and Security, 7(4),
1315–1329.
58. Liu, J., & Kang, X. (2016). Anti-forensics of video frame deletion. [Online] https://www.paper.
edu.cn/download/downPaper/201407-346. Accessed 9 July (2016).
59. Fan, W., Wang, K., & Cayere, F., et al. (2013). A variational approach to JPEG anti-forensics.
In Proceedings of IEEE 38th International Conference on Acoustics, Speech, and Signal
Processing (ICASSP’13) (pp. 3058–3062), Vancouver, Canada.
60. Bian, S., Luo, W., & Huang, J. (2013). Exposing fake bitrate video and its original bitrate. In
Proceeding of IEEE International Conference on Image Processing (pp. 4492–4496).
61. CASIA Tampered Image Detection Evaluation Database. [Online]. https://forensics.idealtest.
org:8080. Accessed 30 Mar (2016).
62. Tralic, D., Zupancic, I., Grgic, S., Grgic, M., CoMoFoD—New Database for Copy-
63. Move Forgery Detection. In: Proceedings of 55th International Symposium ELMAR, Zadar,
Croatia (pp. 49–54), [Online]. https://www.vcl.fer.hr/comofod/download.html. Accessed 18
July (2016).
64. CFReDS—Computer Forensic Reference Data Sets, [Online]. https://www.cfreds.nist.gov/.
Accessed 17 May (2016).
65. Kwatra, V., Schödl, A., Essa, I., Turk, G., & Bobick, A. F. (2003). Graph cut textures image
and video synthesis using graph cuts. ACM Transactions on Graphics, 22(3), 277–286.
66. Pèrez, P., Gangnet, M., & Blake, A. (2003). Poisson image editing. ACM Transactions on
Graph. (SIGGRAPH’03, 22(3), 313–318.
67. Criminisi, A., Pèrez, P., & Toyama, K. (2004). Region filling and object removal by exemplar-
based image inpainting. IEEE Transactions on Image Processing, 13(9), 1200–1212.
68. Shen, Y., Lu, F., Cao, X., & Foroosh, H. (2006). Video completion for perspective camera
under constrained motion. In Proceedings of 18th IEEE International Conference on Pattern
Recognition (ICPR’06) (pp. 63–66). Hong Kong, China.
69. Komodakis, N., & Tziritas, G. (2007). Image completion using efficient belief propagation via
priority scheduling and dynamic pruning. IEEE Transactions on Image Processing, 16(11),
2649–2661.
70. Patwardhan, K. A., Sapiro, J., & Bertalmio, M. (2007). Video inpainting under constrained
camera motion. IEEE Transactions on Image Processing, 16(2), 545–553.
71. Hays, J., & Efros, A. A. (2007). Scene completion using millions of photographs. ACM
Transactions on Graph (SIGGRAPH’07), 26(3), 1–7.
72. Columbia Image Splicing Detection Evaluation Dataset. [Online]. https://www.ee.col
umbia.edu/ln/dmvv/downloads/AuthSplicedDataSet/AuthSplicedDataSet.htm. Accessed 3
June (2016).
DDOS Detection Using Machine
Learning Technique
© The Editor(s) (if applicable) and The Author(s), under exclusive license 59
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_5
60 S. Pande et al.
1 Introduction
With the ongoing convergence of data innovation (IT), various data gadgets are
turning out to be massively muddled. Associated with one another, they keep on
making furthermore spare significant computerized information, introducing a period
of big data. However, the probability is extremely high that they may expose signif-
icant data as they transmit a lot of it through consistent correspondence with one
another. A framework turns out to be more vulnerable as more digital devices are
connected. Hackers may additionally target it to take information, individual data,
and mechanical insider facts and break them for unlawful additions [1]. Given these
conditions, attack detection system (ADS) ought to likewise be more smart and
successful than previously to battle attack from hackers, which are continuously
evolving. Confidentiality, integrity and availability can be considered as the main
pillars of security [2, 3]. All these pillars are discussed below.
1.1 Confidentiality
Confidentiality is also called as secrecy. The motive behind secrecy is to keep sensitive
information away from illegitimate user and to provide access to the legitimate user.
Along with this, assurance must be given on restricted access of the information.
1.2 Integrity
Integrity means keeping up the data as it is without any modification in the data.
Data must be received as it at the receiver end. To provide integrity, file permissions
and user access controls can be used. A variety of techniques has been designed to
provide integrity, and some of them are as follows: checksums, encryption, etc.
1.3 Availability
As per the report of Kaspersky [4], growth in the frequency and the size of DDoS
attack in the 2018 can be seen. One of the largest DDoS attacks was implemented
on GitHub in the month of February, 2018, which consists of 1.3 TBPS of traffic
transfer [5].
Various tools are freely available for performing DDoS attack; some of them are
listed below [6]
• HULK (HTTP Unbearable Load King)
• GoldenEye HTTP DoS tool
• Tor’s Hammer
• DAVOSET
• PyLoris
• LOIC (Low Orbit Ion Cannon)
• XOIC
• OWASP DoS HTTP Post
• TFN (Tribe Flood Network)
• Trinoo.
2 Related Work
Lot of researchers are working on the detection of the most DDoS which has its
largest impact in the area of social networking by using deep learning and machine
learning techniques. Some of the recent work done in this area is discussed below.
Hariharan et al. [7] used machine learning C5.0 algorithm and have done the
comparative analysis of the obtained results with different machine learning algo-
rithms such as Naïve Bayes classifier and C4.5 decision tree classifier. Mainly, the
author tried to work in offline mode.
BhuvaneswariAmma N. G. et al. [8] have implemented a technique, deep intelli-
gence. The author extracted the intelligence from radial basis function consisting of
varieties of abstraction level. The experiment was carried out on famous NSL KDD
and UNSW NB15 dataset, where 27 features were considered. The author claimed
to have better accuracy compared to other existing techniques.
Muhammad Aamir et al. [9] implemented feature selection method based on clus-
tering approach. Algorithm was compared based on five different ML algorithms.
Random forest (RF) and support vector machine (SVM) were used for training
purpose. RF achieved highest accuracy of around 96%.
62 S. Pande et al.
Dayanandam et al. [10] have done classification based on features of the packets.
The prevention technique tries to analyze the IP addresses by verifying the IP
header. These IP addresses are used for differentiating spoofed and normal addresses.
Firewalls do not provide efficient solution when the attack size increases.
Narasimha et al. [11] used anomaly detection along with the machine learning
algorithms for bifurcating the normal and attacked traffics. For the experiment, real-
time datasets were used. Famous naive Bayes ML algorithm was used for classi-
fication purpose. The results were compared with existing algorithms like J48 and
random forest (RF).
J. Cui et al. [12] used cognitive-inspired computing along with entropy technique.
Support vector machine learning was used for classification. Details from switch
were being extracted from its flow table. The obtained results were good in terms of
detection accuracy.
Omar E. Elejla et al. [13] implemented an algorithm for detecting DDoS attack
based on classification technique in IPv6. The author compared the obtained results
with five different famous machine learning algorithms. The author claimed that
KNN obtained the good precision around 85%.
Mohamed Idhammad et al. [14] designed entropy-based semi-supervised
approach using ML technique. This implementation consists of unsupervised and
supervised compositions, among which unsupervised technique gives good accuracy
with few false-positive rates. While supervised technique gives reduce false-positive
rates. Recent datasets were used for this experiment.
Nathan Shone et al. [15] implemented deep learning algorithm for classification
of the attack. Along with this, it used unsupervised learning nonsymmetric deep
autoencoder (NDAE) feature. The proposed algorithm was implemented on graphics
processing unit (GPU) using TensorFlow on famous KDD Cup 99 and NSL-KDD
datasets. The author claimed to obtain more accurate detection results.
Olivier Brun et al. [16] worked in the area of Internet of Things (IoT) to detect the
DDoS attack. The author implemented one of the famous deep learning techniques,
i.e., random neural network (RNN) technique for detection of the network. This deep-
learning-based technique efficiently generates more promising results compared to
existing methods.
While performing ping of death attack, the network information needs to be gathered,
and to achieve this, ipconfig command can be used. In Fig. 1, the detailed information
of the network is gathered after giving ipconfig command. As soon as the network
information is gathered, we can start performing the ping of death attack on the IP
address.
Enter the following command to start the attack:
DDOS Detection Using Machine Learning Technique 63
Figure 2 shows the packet information after performing ping of death attack; this
attack will continue till the target resources are exhausted. The primary goal of this
type of DDoS attack is to utilize all the CPU memory and exhaust it. In Fig. 3, clearly
we can see that before starting the attack, the performance graph was linear, and as
soon as the attack is started, the spikes are visible. Figure 4 signifies that CPU is
being utilized as much as possible, and this will continue till the complete network
is exhausted. Details of the memory consumption, CPU utilization, uptime, etc., can
be seen in Figs. 3, 4 and 5.
Random forest (RF) is one of the popular machine learning techniques which is
used for classification developed by Leo Breiman [3]. The random forest produces
different decision trees. Each tree is built by an alternate bootstrap test from the first
information utilizing a tree classification algorithm. NSL-KDD dataset was used for
this experiment [16]. The experiment was performed using a laptop with Windows
10 64-bit operating system, Intel (R) Core (TM) i5-2450 M CPU@ 2.50 GHz, having
8.00 GB RAM. Total instances used for training were 22,544, and the dataset consists
of attributes 42. Random forest was used for training the model. 8.71 s was building
time of the model, and 1.28 s was the testing time of the model. This experiment
was carried out using Weka 3.8 tool. Table 1 provides the summary of the instances
after classification using random forest. Table 2 shows the performance evaluation
using various parameters. Table 3 consists of confusion matrix using normal & attack
classification.
• Accuracy: It measures the frequency of the attack instances of both classes
correctly identified.
DDOS Detection Using Machine Learning Technique 65
TP + TN
Accuracy =
TP + FN + FP + TN
• Precision: It is the ratio of the number of related attacks that were identified to
the total number of unrelated and related attacks that were identified. Also known
as positive predictive value.
TP
Precision =
TP + FP
• Recall: This is the ratio of the number of related attacks to the total number of
related attacks received and also known as positive sensitive value.
TP
Recall =
TP + FN
66 S. Pande et al.
5 Conclusion
In this paper, several ongoing detection techniques for DDoS attack are discussed,
especially using machine learning techniques. Along with this, list of freely available
DDoS tools is also discussed. Command-based ping of death technique was used
to perform DDoS attack. Random forest algorithm was used to train the model
which resulted into 99.76% of correctly classified instances. In future, we will try to
implement deep learning technique for the classification of the instances.
References
1. Ganorkar, S. S., Vishwakarma, S. U., & Pande, S. D. (2014). An information security scheme
for cloud based environment using 3DES encryption algorithm. International Journal of Recent
Development in Engineering and Technology, 2(4).
2. Pande, S., & Gadicha, A. B. (2015). Prevention mechanism on DDOS attacks by using multi-
level filtering of distributed firewalls. International Journal on Recent and Innovation Trends
in Computing and Communication, 3(3), 1005–1008. ISSN: 2321–8169.
3. Khamparia, A., Pande, S., Gupta, D., Khanna, A., & Sangaiah, A. K. (2020). Multi-level
framework for anomaly detection in social networking, Library Hi Tech, 2020. https://doi.org/
10.1108/LHT-01-2019-0023.
4. https://www.calyptix.com/top-threats/ddos-attacks-101-types-targets-motivations/.
5. https://www.foxnews.com/tech/biggest-ddos-attack-on-record-hits-github.
6. Fenil, E., & Mohan Kumar, P. (2019). Survey on DDoS defense mechanisms. John Wiley &
Sons, Ltd. https://doi.org/10.1002/cpe.5114.
7. Hariharan, M., Abhishek, H. K., & Prasad, B. G. (2019). DDoS attack detection using C5.0
machine learning algorithm. I.J. Wireless and Microwave Technologies, 1, 52–59 Published
Online January 2019 in MECS. https://doi.org/10.5815/ijwmt.2019.01.06.
8. NG, B. A., & Selvakumar, S. (2019). Deep radial intelligence with cumulative incarnation
approach for detecting denial of service attacks. Neurocomputing. https://doi.org/10.1016/j.
neucom.2019.02.047.
9. Aamir, M., & Zaidi, S. M. A. (2019). Clustering based semi-supervised machine learning
for DDoS attack classification. Journal of King Saud University—Computer and Informa-
tion Sciences, Production and hosting by Elsevier, https://doi.org/10.1016/j.jksuci.2019.02.
0031319-1578/_2019.
10. Dayanandam, G., Rao, T. V., BujjiBabu, D., & NaliniDurga, N. (2019). DDoS attacks—analysis
and prevention. In H. S. Saini, et al. (Eds.), Innovations in computer science and engineering,
Lecture notes in networks and systems 32. © Springer Nature Singapore Pte Ltd.https://doi.
org/10.1007/978-981-10-8201-6_1.
11. NarasimhaMallikarjunan, K., Bhuvaneshwaran, A., Sundarakantham, K., & Mercy Shalinie, S.
(2019). DDAM: Detecting DDoS attacks using machine learning approach. In N. K. Verma & A.
K. Ghosh (Eds.), Computational Intelligence: Theories, Applications and Future Directions—
Volume I, Advances in Intelligent Systems and Computing, 798, https://doi.org/10.1007/978-
981-13-1132-1_21.
12. Cui, J., Wang, M., & Luo, Y., et al. (2019). DDoS detection and defense mechanism based on
cognitive-inspired computing in SDN. Future Generation Computer Systems. https://doi.org/
10.1016/j.future.2019.02.037.
13. Elejla, O. E., Belaton, B., Anbar, M., Alabsi, B., & Al-Ani, A. K. (2019). Comparison of
classification algorithms on ICMPv6 based DDoS attacks detection. In R. Alfred et al. (Eds.),
Computational Science and Technology, Lecture Notes in Electrical Engineering 481. , Springer
Nature Singapore Pte Ltd.https://doi.org/10.1007/978-981-13-2622-6_34.
68 S. Pande et al.
14. Idhammad, M., Afdel, K., & Belouch, M. (2018). Semi-supervised machine learning approach
for DDoS detection.Applied Intelligence. . Springer Science+Business Media, LLC, part of
Springer Nature 2018. https://doi.org/10.1007/s10489-018-1141-2.
15. Shone, N., Ngoc, T. N., Phai, V. D., & Shi, Q. (2018). A deep learning approach to network
intrusion detection. IEEE Transactions on Emerging Topics in Computational Intelligence,
2(1).
16. Brun, O., Yin, Y., & Gelenbe, E. (2018). Deep learning with dense random neural network for
detecting attacks against IoT-connected home environments. Procedia Computer Science, 134,
458–463, Published by Elsevier Ltd.
Enhancements in Performance
of Reduced Order Modelling
of Large-Scale Control Systems
1 Introduction
The designing of linear and dynamic systems of higher order is tough to tackle due to
the problems in implementation and computation, and it is too tedious to be employed
in reality. Model order reduction is a technique for simplification of the linear and
dynamic high-order systems which are depicted by differential equations. The main
A. Gupta (B)
Department of Electronics and Communication Engineering, Maharaja Ranjit Singh Punjab
Technical University, Bathinda, Punjab, India
e-mail: ankurgarg2711@gmail.com
A. K. Manocha
Department of Electrical Engineering, Maharaja Ranjit Singh Punjab Technical University,
Bathinda, Punjab, India
e-mail: akmanochagzsccet@gmail.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license 69
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_6
70 A. Gupta and A. K. Manocha
motive of model order reduction (MOR) is to replace the system with high order into
a system with comparatively lower order by keeping intact the initial properties.
The purpose of carrying out his simplification is to get a reduced order system
of the higher-order system so that the initial and final systems both are same and
identical in terms of the response of the system and other physical means of repre-
sentation. Numerous researches have been done, and varied techniques have been
suggested for the reduction of transfer function with high order [1, 2]. The ways of
these techniques include Hankel-norm approximation [3], projection technique [4],
Schur decomposition [5], continued fraction expansion approximation [6], Pade or
moment approximation [7], stability-equation method [8], factor division method
[9]. Each technique has its own benefits and drawbacks, and the most important
parameter of concern amongst the limitations is difficult procedure of computation
and maintaining stabilization in the reduced model. Many other approaches were also
developed [10–18] to state the need of model order reduction by mixed approaches
and various evolutionary technique.
In this paper, timeline pertaining to the advancements in the various techniques
in the model order reduction of the high-order system into the low-order system has
been described in detail, and various illustrative examples have been used to justify
the facts. The major driving force behind this study was to get a comprehensive
view of the advancements so that the best technique can be used in the further
work minimizing the drawbacks and highlighting the benefits amongst the numerous
techniques developed till date.
The first part of this paper defines the problem of model order reduction and then
presents methods for the model described with detailed survey of each. Following
it is a test example to show the comparison of different techniques by step response
behaviour, and finally, the derived result is mentioned in the conclusion.
Consider a dynamic system of linear nature which is described by the transfer function
[11, 12, 19, 20] as
where m < n,
The number of poles is n, and m depicts the number of zeros. The zeros and poles
could be either complex or real or combinations of complex and real. The complex
poles they occur in conjugate pair if they are present.
The reduced rth-order system is given by
Enhancements in Performance of Reduced … 71
Nr −1 (s) cr −1 s r −1 + cr −2 s r −2 + · · · + c1 s + c0
G(s) = = (2)
Dr (s) dr s r + dr −1 s r −1 + · · · + d1 s + d0
where r − 1 is the number of zeros and r is the number of poles, of the reduced order
model Gr(s). The zeros and poles could be either complex or real or combinations of
complex and real. The model order reduction’s aim is to decrease the linear system’s
order, for the sake of maintaining, bring on a response with minimum error and for
the system stability.
3 Description of Methods
The balanced truncation method [2] is the rudimentary technique for the model order
reduction as maximum reduction methods depend on it for getting the system in the
balanced form. This method comprises the state matrices A, B, C and D which are
transformed to form a balanced system by the use of a non-singular matrix T such
that
A , B , C , D = T −1 AT, T −1 B, C T, D (3)
The balanced system A , B , C , D is obtained which is the reduced order
approximation of (A, B, C, D).
Mukharjee (2005) showed the response matching algorithm (dominant pole reten-
tion) to obtain reduced order system from original high-order system [21]
The roots of the denominator polynomial (poles) of OHOS as shown in Eq. 1
can be of varied form, viz. distinct or repeated, real or complex conjugate. Using the
technique of varied types of poles, a ROS of 3rd order is assumed having (A) all poles
are repeated (B), one pole is distinct and two or more repeated poles (C), one pair of
complex poles and one real pole (D), and all real poles. All the four conditions for
ROS of three poles can be given as:
72 A. Gupta and A. K. Manocha
a1 s 2 + b1 s + c1
G 1r (s) = (4)
(s + d1 )3
a2 s 2 + b2 s + c2
G 2r (s) = (5)
(s + d2 )(s + e2 )2
a3 s 2 + b3 s + c3
G 3r (s) = (6)
(s + γ )(s + δ + jβ)(s + δ − jβ)
a4 s 2 + b4 s + c4
G 4r (s) = (7)
(s + d4 )(s + e4 )(s + f 4 )
Philip (2010) described the procedure to estimate the reduced order polynomial by
dominant pole retention technique. Philip [8] described various algorithms to estimate
the dominant poles of the OHOS given by Eq. (1). These algorithms are described
as
A. Dominant pole estimation using reci1procal transformation
The transfer function G(s)’s as shown in Eq. (1) has its reciprocal transformation
as,
1 1 an−1 + an−2 s + · · · + a0 s n−1
G̃(s) = G = (8)
s s bn + bn−1 s + bn−2 s 2 + · · · + b0 s n
Enhancements in Performance of Reduced … 73
b0 n
rd = (10)
b1
The calculation is in accordance with the results of classical algebra that the
negation of the addition of its roots (only real parts) corresponds to (n − 1)
degree term’s coefficients in a polynomial of n degree. Thus, through the division
of the term obtained by polynomial’s degree, its average value of can be easily
computed. The dominant root inversion’s approximate value will be shown as it
is reciprocate polynomial. The reciprocal of this value is the original system’s
approximate dominant root.
B. Estimation of dominant pole by principal pseudo-break frequency
The next approximation of system with dominant pole is the characteristic poly-
nomial’s principal pseudo-break frequency [19]. The denominator polynomial
of Eq. (1) gives the estimated dominant pole as,
b0
r p = (11)
b − 2b2 b0
2
1
With r1 and r2 s knowledge, the polynomial in the denominator of reduced order
system could be determined.
C. Model of reduced order dependent on frequency
Latest observations for MOR research used response matching which uses user-
specific frequency. Moving by same path, the proposed technique is elaborated
to make the model of reduced order to be able to match various frequencies
frequency response. It is quite obvious to say that the technique employed
for determining the real pole of less magnitude can be utilized to obtain the
estimation of real pole of the highest magnitude as well, it can be described as,
bn−1
rh = (12)
n
By obtaining the average, we get one more estimation of pole.
The above cited three estimated values give assistance in improving the approx-
imation of the reduced order in approximately high-, medium- and low-frequency
regions, i.e., the entire frequency range.
74 A. Gupta and A. K. Manocha
Desai and Prasad (2013) did the reduction of the model order by the assistance of two
techniques combined together [22, 23] . The coefficients of the denominator of ROS
are obtained by Routh approximation method for finding out the stable model. In
the above-mentioned technique, initially the denominator of the high-order original
system is reciprocated to get
1
D̃n (s) = s Dn
n
(13)
s
Then, the α array from the coefficients of obtained polynomial from Eq. 13 is
formed, and the values of α 1 , α 2, α 3 , ….…… α n parameters are obtained.
The reduced (rth)-order denominator polynomial is obtained using
D̃r (s) = αr s Dr −1 (s) + Dr −2 (s) for · r = 1, 2, . . . and D−1 (s) = D0 (s) = 1 (14)
Then, the reciprocal transformation is applied again for obtaining the reduced
order system’s reduced denominator as
1
Dr (s) = s r D̃r (15)
s
Using the Big Bank Big Crunch (BBBC) theory, the numerator’s coefficients
are obtained for minimization of the objective function ‘F’ which is known as the
integral square error (ISE) between the transient responses of the OHOS and ROS.
Big Bank Big Crunch is an algorithm just like genetic algorithm which operates on
the principle of formation of universe.
Tiwari (2019) carried out the reduction of the model by separating the OHOS into two
parts: denominator and numerator by keeping the stability of the system [24]. The
denominator part is reduced by the usage of technique of dominant pole retention with
the additional concept of clustering. Within this algorithm, the quantitative analysis
of the dominant poles of OHOS is done, and using MDI, formation of the dominancy
of particular pole is done. The highest value MDI of a particular pole depicts that the
pole has high controllability and observability. Then, a cluster of dominant poles is
made, and a cluster center is found out by the application of Eq. (16)
Enhancements in Performance of Reduced … 75
⎡ k−1
⎤
k
1
+ 1
⎢ |λ1 | i=2 |λi | ⎥
λc = ⎣ 2 ⎦ (16)
k
1
|λ1 | + i=2
1
|λi |
where λc is known as the cluster center obtained from k, where k is the number of
poles (λ1 , λ2, λ3 , …, λk ). The number of poles of ROS clusters is equal to the number
of clusters.
The reduced order numerator is found out from a popular technique known as
Pade’s approximation. It is a rational function N(s)/D(s) of degree m and n each.
4 Calculative Experiments
The performance for all MOR methods discussed in Sect. 3 is compared with the
help of numerical experiments on the basis of overshoot, integral square error (ISE),
settling time and rise time with in the OHOS and ROS obtained after applying MOR
technique.
Integral square error is a measure of quality of the found out reduced order system
as
∞
ISE = [y(t) − yr (t)]2 (17)
0
where the response of OHOS is y(t) and the response of obtained ROS is yr (t).
Test Example: Consider linear dynamic system of order nine used by [6, 9, 10,
20, 22] given by the following transfer function as
Step Response
1.2
0.8
Amplitude
0.6
0.4
Original System
0.2 Balanced Truncation
Mukharjee
Philip & Pal
0 Desai & Prasad
Tiwari & Kaur
-0.2
0 2 4 6 8 10 12 14 16 18 20
Time (seconds)
Fig. 1 Step response of original and reduced order models for test example
Table 1 Comparison between various reduced order models for test example
Method of order ISE Steady-state Rise time (s) Overshoot (%) Settling time
reduction value (s)
Original – 1 2.85 – 8.72
Balanced High 1.09 2.92 – –
truncation [2]
Mukharjee [21] 8.77 × 10–2 1 4.67 0 12.9
Philip [8] 2.82 × 10–2 1 2.99 0 7.6
Desai [23] 2.52 × 10–2 1 3.43 1.96 10.6
Tiwari [24] 1.74 × 10–2 1 2.92 0 6.91
The responses of all the MOR techniques are plotted by their step response
behaviour as shown in Fig. 1. The quantitative comparison amongst all the methods
is also carried out as given in Table 1 on the basis of integral square error, peak
overshoot, rise time and settling time along with the achieved steady-state value.
Enhancements in Performance of Reduced … 77
The comparative analysis shows that initially developed balanced truncation tech-
nique gives a good approximation of large-scale system, but the amount of error is
significant. Latest developed techniques decrease the amount of error whether it is
steady-state error or integral square error.
5 Conclusion
This paper shows the enhancement occurring in the area of model order reduction.
The initially developed balanced truncation technique gives high erroneous reduced
order system which suggested the requirement for the development of more improved
techniques. After that Mukharjee in 2005 with the help of response matching devel-
oped, more accurate system which initiated the interest in the model order reduction
and hence more advanced techniques were developed as described by Philip, Desai &
Tiwari. These techniques improved the accuracy between original and reduced order
system and reduced the error amongst them. The present work shows that the integral
square error is reduced with the development in the study of MOR techniques, but the
amount of error should be reduced more to find the exact approximation of original
higher order system. This limitation in the present work can be removed with the
design of a more advanced technique which can eliminate the error to improve the
accuracy amongst the original higher-order system and reduced order system. So,
future development in the area of model order reduction can help in obtaining more
advanced techniques, which can make the reduced order system more accurate so
that the study of large-scale systems can become easier.
References
1. Antoulas, A. C., Sorensen, D. C., & Gugercin, S. (2006). A survey of model reduction methods
for large-scale systems. Math: Contemp.
2. Moore, B. C. (1981). Principal component analysis in linear systems: Controllablity, observ-
ability and model reduction. IEEE Transactions on Automatic Control, AC-26(1), 17–32.
3. Villemagne, C., & Skelton, R. E. (1987). Model reduction using a projection formulation. In
26th IEEE Conference on Decision and Control (pp. 461–466).
4. Safonov, M. G., & Chiang, R. Y. (1989). A Schur method for balanced-truncation model
reduction. IEEE Transactions on Automatic Control, 34(7), 729–733.
5. Shamash, Y. (1974). Continued fraction methods for the reduction of discrete-time dynamic
systems. International Journal of Control, 20(2), 267–275.
6. Shamash, Y. (1975). Linear system reduction using pade approximation to allow retention of
dominant modes. International Journal of Control, 21(2), 257–272.
7. Chen, T. C., & Chang, C. Y. (1979). Reduction of transfer functions by the stability-equation
method. Journal of the Franklin Institute, 308(4), 389–404.
8. Philip, B., & Pal, J. (2010). An evolutionary computation based approach for reduced order
modeling of linear systems. IEEE International Conference on Computational Intelligence and
Computing Research, Coimbatore, pp. 1–8.
78 A. Gupta and A. K. Manocha
9. Lucas, T. N. (1983). Factor division: A useful algorithm in model reduction. IEE Proceedings,
130(6), 362–364.
10. Sikander, A., & Prasad R. (2015). Linear time invariant system reduction using mixed method
approach. Applied Mathematics Modelling.
11. Tiwari, S. K., & Kaur, G. (2016). An improved method using factor division algorithm for
reducing the order of linear dynamical system. Sadhana, 41(6), 589–595.
12. Glover, K. (1984). All optimal hankel-norm approximations of linear multivariable systems
and their L∞ -Error Bounds. International Journal of Control, 39(6), 1115–1193.
13. Le Mehaute, A., & Grepy G. (1983). Introduction to transfer and motion in fractal media: The
geometry of kinetics. Solid State Ionics, 9 & 10, Part 1, 17–30.
14. Vishakarma, C. B., & Prasad, R. (2009). MIMO system reduction using modified pole clustering
and genetic algorithm. Modelling and Simulation in Engineering.
15. Narwal, A., & Prasad, R. (2015). A novel order reduction approach for LTI systems
using cuckoo search and Routh approximation. In IEEE International Advance Computing
Conference (IACC), Bangalore, pp. 564–569.
16. Narwal, A., & Prasad R. (2016). Optimization of LTI systems using modified clustering
algorithm. IETE Technical Review.
17. Sikander A., Prasad R. (2017), “A New Technique for Reduced-Order Modelling of Linear
Time-Invarient system”, IETE Journal of Research.
18. Parmar, G., Mukherjee, S., & Prasad, R. (2007). System reduction using factor division algo-
rithm and eigen spectrum analysis. International Journal of Applied Mathematical Modelling,
31, 2542–2552.
19. Cheng, X., & Scherpen, J. (2018). Clustering approach to model order reduction of power
networks with distributed controllers. Advances in Computational Mathematics.
20. Alsmadi O., Abo-Hammour Z., Abu-Al-Nadi D., & Saraireh S. (2015). soft computing tech-
niques for reduced order modelling: Review and application. Intelligent Automation & Soft
Computing.
21. Mukherjee, S., & Satakshi, M. R. C. (2005). Model order reduction using response matching
technique. Journal of the Franklin Institute, 342, 503–519.
22. Desai, S. R., & Prasad, R. (2013). A novel order diminution of LTI systems using big bang
big crunch optimization and routh approximation. Applied Mathematical Modelling, 37, 8016–
8028.
23. Desai, U. B., & Pal, D. (1984). A transformation approach to stochastic model reduction. IEEE
Transactions on Automatic Control, AC-29(12), 1097–1100.
24. Tiwari S. K., Kaur G. (2019), “Enhanced Accuracy in Reduced Order Modeling for Linear
Stable/Unstable System”, International Journal of Dynamics and Control.
Solution to Unit Commitment Problem:
Modified hGADE Algorithm
Abstract This research paper proposes a hybrid approach which is the extension
of hGADE algorithm consisting of differential evolution and genetic algorithm aims
at solving mixed-integer optimization problem called unit commitment scheduling
problem. The ramp up and down constraints have been included for calculation
of total operating cost of power system operation. The proposed approach is easy
to implement and understand. The technique has been tested on six-unit system
by taking into consideration various system and unit constraints for solving unit
commitment problem. Hybridization of genetic algorithm and differential evolution
has produced the significant improvement in overall results.
1 Introduction
Nowadays, there are only thermal plants or is it a combination of hydro and thermal
or is it a combination of thermal, hydro, and nuclear. As far as modeling is concerned,
nuclear is same as thermal, in fact, that is also called thermal plant. So, hydro, thermal,
and nuclear are same as hydro and thermal [1]. The problem related to power system
operation is hierarchical or multilevel. The problem starts with load forecasting,
which is a very important problem even in control system or even in energy system.
So, load has to be first ascertained, forecasted well in advance. One can has a short,
very short term load forecasting, next 10 min how the load is going to change. So in
case one needs power plant in 2025, one has to start planning right now because the
gestation period for hydropower plant is 7–8 years that is the time we have. And even
© The Editor(s) (if applicable) and The Author(s), under exclusive license 79
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_7
80 A. Singh and A. Khamparia
in thermal power plant one can has some 5–6 years of gestation period. Gestation
period is the time required before a megawatt is produced from power plant, from the
time it is conceived, it is planned, very long term planning to decide the initiation of
the installation of new power plants, because you have to decide which place, which
site, which fuel, from where you are going to get resources. So, that is why, we need
to do load forecasting. The major outcomes of the research are given as follows:
• This research aimed at solving UC problem which is one of the biggest concerns
for power companies.
• Proposed a hybrid approach which is the extension of hGADE algorithm.
• The ramp up and down constraints have been included for calculation of total
operating cost of power system operation.
2 Unit Commitment
In power system operations, there is a load. Now that load changes from hour to hour,
day to day, and week to week. So, one cannot have a permanent solution which units
should be on and which units should to be off. Suppose there is a given load, now
we have to find out which units of the power station should be on to tackle that load,
to take up that load and which units should are not required should be put off which
can be represented by binary numbers 1 or 0, available or not available, working or
not working [2]. This is the binary situation which solution you to find out, and this
is called unit commitment problem solution. The generic unit commitment problem
can be formulated mathematically which is given as Eq. 1.
The constraints which play an important role in unit commitment are as follows:
i. Maximum generating capacity
ii. Minimum stable generation
iii. Minimum up time
iv. Minimum down time
v. Ramp rate
vi. Load generation constraint.
Notations
Minimum up time:
Once a unit in a power system is running, it may not be shut down immediately. The
mathematical representation of minimum up time is given as Eq. 2:
up up,min
If u(i, t) = 1 and ti < ti then u(i, t + 1) = 1 (2)
N
u(i, t)x(i, t) = L(t) (6)
i=1
82 A. Singh and A. Khamparia
N
u(i, t)Pimax ≥ L(t) + R(t) (7)
i=1
Conventional Techniques
Conventional techniques include dynamic programming [10], branch and bound
[11], tabu search, Lagrangian relaxation [12], integer programming, interior point
optimization, simulated annealing, etc.
Walter et al. have presented the field-proven dynamic programming formulation
of unit commitment. The following equation which is marked as 8 is the dynamic
programming algorithm mathematical representation for unit commitment.
Fcost (M, N ) = min P (M, N ) + S (M − 1, L : M, N )
{L} cost cost
+Fcost (M − 1, L)] (8)
where
Fcost (M, N ) least cost to arrive at state (M, N)
Pcost (M, N ) production cost for state (M, N)
Scost (M − 1, L: M, N ) transition cost from state (M − 1, L) to state (M, N).
Arthur I. Cohen has presented a new algorithm based on branch and bound tech-
nique [11] which is different from other techniques as it assumes no priority ordering
as most early techniques were priority list of units which defines the order in which
units start-up or shut down.
Solution to Unit Commitment Problem: Modified hGADE Algorithm 83
Non-Conventional Techniques
Evolutionary Algorithms
From the last few years, the global optimization has received lot of attention from
authors worldwide. The reason may be that optimization can play a role in every
area, from engineering to finance, simply everywhere. Inspired by Darwin’s theory
of evolution, evolutionary algorithms can also be used to solve problems that humans
do not really know how to solve.
Differential Evolution
Differential evolution (DE) worked through identical steps as used by evolutionary
algorithms. DE was developed by Storn and Price in year 1995 [13]. DE used to
provide optimal solution (global maxima) as it never got trapped in local maxima.
As compared to other algorithms, space complexity is quite low in DE [14].
Genetic Algorithm
It belongs to the category of evolutionary algorithm. It is widely used to figure out
the optimal solution to complex problems [2]. The mathematical representation of
UC problem formulation using genetic algorithm (Eq. 9) is given as follows:
⎡ ⎤
T
N
⎣ (ai + bi Pi j + ci Pi2j ⎦ ∗ u i j
j=1 i=1
⎛ −Tioff
⎞
j
⎜ T N
1−e
Ti
⎟
+⎝ σi + δ ⎠ ∗ u i j 1 − u i j−1 (9)
j=1 i=1 i
.
Subject to
N
(Pi j ) ∗ u i j − PD j = 0
i=1
j > MUTi
TiON
TiOFF
j > MDTi
where
N units
T scheduling interval
Pi j unit i’s generation for hour j
ai bi ci coefficients of fuel cost
84 A. Singh and A. Khamparia
σ coefficient of start-up
PDj demand for hour j
TiON
j ON time for unit i for hour j
MUT Min. up time
MDT Min. down time
Pimax Max. generation of unit i
PRj Spinning reserve for hour j
uij ON/OFF status for unit i at hour j.
Hybrid Techniques
Numerous optimization algorithms have been devised in the past to address the
optimal power flow. Examples of such algorithms are gray wolf optimizer [7], drag-
onfly algorithm, artificial bee colony, ant colony optimization, and so on. Himanshu
Anand et al. have presented technique to solve profit-based unit commitment (PBUC)
problem [15]. Anupam Trivedi et al. [7] have presented the unique approach for
solving power system optimization problem popularly known as UC scheduling
problem. Authors have named algorithm as hybridizing genetic algorithm and differ-
ential evolution (hGADE). The GA algorithm works well with binary variables while
DE works well with continuous variables. The authors have taken the advantage of
same to solve UC problem.
Anupam Trivedi et al. have presented the unique approach for solving power system
optimization problem popularly known as unit commitment scheduling problem.
Authors have named algorithm as hybridizing genetic algorithm and differential
evolution (hGADE) [16]. The constraints involved in UC are spinning reserve, least
up time, least down time, start-up cost, hydro constraints, generator ‘must run’
constraints, ramp rate and fuel constraints. Authors of the paper have mentioned
in their future work that ramp up/down constraint was neglected and not taken into
consideration for solving unit commitment problem. So, it motivated us to work
further and included ramp up and down constraint in calculation of cost. The fitness
function has been designed for the same.
5 Proposed Approach
Figure I describes the working of already implemented hGADE algorithm. The ramp
up and down is considered for this research. In addition to this, new fitness function
for differential evolution [17] and genetic algorithm has been formulated. Table 1
Table 1 Input data for six-unit system
Unit a ($/hr.) b ($/MW hr.) c ($/MW2 hr.) P min (MW) P max (MW) Min. up Min. down Start-up cost Ramp up Ramp down
(hours) (hours) (MW/hr) (MW/hr)
1 0.00375 2 200 200 50 3 1 176 130 130
2 0.0175 1.75 257 80 20 2 2 187 130 130
3 0.0625 1 300 40 15 3 1 113 90 90
4 0.00834 3.25 400 35 10 3 2 267 60 60
5 0.025 3 515 30 10 2 1 180 60 60
6 0.05 3 515 25 12 3 1 113 40 40
Solution to Unit Commitment Problem: Modified hGADE Algorithm
85
86 A. Singh and A. Khamparia
represents the input data for six-unit systems which consists of values of cost coef-
ficients, minimum up/down costs, and start-up costs. Table 2 shows the load pattern
(in MW) of interval 1 h. Table 3 shows the mutation and crossover rates defined for
this research.
Fitness Function:
The ramp rate is considered for the calculation of overall cost of production of power
plant. The fitness function considered for the working of genetic algorithm is given
below (Eq. 10). Here, Fs is average cost per generation and Ft is mean of Fs
1 if (1 − e/(max(Fs))) ∗ ramprate < Ft
f = (10)
0 otherwise
The fitness function considered for the working of differential evolution algorithm
is given below (Eq. 11)
1 if (Fs) ∗ ramprate < (Ft/e)
f = (11)
0 otherwise
only for the first iteration. There are six units which are taken for power system
analysis. These six units have to satisfy the load with minimum cost. The load is
distributed with one hour interval.
The operation cost is calculated as follows (Eq. 12):
oc = a ∗ Pt2 + b ∗ Pt + c (12)
Here,
oc Operation cost
a, b, c Cost coefficients
P Maximum power.
Then, the priority of power units are maintained as per the following Eq. 13:
oc
Priority = (13)
Pmax
The priority of all units are calculated and sorted in ascending order means unit of
higher priority (lower the number, higher is the priority) will take the load first. As per
working of hGADE, genetic algorithm works on binary component and differential
evolution algorithm works on the continuous component of chromosomes (Fig. 1).
Results
The research has been carried out on MATLAB 2016b. It has been found optimization
made a significant difference in the cost of operation. The following are the results
obtained under conditions specified in Tables 4 and 5.
The results obtained are promising. The graph as shown in Fig. 2 shows the
average cost of generators (units) over generations (iterations). Table 4 shows the
unit commitment schedule of six units over ten generations. Here, “on” suggest unit
is active in particular iteration and “off” indicates unit is not taken into consideration
for calculation of total operating cost.
It has been found that average cost of operation is 142,814.9603 $ and it get
reduced to 142,809.8944 $ after applying optimization (hGADE) and it is shown in
Table 5. Case 1 represents the total operation cost without using any optimization.
Case 2 shows results with optimization. The comparative analysis clearly shows
that there is a significant improvement with respect to cost if proposed approach is
applied.
6 Conclusion
The research is carried out using hybridization of genetic and differential evolution
algorithms with consideration of ramp up and down rates. The fitness functions
have been designed, respectively. It has been observed that the hybridization of
88 A. Singh and A. Khamparia
Start
Initialize Population
Fitness Evaluation of
parent population
Yes
Optimal Solution
Condition Satisfied
Output
No
Perform stochastic uni-
form selection Perform DE mutation on
continuous components
Perform GA Crossover
on binary components
Perform DECrossover on
continuous components
Perform GA mutation
on binary components
Fitness Evaluation
Table 4 UC schedule
Unit 1 Unit 2 Unit 3 Unit 4 Unit 5 Unit 6
Generation 1 Off On Off Off On Off
Generation 2 On On Off Off On On
Generation 3 On Off On On Off On
Generation 4 On On On Off On On
Generation 5 On Off Off Off On Off
Generation 6 On Off Off Off Off Off
Generation 7 Off On On Off On Off
Generation 8 On On Off Off Off Off
Generation 9 On Off Off On Off Off
Generation 10 On On On Off Off On
evolutionary algorithms with ramp rates and newly designed fitness function showed
significant improvement with respect to total operation cost.
90 A. Singh and A. Khamparia
References
1. Wood, A. J., & Wollenberg, B. F. (2007). Power generation, operation & control, 2nd edn.
New York: John Wiley & Sons.
2. Håberg, M. (2019). Fundamentals and recent developments in stochastic unit commitment.
International Journal of Electrical Power & Energy Systems. https://doi.org/10.1016/j.ijepes.
2019.01.037
3. Deka, D., & Datta, D. (2019). Optimization of unit commitment problem with ramp-rate
constraint and wrap-around scheduling. Electric Power Systems Research. https://doi.org/10.
1016/j.epsr.2019.105948
4. Wang, M. Q., Yang, M., Liu, Y., Han, X. S., & Wu, Q. (2019). Optimizing probabilistic spinning
reserve by an umbrella contingencies constrained unit commitment. International Journal of
Electrical Power & Energy Systems. https://doi.org/10.1016/j.ijepes.2019.01.034
5. Zhou, M., Wang, Bo., Li, T., & Watada, J. (2018). A data-driven approach for multi-objective
unit commitment under hybrid uncertainties. Energy. https://doi.org/10.1016/j.energy.2018.
09.008
6. Park, H., Jin, Y. G., & Park, J. –K. (2018). Stochastic security-constrained unit commitment
with wind power generation based on dynamic line rating International. Journal of Electrical
Power & Energy Systems. https://doi.org/10.1016/j.ijepes.2018.04.026.
7. Panwar, L. K., Reddy, S. K, Verma, A., Panigrahi, B. K., & Kumar, R. (2018). Binary grey wolf
optimizer for large scale unit commitment problem. Swarm and Evolutionary Computation.
https://doi.org/10.1016/j.swevo.2017.08.002
8. Tovar-Ramírez, C. A., Fuerte-Esquivel, C. R., Martínez Mares, A., & Sánchez-Garduño, J.
L. (2019). A generalized short-term unit commitment approach for analyzing electric power
and natural gas integrated systems. Electric Power Systems Research. https://doi.org/10.1016/
j.epsr.2019.03.005.
9. Zhou, Bo., Ai, X., Fang, J., Yao, W., Zuo, W., Chen, Z., & Wen, J. (2019). Data-adaptive robust
unit commitment in the hybrid AC/DC power system. Applied Energy. https://doi.org/10.1016/
j.apenergy.2019.113784
10. Hobbs, W. J., Hermon, G., Warner, S., & Shelbe, G. B. (1988). An enhanced dynamic
programming approach for unit commitment. IEEE Transaction on Power Systems.
11. Cohen, A. I., & Yoshimura, M. (1983). A branch-and-bound algorithm for unit commitment.
IEEE Transactions on Power Apparatus and Systems.
12. Yu, X., & Zhang, X. (2014). Unit commitment using Lagrangian relaxation and particle swarm
optimization. International Journal of Electrical Power & Energy Systems.
13. Price, K. V., & Storn, R. (1997). Differential evolution: A simple evolution strategy for fast
optimization. Dr. Dobb’s Journal, 22(4), 18–24.
14. Singh, A., & Kumar, S. (2016). Differential evolution: An overview. Advances in Intelligent
Systems and Computing. https://doi.org/10.1007/978-981-10-0448-3_17
15. Anand, H., Narang, N. & Dhillon, J. S. (2018). Profit based unit commitment using hybrid
optimization technique. Energy. https://doi.org/10.1016/j.energy.2018.01.138.
16. Trivedi, A., Srinivasan, D., Biswas, S., & Reindl, T. (2015). Hybridizing genetical gorithm
with differential evolution for solving the unit commitment scheduling problem. Swarm and
Evolutionary Computation. https://doi.org/10.1016/j.swevo.2015.04.001
17. Dhaliwal, J. S., & Dhillon, J. S. (2019). Unit commitment using memetic binary differential
evolution algorithm. Applied Soft Computing. https://doi.org/10.1016/j.asoc.2019.105502.
In Silico Modeling and Screening Studies
of Pf RAMA Protein: Implications
in Malaria
Abstract Malaria is a major parasitic disease that affects a large human population,
especially in tropical and sub-tropical countries. The treatment of malaria is becoming
extremely difficult due to the emergence of drug-resistant parasites. To address this
problem, many newer drug target proteins are being identified in Plasmodium falci-
parum, the major casual organism of malaria in humans. Rhoptry proteins participate
in the intrusion of red blood cells by the merozoites of the malarial parasite. Inter-
ference with the rhoptry protein function has been shown to prevent invasion of
the erythrocytes by the parasite. As the crystal structure of RAMA protein of Plas-
modium falciparum (Pf RAMA) is not yet available, the three-dimensional structure
of the protein was predicted using comparative modeling methods. The structural
quality of the generated model was validated using Procheck, which is based on the
parameters of Ramachandran plot. The Procheck results showed 92.7% of backbone
angles were in the allowed region and 0.4% in the disallowed region. This structure
was studied for interaction with the entire library of compounds in ZINC database
of natural compounds. The binding site of the protein was predicted using Sitemap
and the entire library was screened against the target. 189,530 compounds were
used as an input to HTVS for the first level of screening. The docking scores of the
compounds were further calculated using “Extra Precision” (XP) algorithm of Glide.
On the basis of docking scores, 54 compounds were selected for further analysis.
The binding affinity was further calculated using MMGBSA method. The interac-
tion studies using molecular docking and MMGBSA revealed appreciable docking
scores and Gbind . 10 compounds were selected as promising leads with appreciable
docking scores in the range of −17.891 to −5.328 kcal/mole. Our data generates
evidence that the screened compounds indicate a potential binding to the target and
© The Editor(s) (if applicable) and The Author(s), under exclusive license 91
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_8
92 S. Srivastava and P. Mathur
need further evaluation. Also, the analysis of interaction of these compounds can be
exploited for better and efficient design of novel drugs against the said target.
1 Introduction
Malaria is a major tropical parasitic disease and affects a large population in the
countries located in this region [1]. According to a WHO World Malaria Report
2015, malaria has resulted in about 438,000 deaths globally [2]. Although major
steps have been taken to reduce the burden of malaria, the ultimate aim of roll back
malaria (RBM), which is zero death, is yet far from achieved. Various classes of
antimalarial drugs such as quinoline derivatives, artemisinin derivatives, antifolates,
antimicrobials, and primaquine are known. However, due to increasing incidences of
resistance and adverse effects of existing drugs, there is a growing need for discovery
and development of new antimalarials.
Malaria is caused by Plasmodium species. These are apicomplexan parasites that
contain secretory organelles such as rhoptries, micronemes, and dense granules.
Rhoptries of Plasmodium are joined club-shaped organelles situated at the apical end
of the parasite merozoites. When these merozoites attach on the surface of the human
erythrocytes, rhoptries discharge their contents on the membrane [3]. The merozoites
internalize and rhoptry disappears. Very less is known about rhoptry biogenesis
due to lack of biomarkers of organelle generation. Microscopic examination reveals
that rhoptry organelles are formed by continuous fusion of post-Golgi vesicles [3,
4]. Rhoptry is composed of proteins and lipids. Some rhoptry proteins have been
analyzed at the molecular level while others have been identified using immuno-
logical reagents [5, 6]. In the present work, a rhoptry protein of the Plasmodium
falciparum, namely rhoptry-associated membrane antigen (Pf RAMA) that appears
to have a role in both rhoptry biogenesis and erythrocyte invasion has been studied.
It has been suggested that RhopH3 and RAP1 show protein–protein interaction with
RAMA. Recently, it has been shown that certain proteins like sortilin are involved
in escorting RAMA from the Golgi apparatus to the rhoptries [7]. Considering the
importance of RAMA and its crucial role in forming the apical complex in Plas-
modium, the protein looks to be an interesting drug target. A threading-based model
of Pf RAMA (PF3D7_0707300), prediction of the binding site and virtual screening
using biogenic compounds belonging to ZINC database has been performed. The
compounds showing promising docking scores were selected. Molecular dynamics
simulation of the protein was performed separately to understand its stability.
In Silico Modeling and Screening Studies of Pf RAMA … 93
As the binding site of the protein was not known, it was predicted using Sitemap,
(version 3.6, Schrödinger). Molecular docking calculation for all the compounds, to
determine the binding affinity of Pf RAMA protein binding site, was performed using
Glide (version 6.8, Schrödinger) [14]. Receptor-grid file was generated after protein
and ligand preparation using receptor-grid generation program. At the centroid of
the predicted binding site, receptor grid was generated. A cube of size 10 Å × 10 Å
× 10 Å was defined in the center of binding site for the binding of docked ligand
and to occupy all the atoms of the docked poses one more enclosing box of 12 Å ×
12 Å × 12 Å was also defined.
The structure was studied for interaction with the entire library of biogenic
compounds selected from the ZINC database. 276,784 compounds were screened
using different filters (Qikprop, reactive, Lipinski’s rule of five) and selected
compounds obtained were used as input for high throughput virtual screening, HTVS.
The screened compounds were subjected to the next level of molecular docking
calculations using “Standard Precision” (SP) algorithm. Compounds selected after
SP docking were further refined by “Extra Precision” (XP) method of Glide. On
the basis of XP docking scores, 10 compounds were selected for further analysis.
The binding affinity was calculated based on molecular mechanics generalized Born
surface area (MMGBSA) using Prime, (version 4.1, Schrödinger), and the interaction
studies using molecular docking and MMGBSA [15] revealed appreciable docking
scores and Gbind.
A. Sequence analysis
Pf RAMA protein sequence of 861 amino acids was primarily analyzed using
BLAST against the PDB database to find structurally categorized proteins that
display significant sequence resemblance to the objective protein utilizing the
evolutionary information by accomplishing profile-profile alignment and the
assessment of the likelihood that two proteins are correlated to each other as
shown in Fig. 1. Sequence identity and query coverage for the templates available
for Pf RAMA proteins were very low.
Fig. 1 Graphical results of BLAST query for PfRAMA; the regions numbers from the target
database that lined up with the inquiry sequence
the protein as represented in Fig. 3. RMSD plot revealed that the structure was
stable after 35 ns.
Fig. 3 50 ns molecular
dynamics simulations run of
Pf RAMA protein for
refinement of structure:
RMSD of heavy atoms and
back bone atoms
residues in disallowed region (Fig. 4). Thus, a good quality structure was gener-
ated and the refined structure with minimum energy was further used to perform
molecular docking studies.
Fig. 4 Validation of MD
simulated structure of
PfRAMA protein:
Ramachandran plot which
shows 92.7% residues were
in allowed regions and 0.4%
residues in disallowed region
In Silico Modeling and Screening Studies of Pf RAMA … 97
Table 1 Glide energy, docking score and MMGBSA (Gbind ) score of selected ligands
S. No. ZINC ID Glide energy Docking score MMGBSA
1 ZINC08623270 −48.440 −17.891 −91.547
2 ZINC03794794 −36.907 −11.029 −90.356
3 ZINC20503905 −35.241 −7.427 −85.930
4 ZINC67870780 −43.708 −5.547 −85.490
5 ZINC67870780 −47.18 −7.561 −85.490
6 ZINC22936347 −34.835 −6.461 −84.941
7 ZINC15672677 −34.502 −6.02 −84.845
8 ZINC09435873 −43.937 −6.53 −82.966
9 ZINC09435873 −37.125 −6.095 −82.966
10 ZINC20503855 −33.255 −5.328 −80.351
Fig. 7 Graph showing per residue energy a E vdw of ligand, b E ele of ligand
4 Conclusion
studies were performed and binding affinity of ligands with the protein eval-
uated. Out of ten molecules that showed appreciable docking scores and high
affinity toward the binding site of the protein, ZINC08623270 was selected for
further analysis. The popular name of the ligand was 1-(3-methylsulfanyl phenyl)-3-
[[5-[(4-phenylpiperazin-1-yl)methyl]quinuclidin-2-yl]methyl]urea. The interaction
between the protein and this ligand was stabilized by three hydrogen bonds,
hydrophobic as well as ionic interactions. Our data generates evidence that the
reported compounds indicate a potential binding to the target and need further exper-
imental evaluation. It is therefore proposed that this study could be the basis for
medicinal chemists to design better and efficient compounds which may qualify as
novel drugs against the said target of malaria, caused by Plasmodium falciparum.
References
1. Cowman, A. F., Healer, J., Marapana, D., & Marsh, K. (2016). Malaria: Biology and disease.
Cell, 167, 610–624.
2. WHO. (2015). The World Malaria Report http://wwwwhoint/malaria/publications/world-
malaria-report-2015/report/en/. ISBN 978 92 4 156515 8.
3. Bannister, L. H., Mitchell, G. H., Butcher, G. A., & Dennis, E. D. (1986). Lamellar membranes
associated with rhoptries in erythrocytic merozoites of Plasmodium knowlesi: A clue to the
mechanism of invasion. Parasitology, 92(2), 291–303.
4. Jaikaria, N. S., Rozario, C., Ridley, R. G., & Perkins, M. E. (1993). Biogenesis of rhoptry
organelles in Plasmodium falciparum. Molecular and Biochemical Parasitology, 57(2), 269–
279.
5. Preiser, P., Kaviratne, M., Khan, S., Bannister, L., & Jarra, W. (2000). The apical organelles of
malaria merozoites: Host cell selection, invasion, host immunity and immune evasion. Microbes
and Infection, 2(12), 1461–1477.
6. Blackman, M. J., & Bannister, L. H. (2001). Apical organelles of Apicomplexa: Biology and
isolation by subcellular fractionation. Molecular and Biochemical Parasitology, 117(1), 11–25.
7. Hallée, S., Boddey, J. A., Cowman, A. F., & Richard, D. (2018). Evidence that the Plasmodium
falciparum protein sortilin potentially acts as an escorter for the trafficking of the rhoptry-
associated membrane antigen to the Rhoptries. mSphere, 3 (1), e00551–17. https://doi.org/10.
1128/mSphere.00551-17.
8. Wu, S., & Zhang, Y. (2007). LOMETS: A local meta-threading-server for protein structure
prediction. Nucleic Acids Research., 35(10), 3375–3382.
9. Laskowski, R. A., MacArthur, M. W., Moss, D. S., & Thornton, J. M. (1993). PROCHECK:
A program to check the stereochemical quality of protein structures. Journal of Applied
Crystallography, 26, 283–291.
10. Colovos, C., & Yeates, T. O. (1993). Verification of protein structures: Patterns of non-bonded
atomic interactions. Protein Science, 2(9), 1511–1519.
11. Irwin, J. J., Sterling, T., Mysinger, M. M., Bolstad, E. S., & Coleman, R. G. (2005). ZINC–a
free database of commercially available compounds for virtual screening. Journal of Chemical
Information and Modeling, 45, 177–182.
12. Greenwood, J. R., Calkins, D., Sullivan, A. P., & Shelley, J. C. (2010). Towards the comprehen-
sive, rapid, and accurate prediction of the favorable tautomeric states of drug-like molecules
in aqueous solution. Journal of Computer-Aided Molecular Design, 24(6–7), 591–604.
13. Andrec, M., Harano, Y., Jacobson, M. P., Friesner, R. A., & Levy, R. M. (2002). Complete
protein structure determination using backbone residual dipolar couplings and sidechain
rotamer prediction. Journal of Structural and Functional Genomics, 2(2), 103–11.
In Silico Modeling and Screening Studies of Pf RAMA … 101
14. Friesner, R. A., Banks, J. L., et al. (2004). Glide: A new approach for rapid, accurate docking
and scoring. 1. Method and assessment of docking accuracy. Journal of Medicinal Chemistry,
47(7), 1739–49.
15. Jacobson, M. P., Friesner, R. A., Xiang, Z., & Honig, B. (2002). On the role of the crystal
environment in determining protein side-chain conformations. Journal of Molecular Biology,
320(3), 597–608.
IBRP: An Infrastructure-Based Routing
Protocol Using Static Clusters in Urban
VANETs
P. K. Pandey (B)
Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
e-mail: pavan.pandey.1312@gmail.com
V. Kansal
Institute of Engineering and Technology, Dr. A.P.J Abdul Kalam Technical University, Lucknow,
India
e-mail: vineetkansal@ietlucknow.ac.in
A. Swaroop
Bhagwan Parashuram Institute of Technology, New Delhi, India
e-mail: abhishekswaroop@bpitindia.com
© The Editor(s) (if applicable) and The Author(s), under exclusive license 103
to Springer Nature Singapore Pte Ltd. 2021
A. Khanna et al. (eds.), Recent Studies on Computational Intelligence,
Studies in Computational Intelligence 921,
https://doi.org/10.1007/978-981-15-8469-5_9
104 P. K. Pandey et al.
1 Introduction
Vehicular ad hoc networks (VANETs) [1] are prominent subclass of mobile ad hoc
networks (MANETs). It provides a special kind of framework to allow communica-
tion among several vehicles. In VANETs, vehicles create a huge network (millions
of vehicles running on roads) without any centralized authority by considering
each vehicle acts as a network node and router itself. VANETs [2] are kind of
infrastructure-less self-organized networks where nodes are completely mobile.
Unique characteristics of VANETs such as dynamic topologies, limited bandwidth,
and limited energy made this one of most challenging network scenarios.
As per Fig. 1, every vehicle in VANETs is well equipped [3] with wireless devices
to communicate with other vehicles and road side units (RSUs). Nodes in VANETs
can communicate to each other by following either single-hop or multi-hop connec-
tivity. Each vehicles and RSU are part of VANETs as nodes in any networks. Vehicular
communication can be further categorized in two different kinds of communication
[4], one is vehicle-to-vehicle (V2V) and another one is vehicle-to-infrastructure (V2I)
communication. V2V supports communication among vehicles only. However, V2I
supports the inclusion of other nodes in communication as well like RSUs, traffic
authorities, etc. Hence, vehicles can communicate directly if they are within their
communication range and communication beyond their range can be possible through
other infrastructure nodes.
VANETs are used to design intelligent transportation systems because of their
useful applications [6] such as transportation safety, traffic efficiency, and traffic
improvements. Transportation safety [7] includes message dissemination about
several alerts and warnings such as accident alerts, traffic situation alerts, poor road
conditions alert, lane change warnings, overtaking warnings, and collision warn-
ings. Traffic efficiency and improvement using VANETs focus to assist and enhance
traffic flow by following the current situation of traffic. It provides comfort driving
for drivers by sharing dynamic traffic information to several vehicles running on
roads. These applications are helpful to avoid road accidents that directly reduces a
lot of casualties on road.
VANETs are kind of network, whichever having a lot of challenges those are yet
to be addressed. These challenges include routing challenges, security challenges,
reduced signal quality, degraded signal strength, and quality of communication. Addi-
tionally, the rapid change of vehicle’s position makes setup, implement, and deploy
vehicular communication framework more difficult.
To increase cooperation between vehicles and other nodes, there should be a way to
transmit a messages from one node to another node that requires an effective routing
procedure. By following characteristics of VANETs, routing is the most important
research challenge in VANETs. Therefore, to design an effective and efficient routing
protocol in VANETs, it is a key requirement for providing reliable communication in
VANETs. Vehicles are not able to share messages among them without a well-defined
routing mechanism. In order to address routing issues, several routing protocols have
been already proposed in VANETs.
This approach also focuses on the problem of routing and new routing approach
has been proposed by the following efficient clustering technique. The major
contributions of the present exposition are as follows.
(1) A new cluster-based routing protocol for urban VANET’s has been proposed.
(2) A mechanism has been added in the proposed approach to avoid flooding of
messages in the networks.
(3) The cache routes are used to apply control broadcasting.
(4) Static performance analysis of the proposed approach shows that the proposed
approach is efficient and scalable.
The rest paper is organized in like way, where Sect. 2 explains routing challenges
and discusses few routing protocols as well. Section 3 captures all details of the
new routing approach proposed in this paper, and then the performance evaluation
continues in Sect. 4 that helps to understand an effectiveness and efficiency of the
designed routing approach, and the last Sect. 5 concludes the paper.
106 P. K. Pandey et al.
2 Related Work
The way of transmitting message from one node to another node known as
routing. Numerous different routing protocols are proposed based on different kinds
of network scenarios. Routing protocols are designed for completely connected
networks like MANETs cannot be used effectively and efficiently in VANETs as well.
Therefore, different kinds of routing procedures need to be designed by supporting
dynamic topology, frequent link breakage, and high mobility of nodes.
In VANETs, several routing protocols [8] have already been proposed to achieve
efficient routing between nodes. Efficiency and usability of a routing protocol can
be measured by different parameters of quality of services (QoS) such as end-to-end
delay, round trip delay, packet loss, jitter, and interference. Based on different routing
behavior used in protocols, numerous routing approaches [9] can be further classified
in different categories. As per Fig. 2, major routing protocols can be classified into
five different categories by following the routing mechanism used. Each category of
the protocol has been discussed separately.
Topology-based routing protocols [11] are based on information of complete
network topology. Each node in networks keeps track of every other nodes in
networks and maintain a routing table by capturing best route information to every
other nodes in the network. This mechanism can be further classified into three cate-
gories, proactive, reactive, and hybrid routing protocol. Destination-based distance
vector (DSDV) [12], ad hoc on-demand distance vector (AODV) [13], and zone
routing protocol (ZRP) [14] are few popular routing protocols from this category.
Position-based routing protocol [15, 16] uses the current position or location of
vehicles using a global positioning system (GPS) or any other related technology.
VANETs are a collection of several vehicles and roadside units (RSUs), where each
vehicle supposed to be equipped with onboard units (OBUs). Designated OBUs
help vehicles to communicate each other and communicate with other infrastructure
nodes also like RSUs. Vehicles in the network are limited to communicate with other
vehicles and RSUs within a certain limit of distance. OBUs embedded vehicles are
enough capable to maintain and broadcast traffic-related information to respective
RSUs such as current position of vehicles, speed, direction, current time, and traffic
events.
Infrastructure-based routing protocol (IBRP) is also based on the clustering
approach. Therefore, IBRP allows the complete network to be divided into several
clusters by following the clustering approach. In this approach, each vehicle repeat-
edly exchange messages with RSU within their range. Based on messages exchanged
with vehicles, each RSU forms a cluster with all vehicles within its communication
range. In every cluster, RSU acts as cluster head and other vehicles act as cluster
members. Therefore, RSU has to maintain information about all vehicles in the
respective cluster.
IBRP: An Infrastructure-Based Routing Protocol … 109
Data structure at vehicle:Data structure at vehicles Vi; for (0 < i < N) in VANETs
of N vehicles.
ID(i): Unique identification number of vehicle i.
S(i): Current state of vehicle i.
V(i): Current speed of vehicle i.
D(i): Direction of movement of vehicle i. where Di belongs to (−1, 0, 1).
L(i): Current location of vehicle i.
R(i): Identification number of RSU as cluster head.
Neighbors (i): Map of neighbor vehicles maintained on vehicle i. it will be defined
as map < ID(i), Path (i) > where path(i): List of nodes to traverse to reach V(i).
RSU(i): List of RSUs in communication range.
Data structure at RSU: Data structure at RSU R(j); for (0 < j < M) in VANETs
of M RSUs.
ID(j): Unique identification number of RSU j.
S(j): Current state of vehicle i.
Members (j): Map of member vehicles maintained on RSU (j). it will be defined
as map < ID(i), Path (i) > where ID(i) is identification number of vehicle and
path(i) is list of nodes to traverse to reach V(i).
Old_members (j): Map of old member vehicles maintained on RSU (j). it will be
map < ID(i), R(x) > where ID(i) is identification number of recently left vehicle
and R(x) is identification number of current RSU of that vehicle.
RSU(j): List of RSUs directly connected to R(j).
Message format: There are several messages are to be used in this protocol that
needs to be defined for better understanding of approach.
HELLO {ID(i), V(i), D(i), L(i)}: Message to be sent from vehicle to RSU while
joining cluster.
HELLO_ACK {R(j), Neighbor(i)}: Message to be sent from RSU to vehicle in
response of HELLO.
BYE {R(i)}: Message sent by vehicle to respective RSU, while leaving any cluster.
BYE_ACK {NONE}: response from RSU to vehicle, in response of BYE.
MESSAGE {Vs(i), Vd(i), Path(Vd(i)), String}: structure to keep information
related to data to be sent from one node to another. Where Vs(i) is source
vehicle identification number, Vd(i) is destination vehicle identification number,
Path(Vd(i)) is path traversed so far to reach destination vehicle, and string is data
to be sent.
Some data structures to be used for search operation frequently like a list of
neighbors maintained on vehicles and members maintained on RSU. Therefore, a
map is used for these data structures to reduce the complexity of search operation on
these data structures. Initialization of all data structures for both entities is explained
below.
IBRP: An Infrastructure-Based Routing Protocol … 111
The cluster formation process starts as soon as the vehicle starts and ready to commu-
nicate with nearby RSU. Once OBU equipped with the vehicle is ready with vehicle
details including movement details. It will send the “HELLO” message to all nearest
RSU. Hello, message consists identity of a vehicle (Id), the current location of the
vehicle (Lt), speed of the vehicle (Vt), direction of movement (Dt), and current times-
tamp (t). RSU receives a message, analyze details received and update their routing
information w.r.t to that vehicle. Based on provided traffic information, RSU fetches
all neighbors of the new joined vehicle and publish neighbor list back to the vehicle
in “HELLO_ACK” message that is designated as a response of “HELLO” message.
Same “HELLO” message should be sent periodically by vehicle until or unless it
receives response from any RSU or vehicle crosses intersection points.
After crossing the intersection point, there is a high probability of change in
traffic parameters like direction after turning their way, speed based on new road
condition, etc. Therefore, OBU again prepares a new set of data and starts sending
a new “HELLO” message to selected RSUs after crossing the intersection point.
By following this way of communication, RSU will have routing information of all
vehicles within its range and each vehicle should have all other neighbor’s infor-
mation that can be reachable directly. RSU is known as master vehicle here, as it
maintains routing information of all vehicles within its range and responsible to keep
information updated.
Cluster state transition: In proposed algorithms, at any moment, every vehicle
is marked as one of these five states: initial (IN), start election (SE), wait response
(WR), cluster member (CM), and isolated member (IM).
112 P. K. Pandey et al.
Initial (IN)—Initial state of the vehicle is state before joining any of cluster. Any
new vehicle shall be treated in this state for a certain period of time.
Start election (SE)—In this state, vehicles try to join the relevant clusters.
After expiring initial timer, vehicle changes state from IN to SE and starts sending
HELLO messages to all neighbors. For RSU, in this state, RSU is ready to process
HELLO/BYE request and respond accordingly.
Wait response (WR)—After sending HELLO message, vehicle change state
from SE to WR. In this state, vehicle waits response from respective RSU to become
members of that cluster. In case no response received within a certain period of time,
vehicle moved to SE state again.
Cluster member (CM)—After successfully exchange of HELLO messages and
HELLO_ACK from RSU, vehicle changes state to CM because the vehicle is now
part of cluster. If the vehicle supposed to change cluster, then the vehicle state gets
changed to SE again, after exchanging BYE and BYE_ACK messages.
Cluster head (CH)—This state corresponds to RSU only. After responding
HELLO_ACK for HELLO request, RSU supposed to be marked in CH state from
IN. For RSU, after cleaning up complete cluster RSU changes its state from CH to
IN again.
Isolated member (IM)—Vehicles are not part of any cluster to be marked in IM
state. Vehicles either completed a trip or changing cluster moved to IM state.
To illustrate proper state transition, few events are also designated to understand
state flow properly.
INIT_T—This event corresponds to the timer of 30 s initially to settle OBU and
to get ready to exchange messages.
HELLO—“HELLO” event signifies messages triggered from vehicle to RSU
while cluster formation.
HELLO_ACK—“HELLO_ACK” event is the response of “HELLO” message
from RSU to vehicle. It is to confirm that cluster formed properly and respective
RSU is cluster head.
BYE—BYE event signifies case of leaving cluster and the vehicle sends a BYE
message to RSU while leaving the cluster.
BYE_ACK—“BYE_ACK” event triggered by RSU in response of “BYE”, when
gracefully exit to be performed between vehicle and RSU.
WAIT_T—This event corresponds to the timer of 20 s. This is time to wait for
response of “HELLO” and “BYE” messages.
DROP_T—This event corresponds to timer of 20 s to wait whether the request
is dropped by RSU or not.
START—This event gets triggered, whenever OBU gets power on with vehicle
gets started.
STOP—As same as “START” this event gets triggered, whenever OBU gets
power off with vehicle gets stopped.
The state diagram presents state transition flow based on triggered events. These
state diagrams help to understand the complete flow of vehicles and RSUs. In this
approach, state diagrams for vehicles and RSU captured separately to explain their
roles properly.
IBRP: An Infrastructure-Based Routing Protocol … 113
First, we talk about the state diagram of the vehicle presented in Fig. 3, where
the vehicle starts from IN state initially. First event is “INIT_T” occurred on expiry
of the respective timer. This timer shall be started for a fixed time period of 30 s
to stabilize the vehicle and enable them to have proper data for joining the cluster.
This event cause changes in state from IN to SE state. Vehicle in SE state start
communicating with nearby RSUs to join the correct cluster. In SE state, the vehicle
starts sending “HELLO” messages to RSUs and moved to WR state. “HELLO_ACK”
event occurred when respective RSU responds to the “HELLO” message received
from vehicle. Vehicles in WR state receives HELLO_ACK and change their state to
CM state. In case no response received within 20 s, then “WAIT_T” event-triggered
and vehicle changes state back to SE. If any vehicle movement pattern forced vehicle
to leave cluster, then the respective vehicle has to convey its RSU for the same. For
leaving cluster or changing cluster “BYE” and “BYE_ACK” events are marked
in the proposed approach. BYE event specifies vehicle intimation to RSU before
leaving any cluster but that keeps the vehicle still in CM state until or unless the
BYE_ACK event gets triggered. BYE_ACK expected to be triggered when BYE
message properly responded by RSU. After receiving BYE_ACK, the state will be
changed to SE state again. The last event is “STOP” initiated by a vehicle when OBU
finds vehicles shut down after completing current trip and state moved from CM to
IM state. So in the state of IM, a vehicle state again gets changed to IN after receiving
the “START” event that gets triggered when OBU finds the vehicle started.
After the vehicle state diagram, we talk about state diagram of RSU presented in
Fig. 4. RSU also starts form IN state and ready to receive messages from vehicles.
In first, HELLO event occurred when RSU receives the “HELLO” message from
any vehicle. Then, RSU changes its state from IN to SE state and start analyzing
data received from the vehicle. If RSU finds that vehicle is part of the cluster, then
RSU responds vehicle with “HELLO_ACK” and changes state to CH state. For
subsequent HELLO and BYE messages, RSU will remain in CH state only. However,
it will change the state from CH to SE for processing further requests. In case RSU
does not find request suitable enough to respond due to any reason, DROP_T event
occurred and RSU moved to CH state again. After responding to all those requests
by HELLO_ACK and BYE_ACK, RSU changes its state to CH state to process other
requests. In case of BYE_ACK request or STOP received from the last vehicle in a
cluster, RSU changes its state from CH to IN again.
(1) Clustering procedures: Cluster formation starts as soon as OBU gets ready. OBU
equipped vehicle is expected to prepare and send traffic-related information to
RSU. For detail understanding of approach, algorithms and pseudocodes are
mentioned below.
Algorithms and Pseudocodes: Step-wise step procedure of cluster formation are
captured with pseudocode of respective algorithms in this paper that gives detail
insights of the idea proposed here.
1. The vehicle prepares and sends HELLO message to RSU, with all relevant
information.
2. Start timer T for time period 20 s.
3. Receives HELLO_ACK from RSU and updates routing table information.
4. If timer T gets expired.
5. Then repeat steps 1–3 again.
1. Receives HELLO request and check the communication range of the vehicle.
2. If vehicle within communication range of RSU.
3. Then add vehicle entry on RSU.
4. Prepare HELLO_ACK and send back to the vehicle.
5. Otherwise, drop HELLO message.
IBRP: An Infrastructure-Based Routing Protocol … 115
3.4 Routing
This section covers the procedure to send a message from source to destination
by taking advantage of the proposed clustering approach. The complete network
now is classified in several clusters and each cluster has been controlled by RSU.
Additionally, every vehicle will have a list of neighbor to send message directly.
Therefore, two kinds of communication frameworks will be supported. One when
116 P. K. Pandey et al.
source and destination node lies in communication range of same RSU. And other
one, when source and destination belong to the communication range of different
RSUs.
(1) Intra-cluster routing: Intra-cluster routing explains mechanism when source
vehicle and destination vehicles belong to the same cluster and communication
to happen within the same cluster. In this case, source node checks list of their
neighbors first those are directly reachable. If destination node belongs to that
list, then source node forwards the message [Si, NONE, M, Di] to destination
vehicle directly. Where Si is unique identifier of source vehicle, “NONE” indi-
cates that destination directly reachable from source, Di is a unique identifier
of destination vehicle, and M is information to transmit
In case of destination does not belong to a list of neighbors then source node
will forward the message [Si, NONE, M, Di] to RSU. Then, RSU checks their
routing table and finds next node toward destination node Di and forwards the
message to that node after adding RSU ID in a list of hops [Si, Ri, M, Di]. By
following the same mechanism that RSU will forward the message to further
nodes by adding their identifier details, until the message reaches to destination
node. That list of nodes traverse will be saved by destination node and that
will be used by back-trace message if some immediate reply back needed for
message instead of preparing a new route again. Destination will keep that path
record data up to a certain time interval, after that route data will be removed.
(2) Inter-cluster routing: Inter-cluster routing specifies the way of communication
between vehicles belong to different clusters. In case of communication between
vehicles from different clusters, RSU will not find destination nodes in their
routing table after receiving a message from the source node. Then, RSU first
check for a list of vehicles whichever is associated with this RSU earlier. If
destination does not belong in that list also, then RSU will broadcast messages
to all other directly reachable RSUs by adding their address into message [Si, Ri,
M, Di]. Further, next RSU will check its routing table and list of earlier associated
vehicles. If the destination is associated with this RSU earlier, then new RSU
will be tracked. Otherwise, the message will be broadcasted to RSUs again after
adding the identity of the current RSU. If RSU will not find destination vehicle
after broadcasting up to two-level, RSUs will drop messages to avoid network
overhead further.
While changing cluster, old RSU will keep information of new RSU up to 60 min
by assuming that the vehicle will be associated with new RSU up to an hour. It
will help to increase message delivery percentage with reduced network overhead.
Therefore, while processing messages, RSU will check vehicles in their cluster, then
it will check the list of vehicles maintained by RSU earlier. If destination vehicle
belongs to that list, then RSU will forward a message to the new RSU. It will increase
the probability to reach destination nearby new RSU elected for that node.
End-to-end routing algorithm and pseudocode: complete end-to-end routing algo-
rithm is mentioned below including inter-cluster and within cluster routing. Routing
IBRP: An Infrastructure-Based Routing Protocol … 117
/*****Vehicle Side*****/
if message M is valid message
for x = 1 to size of neighbors (Vs)
if (neighbors(Vs)[x]) equal to Vd
extract path(Vd) from neighbors (Vs);
set path (Vd) to MESSAGE;
set Vs, Vd, M to MESSAGE;
dispatch MESSAGE to send;
else
set R(Vs) as destination in MESSAGE
set Vs, M to MESSAGE;
set Vs in path (Vd);
dispatch MESSAGE to send;
endif
endif
4 Performance Analysis
the next RSU which in turn will forward the message to the destination. Thus,
the message overhead will be three messages (Source → RSU, RSU → next
RSU, Next RSU → destination) and the message delivery time will be 3 T.
• The destination was not previously associated with RSU: The current RSU
will broadcast the message to all neighboring RSU. These neighboring RSUs
will check their respective member list and if anyone finds it in the member
list, they will forward the message to the destination. In this case, the number
of messages required will be n + 2 messages (Source → RSU, RSU → All
Neighboring RSU, Next RSU → destination) and the message delivery time
will be 3 T. However, if No RSU contains the destination as a member but has
the information about next RSU, the message will be forwarded to the next
RSU which in turn will forward the message to the destination. In this case,
n + 3 messages will be required and message delivery time will be 4 T.
If none of the cases is satisfied, the message will not be delivered. However,
the applications considered are such that this is highly probable that the destination
will be near the source only. Hence, it is highly unlikely that the destination is not
covered even by the two-hop away RSU’s from the current RSU. Thus, the probability
of message loss is very low.
5 Conclusion
References
1. Basagni, S., Conti, M., & Giordano, S. (2013). Mobile Ad hoc networking: Cutting edge
directions. Book Second Edition: Willey IEEE Press Publisher.
2. Moridi, E., & Hamid, B. (2017). RMRPTS: A reliable multi-level routing protocol with Tabu
search in VANET. Telecommunication Systems,65(1), 127–137.
IBRP: An Infrastructure-Based Routing Protocol … 121
3. Kasana, R., & Sushil, K. (2015). Multimetric Next Hop Vehicle Selection for geocasting in
vehicular Ad-Hoc networks. In International Conference on Computational Intelligence &
Communication Technology (CICT) (pp. 400–405). IEEE.
4. Dua, A., Kumar, N., & Bawa, S. (2014). A systematic review on routing protocols for vehicular
Ad Hoc networks. Vehicular Communications, 1(1), 33–52.
5. Ahmad, I., Noor, R. M., Ahmedy, I., Shah, S. A. A., Yaqoob, I., Ahmed, E., & Imran, M.
(2018). VANET–LTE based heterogeneous vehicular clustering for driving assistance and route
planning applications. Elsevier Computer Networks, 145, 128–140.
6. Fekair, M., Lakas, A., & Korichi, A. (2016). CBQoS-VANET: Cluster-based artificial bee
colony algorithm for QoS routing protocol in VANET. In International Conference on Selected
Topics in Mobile & Wireless Networking (MoWNeT), (pp. 1–8).
7. Singh, S., & Agrawal, S. (2014). VANET routing protocols: Issues and challenges. In
Proceedings of RAECS-2014 UIET Panjab University Chandigarh (pp. 205–210).
8. Sharma, Y. M., & Mukherjee, S. (2012). A contemporary proportional exploration of numerous
routing protocol in VANET. International Journal of Computer Applications (0975–8887).
9. Singh, S., & Agrawal, S. (2014). VANET routing protocols: Issues and challenges. In
Proceedings of 2014 RAECS UIET Panjab University Chandigarh (pp 205–210). IEEE.
10. Altayeb, M., & Mahgoub, I. (2013). A survey of vehicular Ad hoc networks routing protocols.
International Journal of Innovation and Applied Studies, 3(3), 829–846.
11. Singh, S., & Agrawal, S. (2014). VANET routing protocols: Issues and challenges. In
Proceedings of IEEE Recent Advances in Engineering and Computational Sciences (RAECS)
(pp. 1–5).
12. Dhankhar, S., & Agrawal, S. (2014). VANETs: A survey on routing protocols and issues.
International Journal of Innovative Research in Science, Engineering and Technology, 3(6),
13427–13435.
13. Perkins, C., Belding-Royer, E., & Das, S. (1997). Ad hoc on-demand distance vector (AODV)
routing. In Proceedings of 2nd IEEE WMCSA (pp. 90–100).
14. Haas, Z. J. (1997). The Zone Routing Protocol.
15. Kumar, S., & Verma, A. K. (2015). Position based routing protocols in VANET: A survey.
Wireless Personal Communications, 83(4), 2747–2772.
16. Liu, J., Wan, J., Wang, Q., Deng, P., Zhou, K., & Qiao, Y. (2016). A survey on position-based
routing for vehicular ad hoc networks. Telecommunication Systems, 62(1), 15–30.
17. Basagni, S., Chlamtac, I., Syrotiuk, V., & Woodward, B. (1998). A distance routing effect
algorithm for mobility (DREAM). In Proceedings of ACM International Conference on Mobile
Computing and Networking, Dallas, TX, pp. 76–84.
18. Karp, B., & Kung, H. (2000). Greedy perimeter stateless routing for wireless networks.
In Proceedings of ACM International Conference on Mobile Computing and Networking
(MobiCom 2000), Boston, MA, pp. 243–254.
19. Tonguz, O. K., Wisitpongphan, N., & Bai, F. DV-CAST: A distributed vehicular broadcast
protocol for vehicular ad hoc networks. IEEE Wireless Communication, 1.
20. Maia, G., André, L. L., Aquino, D., Viana, A. C., Boukerche, A., Loureiro, A. A. F. (2010).
HyDi: A hybrid data dissemination protocol for highway scenarios in vehicular ad hoc
networks, DIVANet@MSWiM, pp. 47–56.
21. Luo, Y., Zhang, W., & Hu, Y. (2010). A new cluster based routing protocol for VANET. In
Proceedings of the 2nd International Conference on Networks Security Wireless Communica-
tions and Trusted Computing, IEEE Xplore Press, Wuhan, Hubei, China, pp. 176–180.
22. Zhang, Z., Boukerche, A., & Pazzi, R. (2011). A novel multi-hop clustering scheme for vehic-
ular ad-hoc networks. In Proceedings of the 9th ACM International Symposium on Mobility
Management and Wireless Access (pp. 19–26).
23. Ren, M., Khoukhi, L., Labiod, H., Zhang, J., & Vèque, V. (2017). A mobility-based scheme
for dynamic clustering in vehicular ad-hoc networks (VANETs). Vehicular Communications,
9, 233–241.
24. Ucar, S., Ergen, S. C., & Ozkasap, O. (2015). Multihop-cluster-based IEEE 802.11p and LTE
hybrid architecture for VANET safety message dissemination. IEEE Transactions on Vehicular
Technology, 65(4), 1–1.
122 P. K. Pandey et al.
25. Aravindan, K., Suresh, C., & Dhas, G. (2018). Destination-aware context-based routing
protocol with hybrid soft computing cluster algorithm for VANET. Journal of Soft Computing,
1–9.
26. Khan, Z., Fan, P., Fang, S., & Abbas, F. (2018). An unsupervised cluster-based vanet-oriented
evolving graph (CVoEG) model and associated reliable routing scheme. IEEE Transactions on
Intelligent Transportation Systems.
27. Lin, D., Kang, J., Squicciarini, A., et al. (2017). MoZo: A moving zone based routing protocol
using pure V2V communication in VANETs. IEEE Transactions on Mobile Computing, 16(5),
1357–1370.