Conference Paper Modeling and Dynamics of Infectious Disease Big Data Analytics

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2017 International Conference on Computer Communication and Informatics (ICCCI -2017), Jan.

05 – 07, 2017, Coimbatore, INDIA

Modeling and Dynamics of Infectious Disease: Big

Data Analytics
Chinmayee Mohapatra Manjusha pandey Siddharth Swarup Rautray
School of Computer Engineering School of Computer Engineering School of Computer Engineering
KIIT University KIIT University KIIT University
Bhubaneswar, Odisha,India Bhubaneswar, Odisha,India Bhubaneswar, Odisha,India

Abstract— The rapid increase in population creates an issue in in comparison to other areas. Cause of infectious disease
handling and analyzing the population data for the traditional varies with population depending on the atmosphere.
data base management system. So Big data came into figure to
solve the issue. The main components of Big data are Hadoop II. Overview Infectious Disease (Dengue)
and Map-Reduce. Big data is more efficient in comparison to the
traditional data base system due to some of its basic features like
Velocity, Veracity, Volume , Verity and Value. Infectious disease Dengue fever is commonly found in urban parts of
is the illness resulting from infection. This is caused by infectious subtropical and tropical areas. The main cause of dengue fever
agents including Viruses, Prions, Bacteria, Nematodes etc.
is mosquito bite. It’s seriousness varies from mild to severe.
Population dynamics is a branch of life science which includes
the study of population size and age composition of dynamic
The severe form of dengue includes dengue shock
system and the biological and environmental process managing syndrome(DSS) and Dengue hemorrhagic fever(DHF). DHF
them.This proposed paper consider the Dengue Fever as an occurs due to the virus like DV-1, DV-2, DV-3 and DV-4[3].
infectious disease and divides the population dynamic into three DV is a positive-stranded encapsulated RNA virus and is
parts i.e. High Vulnerable, Mid vulnerable, Low vulnerable to composed of three structural protein genes, which encode the
Dengue. And also suggest the preventive measure respectively nucleocapsid or core (C) protein, a membrane-associated (M)
like Forced preventive for high vulnerable, Efficient preventive protein, an enveloped(E) glycoprotein and seven non-structural
measure for mid vulnerable and delayed preventive measure for (NS) proteins. It is transmitted mainly by Aedes aegypti
low vulnerable areas by utilizing the benefits of big data.
mosquito and also by Ae. Albopictus[4]. The Fig.1 shows the
Keywords- Infectious Disease; Population Dynamics; Hadoop; life cycle of dengue virus , how they become adult from a egg
Map-Reduce; and increase their family.

Infectious disease are caused by pathogenic micro-organisms
like bacteria,viruses, parasites, fungi (1). It spreads directly or
indirectly from one person to another. There are many types of
infectious disease are seen all over the world. Among them
dengue fever is one , which can convert a life to death. Fig 1. Life cycle of Dengue Virus
Dengue fever is a mosquito borne infection that causes a
severe flue-like illness. So it is also called as break-bone fever.
Dengue fever has become vital international public health The symptoms of infection usually begins 4-7days after the
concern for last few years. Dengue virus is primarily mosquito bite. Dengue can be transmitted from one infected
transmitted to humans by the Aedes aegypti and Aedes person to another uninfected person through mosquito. DSS is
albopictus female mosquito(1). A mosquito bite can cause the the worst form of dengue which can also result in death. Th
dengue disease and the virus the dengue viruses will multiply Fig.2 shows how the infection spreads from one person to
and increase. another through the mosquito. The symptoms of dengue fever
Population dynamics is the study of increase or are[5]
decrease in size and structure of population depending in the
 High fever
time. Mainly population dynamics depends on the factors that
include rates of reproduction, death and migration(2). As the  Intense headache
world is changing and also the technology changing , so for
more efficient and accurate result Big data is used to store the  Pain behind the eyes
data and the process the data. Population also varies according  Aching muscles and joints
to geographic region like population is less in forest and hills
 Vomiting and feeling nauseous

978-1-4673-8855-9/17/$31.00 ©2017 IEEE

2017 International Conference on Computer Communication and Informatics (ICCCI -2017), Jan. 05 – 07, 2017, Coimbatore, INDIA

 Bleeding from your mouth/gums, Nosebleeds
 Internal bleeding, which can result in black vomit and feces
 Small blood spots under your skin
 Heavy bleeding
 Blood vessels leaking fluid
 Death

Fig 2. Dengue infection life cycle Fig 3. Proposed Mechanism

Since there are data sets from multiple sources and multiple
III. POPULATION DYNAMICS contexts, data pre-processing plays a vital role in formalizing
and formatting the data into a single context. The main part of
Population dynamics is the study of life science which deals the pre-processing is the combining data which are
with the study of change in population with time [6]. It also corresponding to geographical dynamics.The symptom are
affected by the atmosphere and disease. If in a particular collected from various resources like from hospital database
location a harmful disease is seen then automatically the and also from the social media databases.The Fig.3 shows the
population of that area gets affected .Like Dengue is a very proposed frame work. The idea of disease can got from
harmful disease which has a bad effect on the population.The internet by using the search engine concept like from which
Dengue fever mostly seen in the forestry areas where a large no place more search according to dengue fever then people in
of tree present. And also the dengue mosquito are seen in the that area are more effected. Data can be provide to the system
most areas where the water is stored without proper care using tools for cleansing, abstraction and logical storage for
successful mining in the further steps. It’s become a difficult
IV. LITERATURE SURVEY OF BIG DATA IN DISEASE DIAGNOSIS task for database to manipulate such a large volume of data
A discussion about Big data analysis for heart disease which is coming with high velocity . So Hadoop come into the
detection system using map reduce technique, the causes of figure, for efficient manipulation of data-set. Hadoop has two
Cardiovascular disease describes the basic challenges and main components are HDFS and Map-Reduce. This paper use
scope of big-data in Cardiovascular and also identifies the big- Map-Reduce methodology to save data in a globally classified
data capabilities to support health-care. Similarly another paper manner according to the population dynamics. Hadoop is the
Proposed a mechanism for A review to predictive methodology recent tool to analyze the volume of data which is rapidly
to diagnose chronic kidney disease [8].There are many papers increasing by Giga bytes, Tera Bytes, Peta Bytes, Zeta Bytes
where big data is used to help people for diagnosis of disease and so on.
and also provides some preventives and also recommend drugs The proposed implementation done using some of the current
for the patients. One of the model suggested allows one to existing tools and techniques of big data analysis, some of
generate a “small” number of possible future scenarios and to them like Hadoop, Map-Reduce. As the infection is increasing
determine corresponding trajectories of infected population in day by day and more cases of dengue are coming into figure.So
different regions. Then, this information is used to find an the data set is also increasing rapidly so it becomes difficult to
optimal distribution of bed capabilities across countries/regions handle the data for the traditional data base. Hadoop consist of
according to each scenario [9]. two main components that are storage
2017 International Conference on Computer Communication and Informatics (ICCCI -2017), Jan. 05 – 07, 2017, Coimbatore, INDIA

and processing. Storage part deals with HDFS, Hadoop Similarly the % will be calculated for every population and
Distributed File System. HDFS leverages large block size and according to the % areas will be declared as low, Mid or high
moves computation where data is stored. A file can be vulnerable areas. And according to that it will be decided that
replicated into a number of times. HDFS breaks the larger files which type of preventive measure is needed i.e. If % lies in the
into smaller blocks and stores them in a cluster. The processing range of 0-30% then Low vulnerable and delayed preventive
part is done by Map-Reduce frame work. Map-Reduce measure can be taken . If it will lie between 30-60% then mid
programming helps to process massive amount of data in vulnerable so efficient preventive measure will be applicable
parallel in a efficient manner. The data set will be taken from and if it is more than 60% then it needs forced preventive html . measure as it will come under highly vulnerable area.
Map-Reduce works in two phases mapping phase
and Reducing phase. By using Map-Reduce programming
1.Pre-nalysis Phase here the count of people is done, who are affected by the
dengue fever according to the population dynamics. In Map-
In this phase the input is taken into consideration.
Reduce the data set is divided into small chunks then those
Input→ Set of Attributes ( S )
chunks are mapped by the mapper function in a parallel
HF → High Fever
manner. The output of the mappers are automatically shuffled
IH → Intense headache
and stored by the frame work. The out put is sorted on the
PE → Pain behind the eyes
basis of key element i.e dengue. Here the data set is divided
V → Vomiting into three parts That are Highly vulnerable, Mid vulnerable
and Low vulnerable to dengue fever. And provides the
BM →Bleeding from Mouth /gums preventive measures to them respectively. Forced preventive
X →Set of instances taken for consideration measure for the area or population which is highly affected by
dengue or which is highly vulnerable to dengue fever.
X ={x1, x2, x3 ........... xn } Efficient preventive measure for the medium vulnerable area
π X Є ( S ) → HDFS and delayed preventive measure for low vulnerable
area/population to Dengue.
D = Set of distributed memory Here in Fig.4 example of Map-Reduce is given for
better understanding of the working process of Map-Reduce.
D={D1, D2, D3.......Dn} In this one file is taken into consideration, which belongs to a
X1 → D1 particular area. There are two results affected and not affected,
X2 → D2 affected means one person is affected by dengue and not-
X3 → D3 affected means that person is not affected by dengue. And
: according to the result it can be decided that the area is highly
Xn → Dn vulnerable or mid vulnerable or low vulnerable.

The instances are stored in the memory blocks like D1, D2,
D3 .... Dn . Same thing we will apply for the different

2. Map-Reduce based Analysis Phase

In this phase the population is taken into consideration.

Let P is the total population divided into different parts
i.e. p1, p2 ,
By using reduce phase we calculate the total no of people with
whom the symbols are found.

Now for population p1:

Number of people with Symptoms
% of people affected= ×100
Total no of people in p1
Similarly , % will be calculated for every population. Fig 4.Working procedure of Map-Reduce
If 0-30% → Low vulnerable → Delayed Preventive Measures
30-60% →Mid vulnerable → Efficient Preventive Measures
≥ 60% → High vulnerable → Forced Preventive Measures
2017 International Conference on Computer Communication and Informatics (ICCCI -2017), Jan. 05 – 07, 2017, Coimbatore, INDIA

VI .CONCLUSION [5] W. T. Sesulihatien, S. Sasaki and Y. Kiyoki, "Ecological context-

dependent analysis and prediction using MMM: A case of dengue fever
disease," Electronics Symposium (IES), 2015 International, Surabaya,
The combination of big data and health care improve the 2015, pp. 227-232.
process of disease diagnosis and also provides more efficient
preventives for the disease. This proposed system can be used [6] J. Barreiro-Gomez, N. Quijano and C. Ocampo-Martinez, "Distributed
by the health wale-fare organization for a particular population resource management by using population dynamics: Wastewater
in which is in the suspect of dengue fever on the basis of treatment application," Automatic Control (CCAC), 2015 IEEE 2nd
Colombian Conference on, Manizales, 2015, pp. 1-6.
common signs and symptoms and get the result on the
diagnosis of dengue disease, i.e., highly vulnerable, Medium
[7] G. Vaishali and V. Kalaivani, "Big data analysis for heart disease
Vulnerable and low Vulnerable to dengue. By this application detection system using map reduce technique," 2016 International
of big data it is possible to mark the dengue fever with a sort Conference on Computing Technologies and Intelligent Data
period of time due to which step can be taken to save life by Engineering (ICCTIDE'16), Kovilpatti, India, 2016, pp. 1-6.
preventing the spreading of dengue virus. The same
mechanism can be used for the diagnosis of other disease [8] A. Batra, U. Batra and V. Singh, "A review to predictive methodology to
diagnose chronic kidney disease," 2016 3rd International Conference on
which will help to protect the human-life and also the animals Computing for Sustainable Global Development (INDIACom), New
life also. Delhi, India, 2016, pp. 2760-2763.

[9] Evans, Robin J. and Musa A. Mammadov. “Predicting and controlling

REFERENCES the dynamics of infectious diseases.” CDC (2015).

[1] D. Saikia and J. C. Dutta, "Early diagnosis of dengue disease using [10] K. Kraft, "Robots against infectious diseases," 2016 11th ACM/IEEE
fuzzy inference system," 2016 International Conference on International Conference on Human-Robot Interaction (HRI),
Microelectronics, Computing and Communications (MicroCom), Christchurch, 2016, pp. 627-628.
Durgapur, 2016, pp. 1-6.
[11] S. Alaliyat and H. Yndestad, "An Agent-Based Model to Simulate
[2] A. Zvoleff and J. Ahumada, "Understanding the link between population Infectious Disease Dynamics in an Aquaculture Facility," 2015 17th
dynamics and biodiversity conservation through remote sensing and UKSim-AMSS International Conference on Modelling and Simulation
gridded population data integration," 2015 IEEE International (UKSim), Cambridge, 2015, pp. 131-136.
Geoscience and Remote Sensing Symposium (IGARSS), Milan, 2015,
pp. 2560-2563.
[12] K. Bissett, J. Cadena, M. Khan, C. J. Kuhlman, B. Lewis and P. A.
Telionis, "An integrated agent-based approach for modeling disease
[3] Gupta, Nivedita, et al. "Dengue in India." The Indian journal of medical spread in large populations to support health informatics," 2016 IEEE-
research 136.3 (2012): 373. EMBS International Conference on Biomedical and Health Informatics
(BHI), Las Vegas, NV, 2016, pp. 629-632.
[4] D. Guhar, G. Mustafa, S. F. Rehmani and R. Bilal, "Immunodiagnostic
of dengue fever: Primary and secondary infections," 2016 13th [13] C. Pasupathi and V. Kalavakonda, "Evidence Based health care system
International Bhurban Conference on Applied Sciences and Technology using Big Data for disease diagnosis," 2016 2nd International
(IBCAST), Islamabad, 2016, pp. 75-76. Conference on Advances in Electrical, Electronics, Information,

You might also like