Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 83

EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

ALGORITHM

A Project Report Submitted to


JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY, ANANTAPUR

Submitted By
B. SRI KAVYA
(182U1A0510)

L. REKHA K. CHANDANA B. PRANAVI


(182U1A0549) (182U1A0545) (182U1A0509)

Under the Esteemed Guidance Of


V.GAYATRI, M.E., (Ph.D) Associate Professor,
Department of Computer Science & Engineering.
Project report submitted in partial fulfillment of the Requirements
for the award of the degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

GEETHANJALI INSTITUTE OF SCIENCE AND TECHNOLOGY


A Unit of USHODAYA EDUCATIONAL SOCIETY
(Approved by AICTE, New Delhi & Permanently Affiliated to JNTUA, Anantapuramu)
NAAC ‘A’ Grade Accredited Institution, An ISO 9001:2015 certified Institution
Recognized under Sec. 2(f) & 12(B) of UGC Act, 1956
3rd Mile, Bombay Highway, Gangavaram (V), Kovur(M), SPSR Nellore (Dt), Andhra Pradesh, India-
524137 Ph. No. 08622-212769, E-Mail: geethanjali@gist.edu.in, Website: www.gist.edu.in
(2018-2022)

Website : www.gist.edu.in Ph : 0862-212781


Email: csehod@gist.edu.in Fax: 08622-212778

GEETHANJALI INSTITUTE OF SCIENCE AND TECHNOLOGY


A Unit of USHODAYA EDUCATIONAL SOCIETY
(Approved by AICTE, New Delhi & Permanently Affiliated to JNTUA, Anantapuramu)
NAAC ‘A’ Grade Accredited Institution, An ISO 9001:2015 certified
Institution Recognized under Sec. 2(f) & 12(B) of UGC Act, 1956
3rd Mile, Bombay Highway, Gangavaram (V), Kovur(M), SPSR Nellore (Dt), Andhra Pradesh, India- 524137

BONAFIDE CERTIFICATE

This is to certify that the project work entitled “EFFICIENCY OF HEART DISEASE
PREDICTION USING GENETIC ALGORITHM” is a bonafide record done by B. SRI KAVYA
(182U1A0510) , L. REKHA (182U1A0549), K. CHANDANA (182U1A0545), B. PRANAVI
(182U1A0509), in the department of Computer Science & Engineering, Geethanjali Institute of
Science and Technology, Nellore and is submitted to Jawaharlal Nehru Technological University,
Anantapur in the partial fulfillment for the award of B. Tech degree in Computer Science &
Engineering. This work has been carried out under my supervision.

V. GAYATRI Dr. V. SIREESHA


Associate Professor Professor & HoD
Department of CSE Department of CSE
GIST, NELLORE GIST, NELLORE

Submitted for the Viva-Voce Examination held on

Internal Examiner External Examiner

(2018-2022)
ACKNOWLEDGEMENTS

The satisfaction that accompanies the successful completion of the project would be
incomplete without the people who made it possible. Their constant guidance and
encouragement crowned the efforts with success.

We express our deepest sense of gratitude to Mr. N. SUDHAKAR REDDY, B. Tech,


Secretary and Correspondent, Geethanjali Institute of Science and Technology, Nellore and
other members of Management for providing all the facilities needed for this work.

We owe our gratitude to Dr. G. SUBBA RAO, M.Tech, Ph.D, MIE, LMISTE,
MSAE, PRINCIPAL, Geethanjali Institute of Science and Technology, Nellore, for his
consistent help and valuable suggestions.

Our special thanks to Dr. V. SIREESHA, M.E., Ph.D., Professor & Head of the
Department, Department of Computer Science & Engineering, Geethanjali Institute of Science
and Technology, Nellore, for her timely suggestions and help during the progress of project
work in spite of her busy schedule.

It is indeed our proud privilege to express our deep sense of gratitude and indebtedness
to our guide, V. GAYATRI, M.E., (Ph.D) Associate Professor Computer Science &
Engineering, Geethanjali Institute of Science and Technology, Nellore, for her keen interest,
critical, constructive and skillful guidance and constant encouragement throughout the course
and for successful completion of project.

During the entire course of dissertation work, we received valuable academic inputs
as well as moral support from other departments, general teaching and non-teaching faculty at
GEETHANJALI INSTITUTE OF SCIENCE AND TECHNOLOGY, Nellore. We
were motivated by the uphold and moral encouragement given to us by our beloved parents.
Finally we wish to express our sincere thanks for all those who helped me directly or indirectly
to complete the work.

PROJECT ASSOCIATES

B. SRI KAYA (182U1A0510)


L.REKHA (182U1A0549)
K.CHANDANA (182U1A0545)
B. PRANAVI (182U1A0509)
ABSTRACT i

LIST OF FIGURES ii

LIST OF TABLES ii
LIST OF GRAPHS iii

LIST OF ABBREVATIONS iii

SNO CONTENTS PAGE NO


1. INTRODUCTION 1
1.1 Introduction 2
1.2 Introduction to Machine Learning 3
1.3 Types of Machine Learning 4
1.4 Bio-Inspired Algorithms 4
1.5 Data Mining 6
1.6 Overview 6
1.7 Objective of Project 7
1.8 Problem Statement 7
2. LITERATURE SURVEY 8
3. METHODOLOGY 15
3.1 System Architecture 16
3.2 Block Diagram of the Proposed Method 16
3.3 Dataset 18
3.4 Preprocessing 19
3.4.1 Training the model 21
3.4.2 Testing the model 21
3.5 Algorithms 22
3.5.1 Genetic Algorithm 22
3.5.2 BAT Algorithm 23
3.5.3 BEE Algorithm 23
4. SYSTEM ANALYSIS 25
4.1 Existing System 26
4.2 Disadvantages of Existing System 26
4.3 Proposed System 26
4.4 Advantages of Proposed System 26
4.5 System Study 26
4.5.1 Feasibility Study 26
4.5.2 Economical Feasibility 27
4.5.3 Technical Feasibility 27
4.5.4 Social Feasibility 27
4.6 System Specifications 28
4.6.1 Hardware Requirements 28
4.6.2 Software Requirements 28
5. SYSTEM DESIGN 29
5.1 Introduction 30
5.2 UML Diagrams 30
5.2.1 Usecase Diagram 32
5.2.2 Class Diagram 33
5.2.3 Sequence Diagram 33
6. TESTING AND VALIDATION 35
6.1 Testing Strategies 36
6.2 System Test 36
6.3 Types of Tests 37
6.3.1 Unit Testing 37
6.3.2 Integration Testing 37
6.3.3 Functional Testing 37
6.3.4 System Test 38
6.3.5 White Box Testing 38
6.3.6 Black Box Testing 38
7. IMPLEMENTATION AND RESULTS 38
7.1 Introduction To Python 40
7.2 Features 41

7.3 Advantages 42

7.4 Python OOPs Concepts 43

7.5 Operators 44

7.6 Results and Discussions 47

7.7 Experimental Setup 48

7.8 Performance Analysis 48

7.8.1 Correlation Matrix 53

7.8.2 Data Analysis 53

7.8.3 Accuracy Results 54

8. CONCLUSION 56
9. FUTURE ENHANCEMENT 58
10. BIBILIOGRAPHY 60
ABSTRACT

Human heart is an important organ in the human body. It is very helpful for body
functioning and removes the waste products from our body by pumping the blood throughout
the body. It is very risky to human lives whenever heart disease or failure occurs. Machine
Learning is one of the most widely used concepts around the world. It will be essential in the
healthcare sector which will be useful for doctors to fasten the diagnosis.

The objective of this project is to build a Machine learning model for heart disease prediction
based on the related attributes. The dataset of Heart disease prediction consists of 14 different
parameters related to Heart Disease. The proposed work deals with Machine Learning
algorithms such as bio-inspired optimization algorithms. It has four features optimizing
algorithms such as Genetic Algorithm, Bat algorithm, Bee algorithm and ACO algorithm.
Here we are implementing 3 algorithms called Genetic, Bat and Bee algorithms. We analyse
and predict the result whether the patient has heart disease, no disease and the stages of
disease using bio –inspired algorithms. Genetic Algorithm gives more accuracy in less time
for the prediction. This prediction will make it faster and more efficient in healthcare sectors.
This model can be helpful to the medical practitioners at their clinic as decision support
system.

i
LIST OF

SNO FIGURE NO FIGURE NAME PAGE NO

1. 1.1 Introduction To Machine Learning 3


2. 1.2 Types of Machine Learning 4
3. 3.2 Block Diagram of the Proposed System 17
4. 3.4.1 Genetic Algorithm 22
5. 3.4.2 BAT Algorithm 23
6. 5.2.1 Usecase Diagram 32
7. 5.2.2 Class Diagram 33
8. 5.2.3 Sequence Diagram 34
9. 7.1 Introduction To Python 41
10. 7.4 Python OOPs Concepts 44
11. 7.8.1 Correlation Matrix 53

LIST OF TABLES

SNO TABLE NO TABLE NAME PAGE NO


1. 3.3.1 Details of Attributes 18
2. 3.3.2 Datasets 18
3. 3.3.3 Attributes Description 19
4. 7.8.3 Accuracy Table 55

i
LIST OF
SNO GRAPH NO GRAPH NAME PAGE NO
1. 7.8.2 Data Analysis 54
2. 7.8.3 Accuracy Graph 55

LIST OF ABBREVATIONS

SNO ACRONYM ABBREVATIONS


1. ML Machine Learning
2. UML Unified Modelling Language
3. GUI Graphical User Interface

i
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 1

INTRODUCTION

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

1. INTRODUCTION

1.1 Introduction

Health diseases are increasing day by day due to life style and hereditary. In this
aspect, heart disease is the most important cause of demise in the human kind over past few
years. Human heart is an important organ in the human body. It is very helpful for body
functioning and removes the waste products from our body by pumping the blood throughout
the body. It is very risky to human lives whenever heart disease or failure occurs. nearly 1.2
billion population die every year as a outcome of heart diseases. There is no single solution to
the increasing load of Heart disease. There is no single solution to the rising burden of heart
disease, given the massive transformation in ethnic, as well as economic environments. Heart
failure prognosis has historically been an extremely thought-provoking task in the eve of
high- cost ratios. The price of a wide range of modern imaging and clinical methodologies for
the diagnosis of heart disease is prohibitively high. Leading causes of cardiac disease involve
chest discomfort, dyspnoea, fatigue, edema, palpations, as well as syncope, as well as cough,
hemoptysis, and cyanosis.

The system which is computer-based clinical decision support and can reduce medical errors,
improve patient safety and reduce unnecessary changes in practice, and improve the
prognosis of the patient’s medical history to integrate patients. Machine Learning is one of
the most widely used concepts around the world. It will be essential in the healthcare sector
which will be useful for doctors to fasten the diagnosis.

The main objective of this study is to develop a prototype of heart disease forecasting system
using bio – inspired algorithms. a huge knowledge and accurate data in the field not only
helps users by providing effective treatment, but also help to reduce the cost of treatment and
improve the visualization and ease of explanation. bio-inspired optimization algorithms has 4
features optimizing algorithms such as Genetic Algorithm, Bat algorithm, Bee algorithm and
ACO algorithm. Here we are implementing 3 algorithms called Genetic, Bat and Bee
algorithms.We analyse and predict the result whether the patient has heart disease, no disease
and the stages of disease using bio –inspired algorithms. Genetic Algorithm gives more
accuracy in less time for the prediction.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

1.2 Introduction To Machine Learning

Machine Learning is the field of study that gives computers the capability to learn
without being explicitly programmed. ML is one of the most exciting technologies that one
would have ever come across. As it is evident from the name, it gives the computer that makes
it more similar to humans: The ability to learn. Machine learning is actively being used today,
perhaps in many more places than one would expect.

Fig 1.2 Introduction To Machine Learning

Machine Learning (ML) has proven to be one of the most game-changing technological
advancements of the past decade. In the increasingly competitive corporate world, ML is
enabling companies to fast-track digital transformation and move into an age of automation.
ML is required to stay relevant in some verticals, such as digital payments and fraud
detection in banking or product recommendations. The eventual adoption of machine learning
algorithms and its pervasiveness in enterprises is also well-documented, with different
companies adopting machine learning at scale across verticals. Today, every other app and
software all over the Internet uses machine learning in some form or the other.

Machine learning is fundamentally set apart from artificial intelligence, as it has the
capability to evolve. Using various programming techniques, machine learning algorithms are
able to process large amounts of data and extract useful information. In this way, they can
improve upon their previous iterations by learning from the data they are provided.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

1.3 Types of Machine Learning

Machine learning is a subset of AI, which enables the machine to automatically learn
from data, improve performance from past experiences, and make predictions. Machine
learning contains a set of algorithms that work on a huge amount of data. Data is fed to these
algorithms to train them, and on the basis of training, they build the model & perform a
specific task. These ML algorithms help to solve different business problems like Regression,
Classification, Forecasting, Clustering, and Associations, etc.

Based on the methods and way of learning, machine learning is divided into mainly four types,
which are:

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning

Fig.1.3 Types of Machine learning

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

1.4 Bio-Inspired Algorithms

Bio-inspired computation is a rapidly evolving field with many real-world


applications including business, management, and engineering. In addition, various
techniques and algorithms have been developed over the last several decades within bio-
inspired computation.

Bio Inspired algorithms have 4 features optimizing algorithms

 Genetic Algorithm
 BAT Algorithm
 BEE Algorithm
 ACO Algorithm

Solving an optimization problem by mathematical or traditional modelling methods can be


difficult or even impossible in practice – due to, for example, the existence of non-convex
functions and discrete variables. However, by applying bio-inspired methodologies, it is
possible to solve complex optimization problems with high-quality solutions. As with
problem- dependent improvement techniques, generating optimal solutions poses several
challenges such as the design and analysis of bio-inspired algorithms. Recent studies have
shown that the performance of some bio-inspired algorithms depends on the special
characteristics of the optimization problems. However, finding an optimal solution for an
optimization problem may not be easy.

This Special Issue intends to gather the latest high-quality original research and review
articles discussing bio-inspired algorithms and their applications. We also welcome articles
focusing on the analysis and design of optimization test problems, as well as performance
evaluation indicators.

Potential topics include but are not limited to the following:

 Bio-inspired algorithms, such as genetic algorithms, evolutionary strategy, particle


swarm optimization, ant colony optimization, and differential evolution
 Application of bio-inspired algorithms in other fields such as natural language
processing, financial engineering, healthcare, time series analysis, recommender

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
systems, and cybersecurity.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

1.5 Data Mining

Data mining is the process of extracting and discovering patterns in large data
sets involving methods at the intersection of machine learning, statistics, and database
systems.

The actual data mining task is the semi-automatic or automatic analysis of large quantities of
data to extract previously unknown, interesting patterns such as groups of data records,
unusual records, and dependencies. This usually involves using database techniques such as
spatial indices. These patterns can then be seen as a kind of summary of the input data, and
may be used in further analysis or, for example, in machine learning and predictive analytics.
For example, the data mining step might identify multiple groups in the data, which can then
be used to obtain more accurate prediction results by a decision support system. Neither the
data collection, data preparation, nor result interpretation and reporting is part of the data
mining step, although they do belong to the overall KDD process as additional steps.

The term "data mining" is a misnomer because the goal is the extraction of patterns and
knowledge from large amounts of data, not the extraction of data itself. It also is
a buzzword and is frequently applied to any form of large-scale data or information
processing as well as any application of computer decision support system, including
artificial intelligence and business intelligence.

1.6 OVERVIEW

Machine Learning is one of the most widely used concepts around the world. It will
be essential in the healthcare sector which will be useful for doctors to fasten the diagnosis.

The objective of this project is to build a Machine learning model for heart disease prediction
based on the related attributes. The dataset of Heart disease prediction consists of 14 different
parameters related to Heart Disease. The raw data for the Heart diseases prediction is the
collects of historical data that includes a variety of important attributes like age , sex, cp,
trestbps, chol, fbs , restecg, thalacg exang, oldpeak, slope, ca, thal, class this are the 14 main
Attributes taken in the dataset. This dataset under goes through the data preprocessing .

The proposed work deals with Machine Learning algorithms such as bio-inspired
optimization algorithms. It has 4 features optimizing algorithms such as Genetic Algorithm,

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
7
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
Bat algorithm, Bee algorithm and ACO algorithm. Here we are implementing 3 algorithms
called Genetic, Bat

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
8
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

and Bee algorithms. We analyse and predict the result whether the patient has heart disease,
no disease and the stages of disease using bio –inspired algorithms. This result will displayed
in the GUI. Genetic Algorithm gives more accuracy in less time for the prediction. This
prediction will make it faster and more efficient in healthcare sectors.

1.7 OBJECTIVE OF PROJECT

The objective of this paper is to evaluate an imbalanced dataset with the help of
various machine learning models like bio-inspired algorithms. It has 4 features optimizing
algorithms such as Genetic Algorithm, Bat algorithm, Bee algorithm and ACO algorithm And
to determine which one of those is the best suited model for Heart disease prediction.

1.8 PROBLEM STATEMENT

The major challenge in heart disease is its detection. There are instruments available
which can predict heart disease but either it are expensive or are not efficient to calculate
chance of heart disease in human. Early detection of cardiac diseases can decrease the
mortality rate and overall complications. However, it is not possible to monitor patients
everyday in all cases accurately and consultation of a patient for 24 hours by a doctor is not
available since it requires more sapience, time and expertise. Since we have a good amount of
data in today’s world, we can use various machine learning algorithms to analyze the data for
hidden patterns. The hidden patterns can be used for health diagnosis in medicinal data.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
9
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 2

LITERATURE SURVEY

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

2. LITERATURE SURVEY

2.1 Heart Disease Prediction using Machine Learning

As per the recent study by WHO, heart related diseases are increasing. 17.9 million
people die every-year due to this. With growing population, it gets further difficult to
diagnose and start treatment at early stage. But due to the recent advancement in technology,
Machine Learning techniques have accelerated the health sector by multiple researches. Thus,
the objective of this project is to build a ML model for heart disease prediction based on the
related parameters. We have used a benchmark dataset of UCI Heart disease prediction for
this research work, which consist of 14 different parameters related to Heart Disease.
Machine Learning algorithms such as Random Forest, Support Vector Machine (SVM),
Naive Bayes and Decision tree have been used for the development of model. In our research
we have also tried to find the correlations between the different attributes available in the
dataset with the help of standard Machine Learning methods and then using them efficiently
in the prediction of chances of Heart disease. Result shows that compared to other ML
techniques, Random Forest gives more accuracy in less time for the prediction. This model
can be helpful to the medical practitioners at their clinic as decision support system.

2.2 Heart Disease Diagnosis and Prediction Using Machine Learning and
Data Mining Techniques

Heart disease is the main reason for death in the world over the last decade. Almost
one person dies of Heart disease about every minute in the United States alone. Researchers
have been using several data mining techniques to help health care professionals in the
diagnosis of heart disease. However using data mining technique can reduce the number of
test that are required. In order to reduce number of deaths from heart diseases there have to be
a quick and efficient detection technique. Decision Tree is one of the effective data mining
methods used. This research compares different algorithms of Decision Tree classification
seeking better performance in heart disease diagnosis using WEKA. The algorithms which
are tested is J48 algorithm, Logistic model tree algorithm and Random Forest algorithm. The
existing datasets of heart disease patients from Cleveland database of UCI repository is used
to test and justify the performance of decision tree algorithms. This datasets consists of 303
instances and 76 attributes. Subsequently, the classification algorithm that has optimal

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
potential will be

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

suggested for use in sizeable data. The goal of this study is to extract hidden patterns by
applying data mining techniques, which are noteworthy to heart diseases and to predict the
presence of heart disease in patients where this presence is valued from no presence to likely
presence.

2.3 International application of a new probability algorithm for the diagnosis


of coronary artery disease

A new discriminant function model for estimating probabilities of angiographic


coronary disease was tested for reliability and clinical utility in 3 patient test groups. This
model, derived from the clinical and noninvasive test results of 303 patients undergoing
angiography at the Cleveland Clinic in Cleveland, Ohio, was applied to a group of 425
patients undergoing angiography at the Hungarian Institute of Cardiology in Budapest,
Hungary (disease prevalence 38%); 200 patients undergoing angiography at the Veterans
Administration Medical Center in Long Beach, California (disease prevalence 75%); and 143
such patients from the University Hospitals in Zurich and Basel, Switzerland (disease
prevalence 84%). The probabilities that resulted from the application of the Cleveland
algorithm were compared with those derived by applying a Bayesian algorithm derived from
published medical studies called CADENZA to the same 3 patient test groups. Both
algorithms overpredicted the probability of disease at the Hungarian and American centers.
Overprediction was more pronounced with the use of CADENZA (average overestimation 16
vs 10% and 11 vs 5%, p less than 0.001). In the Swiss group, the discriminant function
underestimated (by 7%) and CADENZA slightly overestimated (by 2%) disease probability.
Clinical utility, assessed as the percentage of patients correctly classified, was modestly
superior for the new discriminant function as compared with CADENZA in the Hungarian
group and similar in the American and Swiss groups. It was concluded that coronary disease
probabilities derived from discriminant functions are reliable and clinically useful when
applied to patients with chest pain syndromes and intermediate disease prevalence.

2.4 Decision support system for heart disease based on support vector
machine and articial neural network

The medical diagnosis process can be interpreted as a decision making process,


during which the physician induces the diagnosis of a new and unknown case from an
available set of clinical data and from his/her clinical experience. This process can be

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
computerized in order to

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

present medical diagnostic procedures in a rational, objective, accurate and fast way. This
paper presents a decision support system for heart disease classification based on support
vector machine (SVM) and Artificial Neural Network (ANN). A multilayer perceptron neural
network (MLPNN) with three layers is employed to develop a decision support system for the
diagnosis of heart disease. The multilayer perceptron neural network is trained by back-
propagation algorithm which is computationally efficient method. Results obtained show that
a MLPNN with back-propagation can be successfully used for diagnosing heart disease than
support vector machine.

2.5 Heart diseases diagnosis using neural networks arbitration

There is an increase in death rate yearly as a result of heart diseases. One of the major
factors that cause this increase is misdiagnoses on the part of medical doctors or ignorance on
the part of the patient. Heart diseases can be described as any kind of disorder that affects the
heart. In this research work, causes of heart diseases, the complications and the remedies for
the diseases have been considered. An intelligent system which can diagnose heart diseases
has been implemented. This system will prevent misdiagnosis which is the major error that
may occur by medical doctors. The dataset of statlog heart disease has been used to carry out
this experiment. The dataset comprises attributes of patients diagnosed for heart diseases. The
diagnosis was used to confirm whether heart disease is present or absent in the patient. The
datasets were obtained from the UCI Machine Learning. This dataset was divided into
training, validation set and testing set, to be fed into the network. The intelligent system was
modeled on feed forward multilayer perceptron, and support vector machine. The recognition
rate obtained from these models were later compared to ascertain the best model for the
intelligent system due to its significance in medical field. The results obtained are 85%,
87.5% for feedforward multilayer perceptron, and support vector machine respectively. From
this experiment we discovered that support vector machine is the best network for the
diagnosis of heart disease.

2.6 Design And Implementation Heart Disease Prediction Using Naives


Bayesian

Data mining, a great developing technique that revolves around exploring and digging
out significant information from massive collection of data which can be further beneficial in
examining and drawing out patterns for making business related decisions. Talking about the

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Medical domain, implementation of data mining in this field can yield in discovering and
withdrawing valuable patterns and information which can prove beneficial in performing
clinical diagnosis. The research focuses on heart disease diagnosis by considering previous
data and information. To achieve this SHDP (Smart Heart Disease Prediction) is built via
Navies Bayesian in order to predict risk factors concerning heart disease. The speedy
advancement of technology has led to remarkable rise in mobile health technology that
being one of the web application. The required data is assembled in a standardized form. For
predicting the chances of heart disease in a patient, the following attributes are being fetched
from the medical profiles, these include: age, BP, cholesterol, sex, blood sugar etc... The
collected attributes acts as input for the Navies Bayesian classification for predicting heart
disease. The dataset utilized is split into two sections, 80% dataset is utilized for training and
rest 20% is utilized for testing. The proposed approach includes following stages: dataset
collection, user registration and login (Application based), classification via Navies Bayesian,
prediction and secure data transfer by employing AES (Advanced Encryption Standard).
Thereafter result is produced. The research elaborates and presents multiple knowledge
abstraction
techniques by making use of data mining methods which are adopted for heart disease
prediction. The output reveals that the established diagnostic system effectively assists in
predicting risk factors concerning heart diseases.
2.7 Design of a hybrid system for the diabetes and heart diseases

Data can be classified according to their properties. Classification is implemented by


developing a model with existing records by using sample data. One of the aims of
classification is to increase the reliability of the results obtained from the data. Fuzzy and
crisp values are used together in medical data. Regarding to this, a new method is presented
for classification of data of a medical database in this study. Also a hybrid neural network
that includes artificial neural network (ANN) and fuzzy neural network (FNN) was
developed. Two real-time problem data were investigated for determining the applicability of
the proposed method. The data were obtained from the University of California at Irvine
(UCI) machine learning repository. The datasets are Pima Indians diabetes and Cleveland
heart disease. In order to evaluate the performance of the proposed method accuracy,
sensitivity and specificity performance measures that are used commonly in medical
classification studies were used. The classification accuracies of these datasets were
obtained by k-fold cross-validation. The
GEETHANJALI INSTITUTE OF SCIENCE AND
TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

proposed method achieved accuracy values 84.24% and 86.8% for Pima Indians diabetes
dataset and Cleveland heart disease dataset, respectively. It has been observed that these
results are one of the best results compared with results obtained from related previous
studies and reported in the UCI web sites.

2.8 Intelligent heart disease prediction system using data mining techniques

The healthcare industry collects huge amounts of healthcare data which,


unfortunately, are not "mined" to discover hidden information for effective decision making.
Discovery of hidden patterns and relationships often goes unexploited. Advanced data mining
techniques can help remedy this situation. This research has developed a prototype Intelligent
Heart Disease Prediction System (IHDPS) using data mining techniques, namely, Decision
Trees, Naive Bayes and Neural Network. Results show that each technique has its unique
strength in realizing the objectives of the defined mining goals. IHDPS can answer complex
";what if"; queries which traditional decision support systems cannot. Using medical profiles
such as age, sex, blood pressure and blood sugar it can predict the likelihood of patients
getting a heart disease. It enables significant knowledge, e.g patterns, relationships between
medical factors related to heart disease, to be established. IHDPS is Web-based, user-friendly,
scalable, reliable and expandable. It is implemented on the .NET platform.

2.9 Classification of heart disease using artificial neural network and


feature subset selection

Now a day’s artificial neural network (ANN) has been widely used as a tool for
solving many decision modelling problems. A multilayer perception is a feed forward ANN
model that is used extensively for the solution of a no. of different problems. An ANN is the
simulation of the human brain. It is a supervised learning technique used for non- linear
classification Coronary heart disease is major epidemic in India and Andhra Pradesh is in risk
of Coronary Heart Disease. Clinical diagnosis is done mostly by doctor’s expertise and
patients were asked to take no. of diagnosis tests. But all the tests will not contribute towards
effective diagnosis of disease. Feature subset selection is a pre-processing step used to reduce
dimensionality, remove irrelevant data. In this paper we introduce a classification approach
which uses ANN and feature subset selection for the classification of heart disease. PCA is
used for pre-processing and to reduce no. Of attributes which indirectly reduces the no. of
diagnosis tests which are needed to be taken by a patient. We applied our approach on

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
Andhra Pradesh heart disease data

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

base. Our experimental results show that accuracy improved over traditional classification
techniques. This system is feasible and faster and more accurate for diagnosis of heart disease.

2.10 Heart disease prediction using data mining with map-reduce algorithm

The World Health Organization (WHO) estimated that cardiovascular diseases (CVD)
are the major cause of mortality globally, as well as in India. They are caused by disorders of
the heart and blood vessels, and includes coronary heart disease (heart attacks), Data mining
acts as a major role in the construction of an intellectual prediction model for healthcare
systems to detect Heart Disease (HD) using patient data sets, which support doctors in
diminishing mortality rate due to heart disease. Several researches have been carried out for
building model using individually or by combining the Data Mining with computational
techniques involving Decision tree (DT), Naïve bayes (NB) along with Meta-heuristics
approach, Trained Neural Network (NN), Machine intelligence or AI and unsupervised
learning algorithms like KNN and Support vector machine (SVM). In the proposed system,
large set of medical instances are taken as input. From this medical dataset, it is aimed to
extract the needed information from the record of heart patients using Map-reduce technique.
The performance of the proposed Map-reduce Algorithm’s implementation in parallel and
distributed systems was evaluated by using Cleveland dataset and compared with that of the
predictable ANN method. The trial results verify that the projected method could achieve an
average prediction accuracy of 98%, which is greater than the conventional recurrent fuzzy
neural network. In addition, this Map-reduce technique also had better performance than
previous methods that reported prediction accuracies in the range of 95– 98%. These findings
suggest that the Map-reduce technique could be used to accurately predict HD risks in the
clinic. In 2019, Blue Eyes Intelligence Engineering and Sciences Publication. All rights
reserved.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
1
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 3

METHODOLOGY

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

3. METHODOLOGY

3.1 SYSTEM ARCHITECHTURE

The architecture of the proposed system is as displayed in the figure below. The major
components of the architecture are as follows: patient database, preprocessing, training the
model, test the model, algorithms, and prediction of heart disease. There are many disease
prediction systems which do not use some of the risk factors such as age, sex, blood pressure,
cholesterol, chest pain, etc. Without using these vital risk factors; result will not be much
accurate. In this proposed work, 14 important risk factors are used to predict heart disease in
accurate manner. This makes data preparation the most important step in this process. Along
with the data another most important step is selecting the most suitable Algorithms like
Genetic
,BAT ,BEE Algorithms.

3.2 BLOCK DIAGRAM OF THE PRPOSED METHOD


The innovative technique flows through the following five phases.

1. Data Set : A website called kaggle.com obtained the Data Set we're using. This are the
Attributes taken in the dataset : age , sex, cp, trestbps, chol, fbs , restecg, thalacg exang,
oldpeak, slope, ca, thal, class

2.Data Preprocessing: The collected dataset is divided into sections one is traning part and
another one is testing part.

3. Bio-Inspired Algorithms: The algorithms which is used here is Genetic, BAT, BEE
Algorithms to predict the required output.

4. Evaluation: Evaluating the performance of each algorithm by comparison.

5. Prediction: This is the final step of the system, here the output will display in the GUI.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

DATA SETS

DATA RETRIEVAL

DATA PREPROCESSING

TRAINING TESTING

BIO-INSPIRED ALGORITHMS

GENETIC
BAT ALGORITHM BEE ALGORITHM
ALGORITHM

MODEL
EVALUATION

HEART DISEASE PREDICTION

Fig.3.2 Block Diagram of the Proposed System

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

3.3 DATASET
The raw data for the Heart diseases prediction is the collects of historical data that
includes a variety of important attributes like age , sex, cp, trestbps, chol, fbs , restecg,
thalacg exang, oldpeak, slope, ca, thal, class this are the 14 main Attributes taken in the
dataset. All attributes are numeric-valued. The data was collected from the following
locations:Cleveland Clinic Foundation.

3.3.1 Details of Attributes


The 14 main Attributes are taken in the dataset are age , sex, cp, trestbps, chol, fbs , restecg,
thalacg, exang, oldpeak, slope, ca, thal, class/num. The final predicted attribute will be
specified in ‘class/num’. Total data set contains the 303 data of attributes. By using this data
set the preprocessing of data, training and testing is done.

3.3.2 DataSets

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Here the dataset is shows that the count, mean, std, min & max value etc. It is helpful to
understand the dataset of particular attributes.

3.3.3 Attributes Description


3.4 Preprocessing
Data pre-processing is an important step for the creation of a machine learning model.
Initially, data may not be clean or in the required format for the model which can cause
misleading outcomes. In pre-processing of data, we transform data into our required format.
It is used to deal with noises, duplicates, and missing values of the dataset. Data pre-
processing has the activities like importing datasets, splitting datasets, attribute scaling, etc.
Preprocessing of data is required for improving the accuracy of the model. For achieving
better results from the applied model in Machine Learning, the format of data in a proper
manner. Kaggle provides you preprocessed dataset. But how this data is preprocessed is
discussed below. Main Steps involved in Data Preprocessing are Feature Sampling and

Encoding. Since the raw data are incomplete for making them complete form pre-
processing should be done.

Steps involved in Data Preprocessing:

 Data Quality Assessment  Feature Aggregation  Feature Sampling  Feature Encoding

Data Quality Assessment

Because data is collected from multiple sources which are in different formats, more
than half out time is fed in dealing with the data quality issues when working on a machine
learning problem. It is simply unrealistic to expect that the data will be perfect. There may be
problems due to human error, limitations of measuring devices, or flaws in the data collection
process. Let’s cover over some of the techniques to deal with them.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

a. Missing Values

It is common to have missing values in your dataset. it is ordinary to have missing


values on your dataset. It may have occurred at some stage in data collection or perhaps due
to some data validation rule, however regardless missing values need to be taken into
consideration.

Eliminate rows with a missing data

Simple and sometimes effective strategy fails if many objects have missing values. If
a feature has mostly missing values, then that feature itself can also be eliminated.

Estimate Missing Values

If only a reasonable percentage of values are missing, then we can also run simple
interpolation methods to fill in those values. However, the most common method we have
used to deal with missing values is by filling them in with the mean, median or mode value of
the respective feature.

b. Duplicate Values

A dataset may include data objects which are duplicates of one another. It may
happen when the same person submits a form more than once. The term deduplication is
often used to refer to the process of dealing with duplicates. In most cases, the duplicates are
removed so as to not give that particular data object an advantage or bias, when running
machine learning algorithms.

Feature Aggregation

Feature Aggregations are performed so as to take the aggregated values in order to put
the data in a better perspective. This results in reduction of memory consumption and
processing time. Aggregations provide us with a high-level view of the data as the behavior
of groups or aggregates is more stable than individual data objects.

Feature Sampling

Sampling is a very common method for selecting a subset of the dataset that we are
analyzing. In most cases, working with the complete dataset can turn out to be too expensive
considering the memory and time constraints. Using a sampling algorithm can help us reduce
the size of the dataset to a point where we can use a better, but more expensive, machine

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

learning algorithm. The key principle here is that the sampling should be done in such a
manner that the sample generated should have approximately the same properties as the
original dataset, meaning that the sample is representative. This involves choosing the correct
sample size and sampling strategy.

Feature Encoding

The whole purpose of data preprocessing is to encode the data in order to bring it to
such a state that the machine now understands it. Feature encoding is basically performing
transformations on the data such that it can be easily accepted as input for machine learning
algorithms while still retaining its original meaning. There are some general norms or rules
which are followed when performing feature encoding. For Continuous variables: Nominal:
Any one-to-one mapping can be done which retains the meaning. For instance, a permutation
of values like in one hot encoding.

3.4.1 Training the model

In the training part, the algorithm as mentioned above will be implemented. it helps in
finding a better set of output. The training is done on basis of the dataset input to the system.
The efficiency of the system can be improved every instance as many times the model is
trained, the number of iterations etc. The whole dataset provided which consists of 14
attributes and 303 rows will help the model undergo training. Training can also be
implemented by splitting the data in equalized required amount of data partitions. In the user
interactive GUI, as the user will select train network option after entering his data at the
backend the .csv file of heart disease dataset will be read and normalization will be carried out
so as to classify the data into classes which becomes easier. To generate a network, train()
function is implemented so as to pass the inputs. this network will be stored.

3.4.2 Testing the model:

Testing will be conducted so as to determine whether the model that is trained is


providing the desired output. As the data is entered for testing, the .csv file will be retrieved
to crosscheck and then compare and the results of the newly entered data will be generated.
On basis of how the model is trained with the help of the dataset, the user will input values of
his choice to the attributes specified and the results will be generated as the whether there is a
risk of heart disease or not.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

3.4 ALGORITHMS
3.4.1 Genetic Algorithm

The genetic algorithm is a method for solving both constrained and unconstrained
optimization problems that is based on natural selection, the process that drives biological
evolution. The genetic algorithm repeatedly modifies a population of individual solutions.

The genetic algorithm is a method for solving both constrained and unconstrained
optimization problems that is based on natural selection, the process that drives biological
evolution. The genetic algorithm repeatedly modifies a population of individual solutions.
At each step, the genetic algorithm selects individuals from the current population to be
parents and uses them to produce the children for the next generation. Over successive
generations, the population "evolves" toward an optimal solution. You can apply the
genetic algorithm to solve a variety of optimization problems that are not well suited for
standard optimization algorithms, including problems in which the objective function is
discontinuous, nondifferentiable, stochastic, or highly nonlinear. The genetic algorithm can
address problems of mixed integer programming, where some components are restricted to
be integer-valued. It deals with the population i.e individual input string. First it will select
the input string and assign a fitness value. Based on those fitness value a new offspring
will be generated. Then followed by the crossover process it will generate possibly a fit
string so as to obtain optimized weight. The new string generated at each stage is possibly
a better than the previous one.

Fig 3.4.1: Genetic Algorithm

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

3.4.2 BAT Algorithm

The Bat algorithm is a metaheuristic algorithm for global optimization. It was inspired
by the echolocation behaviour of microbats, with varying pulse rates of emission and
loudness. The BA is widely used in various optimization problems because of its excellent
performance. bat algorithm has the advantage of simplicity and flexibility.

The bat algorithm uses some idolized rules for simplicity.

(1) Bats use echolocation to sense prey, predator, or any barriers in the path and distance.

(2) Bats fly with a velocity and position. They have frequency f and loudness to reach their
prey.

This essentially uses a frequency-tuning technique to control the dynamic behaviour of a


swarm of bats, and the balance between exploration and exploitation can be controlled by
tuning algorithm-dependent parameters in bat algorithm.

Fig 3.4.2: BAT Algorithm

3.4.3 BEE Algorithm

The Bees Algorithm is an optimisation algorithm inspired by the natural foraging


behaviour of honey bees to find the optimal solution. The bees algorithm mimics the foraging
strategy of honey bees to look for the best solution to an optimisation problem. Each
candidate solution is thought of as a food source (flower), and a population (colony) of n
agents (bees) is used to search the solution space. Each time an artificial bee visits a flower
(lands on a solution), it evaluates its profitability (fitness). Bee System is an improved
version of the Genetic Algorithm. The main purpose of the algorithm is to improve local
search while keeping the global search ability of GA. The bees algorithm consists of an

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
initialisation procedure and

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
2
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

a main search cycle which is iterated for a given number T of times, or until a solution of
acceptable fitness is found. Each search cycle is composed of five procedures: recruitment,
local search, neighbourhood shrinking, site abandonment, and global search.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 4

SYSTEM ANALYSIS

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

4. SYSTEM ANALYSIS
4.1 Existing System

In the current competitive world, we require an efficient technique to summarize,


analyze, present and maintain large datasets using data mining. This requires the knowledge
of all data mining techniques in order to choose the best for desired datasets and these data
mining techniques can answer the questions that traditionally were too time consuming to
resolve. Research has shown that, data doubles every three years.

4.2 Disadvantages of Existing System

By using data mining techniques like BAT, BEE, ACO algorithms the performance was
evaluated in terms of accuracy, sensitivity and specificity and also compare to other well-
known data sets was low.

4.3 Proposed System

In this project student want to detect heart disease from dataset using Bio Inspired 4 features
optimizing algorithms such as Genetic Algorithm, Bat, Bee and ACO. Here ACO algorithm
is design in python to solve Travelling Salesman Problem to find shortest path and it cannot
be implemented with heart disease dataset, so I am implementing 3 algorithms called Genetic,
Bat and Bee.

4.4 Advantages of Proposed System

Finally, the performance was evaluated in terms of accuracy, sensitivity and specificity and
also compare to other well-known data sets, it has been observed that these results are one of
the best results compared with the results obtained from related previous studies.

4.5 System Study

4.5.1 Feasibility Study

The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During system
analysis the feasibility study of the proposed system is to be carried out. This is to ensure that
the proposed system is not a burden to the company. For feasibility analysis, some
understanding of the major requirements for the system is essential.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Three key considerations involved in the feasibility analysis are,

 ECONOMICAL FEASIBILITY
 TECHNICAL FEASIBILITY
 SOCIAL FEASIBILITY

4.5.2 Economical Feasibility

This study is carried out to check the economic impact that the system will have on
the organization. The amount of fund that the company can pour into the research and
development of the system is limited. The expenditures must be justified. Thus the developed
system as well within the budget and this was achieved because most of the technologies
used are freely available. Only the customized products had to be purchased.

4.5.3 Technical Feasibility

This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client. The developed system
must have a modest requirement, as only minimal or null changes are required for
implementing this system.

4.5.4 Social Feasibility

The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the
users solely depends on the methods that are employed to educate the user about the system
and to make him familiar with it. His level of confidence must be raised so that he is also able
to make some constructive criticism, which is welcomed, as he is the final user of the system.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

4.6 SYSTEM SPECIFICATION

4.6.1 Hardware Requirements

 Hard Disk : 40 GB.

 RAM : 512 Mb.

 Processor : i3

4.6.2 Software Requirements

 Operating system : Windows 7 Ultimate.

 Programming Language : Python.

 Data Base : MySQL.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 5

SYSTEM DESIGN

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

5. SYSTEM DESIGN

5.1 INTRODUCTION

The most creative and challenging phase of the life cycle is system and design. The term
design describes a final system and the process by which it is developed. It refers to the
technical specifications that will be applied in implementation the candidate system. The
design may be defined as “the process of applying various techniques and principles for the
purpose of defining a device, a process or a system in sufficient details to permit its
physicalrealization”. The design’s goal is how the output is to be produced and in what
format samples of theoutput and input are also presented. Second input data and database
files have to be designedto meet the requirements of the proposed output. The processing
phase is handled through the program construction and testing. Finally details related to
justification of the system and an estimate of the impact of the candidate system on the users
and the organization are documented and evaluated by management as a step toward
implementation. The importance of software design can be stated in a single word “Quality”.
Design provides us with representation of software that can be assessed for quality. Design is
the only way that we can accurately translate a customer’s requirements into a finished
software product or system without design we risk building an unstable system, that might
fail it small changes are made or may be difficult to test, or one who’s quality can’t be tested.
So, it is an essential phase in the development of a software product.

5.2 UML Diagram

➢ UML stands for Unified Modeling Language. UML is a standardized general-purpose


modeling language in the field of object-oriented software engineering. The standard is
managed, and was created by, the Object Management Group.

➢ The goal is for UML to become a common language for creating models of objectoriented
computer software. In its current form UML is comprised of two major components: a Meta-
model and a notation. In the future, some form of method or process may also be added to; or
associated with, UML.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

➢ The Unified Modeling Language is a standard language for specifying, Visualization,


Constructing and documenting the artifacts of software system, as well as for business
modeling and other non- software systems.

➢ The UML represents a collection of best engineering practices that have proven successful
in modeling of large and complex systems.

➢ The UML is a very important part of developing objects-oriented software and the
software development process. The UML uses mostly graphical notations to express the
design ofsoftware projects.

GOALS:

The Primarygoals in the design ofthe UML are as follows:

1. Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.

2. Provide extendibility and specialization mechanisms to extend the core concepts. 3. Be


independent ofparticular programming languages and development process.

4. Provide a formal basis for understanding the modeling language.

5. Encourage the growth of OO tools market. 6. Support higher level development concepts
such as collaboration frameworks, patterns and components.

Building blocks of the UML

The vocabularyofthe UML encompasses three kinds ofbuilding blocks.

1. Things.

2. Relationships.

3. Diagrams.

Things in the UML

Things are the abstractions that are first-class citizen in a model. There are four kinds ofthings
in the UML.

1. Structure things.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

2. Behavioral things.

3. Grouping things.

4. Annotational things.

These things are the basic object-oriented building blocks ofthe UML. You use them to write
well-formed models.

5.2.1 USECASE DIAGRAM

Use case diagram shows a set of use cases and actors (a special kind of class) and their
relationship. Usecase diagrams addressthe static usecase view ofa system. These diagrams are
especially important in organizing and modeling the behavioral of a system both sequence
and collaboration diagrams are kind of interaction diagram.

upload heart disease

run genetic algorithm

run bat algorithm

client

Run BEE algorithm

upload & predict test data

accuracy graph

Fig 5.2.1 Use-case Diagram

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

5.2.2 CLASS DIAGRAM:

Class diagrams area unit the foremost common diagrams employed in UML. Category
diagram consists of categories, interfaces, associations and collaboration. Category diagrams
primarily represent the thing directed read of a system that is static in nature. Active category
is employed in a very category diagram to represent the concurrency of the system. Fig 6.2(b)
refers the class diagram for predicting the fraudulent transactions.
Class diagram represents the thing orientation of a system. Therefore, it's usually usedfor
development purpose. This can be the foremost wide used diagram at the time of system
construction.

Fig 5.2.2 Class Diagram


5.2.3 SEQUENCE DIAGRAM
A sequence diagram in Unified Modelling Language (UML) is a kind of interaction diagram
that shows how processes operate with one another and in what order. It is a construct of a
Message Sequence Chart. Sequence diagrams are sometimes called event diagrams, event
scenarios, and timing diagrams. Fig 6.2 (c) refers to the sequence diagram for predicting the
fraudulent transactions.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
3
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

client system

upload heart disease

run genatic algorithm

run bat algorithm

run BEE algorithm

upload & predict test data

accuaracy graph

Fig 5.2.3 Sequence Diagram

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 6

TESTING & VALIDATION

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

6. TESTING AND VALIDATION

Software Testing is a critical element of software quality assurance and represents the
ultimate review of specification, design and coding, Testing presents an interesting
anomaly for the software engineer.

Testing Objectives include

Testing is a process of executing a program with the intent of finding an error. A good test
case is one that has a probability of finding an as yet undiscovered error.A successful test is
one that uncovers an undiscovered error.

Testing Principles
 All tests should be traceable to end user requirements

 Tests should be planned long before testing begins

 Testing should begin on a small scale and progress towards testing in large

 Exhaustive testing is not possible

 To be most effective testing should be conducted by a independent third party

6.1 Testing Strategies

A Strategy for software testing integrates software test cases into a series of well planned
steps that result in the successful construction of software. Software testing is a broader topic
for what is referred to as Verification and Validation. Verification refers to the set of
activities that ensure that the software correctly implements a specific function Validation
refers he set of activities that ensure that the software that has been built is traceable to
customer’s requirements.

6.2 SYSTEM TEST

The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality
of components, sub assemblies, assemblies and/or a finished product It is the process of
exercising software with the intent of ensuring that the Software system meets its

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
requirements

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

and user expectations and does not fail in an unacceptable manner. There are various types of
test. Each test type addresses a specific testing requirement.

6.3 TYPES OF TESTS

6.3.1 Unit testing

Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches
and internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests
perform basic tests at component level and test a specific business process, application,
and/or system configuration. Unit tests ensure that each unique path of a business process
performs accurately to the documented specifications and contains clearly defined inputs and
expected results.

6.3.2 Integration testing

Integration tests are designed to test integrated software components to determine if they
actually run as one program. Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that although the components were
individually satisfaction, as shown by successfully unit testing, the combination of
components is correct and consistent. Integration testing is specifically aimed at exposing the
problems that arise from the combination of components.

6.3.3 Functional test

Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user
manuals.

Functional testing is centered on the following items:

Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.

Functions : identified functions must be exercised.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
Output : identified classes of application outputs must be exercised.

Systems/Procedures : interfacing systems or procedures must be invoked.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Organization and preparation of functional tests is focused on requirements, key functions, or


special test cases. In addition, systematic coverage pertaining to identify Business process
flows; data fields, predefined processes, and successive processes must be considered for
testing. Before functional testing is complete, additional tests are identified and the effective
value of current tests is determined.

6.3.4 System Test

System testing ensures that the entire integrated software system meets requirements. It tests
a configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions
and flows, emphasizing pre-driven process links and integration points.

6.3.5 White Box Testing

White Box Testing is a testing in which in which the software tester has knowledge of the
inner workings, structure and language of the software, or at least its purpose. It is purpose. It
is used to test areas that cannot be reached from a black box level.

6.3.6 Black Box Testing


Black Box Testing is testing the software without any knowledge of the inner workings,
structure or language of the module being tested. Black box tests, as most other kinds of tests,
must be written from a definitive source document, such as specification or requirements
document, such as specification or requirements document. It is a testing in which the
software under test is treated, as a black box .you cannot “see” into it. The test provides
inputs and responds to outputs without considering how the software works.
6.3.7 Unit Testing

Unit testing is usually conducted as part of a combined code and unit test phase of the
software lifecycle, although it is not uncommon for coding and unit testing to be conducted as
two distinct phases.

Integration Testing
Software integration testing is the incremental integration testing of two or
more integrated software components on a single platform to produce failures caused by
interface defects.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

The task of the integration test is to check that components or software applications, e.g.
components in a software system or – one step up – software applications at the company level
– interact without error.

Test Results

All the test cases mentioned above passed successfully. No defects encountered.

Acceptance Testing

User Acceptance Testing is a critical phase of any project and requires significant participation
by the end user. It also ensures that the system meets the functional requirements.

Test Results:

All the test cases mentioned above passed successfully. No defects encountered.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 7

IMPLEMENTATION AND RESULTS

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

7. IMPLEMENTATION AND RESULTS

7.1 INTRODUCTION TO PYTHON

Python is an open source programming language. It was made to be easy-to-read and


powerful. A Dutch programmer named Guido van Rossum made Python in 1991. He named
it after the television program Monty Python's Flying Circus. Many Python examples and
tutorials include jokes from the show. Python is an interpreted language. Interpreted
languages do not need to be compiled to run. A program called an interpreter runs Python
code on almost any kind of computer. This means that a programmer can change the code and
quickly see the results. This also means Python is slower than a compiled language like C,
because it is not running machine code directly.

Python is a good programming language for beginners. It is a high-level language, which


means a programmer can focus on what to do, but does not require knowledge of computer
hardware. Writing programs in Python takes less time than in some other languages.

Python drew inspiration from other programming languages like C, C++, Java, Perl, and Lisp.

Python's developers try to avoid changing the language to make it better until they have a lot
of things to change. Also, they try not to make small repairs, called patches, to unimportant
parts of the CPython reference implementation that would make it faster. When speed is
important, a Python programmer can move some of the work of the program to other parts
written in programming languages like C or PyPy, a just-in-time compiler. It translates a
Python script into C and makes direct C-level API calls into the Python interpreter.

Fig 7.1: Introduction To Python

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
4
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Keeping Python fun to use is an important goal of Python’s developers. It reflects in the
language's name, a tribute to the British comedy group Monty Python. On occasions, there
are playful approaches to tutorials and reference materials, such as referring to spam and eggs
instead of the standard foo and bar.

7.2 Features

Some of the major changes included for Python 3.0 were:

 Changing pri so that it is a built-in function, not a statement. This made it easier

to change a module to use a different print function, as well as making the syntax
more regular. In Python 2.6 and 2.7 print is available as a builtin but is masked

by the print statement syntax, which can be disabled by entering from future

import at the top of the file.

 Removal of the Python 2 inp function, and the renaming of

the
raw_inp function to input . Python function behaves like Python
3's inp
2's raw_inp function, in that the input is always returned as a string rather than

being evaluated as an expression


 Moving redu (but not m or filter ) out of the built-in namespace and

into
functoo (the rationale being code that is less readable than code
uses redu
that uses a for loop and accumulator variable).
 Adding support for optional function annotations that can be used for informal
type declarations or other purposes.
 Unifying the str / unicode types, representing text, and introducing a separate

immutable
byt type; and a mostly corresponding mutable bytearr type, both
of which represent arrays of bytes.
 Removing backward-compatibility features, including old-style classes, string
exceptions, and implicit relative imports
 A change in integer division functionality: in Python 2, integer division always

5/ 5/
GEETHANJALI INSTITUTE OF SCIENCE AND
TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
returns an integer. For example
is 2 ; whereas in Python is 2.5 . (In
3,

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

both Python 2 – 2.2 onwards – and Python 3, a separate operator exists to provide
the old behavior

Syntax and semantics

Python is meant to be an easily readable language. Its formatting is visually uncluttered and
often uses English keywords where other languages use punctuation. Unlike many other
languages, it does not use curly brackets to delimit blocks, and semicolons after statements
are allowed but rarely used. It has fewer syntactic exceptions and special cases than C or
Pascal.

Python Syntax compared to other programming languages

 Python was designed for readability, and has some similarities to the English
language with influence from mathematics.
 Python uses new lines to complete a command, as opposed to other programming
languages which often use semicolons or parentheses.
 Python relies on indentation, using whitespace, to define scope; such as the scope of
loops, functions and classes. Other programming languages often use curly-brackets
for this purpose.

7.3 Advantages

1. Presence of third-party modules


2. Extensive support libraries(NumPy for numerical calculations, Pandas for data analytics
etc)
3. Open source and community development
4. Versatile, Easy to read, learn and write
5. User-friendly data structures
6. High-level language
7. Dynamically typed language(No need to mention data type based on the value assigned,
it takes data type)
8. Object-oriented language
9. Portable and Interactive

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

7.4 Python OOPs Concepts


In Python, object-oriented Programming (OOPs) is a programming paradigm that uses
objects and classes in programming. It aims to implement real-world entities like inheritance,
polymorphisms, encapsulation, etc. in the programming. The main concept of OOPs is to
bind the data and the functions that work on that together as a single unit so that no other part
of the code can access this data.

Main Concepts of Object-Oriented Programming (OOPs)

 Class
 Objects
 Polymorphism
 Encapsulation
 Inheritance

Fig 7.4: Python OOPs concepts

a) Class

A class is a collection of objects. A class contains the blueprints or the prototype from which
the objects are being created. It is a logical entity that contains some attributes and methods.

To understand the need for creating a class let’s consider an example, let’s say you wanted to
track the number of dogs that may have different attributes like breed, age. If a list is used,
the first element could be the dog’s breed while the second element could represent its age.
Let’s suppose there are 100 different dogs, then how would you know which element is

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC
supposed to

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

be which? What if you wanted to add other properties to these dogs? This lacks organization
and it’s the exact need for classes.

Some points on Python class


 Classes are created by keyword class.
 Attributes are the variables that belong to a class.
 Attributes are always public and can be accessed using the dot (.) operator. Eg.:
Myclass.Myattribute
Class Definition Syntax:
class ClassName:

# Statement-1

# Statement-N

b) Objects

The object is an entity that has a state and behavior associated with it. It may be any real-
world object like a mouse, keyboard, chair, table, pen, etc. Integers, strings, floating-point
numbers, even arrays, and dictionaries, are all objects. More specifically, any single integer
or any single string is an object. The number 12 is an object, the string “Hello, world” is an
object, a list is an object that can hold other objects, and so on. You’ve been using objects all
along and may not even realize it.

An object consists of
 State: It is represented by the attributes of an object. It also reflects the properties of an
object.
 Behavior: It is represented by the methods of an object. It also reflects the response of
an object to other objects.
 Identity: It gives a unique name to an object and enables one object to interact with other
objects.
To understand the state, behavior, and identity let us take the example of the class dog
(explained above).

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

 The identity can be considered as the name of the dog.


 State or Attributes can be considered as the breed, age, or color of the dog.
 The behavior can be considered as to whether the dog is eating or sleeping.
c) Inheritance
Inheritance is the capability of one class to derive or inherit the properties from another class.
The class that derives properties is called the derived class or child class and the class from
which the properties are being derived is called the base class or parent class. The benefits of
inheritance are:

• It represents real-world relationships well.


• It provides the reusability of a code. We don’t have to write the same code again and
again. Also, it allows us to add more features to a class without modifying it.
• It is transitive in nature, which means that if class B inherits from another class A,
then all the subclasses of B would automatically inherit from class A.
d) Polymorphism
Polymorphism simply means having many forms. For example, we need to determine if the
given species of birds fly or not, using polymorphism we can do this using a single function.

e) Encapsulation

Encapsulation is one of the fundamental concepts in object-oriented programming (OOP). It


describes the idea of wrapping data and the methods that work on data within one unit. This
puts restrictions on accessing variables and methods directly and can prevent the accidental
modification of data. To prevent accidental change, an object’s variable can only be changed
by an object’s method. Those types of variables are known as private variables.

A class is an example of encapsulation as it encapsulates all the data that is member


functions, variables, etc.

Data Types
a) Strings
A string is a sequence of characters. It can be declared in python by using double-quotes.
Strings are immutable, i.e., they cannot be changed.

b) Lists
Lists are one of the most powerful tools in python. They are just like the arrays declared in
other languages. But the most powerful thing is that list need not be always homogeneous. A

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

single list can contain strings, integers, as well as objects. Lists can also be used for
implementing stacks and queues. Lists are mutable, i.e., they can be altered once declared.

c)Tuples
A tuple is a sequence of immutable Python objects. Tuples are just like lists with the
exception that tuples cannot be changed once declared. Tuples are usually faster than lists.
d)Iterations
Iterations or looping can be performed in python by ‘for’ and ‘while’ loops. Apart from
iterating upon a particular condition, we can also iterate on strings, lists, and tuples.
7.5 Operators
Python Operators in general are used to perform operations on values and variables. These
are standard symbols used for the purpose of logical and arithmetic operations. In this article,
we will look into different types of Python operators.
a) Arithmetic Operators
Arithmetic operators are used to performing mathematical operations like addition,
subtraction, multiplication, and division.

b) Comparison Operators

Comparison of Relational operators compares the values. It either


returns True or False according to the condition.

<= Less than or equal to True if the left operand is less than or equal to the right x <= y

c) Logical Operators
Logical operators perform Logical AND, Logical OR, and Logical NOT operations. It is
used to combine conditional statements.

d) Bitwise Operators

Bitwise operators act on bits and perform the bit-by-bit operations. These are used to operate
on binary numbers.

e) Assignment Operators

Assignment operators are used to assigning values to the variables.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

f) Identity Operators

is and is not are the identity operators both are used to check if two values are located on the
same part of the memory. Two variables that are equal do not imply that they are identical.
is True if the operands are identical
is not True if the operands are not identical

g) Membership Operators

in and not in are the membership operators; used to test whether a value or variable is in a
sequence.
in True if value is found in the sequence
not in True if value is not found in the sequence
7.6 RESULT AND DISCUSSIONS
This section present the results of the proposed method of heart disease prediction using Bio-
Inspired Algorithms. we applied machine learning algorithms on heart disease dataset to
predict heart disease, based on the data of each attribute for each patient. Our goal was to
compare different classification models and define the most efficient one. For the comparison
of the dataset, performance metrics after feature selection, parameter tuning and calibration
are used because this is a standard process of evaluating algorithms. We build a model with
two categories as training and testing set in the machine learning such that 70% of training set
and 30% of testing set involved in the proposed work. Here we implemented the bio-inspired
algorithms that are Genetic Algorithm, Bat algorithm, Bee algorithm. All this algorithms are
well-suited for this system and The highest accuracy is given by Genetic algorithm.

7.7 EXPERIMENTAL SETUP


The Experimental results shows the work model of the system in the GUI. Here in the GUI,
we create some buttons like upload heart disease, run genetic algorithm, run bat algorithm,
run bee algorithm , upload & predict test data, and accuracy graph by using these buttons we
get the required results.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

To run this project double click on ‘run.bat’ file to get below screen

In above screen click on ‘Upload Heart Disease’ button and upload heart disease dataset. See
below screen

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
5
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

In above screen uploading dataset file, after uploading will get below screen

Now click on ‘Run Genetic Algorithm’ button to run genetic algorithm on dataset and to get
its accuracy details. While running this algorithm u can see black console to see feature
selection process, while running it will open empty windows, u just close all those empty
windows except current window

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

In above screen for GA accuracy, precision and recall we got 100% result. Now click on ‘Run
Bat’ algorithm button to get its accuracy is 51%.

In above screen for BEE we got 54% accuracy.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Now click on ‘Upload & Predict Test Data’ button to upload test data and to predict it class.

In above screen I am uploading test file which contains test data without class label, after
uploading test data will get below screen

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

In above screen application has predicted disease stages. Now click on ‘Accuracy Graph’
button to view accuracy of all algorithms in graph format.

7.8 Performance Analysis

7.8.1 Correlation Matrix: The correlation matrix in machine learning is used for feature
selection. It represents dependency between various attributes.

Fig 7.8.1: Correlation matrix


Precision: It is the ratio of correct positive results to the total number of positive results
predicted by the system.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Recall: It is the ratio of correct positive results to the total number of positive results predicted
by the system.
It is the harmonic mean of Precision and Recall. It measures the test accuracy. The range of
this metric is 0 to 1.
7.8.2 Data Analysis

Here the below Histogram shows the Distribution of dataset of different Attributes. which
helps to predict the required output.

7.8.2 Data Analysis


7.8.3 Accuracy Results
The performance of our proposed method includes accuracy. Once the predictive model is
built, we can check how efficient is Bio-inspired algorithms are working. For that, we
compare the accuracy measures based on precision, recall, f1-score values for the Genetic
algorithm, Bat algorithm, Bee algorithm. The model shows the best results to compare with
other algorithms.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

The highest accuracy is given by genetic algorithm(100), and reaming algorithms got less
accuracy Bat algorithm (51.61%), Bee algorithm(54.83%).The results are shown below.

Algorithms Genetic Bat Algorithm Bee Algorithm


Algorithms

Accuracy 100% 51.61% 54.83%

Table 7.8.3: Accuracy Table

Here the below figure shows the accuracy graph between the Genetic algorithm, Bat algorithm,
Bee algorithm.:

Fig 7.8.3: Accuracy Graph

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 8

CONCLUSION

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

8. CONCLUSION

In the proposed Heart disease prediction system, Bio-Inspired algorithms are work well for
this model. Here in this bio- inspired algorithms it has four features optimizing algorithms
such as Genetic Algorithm, Bat algorithm, Bee algorithm and ACO algorithm. Here we are
implementing 3 algorithms called Genetic, Bat and Bee algorithms. We analyse and predict
the result whether the patient has heart disease, no disease and the stages of disease using bio
– inspired algorithms. After the comparative analysis of the various Machine Learning
models, we can conclude that the Genetic algorithm is the best approach to be used for
predicting heart disease. Among all the algorithms used Genetic algorithm has highest
accuracy value i.e., about 100%. Hence, we conclude that the Genetic algorithm is an
efficient model among all the algorithms used. Based on these results it can be shows that the
proposed system gives the good performance in the category of optimization and our project
provides an easy and efficient system for Heart disease prediction.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 9

FUTURE ENHANCEMENT

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

9. FUTURE ENHANCEMENT

From a future perspective, it is necessary to formalize an alliance and work together


with the institutions that collect the forefront of knowledge, and thus be able to apply
it to improve a real problem at the country level, to be a contribution to our society. In
the future, the study intended to estimate heart diseases utilizing DL algorithms
approaches and larger datasets. This work leaves an application that can be used as
support for medical personnel in medical decision making, but discrete data variables.
It can also detect future work, heart disease, cancer, arthritis, and other chronic
diseases. As the developed system is generalized, it can utilize it to analyze various
datasets in the future. Deep learning algorithms can be used to increase Accuracy.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
6
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

Chapter 10

BIBLIOGRAPHY

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
7
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

10. BIBLIOGRAPHY

1. S.Nandhini, Monojit Debnath, Anurag Sharma , Pushkar, “Heart Disease. Prediction


using Machine Learning”, International Journal of Recent Engineering Research and
Development (IJRERD), Volume 03, Issue 10, October 2018, PP. 39-46, ISSN: 2455-
8761.
2. Animesh Hazra, Subrata Kumar Mandal, Amit Gupta, Arkomita Mukherjee and
Asmita Mukherjee, “Heart Disease Diagnosis and Predic- tion Using Machine
Learning and Data Mining Techniques: A Review”, Advances in Computational
Sciences and Technology, Volume 10, Number 7 (2017) pp. 2137-2159, ISSN 0973-
6107.
3. J. Lo´pez-Sendo´n, ”The heart failure epidemic,” Medicographia, vol. 33, pp. 363-
369, 2011.
4. K. Vanisree and J. Singaraju, ”Decision support system for congenital heart disease
diagnosis based on signs and symptoms using neural networks,” International Journal
of Computer Applications, vol. 19, no. 6, pp. 6-12, 2011.
5. S. Nazir, S. Shahzad, and L. Septem Riza, ”Birthmark-based software classification
using rough sets,” Arabian Journal for Science and Engi- neering, vol. 42, no. 2, pp.
859-871, 2017.
A. Methaila, P. Kansal, H. Arya, and P. Kumar, ”Early heart disease prediction using
data mining techniques,” in Proceedings of Computer Science Information
Technology (CCSIT-2014), vol. 24, pp. 53-59, Sydney, NSW, Australia, 2014.
6. R. Detrano, A. Janosi, and W. Steinbrunn, ”International application of a new
probability algorithm for the diagnosis of coronary artery disease,” American Journal
of Cardiology, vol. 64, no. 5, pp. 304-310, 1989.
A. U. Haq, J. P. Li, M. H. Memon, S. Nazir, and R. Sun, “A Hybrid Intelligent System
Framework for the Prediction of Heart Disease Using Machine Learning
Algorithms,” Mobile Information Systems, vol. 2018, p. 3860146, Dec. 2018,
doi: 10.1155/2018/3860146.
7. M. Gudadhe, K. Wankhade, and S. Dongre, ”Decision support system for heart
disease based on support vector machine and articial neural network,” in Proceedings
of International Conference on Computer and Communication Technology (ICCCT),
pp. 741-745, Allahabad, India, September 2010.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
7
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

8. H. Kahramanli and N. Allahverdi, ”Design of a hybrid system for the diabetes and
heart diseases,” Expert Systems with Applications, vol. 35, no. 1-2, pp. 82-89, 2008.
9. S. Palaniappan and R. Awang, ”Intelligent heart disease prediction sys- tem using data
mining techniques,” in Pro- ceedings of IEEE/ACS Inter- national Conference on
Computer Systems and Applications (AICCSA 2008), pp. 108-115, Doha, Qatar,
March-April 2008.
10. E. O. Olaniyi and O. K. Oyedotun, ”Heart diseases diagnosis using neural networks
arbitration,” International Journal of Intelligent Systems and Applications, vol. 7, no.
12, pp. 75-82, 2015.
11. R. Das, I. Turkoglu, and A. Sengur, ”Effective diagnosis of heart disease through
neural networks ensembles,” Expert Systems with Applications, vol. 36, no. 4, pp.
7675-7680, 2009.
12. M. A. Jabbar, B. L. Deekshatulu, and P. Chandra, ”Classification of heart disease
using artificial neural network and feature subset selection,” Global Journal of
Computer Science and Technology Neural Artificial Intelligence, vol. 13, no. 11,
2013.
13. Rajesh Tiwari, Manisha Sharma, Kamal K. Mehta and Mohan Awasthy, “Dynamic
Load
Distribution to Improve Speedup of Multi-core System using MPI with
Virtualization”, International Journal of Advanced Science and Technology, Vol. 29,
Issue 12s, 2020, pp 931 – 940, ISSN: 2005 – 4238.
14. T.Nagamani, S.Logeswari, B.Gomathy,” Heart Disease Prediction using Data Mining
with Mapreduce Algorithm”, International Journal of Innovative Technology and
Exploring Engineering (IJITEE) ISSN: 2278- 3075, Volume-8 Issue-3, January 2019.
15. Fahd Saleh Alotaibi,” Implementation of Machine Learning Model to Predict Heart
Failure Disease”, (IJACSA) International Journal of Advanced Computer Science and
Applications, Vol. 10, No. 6, 2019.
16. Anjan Nikhil Repaka, Sai Deepak Ravikanti, Ramya G Franklin, ”Design And
Implementation Heart Disease Prediction Using Naives Bayesian”, International
Conference on Trends in Electronics and Infor- mation(ICOEI 2019).
17. Theresa Princy R,J. Thomas,’Human heart Disease Prediction System using Data
Mining Techniques’, International Conference on Circuit Power and Computing
Technologies,Bangalore,2016.

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
7
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

18. Nagaraj M Lutimath,Chethan C,Basavaraj S Pol.,’Prediction Of Heart Disease using


Machine Learning’, International journal Of Recent Technology and Engineering,8,
(2S10), pp 474-477, 2019.
19. Apurb Rajdhan , Avi Agarwal , Milan Sai , Dundigalla Ravi, Dr. Poonam Ghuli,
2020, Heart Disease Prediction using Machine Learn- ing, INTERNATIONAL
JOURNAL OF
ENGINEERING RESEARCH TECHNOLOGY (IJERT) Volume 09, Issue 04
(April 2020), ISSN (Online) : 2278-0181.
20. P. M. Awantika and Rajesh Tiwari, “A Novel Based AI approach for Real Time
Driver Drowsiness Identification System using Viola Jones Algorithm in MATLAB
platform”, Solid State Technology, Vol. 63, Issue 05, 2020, pp 3293 – 3303, ISSN:
0038-111X.
21. Shaikh Abdul Hannan, A.V. Mane, R. R. Manza,and R. J. Ramteke, Dec
2010,“Prediction of Heart Disease Medical Prescription using Radial Basis
Function”,IEEE International Conference on Computa- tional Intelligence and
Computing Research (ICCIC), DOI: 10.1109/IC- CIC.2010.5705900 ,28-29 .
22. AH Chen, SY Huang, PS Hong, CH Cheng, and EJ Lin,2011, “HDPS: Heart Disease
Prediction System”,Computing in Cardiology, ISSN: 0276-6574, pp.557- 560.
23. Mrudula Gudadhe, Kapil Wankhade, and Snehlata Dongre, Sept 2010,“Decision
Support System for Heart Disease Based on Sup- port Vector Machine and Artificial
Neural Network”,International Conference on Computer and Communication
Technology (IC- CCT),DOI:10.1109/ICCCT.2010.5640377, 17-19.
24. Manpreet Singh, Levi Monteiro Martins, Patrick Joanis, and Vijay K. Mago,2016,
“Building a Cardiovascular Disease Predictive Model using Structural Equation
Model Fuzzy Cognitive
Map”,IEEE International Conference on Fuzzy Systems (FUZZ),pp. 1377-1382.
25. Carlos Ordonez,2006, “Association Rule Discovery With the Train and Test
Approach for Heart Disease Prediction”, IEEE Transactions on Information
Technology in Biomedicine (TITB), pp. 334-343, vol. 10, no. 2.
26. Rajesh Tiwari, Manisha Sharma and Kamal K. Mehta “IoT based Parallel Framework
for Measurement of Heat Distribution in Metallic Sheets”, Solid State Technology,
Vol. 63, Issue 06, 2020, pp 7294 – 7302, ISSN: 0038-111X.
27. Suriya Begum, Farooq Ahmed Siddique, “A study to predict home loan defaulter using
machine learning”, SSIMApril 2021 .
GEETHANJALI INSTITUTE OF SCIENCE AND
TECHNOLOGY
7
EFFICIENCY OF HEART DISEASE PREDICTION USING GENETIC

28. Prajakta Ghadge,Vrushali Girme, Kajal Kokane, and Prajakta Desh- mukh,
2016,“Intelligent Heart Attack Prediction System Using Big Data”, International
Journal of Recent Research in Mathematics Com- puter Science and Information
Technology,Vol. 2, Issue 2, pp.73- 77, October 2015–March.

29. Asha Rajkumar, and Mrs G. Sophia Reena, 2010, “Diagnosis of Heart Disease using
Data Mining Algorithms”,Global Journal of Computer Science and Technology,Vol.
10,Issue 10, pp.38-43, September.
30. Purusothaman, and P. Krishnakumari, June 2015,“A Survey of Data Mining
Techniques on Risk Prediction: Heart Dis- ease”, Indian Journal of Science and
Technology, Vol. 8(12), DOI:10.17485/ijst/2015/v8i12/58385, pp. 1-5

GEETHANJALI INSTITUTE OF SCIENCE AND


TECHNOLOGY
7

You might also like