Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

REVIEW PAPER

Heart Disease Prediction Using Machine Learning

Abstract-- According to the World Health to discernible hazard factors, for eg, tobacco use,
Organization, heart-related diseases are undesirable eating routine and heftiness, physical
responsible for taking 17.7 million lives every year, dormancy and destructive utilization of liquor utilizing
31% of all global deaths. In India too, heart populace wide situations. Individuals with
related diseases have become the top cause of Cardiovascular Disease(CVD) or who are at high
death. Heart diseases have killed 1.7 million cardiovascular hazards (because of the nearness of at
Indians in 2016, according to the 2016 Global least one hazard factor, for example, hypertension,
Burden of Disease. We have referred around 15-20 diabetes, hyperlipidemia or effectively settled sickness)
IEEE papers and all the papers have the same need an early introduction and directorate utilizing brief
attributes. So we propose the model having prescriptions, as set apart. All in all, Cardiovascular
consultation with doctor with new and easily Disease (CVD) is wound up with a development of
manageable attributes that are Gender, greasy stores inside the conduits (atherosclerosis) and
Breathlessness during activity, breathlessness at development of blood clusters. It can likewise be
rest, awake in by breathlessness at night, Exercise related to harm to courses in organs, for eg, the mind,
Induced Angina (chest pain after exercise),history heart, kidneys, and eyes
of cyanosis (bluish discoloration of fingers/around Estimates made by WHO, suggest that India has lost up
lips),diabetes, clubbing, Blood Pressure(if more to 237 billion $ on cardiovascular disease in the last
than 140/90). The main objective of our project is decade. Thus, reasonable and accurate prediction of
to develop an Intelligent System using machine heart related disease is the need of the time.
learning, namely Naive Bayes, KNN, Random This system is basically implemented as a web
forest Decision tree,Logistic Regression,SVM application in which the user answers the predefined
Based on the obtained results the system can questions. Data analytics is used to incorporate the
predict whether a person has chances of heart world for its valuable use to controlling, contravasting
disease or not. It is implemented as a web based and managing large data sets.it can be applied with
application. much success to predict, prevent, manage a
cardiovascular disease.
Keyword--Naive Bayes, KNN, Random forest,
Decision tree,LR,SVM
2.LITERATURE SURVEY
This area contains ongoing works in anticipating
1.INTRODUCTION incessant and irresistible maladies utilizing machine
The heart is a muscular organ in most animals, which learning classifiers. Yuanyuan [1] determines an
pumps blood through the blood vessels of the optimal value of K using the Silhouette method to form
circulatory system. The pumped blood carries oxygen the clusters for finding the anomalies. After that, they
and nutrients to the body, while carrying metabolic eliminate the identified anomalies from the data and
waste such as carbon dioxide to the lungs. if it employ the five most popular machine learning
miscarries to function correctly, then the brain and classification techniques. The work in [2] explored the
various other organs will stop functioning and within a Levy-based crow search algorithm (LCSA) .In this
few minutes, the person will die. All most framework, predictions are made on heart conditions
Cardiovascular Disease (CVD) can be killed by tending using extremely non-linear, complex and dynamic
computational processes known as ANFIS. The Forest, build several decision trees and incorporate
learning parameters of ANFIS are optimized using them to get the best result..The neuron components
MSSO to provide better results. The RFRF-ILM model include inputs, hidden layers and output.Vijeta Sharma
given in [3],Dataset clustering is based on Decision [10]examined the neural network.They used neural
Tree(DT) feature variables and criteria To estimate its networks as classifiers to predict the diagnosis of
performance, the classifier is then applied to each data Cardiovascular Heart disease.An artificial neurons
set. Based on their less error rate, the good performing consist of activation function which determines on the
models are identities based on the results. Decision tree basis of its output.The input of this activation function
cluster with a high error rate and removing its is the pre-activation value obtained by weighted sum of
respective class-type features, the output is further all its input.Basis of output of activation function it is
optimized. Chidambaram [4] first cleansed the dataset decided whether a neuron should fire or not.This
and processed using preprocessing techniques like Data research has discovered the neural networks and the
Integration, Data transformation, Data reduction, and different algorithms employed to improve the
Data cleaning using pandas tool .Patient records were performance of neural networks.
visualized Based on the split criterion, the cleansed data
is split into 60% training and 40% test, then the dataset 3.EXISTING SYSTEM
is subjected to five machine learning classifiers. Rony
[5] work on the EDCNN model. This Model is focused In this section we take a glance on the work that
on a deeper architecture which covers the multilayer developers and researchers did in heart disease
perceptron’s model with regularization learning prediction. Almost all of them have used heart disease
approaches The UCI repository dataset has been dataset obtained from UCI (University of California at
utilized for the diagnosis purpose, and CNN classifier Irvine) repository; the data set contained 10 attributes
and multi-layer perceptron (MLP) module has been such as age, sex, cp, trestbps, cho, fbs, restecg, thalach,
used. Santhana Krishnan [6] uses the dataset taken from ca, and target with 304 instances as shown in Table I.
UCI Machine learning repository. Decision tree and At first level, The dataset is first cleansed and processed
Naive Bayes classifier algorithms are used. Two data using preprocessing techniques like Data
mining algorithms were applied on the dataset to Integration, Data transformation, Data reduction, and
predict the possibilities of having heart disease of a Data cleaning using pandas tool. A total of 304 patient
patient, were analyzed with a classification model.An records were visualized. Data visualization techniques
approach for the expectation of heart ailments utilizing helps the data scientist to understand the feasibility of
a hybrid machine learning methodology was given in the dataset the cleansed data is split into 60% training
[7]. The hybrid approach is proposed for coronary and 40% test, then the dataset is subjected to five
illness forecast utilizing arbitrary random forest machine learning classifiers such as Logistic
classifiers and simple k-means algorithms in machine Regression (LR), Support Vector Machine (SVM),
learning. Later outcomes were achieved through Decision Tree (DT), Random Forest (RF), K-Nearest
random forest classifier and the corresponding Neighbors (KNN). The accuracy of the classifiers was
confusion matrix demonstrates the robustness of the calculated using the confusion matrix. The classifier
approach.Nabaouia Louridi [8]worked on the SVM which bags up the highest accuracy could be
model.SVM helps in analyzing data for classification determined as the best classifier.
and regression analysis, The aim of SVM is to find a
hyperplane in N-dimension space that separates data
distinctly.It compares machine learning algorithms with 4. EXPERIMENTAL ANALYSIS AND
different performance metrics and to improve their FINDINGS
accuracy.ML techniques were used in[9] to process raw
data and provide a new direction towards heart disease. This section contains the experimental analysis and
Its mortality rate can be drastically controlled if disease findings of predicting cardiovascular disease. A
is detected at early stages and preventative measures are quad-core i5 system with 6 GB of RAM, pandas,
adopted as possible.Decision Tree, in this the trees are Ipython, SciPy, StatsModels and Matplotlib was used
constructed based on high entropy inputs. Random in the Jupyter web application environment. The
experimental analysis takes place in two levels, in the
first level the dataset is cleaned using the pandas tool
and in the second level, the tidy data were subjected
to five machine learning classifiers in predicting 5.PROPOSED SYSTEM
cardiovascular disease. The classifier's execution
with its accuracy is shown in Fig.1 and Fig.2.The In this system, we are implementing an effective
comparison of accuracy obtained from Logistic heart disease prediction system using a
Regression (LR) and KNN is shown in Table.1.The Machine-learning algorithm architecture shown in fig
graph which shows the K neighbors classifier score 3.We can give the input as in the CSV file to the
for different K values is also included in the Fig.2. system. After taking input, the algorithms apply on
that input . After accessing the data set the operation
is performed and effective heart attack level is
produced. We'll add some more parameters
significant to heart disease with their weight, age and
the priority levels by consulting expert doctors and
medical experts. The heart attack prediction system
designed to help the identify different risk levels of
heart attack like you have chances of getting disease
or you have no dangerous symptoms of the disease
and also giving the prescription details with related to
the predicted result.​

Architecture:
Fig.1 Accuracy of Logistic Regression
Model

Fig.2 Accuracy of KNN Model


Sr.No Models Accuracy
Fig.3 Architecture of proposed system
1 Logistic Regression 0.819672

2 KNN 0.844838

Data Set Creation:


Table.1 Comparison of accuracy
Parameters:
We have referred different IEEE papers and we come
to know that all the papers have the same attributes.
So we propose the model having consultation with an
expert Dr.Harsh Dhondi with new and easily
manageable attributes. All the parameters that we use
are non-medical parameters so anyone having basic
knowledge of health can fill his/her personal data and
based on the given input data the system will predict
the person having heart disease or not. Table.2.
shows the parameters used in the proposed model. 6. CONCLUSION

In this application we will find the best optimized


prediction models for heart disease with simple and
Sr.No Attribute Name Range of
values easily manageable parameters.So that the system can
identify heart disease at an early stage.
1 Age Int (years) This system can not only answer complex queries for
diagnosing heart disease but also assist healthcare
2 Gender Categorical practitioners to make intelligent clinical decisions,
code which traditional decision support systems or any
3 Breathlessness during Binary other existing system can’t. By providing effective
activity treatment, it also helps to reduce treatment cost.

4 breathlessness at rest Binary 7.FUTURE SCOPE


5 Awake by Binary To determine the accuracy in heart disease prediction
breathlessness at night classification systems have been developed to
analyze the minimum error rate. Multi-layered
6 Exercise induced Binary perceptron algorithm is composed of artificial
angina (chest pain
neurons, including hidden layers for the problems of
after exercise)
binary classification. The experimental results have
7 History of cyanosis Binary been performed using the UCI repository dataset and
(bluish discoloration the proposed system has high performance in terms
of fingers/lips) of precision and accuracy of detecting heart disease.
We can integrate this web application with an
8 Diabetes 1.Normal
Android app in the future so that it can be easily
2.Above
normal accessible to android users.we also make a chain of
3.well above different heart specialist hospitals and provide them
normal with this system. So the patient can easily get idea of
the available hospital for treatment.
9 Clubbing Binary

10 Blood pressure(if Binary


more than 140/90) REFERENCES
1. Devansh Shah ; Samir Patel ; Santosh Kumar
Bharti, “Heart Disease Prediction using Machine
Table.2 Feature information of dataset Learning”
https://doi.org/10.1007/s42979-020-00365-y

2. Archana Singh ; Rakesh Kumar, “Heart Disease


Prediction Using Machine Learning Algorithms
“ DOI: 10.1109/ICE348803.2020.9122958

3. Pranav Motarwar ; Ankita Duraphe ; G Suganya ;


M Premalatha , “Cognitive Approach for Heart
Disease Prediction using Machine Learning”
DOI: 10.1109/ic-ETITE47903.2020.242

4. Harshit Jindal ; Sarthak Agrawal ; Rishabh Khera ;


Rachna Jain ; Preeti Nagrath , “Heart disease
prediction using machine learning algorithms
“ DOI: :10.1088/1757-899X/1022/1/012072

5. R.Jane Preetha Princy ; Saravanan Parthasarathy ;


P. Subha Hency Jose ; Arun Raj Lakshminarayanan
;Selvaprabu Jeganathan , “Prediction of Cardiac
Disease using Supervised Machine Learning
Algorithms” DOI :
10.1109/ICICCS48265.2020.9121169

6. Santhana Krishnan; Geetha.S,”Prediction of Heart


Disease Using Machine Learning Algorithms”
DOI: 10.1109/ICIICT1.2019.8741465

7. Keerthi Samhitha;Sarika Priya;Jithina Jose,


”Improving the Accuracy in Prediction of Heart
Disease using Machine Learning Algorithms”.DOI:
10.1109/ICCSP48568.2020.9182303

8. Nabaouia Louridi;Meryem Amar; Bouabid EI


Ouahidi

,”Identification of Cardiovascular Disease Using


Machine Learning” DOI:
10.1109/CMT.2019.8931411

9. SenthilKumar Mohan; Chandrasegar Thirumalia;


Gautam Srivastava,”Effective Heart Disease
Prediction Using Hybrid Machine Learning
Techniques”. DOI: 10.1109/ACCESS.2019.2923707

10. Vijeta Sharma;Shrinkhala Yadav;Manjari


Gupta,”Heart Disease Prediction using ML
Techniques”DOI:10.1109/ICACCCN51052.2020.9362842

You might also like