Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

PREDICTION AND DIAGNOSIS OF HEART DISEASE

PATIENTS USING DATA MINING TECHNIQUES


1
Shaik Tarannum, 2Govardhan Reddy Kamatam, 3R. Praveen Sam
1
PG Scholar, 2Associate Professor,3Professor
DEPT OF CSE
G. Pulla Reddy Engineering College, Kurnool, Andhra Pradesh

Abstract—We live in a postmodern period, and our daily routines are undergoing significant
changes that have a beneficial and bad influence on our health. As a consequence of these
developments, the prevalence of numerous illnesses has skyrocketed. Heart disease, in
particular, has grown increasingly prevalent in recent years. Human life is at risk. Blood
pressure, sugar, pulse rate differences and so on may generate heart diseases such as blockage
or blood congestion. The result may be a cardiac failure, peripheral artery disease, cardiac
episodes, stroke, and even a sudden heart attack. Various types of cardiac disease are detected
and/or diagnosed by various medical tests taking a family medical history and other
properties into consideration. However, predicting heart disease without any medical testing
is very challenging. The goal of this initiative is to detect various cardiac problems and take
all necessary actions to avoid them at a reasonable charge as early as feasible. For the
prediction of cardiac disorders, we use the ‘Data mining' methodology, in which
characteristics are input into SVM, Random forest, KNN, and ANN classification algorithms.
Preliminary readings and studies acquired with this technology are used to determine the
likelihood of discovering cardiac illnesses at an early stage, when they may be entirely
treated with the correct diagnosis.

Key words—SVM, Random Forest, KNN, ANN, and Data Mining.

1.0 INTRODUCTION

Heart diseases affect our lives and are one of the many disorders. It is a major disease
since we usually hear that most people are going to die from heart disease and other heart
disease [1-3]. Most authors believe that most individuals with heart attacks do not survive
and die from it. Every day, several things affect a human heart. There are many quick-speed
disorders and new cardiac disorders are alarmingly found. In the world of stress today, the
heart is needed for healthy living as the essential organ of the human body that pumps blood
through the body and maintains its health. The health of a human heart depends fully on
personal and professional activities and is defined through a person's life experiences. There
may also be different variables, which convey a type of heart disease, inherited from down
the generations. More than 12 million people die every year as a consequence of various
types of cardiovascular disease (CARD) according to the World Health Organization. The
phrase "cardiac disease" refers to a wide spectrum of heart and arteries problems. Even young
adults in their 20s and 30s are susceptible to heart disease. Obesity, poor nutrition, family
history, blood pressure, high blood cholesterol, idle behavior, family history, smoking, and
blood pressure may all result in an increase in young people's risk of heart disease.
We hope that there will always be a priority in understanding cardiac ailments. Our
study is to see whether there is a way to identify cardiac disease at an early stage [7-9]. By
performing a good diagnostic, it is possible to totally cure the ailment. Heart condition is
significant is the main organ in the body. It is the OS of our body. The irregular cardiac
function will have a negative impact on other human bodily functions [10-13]. Heart disease
is defined as any disorder of the heart. Blood and circulation problems and the heart are
different from cardiovascular disease. Issues are different. Cardiology diseases are the
leading cause of death for the United Kingdom, USA, Canada and Australia, according to the
World Health Organization. Heart disease includes conditions such as coronary artery
disease, arrhythmia, and myocardial infarction. Age, smoking, diabetes, obesity, genetic,
depression, hypertension, blood pressure, cholesterol, and other factors all have a role in heart
disease [14-17].

Fig. 1. System Architecture

The block diagram seen above the data mining systems market is seen in Figure 1. We
have chosen a smaller set of data, which has been preprocessed, and we have obtained
knowledge via postprocessing.

2.0 LITERATURE SURVEY


"Disease predicting system utilizing data mining approaches," by Banu, MA Nishara,
and B. Gomathy.

The health care company creates large amounts of information for efficient
assessment that cannot be extracted for unknown information. Hidden patterns are usually
discovered as underused. Heart disease is a widespread concept that deals with various of
heart disease. This medical ailment refers to the unpredicted health issues that affect all of the
heart's components. Various approaches to data mining in the health sector are utilized to
forecast cardiac diseases such as the mining of associated regulations, classification
techniques. The Database on Heart Disease will improve the efficiency of the mining process.
Before the data are arranged by clustering algorithms such as K-media for clustering
significant data. Applied for finding most common patterns in a heart disease database, the
Max. Frequent Set of item Algorithm (MAFIA) is used. Using the idea of information
entropy, the C4.5 method may be used as a training method to classify common patterns. The
data showed that the prediction system created was able to evaluate high blood pressure
successfully.

"Diagnosis of heart disease patients using fuzzy classification methodology," by V.


Krishnaiah.

The use of data mining techniques has been established in the history of medical data
to be significant in medical research for predicting heart disease. It has been recognized in
medical history that unstructured data is heterogeneous data, and that data created with varied
features should be examined to forecast and offer information for establishing a cardiac
patient diagnosis. Different approaches for data mining were utilized to predict people with
heart disease. However, several writers did not lessen the complexity in data analysis, which
was available. An effort was made to lessen the unstructured data uncertainty by inserting fog
into the measured data. An objective function and its measuring value were devised and
integrated with fumigated data for predicting persons with heart disease in order to reduce
ambiguity. Furthermore, an attempt was made to characterize patients by medical industry
features. Euclidean minimum distance the fuzzy K-NN classifier was created to sort training
and testing data into multiple categories. When compared to other parametric approaches'
classifiers, the Fuzzy K-NN classifier performed well.

Shailendra Narayan Singh, Gandhi, and Monika "Data mining methods are used to
make predictions in heart disease."

Predicting heart disease is regarded as the most difficult problem in medical research.
As a result, a decision support system for recognizing cardiac disease in patients is required.
In this study, we offer a method for predicting heart illness that combines an efficient genetic
algorithm with a back propagation mechanism. Today's medical sector has gone a long way
in treating people suffering from numerous ailments. Heart disease is one of the most
dangerous since it cannot be seen directly in the eye and suddenly hits at its boundaries. A
patient's death would result from poor clinical judgments, which no hospital could afford. To
accomplish an accurate and cost-effective treatment using computer technology, as well as to
provide support Good decision-making systems may be created. To handle their healthcare or
patient data, several hospitals utilize hospital information systems. These systems generate
massive volumes of information in the form of pictures, text, graphs, and statistics.

3.0 METHODOLOGY

The main goal of this research is to identify the illness at an early stage at a low cost.
We can identify illness at an early stage using data mining techniques. By correctly
diagnosing the problem, we can entirely treat it. The health-care business collects a large
quantity of data. Which aren't being mined for confidential information. Data mining is a
methodology that may be used to solve this challenge. It's a strategy for approaching a
problem. This is used to look at vast amounts of data. Extracts patterns that can be
transformed into usable data. The data must be collected in a defined format. Profiles of
medical professional’s Direct hospital data is gathered on 20 characteristics, including: These
variables are used to predict whether or not a patient may develop cardiovascular disease.
One of the main benefits of the study is early diagnosis of heart disease. Its diagnosis was
accurate and timely. Providing therapy at a reasonable price. Here, four algorithms are used:
 SUPPORT VECTOR MACHINE (SVM)
 RANDOM FOREST
 KNN
 ANN
The four ways are used to verify accuracy via data training and testing through open CVs of
the QT Designer system.

SVM: SVM stands for Vector Control and is a sort of supervised method of machine
learning. Mostly it is used to tackle problems with categorization. There is a training period at
the beginning. Data should be forwarded to the SVM algorithm described in fig. 2. The
information that has been classified. Then it's already labelled. Having completed the most
important part of the course. The data is then submitted to the algorithm, which will be
categorized with the least amount of human stimulation.
Fig. 2. SVM Process
Random Forest: It is a decision assistance tool, and Random Forest is one of them. This
makes use of a decision tree concept. It's mostly a smattering of unprimed classification trees.
It performs well in a variety of real-world situations. It works quickly. Many tree algorithms,
such as the decision tree, generally show a great improvement in performance. There are
three basic decisions to be made: a random tree, a random tree, and a random tree.
1. The leaves are split.
2. Each leaf must be used as a predictor.
3. Randomness injecting into the trees.

KNN: It is a non-parametric approach. It's a categorization system. Regression was also


utilized. It's an example of case-based learning. When just a local approximation of the
function is used. All computations are postponed until categorization. This is one of the most
important algorithms known for learning machines. A good strategy can be employed to give
weight to the contributions from the neighborhood. As a result, the neighbors who are closer
contribute more to the average than those who are further away.

ANN: It is a computer system that is patterned after the human brain. There's also the
neurological system to consider. The neural network is trained using the identified relevant
patterns. With accurate heart attack prediction. The training algorithm is a multilayer
Perception Back Propagation with Neural Network

Fig. 3 KNN-Layers

4.0 RESULT

We utilized data mining technology in our study to identify and predict cardiac diseases
to achieve better results in the table below. We employed four different algorithms: ANN,
KNN, SVM, and RANDOM FOREST.

COMPARISON OF RESULT
TABLE I
The major effect of data mining is that health issues can be predicted and prophylactic
measures can be taken.

Fig. 4. Use the algorithm of Svm

Fig. 5 Use Random Algorithms for Forest

Fig.6 Use of the algorithm KNN


Fig. 7 Using ANN Algorithm
The various accuracy scales that come from the algorithms.

Fig. 8 Sales of Algorithms

4.0 CONCLUSION

The major goal of this research is to use data mining to give insight into diagnosing and
treating cardiac disease. Data was acquired from the Thrissur Jubilee Mission Hospital for
data mining. Data was collected by conversing with patients one-on-one and writing it down.
The data was also gathered from the discharge summaries of the various patients. In this
approach, a total of 20 qualities from almost 2200 patients and above were gathered. The data
was then sorted and organized in an Excel spreadsheet in a logical manner. Different data
mining methods may be used to this information. Age, gender, heart rate, and blood sugar are
among the twenty criteria gathered from medical profiles to estimate the chance of a patient
developing heart disease. These features are provided in classification algorithms, such as
SVM, Random Forest, KNN, and ANN, which provide the best results and the maximum
possible reliability in detecting cardiac problems, the ANN algorithm achieves valid results,
which may be further enhanced by raising the amount of attributes.

REFERENCES

[1] Babu, Sarath, "Heart disease diagnosis using data mining technique. “Electronics
Communication and Aerospace Technology (ICECA), 2017 International conference of.Vol.
1. IEEE, 2017.
[2] ] Banu, MA Nishara, and B. Gomathy. "Disease forecasting system using data mining
methods." Intelligent Computing Applications (ICICA), 2014 International Conference on.
IEEE, 2014.
[3] Krishnaiah, V., "Diagnosis of heart disease patients using fuzzy classification technique. “
Computer and Communications Technologies (ICCCT), 2014 International Conference on.
IEEE, 2014.
[4] Gandhi, Monika, and Shailendra Narayan Singh. "Predictions in heart disease using
techniques of data mining." Futuristic Trends on Computational Analysis and Knowledge
Management (ABLAZE), 2015 International Conference on. IEEE, 2015.
[5] Purusothaman, G., and P. Krishnakumari. "A survey of data mining techniques on risk
prediction: Heart disease." Indian Journal of Science and Technology 8.12 (2015).
[6] Thomas, J., and R. Theresa Princy. "Human heart disease prediction system using data
mining techniques." Circuit, Power and Computing Technologies (ICCPCT), 2016
International Conference on. IEEE, 2016.
[7] Banu, NK Salma, and Suma Swamy. "Prediction of heart disease at early stage using data
mining and big data analytics: A survey." Electrical, Electronics, Communication, Computer
and Optimization Techniques (ICEECCOT), 016 International Conference on. IEEE, 2016.
[8] Thanigaivel, R., and K. Ramesh Kumar. "Boosted Apriori: an Effective Data Mining
Association Rules for Heart Disease Prediction System." Middle-East Journal of Scientific
Research 24.1 (2016): 192-200.
[9] Saboji, Rashmi G. "A scalable solution for heart disease prediction using classification
mining technique." 2017 International Conference on Energy, Communication, Data
Analytics and Soft Computing (ICECDS). IEEE, 2017.
[10] Sowmiya, C., and P. Sumitra. "Analytical study of heart disease diagnosis using
classification techniques." Intelligent Techniques in Control, Optimization and Signal
Processing (INCOS), 2017 IEEE International Conference. IEEE, 2017.
[11] Bahrami, Boshra, and Mirsaeid Hosseini Shirvani. "Prediction and Diagnosis of Heart
Disease by Data Mining Techniques." Journal of Multidisciplinary Engineering Science and
Technology (JMEST) 2.2 (2015): 164-168.
[12] KolÃ˘ge, Elma, and Neki Frasheri. "A literature review of data mining techniques used
in healthcare databases." ICT innovations (2012)
[13] Anbarasi, M., E. Anupriya, and N. C. S. N. Iyengar. "Enhanced prediction of heart
disease with feature subset selection using genetic algorithm." International Journal of
Engineering Science and Technology 2.10 (2010): 5370-5376.
[14] Chandra, Priti, and B. L. Deekshatulu. "Prediction of risk score for heart disease using
associative classification and hybrid feature subset selection." Intelligent Systems Design and
Applications (ISDA), 2012 12th International Conference on. IEEE, 2012.
[15] Ranganatha, S., "Medical data mining and analysis for heart disease dataset using
classification techniques." (2013): 1-09.
[16] Shouman, Mai, Tim Turner, and Rob Stocker. "Using data mining techniques in heart
disease diagnosis and treatment." Electronics, Communications and Computers (JECECC),
2012 Japan-Egypt Conference on. IEEE, 2012.
[17] Sonet, KM Mehedi Hasan, "Analyzing patterns of numerously occurring heart diseases
using association rule mining." Digital Information Management (ICDIM), 2017 Twelfth
International Conference on. IEEE, 2017.
[18] Sudeshna, P., S. Bhanumathi, and MR Anish Hamlin. "Identifying symptoms and
treatment for heart disease from biomedical literature using text data mining." Computation
of Power, Energy Information and Commuincation (ICCPEIC), 2017 International
Conference on. IEEE, 2017.
[19] Soni, Jyoti, "Predictive data mining for medical diagnosis: An overview of heart disease
prediction." International Journal of Computer Applications 17.8 (2011): 43-48.
[20] Jabbar, M. Akhil, Bulusu Lakshmana Deekshatulu, and Priti Chandra. "Heart disease
prediction system using associative classification and genetic algorithm." arXiv preprint
arXiv:1303.5919 (2013).
[21] Anooj, P. K. "Clinical decision support system: Risk level prediction of heart disease
using weighted fuzzy rules." Journal of King Saud University-Computer and Information
Sciences24.1(2012):27-40.

You might also like