Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

UI*OUFSOBUJPOBM$POGFSFODFPO%FQFOEBCMF4ZTUFNTBOE5IFJS"QQMJDBUJPOT

%4"

Heart Disease Prediction Algorithm Based on Ensemble Learning


2020 7th International Conference on Dependable Systems and Their Applications (DSA) | 978-0-7381-2422-3/20/$31.00 ©2020 IEEE | DOI: 10.1109/DSA51864.2020.00052

Ke Yuan Longwei Yang


School of Computer and Information School of Computer and Information
Engineering,Henan University Engineering,Henan University
International Joint Research Laboratory for E-mail: 291513440@qq.com
Cooperative Vehicular Networks of Henan
E-mail: yuanke_hhhh@163.com

Yabing Huang Zheng Li


School of Computer and Information School of Computer and Information
Engineering,Henan University Engineering,Henan University
E-mail: hyabing@163.com E-mail: lizheng@henu.edu.cn

Abstract— Nowadays, heart disease is one of the important applied the NB algorithm to the diagnosis of heart disease in
causes of human deaths. According to statistics, deaths caused 2011 and got a prediction accuracy of 74% [4]. H.Hugo et al.
by heart disease account for about one-third of all deaths in the proposed a recognition embedded system based on fuzzy
world. With further research, the use of machine learning to clustering algorithm for heart disease in 2013 [5]. S. Kedar et
predict heart disease has become an essential method to prevent al. used the KNN to classify heart diseases in 2015, and got
and treat heart disease. In recent years, machine learning based 75% accuracy [6]. Chinese researchers have also made
on big data analysis has been widely used in various software significant contributions in this field. Wang Jie et al. used
applications, but it has not been used on a large scale in disease logistic regression (LR) in 2007 to analyze the factors of
prediction. In this article, we propose a new algorithm named
angina pectoris and coronary artery disease [7]. Chen Tianhua
hybrid gradient boosting decision tree with logistic regression
(HGBDTLR) based on ensemble learning to improve the
and others used neural networks to study heart disease in 2008
accuracy of machine learning in heart disease prediction. The [8]. Shi Qi et al. used orthogonal partial least squares
actual results prove that the prediction accuracy of HGBDTLR discriminant analysis and other methods to predict heart
algorithm can reach 91.8% in the Cleveland heart disease data disease in 2012, 2013 and 2014 respectively [9], [10]. In 2015,
set. WY Dai et al. used classification models such as SVM, LR
and NB to predict heart disease, and obtained a prediction
Keywords-Big data analysis; machine learning; heart disease accuracy of 82% [11]. Chen Xu et al. launched disease
prediction; HGBDTLR algorithm; prediction work on imbalanced medical data sets in 2019, and
proposed a new ensemble classification method based on
I. INTRODUCTION iterative promotion and under-sampling. The accuracy of the
two selected data sets was increased by 6.3% and 12.43%
Heart disease does great damage to our heart which even respectively [12]. In 2019, Li Xiaoqian from Northeast
kills us. According to estimates, the number of people Forestry University proposed a heart disease prediction
suffering from cardiovascular diseases in China is as high as method based on convolutional neural network [13]. At
290 million and the mortality rate of it ranks the first present, researchers in the study of heart disease prediction
accounting for more than 40% of the residents’ deaths. These tend to use new computing technologies such as data analysis
shocking data has forced us to worry about the heart disease, and machine learning to establish classification models to
and also let us have a great desire for whether there is a way complete disease prediction.
that can predict heart disease effectively Compared with the traditional single classifier, the new
Researchers in related fields have used data analysis classification algorithms obtained by combining machine
techniques, neural network techniques and machine learning learning and other technologies are more suitable for the
to predict whether an individual has heart disease and the requirements and has a more significant role in the field of
severity of their heart disease, such as K-Nearest Neighbor disease prediction. Therefore, we propose the Hybrid Gradient
algorithm (KNN), Decision Trees (DT), Genetic algorithm Boosting Decision Tree with Logistic Regression
(GA), and Naive Bayes (NB) and many other algorithms. In (HGBDTLR) algorithm to improve the accuracy of heart
terms of the prediction of heart disease, many countries have disease prediction using ensemble learning methods in this
done advanced research in related fields. T.Tantimongcolwat paper. In many previous algorithms, the limitation of feature
et al. used DK self-organizing map and BP neural network to selection is one of the vital reasons that affects the
identify ischemic heart disease in 2008 [1]. I.Maglogiannis performance of the classifier, while the HGBDTLR algorithm
and J. Vepa et al. used Support Vector Machine (SVM) to uses the entire feature set and does not limit the selection of
classify heart disease in 2009 [2], [3]. G. Parthiban and others features. The experimental results show that the hybrid

¥*&&& 
%0*%4"

Authorized licensed use limited to: National University of Singapore. Downloaded on July 04,2021 at 06:55:45 UTC from IEEE Xplore. Restrictions apply.
algorithm we proposed can be more accurate Predict heart All algorithms are implemented by python sklearn library and
disease. optimized by adjusting parameters.
II. OVERVIEW OF METHOD
A. Data source and preprocessing
A. UCI Data Set The Cleveland Heart Disease Data contains 14 attributes,
In this study, the Cleveland database of the heart disease 13 of which are used to predict the characteristics of the heart
data set in the UCI database was selected. After obtaining 303 disease table, and the remaining one is used as a marker
pieces of data from the UCI data set, we first preprocess the sample. Table I shows the details of the data set attributes.
data to obtain experimental data, and then perform repeated First, the metadata is preprocessed and the missing values in
random sub-sampling verification on the experimental data. the data are filled in. Then the non-continuous digital features
The 80% of the data (242 pieces) are divided into training set in the features are mapped to continuous digital features.
to train HGBDTLR, and the remaining 20% of the data (61 Finally, the Target attribute of the sample is processed. The
pieces) are divided into test set to classify. Target attribute of the sample is processed, the non-zero
Target value in the sample is set to 1, indicating that the
B. Ensemble Learning sample has heart disease. If the Target value in the sample is
Ensemble learning refers to the construction and 0, no treatment is performed, indicating that the sample does
combination of multiple learners to jointly complete the not have heart disease.
Table I ATTRIBUTES OF UCI HEART DISEASE DATA SET
learning task, such as C4.5 decision tree algorithm, BP neural Attributes Value
network algorithm, Random Forest mixed linear model Description Type Ranges
algorithm and Adaboost algorithm, etc. Ensemble learning Age Patient’s age in completed years Numeric 29 to
can usually obtain a ensemble learner with more significant 77
performance than a single learner by combining multiple Sex Patient’s Gender (male represented Nominal 0 or 1
as 1 and female as 0)
learners. Since ensemble learning is particularly obvious for Cp The type of Chest pain categorized Nominal 0 to 3
"weak learners", the theoretical research of ensemble learning into 4 values: 0. typical angina, 1.
is conducted on weak learners, and HGBDTLR algorithm atypical angina, 2.non-anginal pain
proposed in this paper embodies the superiority of ensemble and 3. asymptomatic
Trestbps Level of blood pressure at resting Numeric 94 to
learning. mode (in mm/Hg at the time of 200
admitting in the hospital)
C. Algorithm Introduction
Chol Serum cholesterol in mg/dl Numeric 126-
Recently, many data mining methods and prediction 564
algorithms, such as KNN, LR, SVM, NB, etc. have become FBS Blood sugar levels on fasting > 120 Nominal 0 or 1
mg/dl; represented as 1 in case of
very popular to identify and predict heart disease, and the true, and 0 in case of false
hybrid algorithm based on ensemble learning has attracted Restecg Results of electrocardiogram while Nominal 0 to 2
more and more researchers' attention. Farnaz Sabahi et al. at rest are represented in 3 distinct
values: Normal state is represented
proposed the BFAHP algorithm, which obtained 87.4% as Value 0. Abnormality in ST-T
prediction accuracy on the UCI Cleveland dataset [14]. Amin wave as Value 1(which may include
et al. proposed the introduction of rough set technology into inversions of T-wave and/or
depression or elevation of ST
a hybrid method combining linear regression, multiple of >0.05 mV) and any probability or
adaptive regression splines and neural networks, and the certainty of LV hypertrophy by
prediction accuracy reached 82.18%, 85.82% and 91.30% on Estes’ criteria as Value 2
the heart disease datasets of Cleveland, Switzerland and Thalach The accomplishment of the Numeric 71 to
maximum rate of heart 202
Hungary [15]. In the latest research on ensemble learning, Exang Angina induced by exercise.0 Nominal 0 or 1
Mohan S, Thirumalai C and Srivastava G proposed a new represents false, 1 represents true.
algorithm called HRFLM, which combines two different Oldpeak Exercise-induced ST depression in Numeric 0 to 6.2
comparison with the state of rest
methods to obtain a better prediction model. In Cleveland Slope ST segment measured in terms of Nominal 0 to 2
data and Up to 88.7% accuracy [16]. This paper establishes a the slope during peak exercise
comparative experiment in Chapter 4. Under the same data depicted in three values: 1.
and experimental process, we compare proposed HGBDTLR unsloping, 2. flat and 3.
downsloping
algorithm with seven classic classification algorithms and the Ca Fluoroscopy colored major vessels Numeric 0 to 3
latest HRFLM algorithm, and then compare the classification numbered from 0 to 3
results to show the progress of the algorithm. Thal Status of the heart illustrated Nominal 3ˈ6ˈ
through three distinctly numbered 7
values. Normal numbered as 3,
III. MODEL INTRODUCTION AND HGBDTLR ALGORITHM fixed defect as 6 and reversible
The nine classification algorithms selected in this paper, defect as 7.
Decision Tree, Random Forest, Logistic Regression, K- Target The label column, those have heart Nominal 0 or 1
disease are represented by 1, and
Nearest neighbor, Support Vector Machine, Adaboost, those do not have heart disease are
Gradient Boosting Decision Tree, HRFLM and HGBDTLR. represented by 0



Authorized licensed use limited to: National University of Singapore. Downloaded on July 04,2021 at 06:55:45 UTC from IEEE Xplore. Restrictions apply.
B. Classification Modeling iteration, there are a total of n weak classifiers, of which n-1
1) Decision Tree: For the data in the data set, we create a are previously trained, and their various parameters will not
decision tree based on those nodes with high information change. The classification performance of the obtained nth
entropy. The pruning of branches and leaves is to remove the classifier is usually higher than the performance of the former
parts that are not related to the data in the data set. Entropy is n-1 weak classifiers.
calculated as follows: 7) Gradient Boosting Decision Tree: The algorithm is
m based on the extraction method of the sample standard
Entropy=- ෍ pi log2 pi Ł Euclidean distance ݀൫xi ,xj ൯ and the combination of the
Jሶ =
Where pi is the probability that the representative value takes KNN main body algorithm, where:
i. x Initialize the weak learner
N
2) Random Forest: The random forest algorithm is an
f0 (x)= arg minc ෍ L ቀy1‫ڄ‬j cቁ Ň
ensemble learning algorithm, which achieves the best results i=1
by building multiple decision trees and fitting them, this x For m = 1, 2, 3, . . . . . .n, M has: For each sample i =
method is a combination of Breimans’ Bootstrap aggregating 1, 2, 3, . . . . . .N, calculate the negative gradient, the
idea and Ho’s random subspace method to construct a residual
˜Lቀyi ,f൫xiሶ ൯ቁ
collection of decision trees. For a given data set, X= rim =- ቈ ቉ ň
{ x1,x2,x3,…,xn }has a corresponding mapping Y= ˜f(xi )
f(௫)ୀ௙೘షభ (௫)
{ x1,x2,x3,…,xn }, and repeats from b = 1 to B bagging
algorithm. We obtain the hidden x' by averaging ȈB b=1 fb(x ) x
'
Take the residual as the new true value of the sample,
'
on each individual decision tree to predict x . and use the data (xi ,rim ) , ݅ = 1, 2, 3,...,N as the
1 training data of the next tree to obtain a new
j= ȈBb=1 fb(x' ) ł
B regression tree fm (x) The corresponding leaf node
The uncertainty of prediction on these trees is made through
area is Rjm , ‫ܬ‬ሶ =1,2,3,...,J Where J is the number
its standard deviation.
2
of leaf nodes of the regression tree t. Calculate the
Ȉ ൫fb(x )-f൯
B ' መ ሶ 1,2,3,...,J
ı ට b=1 B-1 Ń best fit value for the leaf area ‫=ܬ‬
3) Logistic Regression: Logistic regression assumes that Ȗjm = arg min ෍ L൫ylሶ ,fm-1 (xi )Ȗ൯ ʼn
xi ęRjm
the data obeys the Bernoulli distribution, through the method
x update strong learner
of maximizing likelihood estimation, using the gradient J
descent method to solve the parameters and achieve the fm (x)=fm-1 (x)+ ෍ Ȗjm I(xęRjm ) Ŋ
j=1
purpose of data classification.
x Get a strong learner
4) K-nearest Neighbor: The algorithm is based on the m

extraction method of the sample standard Euclidean f(x)=fM (x)=f0 (x)+ ෎ ෍ ȖjmI(xęR ŋ
distance ݀൫xi ,xj ൯ and the combination of the KNN main j=1 jm )
m=1
body algorithm, where:
2 2 8) Hybrid Random Forest with Linear Model: HRFLM
݀൫xi ,xj ൯=ටቀxiˈ1-xjˈ1 ቁ +…+ ቀxiˈm -xjˈm ቁ ń
firstly uses Decision Tree to get partition, then apply Linear
Model to find less error rate. After that, we extract features
5) Support Vector Machines: We set the training sample by obtained less error classifier. And then, apply classifier on
Data=൛yi ˈxi ൟ in the data set: when i=1,2,…,n and xi ‫ א‬Rn extracted features to complete the classification.
represents the ݈ ௧ሶ ௛ vector, and yi ‫ ܴ݊ א‬represents the target
item. When ‫ ݓ‬is the size coefficient vector and ܾ is the offset, C. HGBDTLR Algorithm
the linear SVM model can find the best hyperplane Generally speaking, ensemble learning can be divided into
f(x)=wT x+b, and this is achieved by solving the subsequent three categories, they are bagging, boosting and stacking. The
optimization problem: HGBDTLR is an algorithm based on stack. Stacking is an
n ensemble learning technology that integrates multiple
1
Minw,b,ȟi w2 +C ෍ i
ȟ Ņ classification models (refer to GBDT algorithm in
2 Iሶ =1 HGBDTLR) or regression models (refer to LR model in
s.t.yi (wT xi +b)•-ȟi , ȟi • ‫׊‬i ࣅ{1,2,…,m} ņ HGBDTLR) through a meta-classifier or meta-regressor. The
6) Adaboost: Adaboost is an ensemble learner that uses basic model uses the whole training set for training, and the
the idea of iteration. Only one weak classifier (learner) is meta model trains the features of the basic model as the
trained for each iteration, and the trained weak classifier will features. The process of the HGBDTLR algorithm is shown in
participate in the next iteration. That’s to say, after the ݊௧௛ Figure 1.



Authorized licensed use limited to: National University of Singapore. Downloaded on July 04,2021 at 06:55:45 UTC from IEEE Xplore. Restrictions apply.
Figure 1. The process of HGBDTLR algorithm for heart disease prediction
1) Step 1: Train the gradient boosting decision tree to get b) Output: the boosting tree fM (x), including the key
a strong learner. features selected by the gradient boosting decision tree and
a) Input: Data set D = {(x1 ,y1 ), (x2 ,y2 ),…, (xN ,yN )} the scores of each feature.
containing features and labels; loss function L(y, fM (x) ). 2) Step 2: Feature Engineering.
process: a) Input: boosting tree fM (x) and important features
x Initialize the base classifier. selected by boosting tree training.
x For m = 1, 2, ..., M basis regression trees, calculate process: First determine the coding object and categorical
the negative gradient value rim of the loss function variables, then perform integer coding on the features and
for i = 1, 2,...,N. Fit the negative gradient value ‫ݎ‬୧୫ , arrange them in order.
learn a regression tree, and get the current regression b) Output: normalized features F(d1, d2, d3, ..., dn) with
tree T(x;Ĭm ). Update the current addition model classification attributes.
fm (x)=fm-1 (x)+T(x;Ĭm ) Ō 3) Step 3: Integrated learning to generate a strong
x Get regression problem boost tree: classifier and get the classification result.
fm (x)= σM m=i T(x;Ĭm ) ō a) Input: normalized feature F(d1, d2, d3, ..., dn).
Process: Apply logistic regression classifier to classify the
normalized features.
b) Output: Strong classifier.

IV. ENVIRONMENT AND RESULT EVALUATION

A. Dataset
The detailed information of each attribute in the data set is
shown in Table I. Figure 2 is a heat map showing the
relationship between the attributes. It can be seen from the
heat map that the correlation between almost all the features
given in the data set is very low. Figure 3 shows the
distribution of tags on each feature. Figure 4 shows the
important features extracted in the first step of the HGBDTLR
algorithm and each feature score. It can be seen from the
figure that the three features of thal, ca, and oldpeak are of
high importance, and the corresponding contribution to
classification is the largest, while restecg, fbs, chol, and age
are of weak importance, and the corresponding contribution to
classification is small.
Figure 2. Heat map of 14 attribute relationships in Cleveland Heart Disease Data Set



Authorized licensed use limited to: National University of Singapore. Downloaded on July 04,2021 at 06:55:45 UTC from IEEE Xplore. Restrictions apply.
Figure 3. Histogram of the distribution of target on each attribute
HGBDTLR algorithm proposed by us and the results are
shown in Table II.
Table II
NINE CLASSIFICATION ALGORITHMS’ PREDICTION RESULTS
Classification Evaluation Indicators
Accuracy Precision Recall_score F1score
DT 0.820 0.825 0.820 0.820
RF 0.869 0.869 0.869 0.869
KNN 0.639 0.643 0.639 0.639
Adaboost 0.852 0.852 0.852 0.852
LR 0.852 0.852 0.852 0.852
SVM 0.820 0.821 0.820 0.820
GBDT 0.803 0.803 0.803 0.803
HRFLM 0.885 0.888 0.885 0.885
HGBDTLR 0.918 0.919 0.918 0.918
Figure 4. Histogram of feature importance scores

B. Evaluation Indicators Table II shows that in the binary classification


Four different classification performance evaluation prediction task of the Cleveland Heart Disease Data Set, the
indicators to compare the differences between different KNN algorithm has the lowest classification accuracy of
algorithms and they are: 63.9%; Random Forest and Adaboost as ensemble learning
x Accuracy: proportion of correctly classified algorithms have higher classification accuracy, 86.9% and
samples to the total number of samples. 85.2% respectively; The classification accuracy of the
x Precision: proportion of the samples that are HFBDTLR algorithm based on ensemble learning reaches
predicted to be positive by the model that are 91.8%, which is 11.5% higher than the 80.3% classification
actually positive to the samples that are predicted to accuracy of the single GBDT algorithm; It is also higher
be positive. than the 88.5% classification accuracy of the HRFLR
x Recall: the percentage of samples that are actually algorithm based on ensemble learning.
positive that are predicted to be positive among the Figure 5 is the receiver operating characteristic (ROC)
samples that are actually positive. curve of the HGDBTLR algorithm for predicting heart
x F1_score: the harmonic average of precision rate disease. The ROC curve is a curve reflecting the
and recall rate, the maximum is 1, the minimum is relationship between sensitivity and specificity. The
0. abscissa of the X-axis is specificity, the closer to zero, the
higher the accuracy; the Y-axis of the ordinate is called
C. Result Evaluation sensitivity, and the closer to 1, the better the accuracy. It
We test eight different classification algorithms on the can be seen from Figure 5 that HGDBTLR has the highest
same data set, and compare the classification results we prediction accuracy among the nine algorithms.
obtained from them with the classification results of the



Authorized licensed use limited to: National University of Singapore. Downloaded on July 04,2021 at 06:55:45 UTC from IEEE Xplore. Restrictions apply.
REFERENCES
[1] C.Isarankura-Na-Ayudhya, T.Tantimongcolwat, T.Naenna, Mark J.
Embrechts, Virapong Prachayasittikul, “Identifification of ischemic
heart disease via machine learning analysis on magnetocardiograms,”
Computers In Biology And Medicine, 38(7):817–825, 2008.
[2] Ilias Maglogiannis, Euripidis Loukis, Elias Zafifiropoulos, and
Antonis Stasis, “Support vectors machine-based identifification of
heart valve diseases using heart sounds,” Comput Methods
Programs Biomed, 95(1):47–61, 2009.
[3] Jithendra Vepa, “Classifification of heart murmurs using cepstral
features and support vector machines. onference proceedings: ...
Annual International Conference of the IEEE Engineering in
Medicine and Biology Society. IEEE Engineering in Medicine and
Biology Society. Conference, 2009:2539– 2542, 2009.
Figure 5. ROC curve of nine algorithms in heart disease prediction [4] G. Parthiban, A. Rajesh, and S. K. Srivatsa, “Diagnosing vulnerability
of diabetic patients to heart diseases using support vector machines,”
In this chapter, we first preprocess the data in the dataset, International Journal of Computer Applications, 48(2):45–49, 2012.
then analyze the eight algorithms that have already been [5] de Carvalho, Helton Hugo, Moreno, Robson Luiz, Pimenta, Tales
applied and HGBDT algorithms. After that, we bring the Cleber, Crepaldi, Paulo C., Cintra, Evaldo, “A heart disease
data into different algorithm models to get their four recognition embedded system with fuzzy cluster algorithm,”
different evaluation indicators, comparing the HGBDTLR Computer Methods & Programs in Biomedicine, 110(3):447–454,
algorithm with the other eight different algorithms. We can 2013.
see that the four different evaluation indicators obtained by [6] Himansu Sekhar Behera and Durga Prasad Mohapatra,
“Computational intelligence in data mining—volume 1:
the HGBDTLR hybrid algorithm are significantly better Proceedings of the international conference on cidm, 5-6 december
than other algorithms, and its prediction accuracy can be as 2015,” Advances in Intelligent Systems & Computing, pages 49–56,
high as 91.8% which shows that the HGBDTLR algorithm 2016.
has improved in all aspects compared with the previous [7] Li Jun, Wang Jie, Yao Kuiwu and Zhong Jingbai, “Logistic regression
algorithm, and indicating that the algorithm has a certain of coronary heart disease and angina pectoris syndrome elements
progress. and coronary artery lesions,” Liaoning Journal of Traditional
Chinese Medicine, 034(009):1209–1211, 2007.
V. CONCLUSION [8] Zheng Yu, Chen Tianhua and Han Liqun, “Research on non-invasive
diagnosis of coronary heart disease based on neural network,”
The original medical records which are identified by Aerospace Medicine and Medical Engineering, 21(006):513–517,
HGBDTLR, in the long run, will help us save more lives of 2008.
heart disease patients and get early detection of heart [9] Shi Qi, “Research on TCM Syndrome Recognition of Unstable Angina
abnormalities. If the disease can be detected and treated Pectoris of Coronary Heart Disease Based on Data Mining,” PhD
earlier, then we can even prevent the occurrence of heart thesis, Beijing University of Chinese Medicine, 2012.
disease and reduce the mortality rate which will [10] Chen Jianxin, Shi Qi, Wang Wei, Zhao Huihui and Li Youlin, “Study
on the recognition mode of phlegm and blood stasis mutual
undoubtedly be an exciting news. In this work, we have obstruction syndrome of coronary heart disease based on decision
used machine learning technology and data analysis tree,” Chinese Journal of Traditional Chinese Medicine, (12):3523–
technology to get good results. The future courses of this 3526, 2013.
research can use more mixtures of different machine [11] Wuyang Dai, Theodora S. Brisimi, William G. Adams, Theofanie
learning technologies to achieve better prediction results, Mela, Venkatesh Saligrama, Ioannis Ch. Paschalidis, “Prediction of
which means there is a lot of room for improvement in the hospitalization due to heart disease by supervised learning methods,”
Future Generation Computer Systems, 84(3):189– 197, 2015.
application of ensemble learning in medical field. On the
[12] Sun Yuzhong, Chen Xu, Liu Penghe, Shen Xi, Zhang Lei, Wang
other hand, we can also develop new feature selection Xiaoqing, Sun Xiaoping and Cheng Wei, “Research on disease
methods to obtain a broader perception of salient features, prediction model for unbalanced medical data set,” Chinese Journal
thereby improving the performance of heart disease of Computers, 042(003):596–609, 2019.
prediction. In addition, because we just get fewer data [13] Sabahi Farnaz, “Bimodal fuzzy analytic hierarchy process (bfahp)
instances in the UCI data set, and many of the patient’s for coronary heart disease risk assessment,” Journal of biomedical
attribute records are hidden because of its strong privacy, informatics, 83:204–216, 2018.
which reduces the fit of the training model. So, if we can get [14] Wang Jian and Li Xiaoqian, “A new method of predicting heart
more patient instances and more records of their attributes, disease based on feature combination and convolutional neural
network,” Journal of Natural Science of Heilongjiang University,
then we are expected to achieve better results in the 36(01):119–124, 2019.
accuracy and precision of the algorithm. [15] Mohammad Shafenoor Amin, Yin Kia Chiam and Kasturi Dewi
Varathan. Identifification of signifificant features and data mining
FUNDING techniques in predicting heart disease,” Telematics & Informatics,
This research is supported by the National Natural 36(MAR.):82–93, 2019.
Science Foundation of China, which is called the research [16] Mohan, Senthilkumar , C. Thirumalai , and G. Srivastava, “Effective
Heart Disease Prediction using Hybrid Machine Learning
of nonlinear distance measurement learning algorithm Techniques,”IEEE Access PP.99(2019):1-1.
based on supervision.



Authorized licensed use limited to: National University of Singapore. Downloaded on July 04,2021 at 06:55:45 UTC from IEEE Xplore. Restrictions apply.

You might also like