Cardiovascular Disease Prognosis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/310796176

Cardiovascular Disease Prognosis Using Effective Classification and Feature


Selection Technique

Conference Paper · November 2016


DOI: 10.1109/MEDITEC.2016.7835374

CITATIONS READS

12 355

4 authors:

Shahed Anzarus Sabab Ahmed Iqbal Pritom


Northern University Bangladesh Green University of Bangladesh
6 PUBLICATIONS   75 CITATIONS    24 PUBLICATIONS   91 CITATIONS   

SEE PROFILE SEE PROFILE

Md. Ahadur Rahman Munshi Shihabuz Zaman


Islamic University of Technology Green University of Bangladesh
2 PUBLICATIONS   53 CITATIONS    8 PUBLICATIONS   58 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Data Classification View project

A Sensor Based Residential Carbon Monoxide Emission Surveillance System from Least Developed Country’s Perspective View project

All content following this page was uploaded by Ahmed Iqbal Pritom on 02 May 2020.

The user has requested enhancement of the downloaded file.


Cardiovascular Disease Prognosis Using Effective
Classification and Feature Selection Technique
Shahed Anzarus Sabab Ahmed Iqbal Pritom
sabab.iutcse@gmail.com iqbal.cse@green.edu.bd
Lecturer, Dept. of CSE Lecturer, Dept. of CSE
Northern University Bangladesh Green University Bangladesh
Bangladesh Bangladesh

Md. Ahadur Rahman Munshi Shihabuzzaman


ahad_1114095@live.com shihabuzzaman.cse@green.edu.bd
Lecturer, Dept. of CSE Lecturer, Dept. of CSE
Green University Bangladesh Green University Bangladesh
Bangladesh Bangladesh

Abstract— Cardiovascular disease is a worldwide health best way to fight against this deadly disease. Most importantly,
problem and according to American Heart Association through regular checkup, predicting the physical condition of a
(AHA), it also causes an approximate death of 17.3 million patient (healthy or sick) after certain period of time has become
each year. Therefore early detection and treatment of a real-world medical problem.
asymptomatic cardiovascular disease which can Recently, for knowledge discovering and extracting hidden
significantly reduce the chances of death. An important fact patterns from large datasets, scientists and researchers are using
regarding such life-threatening disease prognosis is to Data Mining vastly. This includes the use of sophisticated data
identify the patient’s physical state (healthy or sick) based manipulation tools in order to search previously unknown, valid
on the analysis of health checkup data. This paper aims at patterns and relationships in a large dataset. We applied three
optimized cardiovascular disease prognosis using different strong data mining classification algorithms i.e. Support Vector
data mining techniques. We also provide a technique to Machine (SVM), Naive Bayes and C4.5 Decision Tree on a
improve the accuracy of proposed classifier models using medium sized dataset which contained 14 attributes and 303
feature selection technique. Patient’s data were collected cardiovascular patient data. We classified the instances on the
from Department of Computing of Goldsmiths University basis of ‘class’ attribute which can have only two nominal
of London. This dataset contains total 14 attributes in which values i.e. ‘Sick’ and ‘Healthy’. Results have shown that Naïve
we applied SMO (SVM - Support Vector Machine), C4.5 Bayes classifier has the highest prediction accuracy i.e. 83.17%
(J48 - Decision Tree) and Naïve Bayes classification while SVM and C4.5 has 82.84% and 73.93% accuracy
algorithms and calculated their prediction accuracy. An respectively.
efficient feature selection algorithm helped us to improve Compared to those accuracy rates reported by several
the accuracy of each model by reducing some lower ranked previous works [3, 6, 9, 11, 12] based on similar classification
attributes. Which helped us to gain an accuracy of 87.8%, models, our result was relatively poor and we tried to figure out
86.80% & 79.9% in case of SMO, Naïve Bayes and C4.5 the facts behind this. One important factor to be noted here is
Decision Tree algorithms respectively. that our record had a larger attribute set and a smaller instance
set. We found that some attributes were contributing less but
Keywords—Cardiovascular disease; SMO (SVM - Support misguiding the algorithms more while classifying. So, we
Vector Machine); Decision tree; Naïve Bayes; Receiver Operating applied feature selection algorithm to determine each instance’s
Characteristic curve(ROC); WEKA contribution towards classification. In case of Naïve Bayes, we
found that if we carefully select 8 best attributes, it gives 3.63%
I. INTRODUCTION improved result than its previous prediction. For SVM, using 9
Cardiovascular diseases (CVD) are not only one of the most heavy weight attributes, accuracy improved by 4.95%.
largest health problems worldwide but also number one cause Decision Tree prediction accuracy improved by 5.94% after
of death globally [1]. Despite improvements over the last few carefully selecting 10 upper ranked attributes.
decades, it remains one of the biggest burdens on world All related works that we have gone through including
economy. In the year 2010, coronary heart disease was background information on cardiovascular disease research,
responsible of $108.9 billion cost through USA including the risk and prognosis factors, uses of ranking algorithm, several
cost of health care services, medications, and lost productivity classification models and comparison among their accuracies
[2]. By addressing behavioral risk factors like tobacco use, are discussed through the rest of the paper. This is followed by
unhealthy diet, physical inactivity and use of alcohol, most the method section which explains our proposed classification
CVDs can be prevented [1]. Early detection and prediction techniques. Finally, the results section is followed by a
through close observation of risk factors is considered to be the conclusion section. We tried to focus on all existing
experiment’s merits and demerits and targeted the issues that found 91.66% while overall performance accuracy of the
the pervious solutions did not admit. We tried to come up with classifier was equal to 93.33%.
an optimized feature selection method which will not only
improve the classifications, but will also be a key point to be Mai Shouman et al. [12] introduced a new classification
remembered while using such heavily featured datasets. model called ‘Nine Decision Tree’ model which gives better
accuracy (84.1%) than J4.8 Decision tree (78.9%) and Bragging
Section II discusses about the related works. Section III Algorithm (81.41%). Tina R. Patil et al. [13] focused on the data
discusses about the methodology and dataset. Section IV classification and the performance measure of J48 Decision
describes the evaluation method. Section V shows the Tree and Naïve Bayes based on TP and FP rate. Experimental
experimental results. Section VI is the future scope and work proved that correct instances generated by J48 are 203 and
conclusion. Naïve Bayes are 184 after training and testing equally. They
also declared that J48 is cost efficient than the Naïve Bayes
II. RELATED WORKS classifier when data set contains less attributes and instances.
Hlaudi Daniel Masethe et al. [3] collected their data set from Huan Liu and Lei Yu [14] introduced active feature
medical practitioners in South Africa, on which they applied selection which promotes the idea to actively selecting
J48, Naïve Bayes, REPTREE, CART, and Bayes Net data instances for feature selection. S.Vanaja et al. [15] showed that
mining algorithms for predicting heart attacks and showed each feature selection methodology has its own advantages and
prediction accuracy of 99% in their work. Beant Kaur et al. [4] disadvantages inclusion of larger attribute causes the reduction
found that smoking, cholesterol, High BP, obesity and less of accuracy. Combination of decision tree based method,
physical exercise are the major factors behind heart disease in bagging and backward elimination strategy was proposed by
family history of heart disease. Paulo J. L. Adeodato et al. [5] Dong-Sheng Cao’s [16] to find the structure activity
used two classifiers for supporting the cardiologist’s decision in relationships in the area of chemo metrics which is also related
two different scenarios. Scenario one was heart health of the to pharmaceutical industry. YongSeog Kim et al. [17]
patient on the basis of consultation while scenario two was investigated the importance of feature selection in both
cardiac health of the patient after the results of supervised and unsupervised learning.
echocardiograms. The areas under the ROC curve (AUC) were
equal to 0.91 for scenario one and 0.94 for the scenario two III. METHODOLOGY AND DATASET
which were both close to the maximum 1.00. B.Anuradha et al.
[6] proved that to increase the performance further by adding In data mining the knowledge discovery process involves a
more related input variables with more data to train the mode collection of stages. Data is collected from various sources and
and for the realization of custom-made cardiac implants like preprocessed and converted to a suitable format. Then Data
cardiac pacemakers or implantable cardiac defibrillator (ICD) mining techniques are implemented on the processed data for
Fuzzy-logic-based cardiac arrhythmia detection is promising. classification and information extraction. At the end the
Main achievement of this work was the significant reduction of extracted information is analyzed for knowledge discovery. The
misclassified data which led to a very healthy accuracy rate of system architecture used in this paper is shown in Figure 1.
93.13%.
To diagnose heart disease for diabetic patients Naïve Bayes
based method was proposed by G. Parthiban et al. [7]. Using
that model they were able to get an accuracy of 74% using 500
patient data. a Heart Disease Prediction System (HDPS)
prototype was designed by Shadab Adam Pattekari et al. [8]
where they used data mining modeling techniques, namely,
Decision Trees, Naïve Bayes and Neural Network. They
concluded with the decision that naive Bayes classifier is more
efficient in such classifications. Cause a small amount of
training data is sufficient to estimate the parameters (means and
variances of the variables) for classification. K. Thenmozhi et
al. [9] compared prediction accuracies of Decision tree, Naïve
Bayes and Neural Network with both 13 and 15 attributes. But
this paper didn’t propose any efficient feature selection method.
Neural Network performed best with 100% prediction accuracy
using 15 attributes. But after attribute reduction, only Naïve
bayes improved its accuracy by 3.7% where other two models
accuracy decreased by significant amount. Elma Kolçe et al.
[10] identified and evaluated the most commonly used data
mining algorithms resulting as well-performing on medical
databases, based on recent studies. Their experiments provided
strong evidence that Naïve Bayes and ANN gives maximum
accuracy in Heart disease prognosis where ANN and Decision
Tree are best suited for Cancer prediction. By using genetic
Figure 1: System Architecture
algorithms Naser Safdarian et al. [11] formulated fuzzy
classifier where accuracy for normal sinus rhythm (NSR) was
Here, after preprocessing, three classification algorithm has IV. EVALUAION METHOD
been implemented to the data set and compared for the best We tried to compare three data mining techniques in this
performance. paper, those are: Naïve Bayes, Support Vector Machine and
A. Experimental Methods C4.5 Decision Tree to predict whether a patient is in danger of
cardiovascular disease affection or not. We used Weka machine
In this paper different classification techniques have been learning tool for all our classifications. We used Weka [21] for
implemented on the collected data. These classifiers are mainly data mining tasks since most of the machine learning technique
learning methods and adopt sets of rules. Three classifiers has are already implemented there. Weka version 3.6.9 was used
been chosen based on the attributes in the data set. They are- for all our preprocessing and classifying. In order to maintain a
Naïve Bayes, Support Vector Machine and Decision tree. fair measure of the performance of the classifier, we used 10
Support vector machine is a machine learning tool based on fold cross validation technique for all three algorithms. In k-
statistical learning theory. It uses a nonlinear mapping to fold cross-validation, a single subsample was used as the
rehabilitate the data in to a higher dimension [18]. Basic validation data for testing the model from k equal sized
concept of SVM is decision planes which define decision subsample portioned from original data set, and the remaining
boundaries. Decision plane a discrete hyper plane created in the k-1 subsamples were used as training data. 66% data of the
descriptor space training data and compound are classified dataset was used as Training data and remaining 34% were used
based on the side of hyper plane they are located. The SVM as test data.
takes input data and predicts for each data, which of two To improve the accuracy of these models, we used Ranker
possible classes comprises the input. For which SVM is a non- algorithm for best feature selection and for the removal of
probabilistic binary linear classifier. Training time for the SVM redundant and irrelevant attributes. InfoGainAttributeEval was
is extremely slow but they are extremely accurate in making selected as attribute evaluator for Naïve Bayes and C4.5
predictions [18]. Decision Tree. Worth of an attribute is evaluated by measuring
Naïve Bayes is a simple machine learning probabilistic the information gain with respect to the class. On the other
classifier based on Bayes’ theorem [19, 13]. Here variables are hand, for SVM, we choose SVMAttributeEval as attribute
not dependent of each other. It’s not difficult to build and evaluator where SVM classifier evaluates the worth of an
performs better for large dataset. Using small training data it attribute. Here, .
identifies the parameters which are important for classification. Attribute selection for multiclass problems is handled by
It predicts class membership probabilities i.e. it finds the ranking attributes where each class was separated using a one-
probability about a tuple, if it belongs to the particular class or vs.-all method and then "dealing" from the top of each pile to
not. give a final ranking [22].
As a machine learning tool, Decision tree is commonly used Figure 2 shows the average rank of all 13 features (except
for classification [13]. Unlike other model it does not assume the ‘class’ which was set as outcome attribute). This is a clear
that the attributes are independent. It is a hierarchical indication that while classifying, heavily ranked attributes like
breakdown of the data, with two attributes at each level of the ‘thal’, ‘chest pain type’, ‘#colored vessels’ and ‘angina’ will
hierarchy. It contains an inducer and a visualizer. Inducer contribute significantly and will also control the accuracy of the
identify the most important attributes for classification. model. On the other hand, features like ‘blood pressure’ and
Visualizer represents the resulting model graphically. It ‘resting ecg’ can be considered as light weight attributes who
summarizes the dataset with a table containing all the attributes actually contribute very little in building this prediction models.
of the dataset. By eliminating unnecessary attributes, the
algorithm creates a smaller and condensed decision table.
B. Dataset and Attributes
The data used in this study are collected from the
Department of Computing of Goldsmiths University of London
[20]. This dataset contains 303 cardiovascular patient data, 1
class attribute named ‘class’ with two possible results (Sick and
Healthy) and 13 other attributes. There were no missing values
in the dataset. Class distribution: 138 Sick, 165 Healthy. Each
record represents follow-up data for one cardiovascular case.
Among 14 attributes, 6 are numeric which are ‘age’, ‘blood
pressure’, ‘cholesterol’, ‘maximum heart rate’, ‘peak’ and
‘#colored vessels’). Then there are 6 nominal attributes. They
are, ‘sex’ (Male, Female), ‘chest pain type’ (_Asymptomatic, Figure 2: Average Rank of Attributes
Abnormal Angina, Angina, No Tang), ‘resting ecg’ (Hyp,
Normal, Abnormal), ‘slope’ (Flat, Up, Down), ‘thal’ (Rev,
Normal, Fix) and ‘class’ (Sick, Healthy). ‘Fasting blood sugar V. EXPERIMENTAL RESULT
<120’ and ‘angina’ are two Boolean attributes. This First we experimented the accuracy for all 3 classification
summarizes all 14 attributes of our dataset. algorithms without applying any ranker. Among them, the best
result was recorded for Naïve Bayes which provided 83.17%
accuracy. J48 (generates a pruned or unpruned C4.5 decision
tree) and SMO (John Platt's sequential minimal optimization
algorithm for training a support vector classifier) gave 73.93%
and 82.84% accuracies respectively. We believed that through
proper attribute selection for classification and removal of
redundant attributes, this result could be improved by a fair
margin. So, after ranking all attributes, we selected ‘N best
attributes’ for all 3 classifiers. Our target was to find the
combination of N best features that predicts with maximum
accuracy. Here, value of N can be any integer value from 1 to
14 as we have 14 classifying features in our dataset. And the
attributes were chosen as they appeared in the ranker list as
because trying out other combination to get higher accuracy
doesn’t work other than this attributes.
After a proper feature selection, surprisingly, not Naïve
Bayes but SMO gave maximum accuracy. For SMO, we
carefully selected best 9 features from our dataset. So, we
removed remaining 5 less important attributes and performed Figure 3: Compression between classifiers - showing different
SVM classification again. The latest accuracy (87.79%) was performance measurement criteria.
improved by 4.95% from the previous record. Naive Bayes and
C4.5 Decision Tree gave improved maximum accuracy of We acknowledged that error rate is reduced significantly
86.80% and 79.87% respectively. Their accuracies were for almost every evaluation criteria after including ranker
improved by 3.63% and 5.94% respectively compared to their algorithm. In Naïve Bayes, all 5 error rate showed consistent
previous results when we didn’t apply any feature selection improvement after applying Ranker. In case of J48 Decision
technique. We recorded this result for C4.5 Decision Tree using Tree and SMO, except Kappa statistic [24], all other error rates
5 best attributes while for Naïve Bayes, this result was achieved (MAE [25], RMSE [25], RAE [26], RRSE) were improved.
using best 8 attributes. Table 1 summarizes our experiment Table 2 summarizes our experiment result for this section.
result for this section. In ‘With Ranker’ column, we included
the result found using N best attribute selection technique. Table 2: Training and Simulation Error
Classifiers
Table 1: Performance of the Classifier C4.5 Decision
Classifiers Evaluation SMO Naïve Bayes
Criteria Tree
C4.5 Decision Without With Without With Without With
Evaluation SMO Naïve Bayes
Criteria
Tree Ranker Ranker Ranker Ranker Ranker Ranker
Kappa
Without With Without With Without With
statistic 0.652 0.6914 0.4734 0.5939 0.6597 0.639
Ranker Ranker Ranker Ranker Ranker Ranker
(KS) 1
Timing to Mean
build absolute
model (in 0.095 0.32 0.078 0.19 0.02 0.01 error
0.1716 0.1518 0.2956 0.2581 0.1839 0.1765
Sec) (MAE)
Root
Correctly mean
Classified 251 266 224 242 252 263 squared 0.4143 0.3896 0.4727 0.3824 0.3628 0.3519
instances error
Incorrectly (RMSE)
Relative
classified 52 37 79 61 51 40 absolute 30.60 59.58 52.03 37.06 35.58
instances 34.59%
error % % % % %
Accuracy 82.84 87.8 73.93 79.9 83.2 86.80 (RAE)
Root
(%) % % % % % % relative
Two different classifiers provided maximum accuracies in 78.23 94.90 76.78 72.85 70.65
squared 83.18%
two different situations. Without feature selection, Naïve Bayes Error % % % % %
(RRSE)
provided maximum accuracy but after performing careful
attribute selection, SVM showed better performance. But Figure 4 shows the statistical comparison among three
overall, much more accurate prediction was confirmed by each classification algorithms. Again it shows the improved error
classifier after implementation of ranking algorithm. In figure 3 rate gain using ‘N best attributes’ selection technique. In every
we showed that every evaluation criteria achieves certain level
of improvement when appropriate feature selection is
confirmed. This is a strong evidence that medium sized dataset
with large amount of attributes can easily be misguided by the
additional amount of features with very less contribution in
classification.
Figure 4: Comparison between Parameters
possible evaluation criteriaafter , SMO is showing far too
improved and reduced error rate after applying ranker
algorithm.

Our final evaluation criteria was to find the true positive and
false positive rate and also to generate a Receiver operating
characteristic (ROC) curve. A receiver operating characteristics Figure 5: ROC using C4.5 Decision Tree without ranker (left), with
curve (ROC) [23] graphically represents the tradeoff between ranker (right)
the false negative and false positive. The area under the curve
measures the accuracy of the classifiers. Without proper feature
selection, Area under ROC curve showed noticeable
improvement after reduction of features with less contributions.
Table 3 summarizes our experiment result for this section.

Table 3: Comparison of Accuracy measure


Classifiers Figure 6: ROC using Naïve Bayes without ranker (left), with ranker
C4.5 Decision (right)
Evaluation SMO Naïve Bayes
Tree
Criteria
Without With Without With Without With
Ranker Ranker Ranker Ranker Ranker Ranker

True
Positive 0.775 0.783 0.703 0.775 0.832 0.842
Rate
False
Positive 0.127 0.097 0.23 0.182 0.174 0.166
Rate Figure 7: ROC using SMO without ranker (left), with ranker (right)
Area
Under
0.824 0.843 0.738 0.849 0.906 0.909 VI. CONCLUSION
ROC
Curve In this paper we tried to focus on the importance of feature
selection in cardiovascular disease prognosis treatment using
different data mining algorithm .Using proper attribute
selection technique, any classification algorithm can be
The improvement of ROC Curve shows the importance of improved significantly. Attributes with less contribution in
best feature selection for datasets. Figure 5, 6 and 7 summarizes dataset, often miss lead the classification model and results in
our result of this section. poor prediction accuracy. In our work, we found that Naïve
Bayes gave best result before attribute selectionBut after .
performing a controlled and careful feature selection, SVM
.turned out to be the best classifierArea under ROC curve
analysis showed results in our favor where all three classifiers
showed much better improvements after feature selection
method. In ,addition to this work we will try to evaluate some [23] Fawcett, Tom. "ROC graphs: Notes and practical considerations for
newer algorithms with better feature selection technique. researchers." Machine learning 31.1 (2004): 1-38.
[24] Viera, Anthony J., and Joanne M. Garrett. "Understanding interobserver
agreement: the kappa statistic." Fam Med 37.5 (2005): 360-363.
REFERENCES [25] Willmott, Cort J., and Kenji Matsuura. "Advantages of the mean
absolute error (MAE) over the root mean square error (RMSE) in
[1] http://www.who.int/mediacentre/factsheets/fs317/en assessing average model performance." Climate research 30.1 (2005):
[2] https://www.cardiosmart.org/Heart-Basics/CVD-Stats 79-82.
[3] Masethe, Hlaudi Daniel, and Mosima Anna Masethe. "Prediction of heart [26] Armstrong, J. Scott, and Fred Collopy. "Error measures for generalizing
disease using classification algorithms." Proceedings of the world about forecasting methods: Empirical comparisons." International
congress on engineering and computer science. Vol. 2. 2014. journal of forecasting 8.1 (1992): 69-80.
[4] Kaur, Beant, and Williamjeet Singh. "Review on heart disease prediction
system using data mining techniques." International Journal on Recent
and Innovation Trends in Computing and Communication ISSN
(2014): 2321-8169.
[5] Adeodato, Paulo JL, Tarcisio B. Gurgel, and Sandra Mattos. "A Decision
Support System Based on Data Mining for Pediatric Cardiology
Diagnosis." DMIN. 2009.
[6] Anuradha, B., Veera Reddy, and C. Veera. "Cardiac arrhythmia
classification using fuzzy classifiers." Journal of Theoretical and
Applied Information Technology 4.4 (2008): 353-359.
[7] Parthiban, G., A. Rajesh, and S. K. Srivatsa. "Diagnosis of heart disease
for diabetic patients using naive bayes method." International Journal
of Computer Applications 24.3 (2011): 7-11.
[8] Pattekari, Shadab Adam, and Asma Parveen. "Prediction system for heart
disease using Naïve Bayes." International Journal of Advanced
Computer and Mathematical Sciences 3.3 (2012): 290-294.
[9] Thenmozhi, K., and P. Deepika. "Heart disease prediction using
classification with different decision tree techniques." Int. J. Eng. Res.
Gen. Sci 2.6 (2014).
[10] Kolçe, Elma, and Neki Frasheri. "A literature review of data mining
techniques used in healthcare databases." ICT Innovations (2012).
[11] Safdarian, Naser, Keivan Maghooli, and Nader Jafarnia Dabanloo.
"Classification of Cardiac Arrhythmias with TSK Fuzzy System using
Genetic Algorithm." International Journal of Signal Processing, Image
Processing and Pattern Recognition 5.2 (2012): 89-100.
[12] Shouman, Mai, Tim Turner, and Rob Stocker. "Using decision tree for
diagnosing heart disease patients." Proceedings of the Ninth
Australasian Data Mining Conference-Volume 121. Australian
Computer Society, Inc., 2011.
[13] Patil, Tina R., and S. S. Sherekar. "Performance analysis of Naive Bayes
and J48 classification algorithm for data classification." International
Journal of Computer Science and Applications 6.2 (2013): 256-261.
[14] Liu, Huan, and Lei Yu. "Toward integrating feature selection algorithms
for classification and clustering." IEEE Transactions on knowledge and
data engineering 17.4 (2005): 491-502.
[15] Vanaja, S., and K. Ramesh Kumar. "Analysis of feature selection
algorithms on classification: a survey." International Journal of
Computer Applications 96.17 (2014).
[16] Cao, Dong-Sheng, et al. "Automatic feature subset selection for decision
tree-based ensemble methods in the prediction of bioactivity."
Chemometrics and Intelligent Laboratory Systems 103.2 (2010): 129-
136.
[17] Kim, Yong, W. Nick Street, and Filippo Menczer. "Feature selection in
data mining." Data mining: opportunities and challenges 3.9 (2003):
80-105.
[18] Abdelaal, Medhat Mohamed Ahmed, et al. "Using data mining for
assessing diagnosis of breast cancer." Computer Science and
Information Technology (IMCSIT), Proceedings of the 2010
International Multiconference on. IEEE, 2010.
[19] Kaur, Gurneet, and Er Neelam Oberai. "A REVIEW ARTICLE ON
NAIVE BAYES CLASSIFIER WITH VARIOUS SMOOTHING
TECHNIQUES." (2014).
[20] http://www.doc.gold.ac.uk/~mas01ds/cis338/cardiology.arff
[21] Weka 3.5.6, An open source data mining software tool developed at
university of Waikato, New Zealand,
http://www.cs.waikato.ac.nz/ml/weka/ 2009.
[22]http://weka.sourceforge.net/doc.stable/weka/attributeSelection/SVMAtt
ributeEval.html

View publication stats

You might also like