Professional Documents
Culture Documents
Fetal Health Classification Using Supervised Learning Approach
Fetal Health Classification Using Supervised Learning Approach
Abstract— Fetal Health monitoring is important to reduce automate CTG interpretation, reducing inconsistencies in
or minimize the mortality of both mother and child. This paper outcome classification [6]. The present algorithms are very
presents a study on a dataset of 2126 records on features good in predicting the sick status of the foetus, but they aren't
extracted from cardiotocography exam with 21 attributes very good at predicting suspicious states [7,8].
including baseline value accelerations, fetal movement, uterine
contractions, light, severe and prolonged decelerations, The goal of this study was to create a machine learning
abnormal short-term variability, the mean value of short-term model that could accurately identify fetal health from the
variability, percentage of time with abnormal long-term CTG readings whether it is normal, suspect or pathological.
variability, the mean value of long-term variability, histogram The machine learning approach that will be used is
width, min, max, number of peaks, number of zeroes, mode, Supervised Learning.
mean, median, variance, and tendency. This paper will be using
Supervised Machine Learning to compare and classify the data A. Dataset Description
set using K-NN, Linear SVM, Naïve Bayes, Decision Tree (J48), The dataset was obtained from the University of
Ada Boost, Bagging and Stacking. Lastly, Bayesian networks California Irvine Machine Learning Repository [9] that
are then developed and compared with the other classifier. By contains 2126 records of features extracted from
comparing all of the classifiers, classifier Ada Boost with sub-
Cardiotocography exams with 21 attributes including
model Random Forest has the highest accuracy 94.7% with
k=10.
baseline value accelerations, fetal movement, uterine
contractions, light, severe and prolonged decelerations,
Keywords—Data Mining, Supervised Machine Learning, abnormal short-term variability, the mean value of short-term
Fetal Health Classification variability, percentage of time with abnormal long-term
variability, mean value of long-term variability, histogram
I. INTRODUCTION width, min, max, number of peaks, number of zeroes, mode,
Fetal Health monitoring is important to reduce or mean, median, variance, and tendency [10]. This attribute
minimize the mortality of both mother and child. Based on was then classified into 3 classes; 1 - Normal, 2 - Suspect,
United Nations Sustainable Development Solutions Network and 3 - Pathological.
(SDSN) indicator report by 2030, all countries aiming to B. Summary Statistic
reduce neonatal mortality to at least as low as 12 per 1,000
live births and under-5 The summary statistic of the 21 attributes shown in Table
I provided simple descriptive statistics and the value of the
TABLE I. SUMMARY STATISTIC OF THE 21 data. For this data set, the percentage of missing data is 0%
ATTRIBUTES.
Percentage
_of_
Mean_Val Mean_Val
Prolongue Abnormal_ Time_Wit Histogram Histogram
Uterine_ Light_ Severe_ ue_ ue_
Baseline Accelerati Fetal_ d_ Short_Ter h_ Histogram Histogram Histogram _ _ Histogram Histogram Histogram Histogram Histogram
Description Contractio Decelerati Decelerati of_Short_ of_Long_T
Value ons Movement Decelerati m_ Abnormal_ _ Width _ Min _ Max Number_of Number_O _ Mode _ Mean _ Median _ Variance _ Tendency
ns ons ons Term_ erm_
ons Variability Long_ _ Peaks f_ Zeroes
Variability Variability
Term_Vari
ability
Mean 133.304 0.003 0.009 0.004 0.002 0.000 0.000 46.990 1.333 9.847 8.188 70.446 93.579 164.025 4.068 0.324 137.452 134.611 138.090 18.808 0.320
Standard
9.841 0.004 0.047 0.003 0.003 0.000 0.001 17.193 0.883 18.397 5.628 38.956 29.560 17.944 2.949 0.706 16.381 15.594 14.467 28.978 0.611
Deviation
Minimum 106 0 0 0 0 0 0 12 0.2 0 0 3 50 122 0 0 60 73 77 0 -1
Maximum 160 0.019 0.481 0.015 0.015 0.001 0.005 87 7 91 50.7 180 159 238 18 10 187 182 186 269 1
mortality to at least as low as 25 per 1,000 live births [1]. In which means no data need to be removed or interpolate as
doing so, one of the solutions is to implement the use of a missing data may affect the whole result. The distinct value
Cardiotocograph (CTG) for Fetal Health monitoring in represents the granularity of the distribution of the attribute
hospitals or clinics. Cardiotocograph is equipment that can while Mean and Standard deviation will give an idea of how
be used before or during birth that will help to measure fetal the attributes spread. All these values are differs based on
heartbeat, movement, contraction and more. each attribute.
To make a diagnosis, artificial intelligence (AI) employs II. SUPERVISE LEARNING TECHNIQUE
mathematical algorithms and data from the human body [2].
These models have been used to increase the accuracy of Supervised Learning Techniques compared to
cancer recurrence and mortality prediction [3], Unsupervised Learning Techniques is using an instance that
cardiovascular risk prediction [4], and the diagnostic is given with known labels [11]. This labelled data is then
accuracy of radiological examinations such as computed used with other algorithms for data classification and to
tomography scans and magnetic resonance imaging [5]. accurately predict the result. In this assignment, Weka will
Medical and engineering experts have been collaborating to be used for data mining task and this software are suitable
978-1-6654-3607-6/21/$31.00
Authorized licensed use limited to:©2021 IEEE
Bangladesh 36 on February 07,2022 at 05:02:09 UTC from IEEE Xplore. Restrictions apply.
Agricultural University. Downloaded
2021 IEEE National Biomedical Engineering Conference (NBEC)
for a new experience user with no background with coding. Stacking combines the prediction from different
This software contains tools for supervised and models to produce another model. It is then used to
unsupervised learning and visualization [12]. For the predict the test set [18].
supervised learning approached, 7 types of classifiers • Bayes Network
including K-Nearest Neighbour (KNN), Linear SVM
(Support Vector Machine), Naïve Bayes, Decision Tree This method will visualize the probabilistic model for
(J48), Ada Boost, Bagging and Stacking were to be used on a domain, review all of the relationships between the
the data set using Weka to develop the confusion matrix for random variables, and reason about causal
each classifier and to determine which techniques have the probabilities for scenarios given the available
highest accuracy. evidence [19].
A. Classifiers B. Methods
For the supervised learning approached, 7 types of For this paper, all classifiers will be fitted with k=10, 5
classifiers including K-Nearest Neighbour (KNN), and 3-fold validations. The size differences in the training
Linear SVM (Support Vector Machine), Naïve Bayes, set and the resampling subsets are depended on the number
Decision Tree (J48), Ada Boost, Bagging and Stacking of k [20].
were to be used on the data set using Weka to develop the
confusion matrix for each classifier and to determine The accuracy can be determined by the below formula [21].
which techniques have the highest accuracy.
• K-Nearest Neighbour = (2)
K-NN method used the idea of the distance and The highest the accuracy, the better the data set.
proximity between a point in the graph. The famous
distance matrix that is used to calculate the nearest
neighbour is Euclidean distance. != (3)
TN (True Negative) - negative outcome prediction where the TABLE II. CLASSIFIER RESULTS WHEN K = 5
outcome of the results is negative.
F-
Classifi Precisi ROC
FN (False Negative) - negative outcome prediction but the TP FP Recall Measu
er on Area
re
result come out the opposite (positive). Known as type 2
K-
Error. Nearest
0.913 0.157 0.911 0.913 0.912 0.886
Neighb
III. RESULT AND DISCUSSION our
Linear
A. Classifiers Result and Accuracy. 0.794 0.710 0.798 0.794 0.720 0.542
SVM
Naïve
The classifiers result and accuracy, as shown in Tables 0.822 0.076 0.876 0.822 0.838 0.934
Bayes
II, III, IV, and V and Fig. 1, show the results of each J48 0.930 0.128 0.929 0.930 0.930 0.915
classifier used in this study. Based on the classifier accuracy Ada
with k equal to 10-, 5- and 3-fold cross-validation, classifier Boost +
Ada Boost with sub-model Random Forest has the highest Decisio 0.778 0.778 ? 0.778 ? 0.873
n
accuracy when k=10-fold cross-validation. While Linear Stump
SVM has the lowest accuracy when k=10-fold cross- Ada
validation. For classifier Ada Boost with sub-model Boost +
Decision Stump and classifier Stacking with sub-model Rando 0.943 0.130 0.941 0.943 0.941 0.963
m
ZeroR, the result is negligible as based on the classifier Forest
result, both classifiers with the said sub-model failed to Baggin
0.940 0.145 0.939 0.940 0.938 0.977
calculate the result for Precision and F-Measure. k=3-, 5-, g
and 10-fold validations are chosen for the Supervised Stackin
Learning model due to for k=10-fold cross-validation, g+ 0.778 0.778 ? 0.778 ? 0.499
ZeroR
training 90% and testing 10 % of the data, and will be Stackin
calculated 10-fold times. For k = 5-fold cross-validation, g+
training 80%, testing 20% of the data, and the calculation is ZeroR
0.933 0.122 0.931 0.933 0.932 0.918
running 5-fold times while for k = 3-fold cross-validation, NNGE
MODL
training 67%, and testing 33% of the data and the calculation EM
is running 3-fold times.
F- F-
Classif Precisi ROC Classif Precisi ROC
TP FP Recall Measu TP FP Recall Measu
ier on Area ier on Area
re re
K- K-
Neares Nearest
0.920 0.140 0.918 0.920 0.918 0.894
t 0.907 0.181 0.903 0.907 0.904 0.868 Neighb
Neighb our
our Linear
0.798 0.696 0.806 0.798 0.728 0.551
Linear SVM
0.792 0.718 0.790 0.792 0.715 0.537 Naïve
SVM 0.821 0.077 0.876 0.821 0.837 0.935
Naïve Bayes
0.817 0.082 0.872 0.817 0.834 0.934 J48 0.929 0.124 0.929 0.929 0.929 0.918
Bayes
J48 0.915 0.160 0.914 0.915 0.915 0.901 Ada
Ada Boost
Boost +
0.778 0.778 ? 0.778 ? 0.878
+ Decisio
0.788 0.638 ? 0.788 ? 0.864 n
Decisi
on Stump
Stump Ada
Ada Boost
Boost +
0.947 0.119 0.946 0.947 0.946 0.964
+ Rando
0.935 0.133 0.933 0.935 0.933 0.970 m
Rando
m Forest
Forest Baggin
0.941 0.138 0.940 0.941 0.940 0.976
Baggin g
0.926 0.171 0.924 0.926 0.924 0.969 Stackin
g
Stackin g+ 0.778 0.778 ? 0.778 ? 0.497
g+ 0.778 0.778 ? 0.778 ? 0.499 ZeroR
ZeroR Stackin
Stackin g+
g+ ZeroR
0.932 0.127 0.930 0.932 0.930 0.920
ZeroR NNGE
0.915 0.172 0.911 0.915 0.912 0.869 MODL
NNGE
MODL EM
EM
TABLE IV. CLASSIFIER ACCURACY WHEN Histogram_Min, Histogram_ Number_of_ Peaks and
K=3, 5, 10
Histogram_ Number_Of_ Zeroes are removed automatically
by Weka while the other 14 attributes remained. A negative
Classifier K=3 K=5 K=10 coefficient has a negative impact on fetal health and vice
K-Nearest Neighbour 90.69% 91.35% 91.96% versa. Severe_Decelerations and Prolongued_Decelerations
have the most impact on fetal health as the linear regression
Linear SVM 79.16% 79.40% 79.77%
model for these two attributes resulted in a positive
Naïve Bayes 81.75% 82.22% 82.08% coefficient.
J48 91.53% 92.99% 92.90% C. Bayes Network
Ada Boost +
Decision Stump
78.79% 77.85% 77.85% Using Weka, with classifier BayesNet, estimator
Ada Boost + Random SimpleEstimator and searchAlgorithm K2. The Directed
93.46% 94.26% 94.73%
Forest Acyclic Graph (DAG) is visualized as Fig. 2 shown below.
Bagging 92.62% 94.03% 94.12% Each node representative is shown as per Table VII. The
classifiers result and accuracy as shown in Tables VIII, IX
Stacking + ZeroR 77.85% 77.85% 77.85%
and X.
Stacking + ZeroR
91.49% 93.27% 93.18%
NNGE MODLEM
TABLE VII. Accuracy by Class when k=10 TABLE XI. DETAILED ACCURACY BY CLASS
WHEN K=10
Classifier k=10
BayesNet 86.5475% F-
TP FP Precisi ROC
Class Recall Measu
Rate Rate on Area
re
TABLE VIII. Detailed Accuracy by Class when k=10 Suspec
0.712 0.040 0.739 0.712 0.725 0.951
t
F- Norma
TP FP Precisi ROC 0.957 0.180 0.949 0.957 0.953 0.972
Class Recall Measu l
Rate Rate on Area
re Phatol
0.864 0.011 0.879 0.864 0.871 0.994
Suspec ogical
0.783 0.103 0.550 0.783 0.646 0.923
t Weigh
Norma ted 0.915 0.147 0.914 0.915 0.915 0.971
0.891 0.138 0.958 0.891 0.923 0.953
l Avg.
Phatol
0.761 0.016 0.807 0.761 0.784 0.967
ogical TABLE XII. CONFUSION MATRIX WHEN K=10
Weigh
ted 0.865 0.123 0.889 0.865 0.873 0.950
Avg. a b c <-- classified as
210 76 9 a = Suspect
TABLE IX. CONFUSION MATRIX WHEN K=10 59 1584 12 b = Normal
15 9 152 c = Pathological
a b c <-- classified as
231 59 5 a = Suspect Based on the Weka Directed Acyclic Graph (DAG),
153 1475 27 b = Normal each probability table will emerge by clicking each node in
36 6 134 c = Pathological the Graph. The probability for Fetal Health is 0.139, 0.778
and 0.083 for Suspect, Normal and Pathological
By experimenting with the maxNrOfParents = 4 gave the respectively.
best results where the constructed DAG Graph as shown in Based on the constructed Directed Acyclic Graph
Fig. 3. The classifiers result and accuracy as shown in Tables (DAG), all nodes and links (edges) do not form a directed
XI, XII and XIII. cycle which meant, this graph is proper DAG. By comparing
with the other classifier when k=10, BayesNet Classifier
gave the 6th highest accuracy.
By comparing all of the classifiers, classifier Ada Boost
with sub-model Random Forest has the highest accuracy
94.7% with k=10. While Linear SVM has the lowest
accuracy when k=10.
IV. CONCLUSION
This study comparing the accuracies between the
classifier in Supervised Learning Techniques. The data set
used contains 2126 records of features extracted from
Cardiotocogram exams with 21 attributes which are then
classified by the health professional into 3 multiclass which
is Fetal Health category, Normal, Suspect and Pathological.
The data set is then prepared and 7 types of classifiers
including K-Nearest Neighbour (KNN), Linear SVM
(Support Vector Machine), Naïve Bayes, Decision Tree
(J48), Ada Boost, Bagging and Stacking are then applied to
the data set and the result is compared to evaluate which
model gave the highest accuracy. Classifier Ada Boost with
sub-model Random Forest has the highest accuracy 94.7%
with k=10. It is known that Ada Boost works well with the
decision tree or Random Forest. The AdaBoost makes a new
prediction by adding up the weight (of each tree) multiply
Fig. 3. Directed Acyclic Graph (DAG) the prediction (of each tree) [22]. then Finally, Bayes
Network is applied to the data set to build the Directed
TABLE X. ACCURACY BY CLASS WHEN K=10
Acyclic Graph (DAG).
Finally, machine learning is important for the
Classifier k=10 healthcare industry as it can help the medical practitioner to
BayesNet 91.5334% diagnose and giving the right treatment for the patient,
predict future diseases and preventions. For future works,
apply Reinforcement Learning in the data or to obtain a
dataset from the local hospital as the disease may vary by
country due to population, economy, pollution, nutrition, intelligence applications in computer engineering”, 160(1), 3-24.
2007.
etc.
[12] "Weka 3 - Data Mining with Open Source Machine Learning
ACKNOWLEDGEMENT Software in Java", Cs.waikato.ac.nz, 2021. [Online]. Available:
https://www.cs.waikato.ac.nz/ml/weka/. [Accessed: 15- Oct- 2021].
The work is partially supported by Universiti Teknologi [13] "Support Vector Machines(SVM) — An Overview", Medium, 2021.
Malaysia (Vote No: Q.K130000.3556.05G66) and the [Online]. Available: https://towardsdatascience.com/https-medium-
Ministry of Education Malaysia for guidance and resources. com-pupalerushikesh-svm-f4b42800e989. [Accessed: 15-Jun-2021].
[14] J. Han, M. Kamber, and J. Pei “Data Mining: Concepts and
REFERENCES techniques”. 3rd ed, The Morgan Kaufmann Series: Waltham, USA,
2012.
[1] "3.2 By 2030, end preventable deaths of newborns and children under
[15] N. Saravanan and V. Gayathri "Performance and Classification
5 years of age, with all countries aiming to reduce neonatal mortality
Evaluation of J48 Algorithm and Kendall’s Based J48 Algorithm
to at least as low as 12 per 1,000 live births and under-5 mortality to
(KNJ48)", International Journal of Computer Trends and Technology,
at least as low as 25 per 1,000 live births – Indicators and a Monitoring
vol. 59, no. 2, pp. 73-80, 2018. Available: 10.14445/22312803/ijctt-
Framework", Indicators.report, 2021. [Online]. Available:
v59p112.
https://indicators.report/targets/3-2/. [Accessed: 15- Oct- 2021].
[16] "AdaBoost Algorithm: Boosting Algorithm in Machine Learning",
[2] "Artificial Intelligence in Health Care - AziKar24", AziKar24, 2021.
GreatLearning Blog: Free Resources what Matters to shape your
[Online]. Available: https://azikar24.com/artificial-intelligence-in-
Career!, 2021. [Online]. Available:
health-care. [Accessed: 15- Jun- 2021].
https://www.mygreatlearning.com/blog/adaboost-algorithm/.
[3] J. A. Cruz and D. S. Wishart. “Applications of machine learning in [Accessed: 15- Oct- 2021].
cancer prediction and prognosis”. Cancer Inform. 2007;2:59–77.
[17] I. Education, "What is Bagging?", Ibm.com, 2021. [Online].
[4] S. F. Weng, J. Kai, J. M. Garibaldi, and N. Qureshi. “Can machine- Available: https://www.ibm.com/cloud/learn/bagging. [Accessed:
learning improve cardiovascular risk prediction using routine clinical 15- Oct- 2021].
data?”. PLoS One. 2017;12:e0174944.
[18] J. Brownlee, "Stacking Ensemble Machine Learning With Python",
[5] S. Wang, and R. M. Summers. “Machine learning and Machine Learning Mastery, 2021. [Online]. Available:
radiology”. Med Image Anal. 2012;16:933–51. https://machinelearningmastery.com/stacking-ensemble-machine-
[6] H. Ocak. “A medical decision support system based on support vector learning-with-python/. [Accessed: 20 Jun- 2021].
machines and the genetic algorithm for the evaluation of fetal well- [19] J. Brownlee, "A Gentle Introduction to Bayesian Belief Networks",
being”. J Med Syst. 2013;37:9913. Machine Learning Mastery, 2021. [Online]. Available:
[7] Z. Cömert, A. F. Kocamaz, S. Güngör S. “Cardiotocography signals https://machinelearningmastery.com/introduction-to-bayesian-
with artificial neural networks and extreme learning machines. Signal belief-networks/. [Accessed: 15- Jun- 2021].
Processing and Communication Application Conference” [20] J. Brownlee, "A Gentle Introduction to k-fold Cross-Validation",
(SIU) IEEE. 2016 Machine Learning Mastery, 2021. [Online]. Available:
[8] C. Sundar, M. Chitradevi and G. Geetharamani. “Classification of https://machinelearningmastery.com/k-fold-cross-validation/.
cardiotocograph data using neural network-based machine learning [Accessed: 15- Jun- 2021].
technique”. Int J Comput Appl. 2012;47:14. [21] "Idiot’s Guide to Precision, Recall, and Confusion Matrix -
[9] " Repository UIML ", Archive.ics.uci.edu, 2018. [Online]. Available: KDnuggets", KDnuggets, 2021. [Online]. Available:
https://archive.ics.uci.edu/ml/index.php/. [Accessed: 11- Jun- 2018]. https://www.kdnuggets.com/2020/01/guide-precision-recall-
[10] Ayres de Campos et al. (2000) SisPorto 2.0 A Program for Automated confusion-matrix.html. [Accessed: 15- Jun- 2021].
Analysis of Cardiotocograms. J Matern Fetal Med 5:311-318 [22] "Basic Ensemble Learning (Random Forest, AdaBoost, Gradient
[11] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine Boosting)- Step by Step Explained", Medium, 2021. [Online].
learning: A review of classification techniques. Emerging artificial Available: https://towardsdatascience.com/basic-ensemble-learning-
random-forest-adaboost-gradient-boosting-step-by-step-explained-
95d49d1e2725. [Accessed: 15- Jun- 2021].