Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/341788367

A Multilayer Perceptron Neural Network Model For Predicting Diabetes

Preprint · January 2020


DOI: 10.13140/RG.2.2.23203.89126

CITATIONS READS
2 1,541

2 authors:

Garima Verma Hemraj Verma


DIT University DIT University
62 PUBLICATIONS 385 CITATIONS 47 PUBLICATIONS 316 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Garima Verma on 01 June 2020.

The user has requested enhancement of the downloaded file.


International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

A Multilayer Perceptron Neural Network Model For Predicting Diabetes


Dr. Garima Verma1*, Dr. Hemraj Verma2
*1
Department of Information Technology, DIT University, Dehradun, INDIA
2
Faculty of Management Studies, DIT University, Dehradun, INDIA

Abstract

Diabetes is a silent metabolic disorder that affects a huge number of people all over the world. It affects
the entire body system badly and can create many complications in the body if unidentified. One of the
main reasons for the increase in the number of people getting affected by this disease nowadays is the
lifestyle of the people. Today, the majority of us live a life that is characterized by less physical activity
coupled with the consumption of lots of junk food. In such a scenario, it becomes critical to identify this
disease at the earliest. For the same, this study proposes a machine learning model based on Multilayer
Perception Neural Network (MLP), which can identify diabetes patients. Apart from MLP, the 5-fold cross-
validation is also applied to get better results on test data after training. The experiment results of the model
show 82% accuracy in prediction, which is quite better. The results of the proposed model are also
compared with some existing state of arts.
Keywords- metabolic disorder, diabetes, machine learning, multilayer perceptron, accuracy.
Introduction
Diabetes is a disease that occurs when your blood sugar, which is also called blood glucose, is too high
because it is not metabolized in the body. Blood glucose is the main source of energy and it comes from
the food a human eat day today. The pancreas creates a hormone called insulin, which helps glucose from
food get absorbed by body cells so that it produces energy. Sometimes the human body does not make any
or not make enough insulin, in this situation, glucose remains in the blood and does not reach to body cells
(Swapna et al., 2018; Report, 2016). The main problem with this disease is that it has no cure, only a person
can take steps to manage diabetes and stay healthy for the rest of life. Many a time it is not diagnosed
because of no symptoms. There are mainly three types of diabetes, Type -1, the main cause in the pancreas,
which produces very less or no insulin. Results in stopping the functioning of the pancreas. Type -2 diabetes
is very common in which the body does not produce insulin properly and results in non-conversion of
glucose in energy. The third type of diabetes is called gestational diabetes, generally, 2 to 10% pregnant
women get affected by this. This gestational diabetes generally vanishes after delivery and in fact, it is
treatable also. But type-1 and type-2 has as such no cure but slowly creates many complications if diabetes
is not diagnosed and controlled properly (Chiang et al., 2014; Begum et al., 2014).
Many researchers have done work on prediction to predict diabetes by considering some features which
can be helpful in prediction. Generally, doctors suggest patients take some test for analyzing whether he
has diabetes or not (Report , 2016). In this study, an effort has been done in the same direction, i.e. predicting
diabetes by using certain features. For the study, a machine learning algorithm Multilayer Perceptron (MLP)
is used for the prediction. The experiment is performed using five-fold cross-validation and achieved good
accuracy.
The paper is further divided into 6 sections. Section 2 presents a literature survey related to all machine
learning techniques, used to make the predictions. Section 3 presents the proposed model. Section 4
describes the data set and MLP in detail. The performance of the model and prediction accuracy is discussed
in Section 5. Finally, the conclusion and future scope of the work are presented in Section 6.
Related Work
Now a day's diabetes becomes a very common and dangerous disease, nowadays. There are various reasons
for this disease, but the major reason is the increase of glucose in the body blood. This disease not only
affects day to day actives but it highly affects the other organs of the body like kidney, heart, eye, etc.

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1018
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

Various studies have been done in the direction of prediction of this disease, based on some features
collected from different sources.

Swapna et al., 2018, designed a deep learning model for the prediction of diabetes. They have used normal
heart rate variability signals for predicting diabetes. CNN and CNN-LSTM architecture is used for deep
learning. Orabi et al, 2016, proposed a diabetes prediction model, which predicts diabetes only at a
particular age. They have used the decision tree machine learning technique for prediction. In a study done
by Aljumah et al. 2013, authors have done a prediction analysis using data mining techniques. Authors have
used Oracle data miner for predicting treatment modes for diabetes by using Support Vector Machine
(SVM). They have concluded from the study that patients at a young age can be treated slowly but if the
patients are in the old age then he has to give the proper treatment immediately. Three machine learning
classification algorithms – decision tree, SVM and Naïve Bayes (DSN)- have been used to detect diabetes
in early-stage especially in pregnant women (Sisodia et al, 2018). The analysis of algorithms has been
evaluated using various measures like precision, accuracy, and Receiver Operating Characteristics (ROC),
etc. In a study by Kaur et al., 2018, authors have used machine learning techniques SVM-linear, radial
based function kernel SVM, K-nearest neighbor (KNN) and artificial neural network (ANN) for classifying
patients with diabetes and non-diabetes. Selvakumar et al, 2017, have done classification using machine
learning techniques (KLR) KNN and Logistic Regression (LR). They have also compared the accuracy of
both algorithms. In a study done by Maniruzzaman et al., 2017 authors have used linear discriminant
analysis and quadratic discriminant analysis for classification of diabetes data. Finally, they have compared
the performance of these techniques with accuracy, sensitivity and ROC curve.
Methods and Materials
Multilayer Perceptron (MLP)
The study aimed at developing an MLP model for predicting diabetes. MLP is a supervised machine
learning algorithm. As the name indicates, it has multilayers. If the problem is linear then only one layer is
required, but for complex and non-linear problems more layers are added with a single layer perceptron
(Sonawane et al., 2014). This network then called as multilayer perceptron. MLP is a feed-forward neural
network, which has one or more hidden layers. MLP with one hidden layer is shown in Fig. 1. MLP is
consists of a minimum of three layers- input, one or more hidden layers, and output layers. The input layer
produces input to the next layers with linear activation function without any threshold. But hidden and
output layers have thresholds and non-linear activation functions (Nazzal et al., 2008). The network in Fig.
1 has an input layer with multiple neurons (X1, X2 ----Xn) and bias, one hidden layer with multiple neurons
(a1, a2---- an) and a bias, and finally output layer. In the hidden layer, each neuron converts the values of
the previous layer by linear summation ( 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛 ), where each input is multiplied
with some weight (w), followed by a non-linear function called as the activation function. MLP uses a
function f (⋅): Ri→Ro for training on a dataset. Where ‘i’ is a size of input vector x, and ‘o’ is a size of output
vector f(x). In Eq. (1) matrix notation is shown.

𝑓(𝑥) = 𝐺(𝑏 2 + 𝑊 2 (𝑠(𝑏1 + 𝑊 1 𝑥))) (1)


With bias b1 and b2, weight matrices W1 and W2 and activation function G and s.

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1019
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

Fig 1. MLP with one hidden layer


Description of Dataset
UCI machine learning repository, accessed in December 2018, a dataset of diabetes patients is used for the
study. The dataset has a total of 768 records of only records of female patients with a minimum age of 21
years. The dataset contains 8 risk features – number of times pregnant, plasma glucose, blood pressure, the
thickness of skin, insulin level, Body mass index, diabetes pedigree function, and age. The description of
the dataset is shown in Table-1. For evaluation of the proposed model, PyCharm IDE with Python 3.6 is
used. Fig. 2 shows a bar chart, which shows how many cases are with diabetes and how any are non-diabetic
in the dataset, using a target variable of the dataset.

Fig. 2. Plot between diabetic and non-diabetic cases.


Table -1 Description of dataset

Pregnanci Glucos BP SkinTh Insulin BMI DPF Age Outco


es e ic me

Mean 3.85 120.89 69.11 20.54 79.80 31.99 0.47 33.24 0.35
Std. 0.12 1.15 0.70 0.58 4.16 0.28 0.01 0.42 0.02
Error

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1020
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

Median 3.00 117.00 72.00 23.00 30.50 32.00 0.37 29.00 0.00
Mode 1.00 100.00 70.00 0.00 0.00 32.00 0.25 22.00 0.00
Std. 3.37 31.97 19.36 15.95 115.24 7.88 0.33 11.76 0.48
Deviati
on
Sample 11.35 1022.2 374.65 254.47 13281. 62.16 0.11 138.30 0.23
Varianc 5 18
e
Kurtosi 0.16 0.64 5.18 -0.52 7.21 3.29 5.59 0.64 -1.60
s
Skewne 0.90 0.17 -1.84 0.11 2.27 -0.43 1.92 1.13 0.64
ss
Range 17.00 199.00 122.00 99.00 846.00 67.10 2.34 60.00 1.00
Min 0.00 0.00 0.00 0.00 0.00 0.00 0.08 21.00 0.00
Max 17.00 199.00 122.00 99.00 846.00 67.10 2.42 81.00 1.00
Sum 2953.00 92847. 53073. 15772. 61286. 24570. 362.4 25529. 268.00
00 00 00 00 30 0 00
Count 768.00 768.00 768.00 768.00 768.00 768.00 768.0 768.00 768.00
0

Proposed Model
The proposed model is shown in Fig. 3. The figure shows the flow and steps of the model. After loading
the dataset come cleaning and refining were used to make the dataset better. Especially to handle missing
values and values with 0.

Skin
Pregnencies Glucose BP Thickness insulin BMI DPF Age

Dataset

Data Cleaning
and Refining

MLP

5-Fold Cross Validation

Fig 3. Proposed Model

For prediction, the MLP algorithm is used with 5-fold cross-validation. MLP is very sensitive to feature
scaling. Therefore, the data is re-scaled or normalized to get better results (Jayalakshmi et al., 2011). For
rescaling the StandardScaler method is used. After applying the MLP 5-fold cross-validation is applied,
using this method the dataset split into 5 partitions, which were also called as folds. After partition one fold
was used as a test dataset and the union of rest folds was used as the training dataset. Then the model was
tested for accuracy. This process was repeated 5 times in the proposed model. The proposed model
algorithm is shown in Fig. 4.

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1021
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

Begin
1. Load dataset
2. Data cleaning and refining
3. Define X collection of independent
variables
4. Define Y as the dependent variable or
target
5. Rescaling of data
6. Creation of the MLP model with 3
hidden layers of 100 units
7. Set 5 folds cross-validation on the
model with X and Y.
8. Find the accuracy score
End

Fig 4. Algorithm of the proposed model


Results and Discussions
Experiment Setup
Simulation of the proposed work has been done using PyCharm 2016.3.6 IDE using Python in a workstation
with Intel coreTM i3 3.2 GHz processor. The proposed approach has been applied for the dataset of 768
female patients with a minimum age of 21 years. The whole dataset in the experiment has been divided into
two parts - training and testing. 80% of data has been taken for the training of the model and the remaining
20% used for testing. The model performance was evaluated using confusion matrix, Area under the Curve
(AUC) - Receiver Operating Characteristics (ROC) curve, precision, recall, and accuracy.

.Confusion Matrix
It is a table, which has four values- True Positive (TP), False Positive (FP), False Negative (FN) and True
Negative (TN) of observed and predicted values. The confusion matrix of the proposed model is shown in
Table 2.
Table -2 Confusion Matrix
Predicted
Testing Data
0 1
0 113 12
Observed
1 22 45
Precision, Recall and Accuracy
Precision means when the model is making a prediction of how frequently it is giving correct results. In the
proposed model the precision is 78.9%. Precision can be calculated by Eq. 2.
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃+𝐹𝑃 (2)

Recall means if there are patients who have diabetes in the test dataset and proposed model can identify
that. For the proposed model recall is 67.16%. Recall can be calculated by Eq. 3.
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃+𝐹𝑁 (3)

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1022
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

The overall accuracy of the model is predicted at approximately 82%. Accuracy can be calculated by Eq.
4.
𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 (4)

The classification report of the proposed model is shown in Table 3.

Table -3 Classification Report


precision recall f1-score support
0 0.84 0.90 0.87 125
1 0.79 0.67 0.73 67
avg / total 0.82 0.82 0.82 192
5.1.3 AUC-ROC Curve
This measure is used to visualize the performance of the algorithm. ROC is basically a curve of probability
and degree is measured by the AUC curve. The model is better and better according to the high values AUC
(Fawcett 2006). ROC curve is plotted between True positive rate (TPR) and False Positive Rate (FPR).
The ROC curve with a score of the proposed model in Fig. 5.

Performance Comparison
To validate the performance of the proposed model, we have compared our approach with some existing
approaches given by various researchers like – Sisodia et al., 2018, and Selvakumar et al., 2017 and Jian et
al., 2013. In the study done by Sisodia et al., 2018, authors have examined three machine learning
algorithms-decision tree, SVM and Naïve Bayes to predict diabetes at an early stage. They have measured
the accuracy of all three models and claimed that naïve bayes performed better than the rest two. In the
study done by Selvakumar et al., 2017, authors have developed prediction models based on three algorithms
logistic regression, k nearest neighbor (KNN) and MLP. The evaluation of the model they have done on
the basis of accuracy and claimed that KNN performed better than the rest two models. In the study done
by Jian et al., 2013, authors have developed an algorithm that uses Heart Rate Variability (HRV) obtained
from ECG signals to detect diabetes. Authors applied machine learning algorithms namely, Fuzzy classifier,
Gaussian Mixture Model (GMM), SVM, Probabilistic Neural Network (PNN), Naive Bayes, KNN and
Decision Tree. The performances of the classifiers were evaluated by calculating the values of their
accuracy. The accuracy of SVM was the best among all models. The summary table of comparison is shown
in Table 4.

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1023
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

Fig 5. ROC curve of the proposed model using test data

Table - 4 Comparative Analysis of existing state-of-art

Sno. Study Method Accuracy


1. DSN [7] Naïve 76.3%
Bayes
2. KLR [9] KNN 80%
3. HRV [16] SVM 79.93%
4. Proposed MLP 82%

Conclusion and Future Work


In this study, an effort has been made to develop a model that could predict patients with diabetes in the
early stage. The model uses a Multilayer perceptron algorithm with a 5-fold cross-validation technique. The
experiment was performed using the dataset of the UCI machine learning repository with 768 records.
Results of the experiments shown in the form of the confusion matrix, precision, recall, overall accuracy,
and AUC-ROC curve. The overall accuracy of the model is 82%.
In the future, the study may be improved by including more machine learning algorithms and final
evaluation for better accuracy can be done using ensemble techniques.

References
1) Aljumah, A.A., Ahamad, M.G. and Siddiqui, M.K., (2013). Application of data mining: Diabetes
health care in young and old patients. Journal of King Saud University-Computer and Information
Sciences, 25(2), pp.127-136.
2) Begum, S.A., Afroz, R., Khanam, Q., Khanom, A. and Choudhury, T.S., (2014). Diabetes mellitus
and gestational diabetes mellitus. Journal of Paediatric Surgeons of Bangladesh, 5(1), pp.30-35.
3) Chiang, J.L., Kirkman, M.S., Laffel, L.M. and Peters, A.L., (2014). Type 1 diabetes through the
life span: a position statement of the American Diabetes Association. Diabetes care, 37(7),
pp.2034-2054.
4) Fawcett, T., 2006. An introduction to ROC analysis. Pattern recognition letters, 27(8), pp.861-874.
5) Jayalakshmi, T. and Santhakumaran, A., (2011). Statistical normalization and back propagation for
classification. International Journal of Computer Theory and Engineering, 3(1), pp.1793-8201.
6) Jian, L.W. and Lim, T.C., (2013). Automated detection of diabetes by means of higher order
spectral features obtained from heart rate signals. Journal of Medical Imaging and Health
Informatics, 3(3), pp.440-447.
7) Kaur, H. and Kumari, V., (2018). Predictive modelling and analytics for diabetes using a machine
learning approach. Applied Computing and Informatics.
8) Maniruzzaman, M., Kumar, N., Abedin, M.M., Islam, M.S., Suri, H.S., El-Baz, A.S. and Suri, J.S.,
(2017). Comparative approaches for classification of diabetes mellitus data: Machine learning
paradigm. Computer methods and programs in biomedicine, 152, pp.23-34.
9) Nazzal, J.M., El-Emary, I.M. and Najim, S.A., (2008). Multilayer perceptron neural network
(MLPs) for analyzing the properties of Jordan Oil Shale 1.
10) Orabi, K.M., Kamal, Y.M. and Rabah, T.M., (2016), July. Early predictive system for diabetes
mellitus disease. In Industrial Conference on Data Mining (pp. 420-427). Springer, Cham.
11) Selvakumar, S., Kannan, K.S. and GothaiNachiyar, S., (2017). Prediction of Diabetes Diagnosis
Using Classification Based Data Mining Techniques. International Journal of Statistics and
Systems, 12(2), pp.183-188.
12) Sisodia, D. and Sisodia, D.S., (2018). Prediction of diabetes using classification
algorithms. Procedia computer science, 132, pp.1578-1585.

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1024
International Journal of Grid and Distributed Computing
Vol. 13, No. 1, (2020), pp. 1018-1025

13) Sonawane, J.S. and Patil, D.R., (2014), February. Prediction of heart disease using multilayer
perceptron neural network. In International Conference on Information Communication and
Embedded Systems (ICICES2014) (pp. 1-6). IEEE.
14) Swapna, G., Vinayakumar, R. and Soman, K.P., (2018). Diabetes detection using deep learning
algorithms. ICT Express, 4(4), pp.243-246.
15) UCI Machine learning repository, Diabetes dataset, Accessed on 10th December 2018.

ISSN: 2005-4262 IJGDC


Copyright ⓒ2020 SERSC
1025

View publication stats

You might also like