Professional Documents
Culture Documents
Performance Analysis and Comparison of Machine Learning and LoRa - Based Healthcare Model
Performance Analysis and Comparison of Machine Learning and LoRa - Based Healthcare Model
https://doi.org/10.1007/s00521-023-08411-5 (0123456789().,-volV)(0123456789().
,- volV)
ORIGINAL ARTICLE
Received: 2 August 2022 / Accepted: 13 February 2023 / Published online: 7 March 2023
Ó The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023
Abstract
Diabetes Mellitus (DM) is a widespread condition that is one of the main causes of health disasters around the world, and
health monitoring is one of the sustainable development topics. Currently, the Internet of Things (IoT) and Machine
Learning (ML) technologies work together to provide a reliable method of monitoring and predicting Diabetes Mellitus. In
this paper, we present the performance of a model for patient real-time data collection that employs the Hybrid Enhanced
Adaptive Data Rate (HEADR) algorithm for the Long-Range (LoRa) protocol of the IoT. On the Contiki Cooja simulator,
the LoRa protocol’s performance is measured in terms of high dissemination and dynamic data transmission range
allocation. Furthermore, by employing classification methods for the detection of diabetes severity levels on acquired data
via the LoRa (HEADR) protocol, Machine Learning prediction takes place. For prediction, a variety of Machine Learning
classifiers are employed, and the final results are compared with the already existing models where the Random Forest and
Decision Tree classifiers outperform the others in terms of precision, recall, F-measure, and receiver operating curve
(ROC) in the Python programming language. We also discovered that using k-fold cross-validation on k-neighbors,
Logistic regression (LR), and Gaussian Nave Bayes (GNB) classifiers boosted the accuracy.
Keywords Machine learning Diabetes mellitus Internet of Things LoRa Contiki Cooja
123
12752 Neural Computing and Applications (2023) 35:12751–12761
heterogeneous IoT network. It is designed to cover unsta- to Islam et al. [4], they offer a useful analysis and a model
ble frequency conditions in case of rain or bad weather that is appropriate for collecting patient physical health
condition and also the Data rate adaptation by end devices data from supporting sensors and IoT health care. The IoT
is specified in this LoRa (HEADR) protocol. In terms of system needs to be adaptable to provide all amenities
analysis, monitoring, and other clinical difficulties related during an emergency while also keeping the presence of
to diagnosis, ML techniques in DM are unquestionably medical staff and nursing staff in remote areas. This
important. However, there is an optimization approach technology also automates data collection, making it more
based on Scikit Learns that employs train-test splitting and reliable than manual patient data entry. The LoRa network
k-fold cross-validation. ML algorithms are commonly used has been used to turn an interface called my signals into a
to predict diabetes, and they deliver better results. health monitoring system that gathers information from a
As the population of various countries grows, so does heart rate, ECG sensor, body temperature, and pulse rate.
the burden of responsibility for the health of this growing The LoRa module’s performance was assessed once it was
population on hospitals, doctors, and nursing staff. So the integrated into the terminal application, and it showed
present study has focused on the IoT since it has the promise for gathering information from the patient’s body.
potential to reduce the pressure on health care systems. In the work of Lavric [5], to estimate the effectiveness
According to Verma et al. [2], Diabetes Healthcare Mon- of the LoRa protocol, more focus is given on measures
itoring Systems are crucial right now, particularly for including the number of packet collisions and network
remote health care monitoring, as visiting hospitals and performance. Accordingly, this work describes how many
standing in line for services is a waste of time for patient LoRa nodes can connect to the gateway at once while still
monitoring. It is extremely dangerous for diabetic patients abiding by the protocol’s rules. To lessen collision inci-
to wait in line since their life might be in danger at any dence and increase communication channel efficiency,
moment [3]. In this COVID-19 situation, remote monitor- several factors have been researched, including the
ing and ML prediction system such as this one may be spreading factor, data transmission rate, and duty cycle.
pretty helpful while we are incapable of actively caring for The Adaptive Data Rate (ADR) technique, which will
our elderly guardians and relatives. In Fig. 1, the applica- modify the data rate without human intervention when a
tion of IoT sensors and technology in the health care collision is determined, is one of the suggestions made by
industry is illustrated. the author for decreasing the frequency of collisions [6, 7].
However, this tactic will result in more energy being used
by the LoRa node.
2 Related work A comparison of short- and long-range communication
protocols is also demonstrated in Table 1.
2.1 Patient monitoring
123
Neural Computing and Applications (2023) 35:12751–12761 12753
2.2 Machine learning methods respectively, for the interpretation of diabetes data. The
approach with the highest accuracy, NB, has an accuracy of
According to Zou et al. [8], information on over 70,000 82.30 percent. The study extends the choice of the top
people—both healthy and diabetes patients—was physi- attributes from the dataset to increase classification
cally collected from a hospital in Luzhou, China. Decision accuracy.
trees, random forests, and neural networks with k-fold The local hospital in Kano, Nigeria provided the dia-
validation were utilized to predict diabetes mellitus, while betes diagnosis dataset that was used in this investigation
PCA and mRMR were employed to minimize dimension- by Muhammad et al. [10]. Using this dataset, a model was
ality. Last but not least, the random forest generated the created using SVM, k-nn, LR, RF, NB, and GB. Although
most accurate forecast (0.8084). It is also mentioned that the accuracy of the random forest and gradient boosting
choosing the right characteristic requires careful consider- predictive learning-based models is 86.28 and 88.76 per-
ation of the classifier technique. cent, respectively, for these models, the receiver operating
In the work of Sneha and Gangil [9], the research goal characteristic (ROC) curve indicates that these are the best
has been to use predictive analysis to identify which feature models. The algorithm will help health care workers and
has the most influence on early diabetes mellitus predic- medical experts identify and predict type 2 diabetes in
tion. Diabetic hyperglycemia is linked to harm to several those who are suspected of having the condition.
organs, including the heart, kidneys, veins, eyes, and As per Letters et al. work [11], using the PIMA Indian
nerves. In this research, the goal is to utilize ML to create a dataset, the SVM model was used to estimate the risk
perfect classifier model whose results are equivalent to factor for diabetes after feature scaling, selection, aug-
clinical outcomes and a prediction algorithm that considers mentation, and imputation. The accuracy, selection, and
important factors. The final results of this model demon- specificity performance parameters for this model were
strated that the decision tree and random forest have given assessed as follows using a tenfold stratified cross-valida-
98.20 percent and 98.00 percent maximum specificity, tion approach: 83.20 percent accuracy, 87.20 percent
Short-range protocols
Bluetooth low energy ZigBee MQTT
123
12754 Neural Computing and Applications (2023) 35:12751–12761
selection, and 79 percent specificity. Patients can send data sensor, we may examine the patient’s blood for the pres-
through smart devices like smartphones and smartwatches, ence of glucose. Depending on the situation, the glucose
but it is not defined on which technology it was based. This sensor can either be placed internally beneath the skin or
suggested strategy might help medical professionals make externally on the skin as part of the continuous glucose
judgments at an early stage depending on the threat pre- monitoring system (CGM).
dicted by the computer.
3.2 Data transmission
3 Proposed work The LoRa (HEADR) protocol is used to transmit data from
the patient to the gateway. The dataset unit for the patient
In Fig. 2, the suggested model is displayed. Two compo- receives this information from the gateway. The discovered
nents make up the LoRa-based Diabetes Predictive Model data may be sent regularly or whenever the patient’s
first one is the communication module, and the second is biomedical sensor readings significantly change. This new
the processing module whose flowchart is given in Figs. 3 Hybrid Enhanced Adaptive Data Rate (HEADR) technique
and 4. LoRa-enabled sensors from the patient’s body will also enhances LoRa device implementation and resolves
first send the data to the gateway node as explained in the sensor’s data transmission range distribution issue. To
algorithm1, after that this will be stored on a server (i.e., evaluate network performance, this model shows real-time
Dataset of Patients). Following preprocessing of the data, data detection and transfer utilizing IoT protocol using the
normalization, and ML classification algorithms are Contiki Cooja simulator.
applied which is the standard protocol for applying an ML
technique. 3.3 Dataset validation and ML classification
123
Neural Computing and Applications (2023) 35:12751–12761 12755
123
12756 Neural Computing and Applications (2023) 35:12751–12761
training method to the training dataset. Conse- presupposes that similar objects are close to one
quently, with the assistance of this training another, which is usually the case with comparable
procedure, a training model will be produced that data points.
will function on the values of the features in the (v) Logistic Regression The category of supervised
training data, logic, and algorithm. Bringing all of learning classification techniques includes logistic
the attributes to the same standard is the aim of regression as well, as mentioned in Butt et al. work
normalization. [16]. By presenting the output result in binary
(vi) Machine learning techniques Only after the data form, which denotes 1 and 0, we can distinguish
have been presented appropriately, ML techniques between individuals who are positive or negative
can be used. Medical diagnostic datasets may be for diabetes in this diabetic dataset. Logistic
effectively mined for information using machine regression is often used to categorize our distinct
learning (ML) techniques. We may use a variety data items.
of classification and ensemble algorithms as used (vi) Gaussian Naı¨ve Bayes According to Sneha and
by Onan et al. [12], to predict diabetes using the Gangil work [9], Naive Bayes Classifiers may be
diabetes dataset. The main goal of employing ML taught quickly and effectively, especially while
techniques is to understand how these classifica- under supervision. Small training data are needed
tion methods are implemented to determine their for Naive Bayes classifiers to approximate the
accuracy, as well as to identify the major charac- classification-related parameters. Gaussian Naive
teristic that is crucial for diabetes prediction. ML Bayes only supports continuous-valued features
methods may be divided into three groups: rein- and models that follow a Gaussian (normal)
forcement learning, unsupervised learning, and distribution.
both. In this prototype, we will employ supervised (vii) SVM The Support Vector Machine approach is
learning, in which the model is trained and one of the supervised machine learning tech-
accurate result predictions are made using a niques. The SVM-generated hyperplane divides
labeled dataset. the data into two categories. In high-dimensional
space, it may also produce one or more hyper-
Classification and regression are further divisions of
planes that can be used for classification or
supervised learning:
regression also mentioned in Bondre et al.
(i) Gradient Boosting The name ‘‘gradient boosting’’ research [17].
refers to the fact that the gradient of the prediction
error determines the target outputs for each case.
In the work of Lai et al. [13], every new prototype
makes predictions in each training example to
reduce error.
(ii) Random Forest Several well-liked ensemble
techniques exist, including ‘‘bagging, boosting,
gradient boosting, ada-boosting, averaging, and
voting, etc., as mentioned in Korukoğlu et al.
research [14].’’ For forecasting diabetes, we
employ a Random Forest of Bagging ensemble
technique. Random Forest may be used for both
classification and regression tasks in the ensemble
learning approach.
(iii) Decision Tree It transforms the data collection
into a roughly sine curve based on basic if–then-
else decision principles. According to Li et al.
[15], for a deeper decision tree, prototype com- 3.4 Execution of the whole process step-by-step
plications will rise and model fitting will get more (1) The placement of LoRa (HEADR)-enabled sensors
difficult. for the data collection on the bodies of patients in the
(iv) k-neighbors We often employed the k-nn network region (Algorithm 1).
approach, which is a supervised ML algorithm, (2) The gateway sends the gathered data to the database
to solve the classification and regression issues system.
[10]. K-nn is a useless prediction method since it
123
Neural Computing and Applications (2023) 35:12751–12761 12757
123
12758 Neural Computing and Applications (2023) 35:12751–12761
b Fig. 5 a Packets received by 20 nodes in 30 min. b Constant and Table 3 Dataset structure
dynamic data transmission
Our dataset [20]
Attributes Type Values Units
123
Neural Computing and Applications (2023) 35:12751–12761 12759
123
12760 Neural Computing and Applications (2023) 35:12751–12761
Declarations
References
1. Mekki K, Bajic E, Chaxel F, Meyer F (2019) A comparative
study of LPWAN technologies for large-scale IoT deployment.
ICT Express 5(1):1–7. https://doi.org/10.1016/j.icte.2017.12.005
2. Verma N, Singh S, Prasad D (2021) A review on existing IoT
architecture and communication protocols used in healthcare
monitoring system. J Inst Eng Ser B. https://doi.org/10.1007/
s40031-021-00632-3
3. Vizhi K, Dash A (2020) Diabetes prediction using machine
learning. Int J Adv Sci Technol 29(6):2842–2852. https://doi.org/
10.32628/cseit206463.
4. Islam MS, Islam MT, Almutairi AF, Beng GK, Misran N, Amin
Fig. 8 a PIMA Classifier. b PIMA Accuracy after k-fold validation N (2019) Monitoring of the human body signal through the
Internet of Things (IoT) based LoRa wireless network system.
The ML classification results are shown in Fig. 7a, Appl Sci 9(9). https://doi.org/10.3390/app9091884.
5. Lavric A (2019) LoRa (long-range) high-density sensors for
whereas after k-fold validation, RF ranks best in Accuracy internet of things. J Sensors. https://doi.org/10.1155/2019/
(96.28%), Precision (94.56%), Recall (90.24%), F-Measure 3502987.
(92.35%), and ROC (95%) as shown in Fig. 7b. Even the 6. Abrardo A, Pozzebon A (2019) A multi-hop lora linear sensor
results are shown in Fig. 8a and b which are based on the network for the monitoring of underground environments: The
case of the medieval aqueducts in Siena, Italy. Sensors
PIMA dataset are lower than our ML classification results. (Switzerland) 19(2). https://doi.org/10.3390/s19020402.
Both data structure is shown in the following Table 3. ROC 7. ‘‘Adaptive Data Rate | The Things Network.’’ https://www.the
curve and values of evaluation parameters are shown in thingsnetwork.org/docs/lorawan/adaptive-data-rate/. Accessed
Fig. 6 and Table 4. January 15, 2022.
8. Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H (2018) Predicting
diabetes mellitus with machine learning techniques. Front Genet
9(November):1–10. https://doi.org/10.3389/fgene.2018.00515
5 Conclusion 9. Sneha N, Gangil T (2019) Analysis of diabetes mellitus for early
prediction using optimal features selection. J Big Data 6(1).
https://doi.org/10.1186/s40537-019-0175-6.
We have given a healthcare model in this research that is 10. Muhammad LJ, Algehyne EA, Usman SS (2020) Predictive
utilized to track a patient’s health, particularly diabetic supervised machine learning models for diabetes mellitus. SN
symptoms, as well as determine the severity of the patient’s Comput Sci 1(5):1–10. https://doi.org/10.1007/s42979-020-
illness. Monitoring and diagnosis were done independently 00250-8
11. Letters HT, Ramesh J, Aburukba R, Sagahyroon A (2021) A
in previous studies. The suggested HEADR method is remote healthcare monitoring framework for diabetes prediction.
utilized to monitor the LoRa protocol from an IoT network, Original Res Paper, pp 45–57. https://doi.org/10.1049/htl2.12010.
and its working is assessed on the Contiki Cooja simulator, 12. Onan A (2022) Bidirectional convolutional recurrent neural net-
where we discovered that data rate is improved by quick work architecture with group-wise enhancement mechanism for
text sentiment classification. J King Saud Univ Comput Inf Sci
switching and by employing low, mid, and high band 34:2098–2117. https://doi.org/10.1016/j.jksuci.2022.02.025.
millimeter waves, and for data processing, the
123
Neural Computing and Applications (2023) 35:12751–12761 12761
13. Lai H, Huang H, Keshavjee K, Guergachi A, Gao X (2019) 18. Lavric A, Popa V (2018) Performance evaluation of LoRaWAN
Predictive models for diabetes mellitus using machine learning communication scalability in large-scale wireless sensor net-
techniques. BMC Endocr Disord 19(1):1–9. https://doi.org/10. works. Wirel Commun Mob Comput. https://doi.org/10.1155/
1186/s12902-019-0436-6 2018/6730719.
14. Onan A, Korukoğlu S, Bulut H (2017) A hybrid ensemble 19. Sripreethaa NYKR (2017) Diabetes prediction in healthcare
pruning approach based on consensus clustering and multi-ob- systems using machine learning algorithms on Hadoop cluster.
jective evolutionary algorithm for sentiment classification. Inf Cluster Comput. https://doi.org/10.1007/s10586-017-1532-x
Process Manag 53(4):814–833. https://doi.org/10.1016/j.ipm. 20. Verma N, Singh S, Prasad D (2022) Machine learning and IoT-
2017.02.008 based model for patient monitoring and early prediction of dia-
15. Li Y, Li H, Yao H (2018) Analysis and study of diabetes follow- betes. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.
up data using a data-mining-based approach in new urban area of 7219
Urumqi, Xinjiang, China, 2016–2017. Comput Math Methods
Med. https://doi.org/10.1155/2018/7207151. Publisher’s Note Springer Nature remains neutral with regard to
16. U. M. Butt, S. Letchmunan, M. Ali, F. H. Hassan, A. Baqir, and jurisdictional claims in published maps and institutional affiliations.
H. H. R. Sherazi, ‘‘Machine Learning Based Diabetes Classifi-
cation and Prediction for Healthcare Applications,’’ J. Healthc.
Springer Nature or its licensor (e.g. a society or other partner) holds
Eng., vol. 2021, 2021. https://doi.org/10.1155/2021/9930985.
exclusive rights to this article under a publishing agreement with the
17. Bondre VM, Umare PN, Patle PG (2016) Parallel artificial bee
author(s) or other rightsholder(s); author self-archiving of the
colony optimisation for solving curricula time-tabling problem.
accepted manuscript version of this article is solely governed by the
Int J Innov Res Comput Commun Eng 2016(1):1–8. https://doi.
terms of such publishing agreement and applicable law.
org/10.15680/IJIRCCE.2016.
123