Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Medicine in Novel Technology and Devices 22 (2024) 100297

Contents lists available at ScienceDirect

Medicine in Novel Technology and Devices


journal homepage: www.journals.elsevier.com/medicine-in-novel-technology-and-devices/

An efficient tool for Parkinson's disease detection and severity grading based
on time-frequency and fuzzy features of cumulative gait signals through
improved LSTM networks
Farhad Abedinzadeh Torghabeh a, Yeganeh Modaresnia a, Seyyed Abed Hosseini b, *
a
Department of Biomedical Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran
b
Department of Electrical Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran

A R T I C L E I N F O A B S T R A C T

Keywords: Parkinson's disease (PD) is a widespread neurodegenerative condition that affects many individuals annually.
Parkinson's disease grading Early identification and monitoring of disease progression are crucial to effectively managing symptoms and
Cumulative gait signal preventing motor complications. This research proposes an automated PD diagnosis and severity-grading model
Vertical ground reaction force
based on time-frequency and fuzzy features using improved uni-directional and bi-directional long short-term
fuzzy feature
Bayesian optimization
memory networks with sensitive hyperparameters optimization. We utilize vertical ground reaction force sig-
Long short-term memory nals collected from Physionet's publicly available dataset recorded during regular and dual-task clinical trials of
walking measurements. Only the cumulative signal of both feet was then utilized and segmented into 30-s
windows without further pre-processing. Subsequently, we extracted only four key time-frequency and fuzzy
features from each segment, effectively capturing the signal's inherent uncertainty. Bayesian optimization is
employed in both detection and grading approaches to fine-tune the two critical hyperparameters: the initial
learning rate and the number of hidden units in the network. The detection phase yields an exceptional accuracy
of 99.19%, surpassing state-of-the-art studies with the same dataset. In the grading phase, classification based on
the unified PD rating scale values achieves an accuracy of 92.28%. The proposed study delves into the potential of
cumulative gait signals as a powerful diagnostic tool for PD, aiming to extract precise and intricate information by
implementing straightforward and minimal processing endeavors. This method demonstrates significant effi-
ciency in terms of complexity, cost, and energy consumption by utilizing a single-dimensional signal, eliminating
the need for pre-processing steps, and limiting the features used for training.

1. Introduction higher variations in gait, and instances of gait freezing [3].


According to the Parkinson's Foundation “Parkinson's Prevalence
Neurodegeneration refers to the progressive deterioration of neurons' Project,” nearly one million Americans live with PD. It is anticipated that
structure and function, which might lead to the demise of these cells. This by 2030, the total number of individuals affected will reach 1.2 million.
condition, also known as a neurodegenerative disease (NDD), encom- Annually, over 90,000 individuals receive a diagnosis of PD in the United
passes a range of ailments, such as amyotrophic lateral sclerosis (ALS), States [4] Furthermore, the prevalence of PD tends to increase with age
multiple sclerosis (MS), Parkinson's disease (PD), Alzheimer's disease and is more prevalent in males [5]. The following motor symptoms often
(AD), Huntington's disease (HD), multiple system atrophy (MSA), and characterize PD: trembling, bradykinesia (slowed movement), rest
prion diseases [1]. PD is the second most frequent NDD, which causes a tremor (rhythmic shaking), rigidity, reduced postural reflexes, impaired
wide range of life-changing symptoms, including tremors in the upper balance, dysarthria (speech impairments) [6,7]. The prompt highlights
extremities [2]. Motor function frequently declines in individuals with the criticality of early detection of PD in improving patient care.
moderate to severe PD, as indicated by a hunched upper body position, Currently, medical practitioners evaluate symptoms that include
short and shuffling steps, a displacement of the body's center of mass tremors, bradykinesia, akinesia, and gait issues. Several rating scales
towards the front, reduced walking pace, compromised equilibrium, were developed by medical societies to assess the severity of motor

* Corresponding author.
E-mail address: abed_hosseyni@yahoo.com (S.A. Hosseini).

https://doi.org/10.1016/j.medntd.2024.100297
Received 4 November 2023; Received in revised form 19 February 2024; Accepted 17 March 2024
2590-0935/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

symptom impairments. The unified PD rating scale (UPDRS) is a widely combined into a single categorization.
utilized tool in the clinical evaluation of PD. This scale includes 42 Both detection and grading methodologies extracted the time-series
questions/criteria that assess various aspects of PD, encompassing motor data complementary of time-frequency (TF) and fuzzy features (FFs).
symptoms such as gait, behavioral characteristics, and daily activities. By analyzing a time series from both a time and frequency perspective, TF
Moreover, it has the capability to differentiate postural instability, mild analysis is able to efficiently capture short-term changes by applying a
and severe PD, and assess the quality of life (QoL) of patients who have function with a two-dimensional (2D) domain in the real plane. TF
moderate to severe PD. transform is used to analyze time series data rather than interpreting a
There is evidence that tremors and abnormal gait are prior symptoms single-dimensional signal. The FF transformation is also a more
of PD in the literature [8]. These walking patterns can differ among in- comprehensive technique compared to analyzing only a single aspect of
dividuals. Some of the common gait disturbances in PD include shuffling the data. It involves transforming the time series data into a multidi-
gait, festinating gait, freezing of gait, wide-based gait, stooped posture, mensional space, where each dimension represents a distinct aspect of
and reduced arm swing [8]. These patterns can adversely affect balance, the data. This allows for a more detailed analysis of the data.
mobility, and QoL, increasing the risk of falls and injuries. However, Our multistage approach makes four major contributions.
physical therapy and other interventions can be beneficial in enhancing
gait and mobility in individuals with PD. Thus, analyzing gait movements  First, this study introduces a TF/FF model for the effective and du-
can be an efficient approach for an early diagnosis of PD. Apart from rable classification tool of wearable GRF signals. The utilization of
detecting PD, analyzing gait movements can also provide insight into the simple TF features and FFs in the employed networks produce higher
advancement of the disease and its severity, as they include periodic and accuracy and a sooner training period compared to the conventional
rhythmic foot movements. It has also been documented that gait ab- networks.
normalities often occur during the early phases of PD [9].  Secondly, Bayesian optimization was utilized for optimizing two
Several characteristics of Parkinsonian gait include reduced step hyperparameters, which resulted in an effective outcome.
length, a slower cycle, greater variability in stride length, shorter dura-  Thirdly, this approach uses UPDRS values to estimate the prognosis of
tion for the swing, longer stance phases, and a flat-footed stride rather PD using a recurrent neural network (RNN) architecture.
than a toe-to-heel strike [10]. PD is diagnosed by examining these fea-  Lastly, but certainly not least, this model outperformed other models
tures as part of the diagnosis process. Performing gait analysis can be used in previous studies to detect PD patients. While it had a slightly
challenging as it can be impacted by multiple factors, including but not lower accuracy in differentiating between types of Parkinson's, it was
limited to age and physical fitness. Additionally, physicians do not have noted for its reduced weight and high precision, making it more
an objective tool to assist them in the gait analysis. Wearable system versatile than current state-of-the-art models. The study outlines a
technology has made gait analysis an increasingly popular tool for highly effective algorithm that predicts the severity of UPDRS, which
assessing and detecting PD [10]. is beneficial for clinical decision-making systems.
The utilization of wearable gait sensors diminishes the necessity of
expensive laboratory setup and expert guidance. The ground reaction This study is structured as follows: Section 2 examines previous
force (GRF) sensor is the most widely adopted for evaluating PD [11]. By research on automated methods for diagnosing PD. Materials and
utilizing wearable pad sensors, GRF sensors can accurately map joint methods are presented in Section 3. The experimental results and dis-
movements and muscle activities with exceptional precision [10]. Due to cussions are presented in Section 4. The study is concluded in Section 5.
their small size, non-invasive nature, and low cost, these gait sensors are Finally, Section 6 is dedicated to discussing limitations.
the most commonly used ones in gait analysis studies.
In recent years, advances in machine learning (ML) have made it 2. Literature review
possible to automate gait analysis to reduce the time and workforce
problems associated with using conventional techniques for analyzing PD is the second most common NDD that impairs the physical abilities
gait. Moreover, it has the capability to tackle the challenge of evaluating of individuals over time. This disease is characterized by the presence of
the severity of PD, which can be a demanding task for both healthcare tremors, stiffness, and bradykinesia, which can lead to difficulty in
professionals and patients in terms of time and energy. The incorporation movement and loss of balance. Several modalities have been used for the
of wearable sensors and ML can facilitate the speedy and efficient automated detection of PD, including imaging, non-imaging, and sensor-
monitoring of PD symptoms by both parties. based modalities. Imaging modalities such as magnetic resonance im-
The objective of this study is to create a predictive system for PD aging (MRI), positron emission tomography (PET), and single-photon
patients, utilizing data from wearable sensors that can be easily obtained. emission computed tomography (SPECT) have been utilized for auto-
To this end, we introduce an intelligent tool that can identify symptoms matic PD detection.
of PD and estimate the corresponding severity level (based on UPDRS) These modalities allow for the visualization of the brain and can
using gait data analysis. Our framework leverages wearable GRF sensors identify modifications in both the structure and functioning of the brain.
to predict UPDRS scores and assess the severity of PD. Several studies have shown that MRI has the capability to identify al-
Our study proposed a multistage approach wherein the first stage terations in the brain composition of individuals with PD. As an example,
involved classification of PD patients using uni-directional long short- Biundo et al. [12] found that PD patients had significant changes in brain
term memory (Uni-LSTM) and bi-directional LSTM (Bi-LSTM) neural volume compared to healthy controls (HC). PET and SPECT have also
networks, and the second stage involved assessing the severity of PD been used to visualize changes in dopamine activity in the brain, which is
based on the UPDRS rating scale and the same models. In the first a hallmark of PD. However, these modalities are expensive and may not
approach, a PD detection algorithm, we employed the aforementioned be feasible for routine clinical use.
neural networks for the evaluation of their efficacy for PD detection. Non-imaging modalities such as electroencephalography (EEG),
There would be a substantial benefit in using these networks in the early electromyography (EMG), and electrooculography (EOG) have also been
diagnosis of PD if they could perform the required task exceptionally well used for the automated detection of PD. These modalities measure the
in classifying patients with PD and those with health conditions. This electrical activity in the brain, muscles, and eyes, respectively. Several
technology allows physicians to make optimal clinical decisions by uti- studies have shown that EEG can detect changes in brain activity in PD
lizing artificial intelligence solutions that can be deployed rapidly. This is patients. Accordingly, Anjum et al. [13] found that EEG could differen-
because it can save time, resources, and technical effort in creating tiate PD patients from HC with 85.3% accuracy. EMG and EOG have also
models that may not be better or worse than those currently used. In the been used to detect changes in muscle activity and eye movements,
detection phase, all PD patients with different UPDRS scores are respectively, in patients with PD. However, these modalities may not be

2
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

as specific as imaging modalities and may require additional validation. patients. Their method utilized a VGRF time series, wavelet analysis to
Sensor-based modalities such as accelerometers, gyroscopes, and extract discriminant gait features, and an SVM for classification. Their
force sensors have been used for the automated detection of PD. These study reported a classification accuracy of 98.20% for early PD detection
modalities are capable of capturing and analyzing movement data, and 96.69% for severity detection. Their approach offers a promising
facilitating the identification of potential variations in motor function non-invasive tool for automatic NDD classification.
among individuals diagnosed with PD. According to multiple research Vidya and Sasikumar [21] proposed a model based on gait analysis,
studies, it has been demonstrated that sensor-based modalities are using a hybrid CNN-LSTM network to predict PD severity. Initially, the
capable of precisely detecting modifications in motor function among researchers utilized variability analysis to obtain noteworthy VGRF sig-
individuals suffering from PD. For example, Williamson et al. [14] found nals. Subsequently, they decomposed these signals utilizing the empirical
that a wrist-worn accelerometer could detect tremors in PD patients and mode decomposition (EMD) technique to extract the significant intrinsic
detect PD with an area under the ROC curve (AUC) of 0.69 on gait data. mode functions (IMFs) that contain crucial gait features. In the second
However, these modalities may not be as sensitive as imaging or step, the dominant IMFs of the selected VGRF signals were identified
non-imaging modalities and may require additional validation. Although through power spectral analysis, and the CNN-LSTM classifier model was
every modality exhibits distinct advantages and drawbacks, gait distur- trained using these features. Their approach attained an accuracy of
bances represent a prevalent manifestation of PD and can offer significant 98.32% in multi-class classification. Klinton Amaladass et al. [22] pro-
insights into the prompt identification and assessment of the condition. posed a novel technique for the early detection of PD using a signal
In recent years, researchers have focused on using gait signals for PD processing and feature extraction approach namely, shifted extended
detection and grading. Several studies have investigated the correlation local binary pattern (S-ELBP) and an artificial neural network (ANN)
between gait signals and PD by ML algorithms. classifier. Their method achieved promising results with an accuracy of
Zeng et al. [15] used deterministic learning theory to diagnose PD 97.6%. Ma et al. [23] proposed a model that utilizes feature extraction,
patterns in HC subjects using gait signals. Their methodology comprises a dimension reduction, data balancing mechanisms, CNN, and three ML
training phase, during which gait dynamics are extracted and saved in models, namely SVM, KNN, and extreme gradient boosting (XGBoost).
constant radial basis function neural networks. Additionally, a classifi- Using XGBoost, the performance analysis had an accuracy of 97.32%,
cation phase is executed, where a set of dynamical estimators is gener- while using CNN; the accuracy was 98.4%.
ated to contrast the gait patterns of test PD patients and HC subjects with In general, analysis of gait signals, which can detect changes in motor
the training set. The results show an accuracy of 96.39%, indicating the function, has the potential to serve as a valuable diagnostic tool for PD.
effectiveness of the features and classifiers used in separating gait pat- The use of VGRF signals in the PD detection and severity grading pro-
terns between the groups. vides several benefits. First, gait disturbances are a common symptom of
In a study, Joshi et al. [16] proposed a model for identifying Par- PD, and the VGRF signals can provide valuable information about gait
kinson's gait using a combination of wavelet analysis and support vector patterns. Secondly, the VGRF signal can be utilized to analyze the vari-
machine (SVM). The findings demonstrated that their approach attains a ability of gait patterns, which can be a sensitive indicator of disease
classification accuracy of 90.32% by using a single gait parameter and progression. Thirdly, the VGRF signal can be decomposed into IMFs,
achieves 100% accuracy when all parameters from the left leg are which contain vital gait features that can be used to train ML models for
employed. For specific gait variables, the Haar wavelet outperformed the accurate disease diagnosis and severity grading. Finally, the use of VGRF
Daubechies2 wavelet. Consequently, the study implies that wavelet signal analysis is non-invasive and cost-effective, making it an attractive
analysis presents a promising method for the automated and approach for the early detection and monitoring of PD. However, further
non-invasive classification of NDDs. Abdulhay et al. [17] employed the research is necessary to standardize gait analysis techniques and deter-
publicly available Physionet dataset in their investigation to classify PD mine their clinical applicability in PD diagnosis and management.
and HC. They used peak detection, pulse duration algorithms, medium
Gaussian SVM, and medium tree. They extracted several gait tremor 3. Materials and methods
features, such as stride time, stance time, swing time, and foot strike
profile, which were significantly different between the two groups to The GRF sensor, also known as a force plate or pressure plate, is used
achieve an average accuracy of 92.7%. to measure the forces exerted on the ground during human movements.
Khoury et al. [18] described an ML technique for the diagnosis of PD These sensors can provide valuable information about gait patterns,
that employs vertical GRFs (VGRFs). Their approach consists of four balance, and stability, and can measure various parameters such as peak
distinct steps: data pre-processing, feature extraction and selection, data forces, force duration, and patterns, which can be used to quantify gait
classification, and performance evaluation. Initially, the study excluded abnormalities. Different walking patterns can result in different values of
the first and last 20 s of data to eliminate starting and stopping artifacts. this force, making it an ideal tool for researching personal gait analysis.
Next, they extracted 19 distinct features from the remaining data. Feature Fig. 1 illustrates our proposed multistage DL approach, including detec-
selection was performed using a wrapper approach based on the random tion and grading.
forest (RF) algorithm. Their methodology involved supervised classifi-
cation techniques such as K-nearest neighbor (KNN), decision tree (DT), 3.1. Materials
RF, naïve bayesian (NB), SVM, and unsupervised methods such as
K-means and the gaussian mixture model (GMM). The SVM model ach- Physionet Gait in PD [24], a public access dataset comprised of gait
ieved a classification accuracy rate of 90.32% in distinguishing between data, was used in this study. Data included 93 Parkinsonian patients with
PD individuals and HC subjects. The effectiveness of the methodology a mean age of 66.3 and 63% of whom were males, as well as 73 HC
was evaluated using a distinct dataset to classify PD against two other subjects with an average age of 63.7 and almost 55% of whom were
NDDs. The KNN algorithm achieved an accuracy of 83.33% in differen- males. Gait signal measurements were performed with dual-tasking and
tiating PD from HD and an accuracy of 92.86% in classifying PD from the usual walking procedure [25]. During normal walking, VGRF were
ALS. measured as the subjects walked on a level ground for 20 m and returned
El Maachi et al. [19] applied the analysis of gait signals using a to their initial location at their natural and usual pace for about 2 min
one-dimensional (1D) convolutional neural network (CNN) to build a [25].
classifier. Their algorithm achieved an accuracy of 98.7% in PD detection Eight sensors under each foot measured the force as a function of time
and 85.3% in predicting the severity of the disease. Wang et al. [20] (in Newtons). The sampling frequency of 100 Hz has been used to digitize
proposed a gait classification system that combines signal processing and and document each of these 16 sensors, as well as two other 1D VGRF
ML techniques to detect gait anomalies and assess the severity of PD signals representing the sum of the eight sensor outputs for each foot.

3
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

Fig. 1. Proposed framework for detecting and evaluating PD through automation.

Dual-tasking was conducted by asking subjects to serially subtract seven signals with a duration of fewer than 30 s. This segmentation function
from a three-digit number (e.g., 200, 193, 186) while walking along the ignores signals with a duration of fewer than 30 s. It also segments a
same pathway and under the same conditions as the usual walking. signal with more than 30 s of samples into segments with 30-s windows
Consequently, 306 signals were recorded that were unbalanced (70% PD and ignores the remaining part of the signal. As an example, a signal that
with various UPDRS scales and 30% HC). lasts 95 s is divided into three signals of 30 s each, and the remaining 5 s
In addition, this database contains demographic information and are ignored. This segmentation may not only shorten the time of feature
measures of disease severity (e.g., Hoehn & Yahr (HY) staging and/or extraction and training of the network, but it may also contribute to
UPDRS). In PD severity prediction, we used UPDRS, where we converted making our method more suitable for real-time applications. It is also
the continuous scale into five levels: possible that some gait patterns occur when walking for a shorter period
of time. Furthermore, it is more convenient for the patient to record a
Class 1 : UPDRS < 4 shorter signal during a real-time clinical diagnosis.

Class 2 : 5  UPDRS < 15 3.2.1. Feature extraction


The performance of LSTM models can be improved by extracting
Class 3 : 15  UPDRS < 25 features from the data rather than relying solely on conventional LSTMs
[26]. To accomplish this, TF features and FFs were extracted.
Class 4 : 25  UPDRS < 35
3.2.1.1. Time-frequency features. The TF analysis is a non-linear dy-
Class 5 : 35  UPDRS namics technique that enables the visualization of state recurrences and
extraction of unique features by transforming a 1D signal into a 2D space.
3.2. Methods To determine which features to extract, we are inspired by the idea that a
CNN can be trained using TF images, such as spectrograms. The approach
In this study, the total force signals from the right and left feet (TFRF must be translated to 1D signals since we intend to use LSTMs rather than
and TFLF) were first extracted for each subject. An analysis of the gait CNNs. TF moments, including instantaneous frequency (IF) and spectral
must consider the coordination of the legs. Merely considering a one-foot entropy (SE), are utilized as a technique to obtain relevant information
movement signal would not be sufficient; thus, we totaled up the force from spectrograms.
applied by each foot (TFLF þ TFRF). These cumulative VGRFs are
mentioned as CVGRFs in this study. Fig. 2 illustrates the TFLF, TFRF, and  Instantaneous Frequency
CVGRF of HC and five-class PD patients.
The CVGRF data were segmented into equal segments of 30-s win- IF is a dynamic parameter that varies with time in a non-stationary
dows (equivalent to 3000 samples). This segmentation function ignores signal, which represents the rate of change of the signal's phase with

4
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

Fig. 2. The TFLF, TFRF, and CVGRF of individuals a) without any health conditions and b-f) those diagnosed with PD classified according to the five UPDRS classes.

respect to time [27,28]. It is computed as the average frequency of the SðmÞ


PðmÞ ¼ P : (2)
signal over a small-time window as it evolves over time t. For the i SðiÞ
calculation of IF, MATLAB's instfreq function was utilized, where the
The SE denoted as H is given as:
function interpolates the signal to a uniform grid if the given signal is not
uniformly sampled. In this instance, the computation of the IF involves X
N
determining the initial conditional spectral moment of the TF distribu- H¼  PðmÞlog2 PðmÞ: (3)
tion of the given input signal, as expressed by Equation (1): m¼1

R∞ Then it will be normalized by dividing to log 2 N, representing the


f Pðt; f Þdf
finst ðtÞ ¼ R0 ∞ : (1) maximum possible SE of a frequency-domain white noise signal with a
0
Pðt; f Þdf uniform distribution. The probability distribution of a TF power spec-
Initially, the algorithm computes the power spectrum Pðt; f Þ of the trogram Sðt; f Þ can be represented mathematically as:
input signal using the short-time Fourier transform, generating a spec- P
f Sðt; mÞ
trogram that represents the signal's TF distribution. Then, it employs PðmÞ ¼ P P : (4)
Equation (1) to estimate the IF. f t Sðt; mÞ

SE is still the same as Equation (3). To compute the instantaneous SE


 Spectral Entropy given a TF power spectrogram Sðt;f Þ, the probability distribution at time t
is:
In the frequency domain, SE is computed by taking the Shannon en-
tropy of the normalized power distribution of the signal. In this context, Sðt; mÞ
Pðt;mÞ ¼ P : (5)
Shannon entropy refers to the signal's SE. SE equations are derived from f Sðt; f Þ
equations for power spectra and probability distributions. Assume the 1D
signal of xðnÞ, the power spectrum is SðmÞ ¼ jXðmÞj2 , where XðmÞ is the Then the SE at time t will be:
discrete Fourier transform of xðnÞ. The probability distribution PðmÞ is
X
N
then: HðtÞ ¼  Pðt; mÞlog2 Pðt; mÞ: (6)
m¼1

5
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

SE was calculated by using MATLAB's pentropy function.


 Fuzzy Recurrence Image Entropy
3.2.1.2. Fuzzy features. The fuzzy analysis technique, which is
commonly used in non-linear dynamics, involves transforming a signal of FRIE is a measure of the complexity or irregularity of an FRP image. It
1D into a 2D space. This method will extract unique features that describe is based on Shannon entropy, which measures the uncertainty or
the behavior of various complex dynamic mechanisms underlying non- randomness of a system. FRIE is calculated by first computing the
linear time series. Extracting these novel features enhances the abilities normalized histogram of gray levels in the FRP image and then calcu-
of LSTM networks for robust signal classification as well as signal lating the Shannon entropy of the histogram. A higher FRIE value in-
compression efficiency for DL models [26]. This study subjected CVGRFs dicates a more complex and irregular FRP image, while a lower FRIE
to fuzzy recurrence image entropy (FRIE) and fuzzy recurrence entropy value indicates a more regular and predictable image. Mathematically,
(FRE) extraction. FRIE is defined as follows:

 Fuzzy Recurrence Plot X


K
EFRI ¼  Pk log2 Pk : (9)
k¼1
In the study of dynamical systems, a common approach is to convert a
The probability associated with each intensity level of the k-th bin of a
sequence of values into objects in space for analysis purposes. This
normalized histogram can be computed, where K represents the number
transformation enables the sequence to be examined in space, and the
of gray levels in FRP, and Pk represents the probability of each intensity
resulting space is known as phase space. The set of objects within phase
level. The real values of pixels, ranging from 0 to 1, are converted to
space is referred to as the phase-space object. One way to convert a
integers in the range of 0 to 255 to obtain the gray levels of FRP.
sequence of values into a phase-space object is through time-delay
embedding [26]. It should be noted that the embedding dimension per-
 Fuzzy Recurrence Entropy
tains to the spatial extent, such as a line, area, or volume, which encloses
the object in phase space. A time delay, also known as lag, is an
FRE is a quantitative measure that describes the irregularity or
expression of how much an event is offset by another event. In mathe-
complexity of an FRP image. It is calculated as the Shannon entropy of
matical terms, phase-space reconstruction can be performed for a time
the normalized histogram of the FRP image, where each bin represents
series ðz 1 ; z ; …; z I Þ using time-delay embedding as follows:
the probability of occurrence of a certain intensity level. The non-
  probabilistic entropy of a fuzzy set provides a definition for the en-
yi ¼ z i ; z iþφ ; …; z iþðd1Þφ ; i ¼ 1; …; Iðd  1Þφ (7)
tropy of an N  N FRP, also known as the entropy of fuzzy recurrences.
where φ and d are the time delay and embedding dimension, respec- This entropy measures the level of uncertainty associated with re-
tively. Fuzzy sets, used in fuzzy logic, allow for partial membership of currences in the reconstructed phase space of a signal. Specifically, it is
elements. Traditional sets have elements that either belong or do not defined as follows:
belong to the set, while fuzzy sets allow for a degree of membership N X
X N
         
ranging from 0 to 1, where zero indicates no membership, and one in- EFR ¼ μ xi ; xj log2 μ xi ; xj  1  μ xi ; xj log2 1  μ xi ; xj ;
dicates full membership. Fuzzy sets were introduced by Zadeh [29] as a i¼1 j¼1
way to represent uncertainty and vagueness in human reasoning. A (10)
detailed description of fuzzy sets can be found in Ref. [29].
In brief, mathematically, let U be the universe of discourses and F its ~ jÞ defined in Equation (8). The FRE
where μðxi ; xj Þ corresponds to Rði;
subset. Defining the fuzzy set F in terms of its membership function μF ðxÞ value ranges from 0 to 1, where a higher value indicates higher irregu-
maps each element x 2 U to the interval ½0;1: μF ðxÞ : U → ½0;1. The real larity or complexity in the FRP image.
value of μF ðxÞ is called the fuzzy membership grade of x in F. Accord-
ingly, the greater the value of the fuzzy membership grade, the greater 3.2.2. Normalization
the probability that x is a member of F. All extracted features are then normalized, whereby the average of
Fuzzy recurrence plot (FRP) is a visualization tool used in the analysis each sample is subtracted from it and subsequently divided by the
of time series data. It is a variation of the recurrence plot (RP) technique, standard deviation.
which is commonly used to study the recurrence properties of a system
[30,31]. FRP presents recurrences of states of a dynamical system as a 3.2.3. Bayesian optimization
grayscale image, where each pixel represents the degree of similarity or The Bayesian optimization method is a sequential optimization
membership between two states in the phase space. The membership is strategy that does not assume a functional form for black-box functions.
calculated using fuzzy sets in the fuzzy logic context, allowing for a more The method is typically used to optimize functions with a high evaluation
detailed representation of the system's recurrence properties [26]. In cost. It is commonly used in ML and engineering applications where the
contrast to the binary RP, which only considers the presence or absence goal is to find the best set of parameters or hyperparameters for a model.
of recurrences, FRP provides a more refined view of the system's The technique involves building a probabilistic model of the function
behavior, making it a preferred texture analysis approach [32]. FRP can that is being optimized, using a prior distribution to capture any existing
be used for pattern recognition tasks, including the classification of PD knowledge or assumptions about the function. The model is then updated
patients and control subjects using DL [33,34]. as new evaluations of the function are made, incorporating the observed
Briefly, FRP is a representation of a time series signal transformed data to improve its accuracy and reduce uncertainty. At each iteration of
into a 2D (phase) space using fuzzy sets. It is denoted by R ~ and is defined the optimization process, the algorithm selects the next point to evaluate
as a square matrix of size N  N, where N is the length of the time series based on a trade-off between exploration and exploitation. The algorithm
signal. Each element R ~ ði; jÞ of the matrix represents the fuzzy recurrence balances the need to explore regions of the input space that are uncertain
rate between the i-th and j-th phase-space vectors of the time series or promising with the desire to exploit regions that are likely to yield the
signal, which is determined based on their fuzzy membership grades and best results. Using Bayesian optimization, a scalar objective function f ðxÞ
ranges between 0 and 1. is minimized for x concerning a bounded domain. Depending on the
function, it may return deterministic or stochastic results, meaning that
 
~ ði; jÞ ¼ μ xi ; xj ; i ¼ 1; …; N
R (8) the same function evaluated at the same point x may provide different
results.

6
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

When training a deep neural network, it is crucial to regulate steps on the current state, while the bias contributes to the decision-
numerous hyperparameters that are both important and sensitive, such as making process of the model. The concatenation of the input weights,
the learning rate, weight decay, and optimization algorithm, as well as recurrent weights, and bias for each component is represented by the
some less sensitive hyperparameters, such as the momentum setting. matrices W, R, and B, respectively and determined by Equation (11):
These hyperparameters control various aspects of the training process 2 3 2 3 2 3
and can significantly impact the performance and convergence of the Wi Ri Bi
6 Wf 7 6 Rf 7 6 Bf 7
network. Therefore, it is necessary to optimize these hyperparameters to 6
W ¼4 7 6 6
; R ¼ 4 5; B ¼ 4 7
7 : (11)
Wg 5 Rg Bg 5
achieve the best possible results. Among all the hyperparameters, just
Wo Ro Bo
two of them were optimized. Due to the importance of a number of
hidden units in LSTM and the initial learning rate, they were subjected to The LSTM layer's four gates are represented by the input gate ðiÞ,
optimization using Bayesian optimization. MATLAB's bayesopt function forget gate ðf Þ, cell candidate ðgÞ, and output gate ðoÞ, respectively. These
was employed to fine-tune these parameters. Additionally, scientific ev- gates play a crucial role in determining the amount of information to
idence presented in the literature was used to select other keep, forget, or update in the cell state. The cell state at time step t is the
hyperparameters. outcome of these gates' collective influence on the previous cell state,
current input, and output from the hidden state. The cell state at time step
 Initial Learning Rate t is as follows:

The learning rate is the parameter that controls the size of weight and ct ¼ ft  ct1 þ it  gt ; (12)
bias changes in the learning of the training algorithm. The initial learning
rate is a hyperparameter that determines the size of the update to the where  represents the Hadamard product (element-wise multiplication
model's weights during the initial phase of training. It is the rate at which of vectors). Equation (13) represents the hidden state at time step t.
the model learns the optimal weights for minimizing the loss function. A
ht ¼ ot  σ cðct Þ: (13)
high initial learning rate can cause the model to overshoot the optimal
weights, leading to poor performance or even divergence, while a low The state activation function is represented by σ c. At each time step t,
initial learning rate can result in slow convergence and suboptimal re- the components that determine the state of the LSTM layer are listed in
sults. Therefore, choosing an appropriate initial learning rate is critical Table 1.
for achieving good performance in DL models. The calculation involving the gate activation function σ g is computed
using the hyperbolic tangent activation function, which is expressed as
 Hidden Units of the LSTM Layer follows:

The number of hidden units is defined as a positive integer, which is σ ðxÞ ¼ ð1 þ ex Þ1 : (14)
the dimensionality of the output space of the LSTM layer. It corresponds The input at each time point is formed by combining the four
to the amount of information the layer remembers between time steps extracted features for a segment that corresponds to the same time point
(the hidden state). The hidden state can contain information from all via concatenation.
previous time steps, regardless of the length of the sequence. However,
excessive hidden units may result in overfitting, where the layer learns to  Bi-directional LSTM
fit the training data too well, leading to poor generalization on unseen
data. The number of hidden units does not constrain the number of time An LSTM layer that learns long-term bi-directional relationships be-
steps that the layer can process during an iteration. tween two-time steps is called a Bi-LSTM layer network. The network can
benefit from these dependencies when learning from the entire time
3.2.4. Uni- and Bi-directional LSTM series at each time interval. In other words, a Bi-LSTM is an expansion of
the conventional Uni-LSTM that can increase sequence classification
 Uni-directional LSTM performance. In light of being trained with a single LSTM on the input
time series, a Bi-LSTM architecture is simultaneously learned with hid-
LSTM networks belong to the family of RNNs that specialize in den forward and backward layers in both time directions. LSTM layers
learning long-term dependencies between time steps in sequence data. used a hyperbolic tangent activation function, followed by a time-
An LSTM layer processes a time series xt , where the output ht represents distributed fully connected layer (i.e., output at each time step passes
the hidden state, and ct represents the cell state at each time step t. The through the fully connected layer) with two units for PD detection and
first block of the LSTM computes the initial output and updated cell state five units for severity prediction approach, followed by SoftMax
using the initial network state and the first-time step. Subsequently, at activation.
each time step t, the block leverages the current network state ðct1 ; ht1 Þ
and the next time step of the sequence to compute the output and the
4. Experimental results and discussion
updated cell state ct .
In an LSTM layer, two states exist, namely, the hidden state (also
The dataset used in this study, consisted of 166 individuals, including
known as the output state) and the cell state. At each time step t, the
73 HC and 93 PD patients labelled according to their stage of disease
hidden state contains the output of the LSTM layer, while the cell state
progression using the UPDRS scale.
retains information from previous time steps. Gates are employed to
regulate the flow of information in and out of the cell state at each time
step. These gates control the degree of addition or deletion of information Table 1
Description of the formulas for LSTM architecture components at time step
in the cell state, thus maintaining its relevance to the current state of the
t.
sequence. The components provided in the table below play a significant
role in determining the state of the cells and the hidden state of the layer. Component Formula

The parameters that undergo learning in an LSTM layer consist of the Input gate it ¼ σ g ðWi xt þ Ri ht  1 þ Bi Þ
input weights ðWÞ, the recurrent weights ðRÞ, and the bias ðBÞ. The Forget gate ft ¼ σ g ðWf xt þ Rf ht  1 þ Bf Þ
learnable input weights pertain to the transformation of input data to the Cell candidate gt ¼ σ c ðWg xt þ Rg ht  1 þ Bg Þ
Output gate ot ¼ σ g ðWo xt þ Ro ht  1 þ Bo Þ
hidden state. The recurrent weights handle the influence of previous time

7
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

The segmentation technique was employed on the premise that the respectively. The LSTM network outperformed the Bi-LSTM network for
feature dimensions of each subject are uniform, thereby ensuring that the the detection phase in all evaluation metrics summarized in Table 3. The
neural network does not experience any adverse effects. Creating signals results suggest that the LSTM network is a more suitable model for the
with equal duration (30-s windows) has several benefits for training detection of PD in a two-class classification task. The results are reported
networks that process sequential data. When dividing the data into mini- as average values and standard deviations based on 10-fold cross-
batches for training, having signals of equal duration allows for easier validation.
processing and less computation time. This is because the signals can be In the severity prediction phase, we created a balanced dataset with
packed into a single tensor, and padding or truncation can be applied to an equal representation of each stage of disease progression using the
make all the signals have the same length. However, it is important to synthetic minority oversampling technique (SMOTE). The results showed
note that padding or truncating can potentially degrade the performance that the Bi-LSTM network achieved an accuracy of 92.28%, while the
of the network. Padding can add irrelevant information, while truncation LSTM network achieved an accuracy of 90.29% in classifying patients
can remove important information. into the five stages. The sensitivity and specificity of the Bi-LSTM
Then, without the need for any pre-processing attempts, we extracted network were 92.35% and 97.68%, respectively, whereas the sensi-
four TF features and FFs from the CVGRF signal. These features produce a tivity and specificity of the LSTM network were 90.36% and 96.92%,
series of numerical values, which are utilized as the input of the network. respectively. The Bi-LSTM network outperformed the LSTM network in
Our findings demonstrate that the four features used in this study are the grading phase, as shown in Table 4. The results suggest that the Bi-
highly informative and provide valuable insights into the underlying LSTM network is a more suitable model for the classification of PD pa-
analysis. Fig. 3 illustrates a sample FRP image from the HC and five tients into the five stages of disease progression.
UPDRS groups. These FRP images generate FFs (FRIE and FRE features). By using only one signal rather than multiple sources and extracting
As per Fig. 3, there seems to be a clear-cut trend in the FRP illustrations only four valuable features, we could develop automated diagnostic
that relate to each group, specifically in CVGRF, and these are highly systems for the early detection of PD, which can improve the QoL of PD
regarded as significant features for further examination and utilization. patients and monitor disease progression. Additionally, our proposed
Furthermore, they are capable of characterizing complex, non-linear method uses a 2-layer Uni-LSTM/Bi-LSTM model that requires only 51
systems by quantifying pattern recurrences within them. Using these kilobytes (KB) of computer memory and can be easily implemented on a
features as inputs to train an LSTM network has several advantages. First, microcontroller for real-time applications [35]. Additionally, the elimi-
they can capture the non-linear dynamics of the system, which is nation of pre-processing steps can reduce the computational complexity
particularly important for time series data. Secondly, they can reduce the of the model and improve its speed and scalability, making it more
dimensionality of the input data, making it easier to train the LSTM practical for real-world applications. Table 5 provides a review of pre-
network and decreasing the risk of overfitting. Thirdly, FFs are capable of vious research on automated methods for diagnosing HC subjects from
handling fuzzy or uncertain data, which is especially useful when PD using the same dataset.
working with real-world data that is often noisy and imprecise. Incor- Abdulhay et al. [17] used a Chebyshev2 high-pass filter to remove
porating fuzzy set theory into the feature extraction process can capture low-frequency components associated with changes in body orientation
the inherent uncertainty and variability of the data, leading to more during the measurement process that may affect the outcome of
robust and accurate results. time-domain gait analysis. To obtain maximum pass band flatness, they
In both phases, the Bayesian optimization algorithm was employed to applied a fast Fourier transform to analyze the signal in the frequency
optimize two difficult-to-choose and challenging hyperparameters. The domain and a second-order Butterworth filter as a pre-processing filter.
Bayesian optimization algorithm iteratively updates the parameters of Using peak detection and pulse duration, they extracted stride time,
each curve until they converge to the local minimum for each stage and stance time, swing time, and foot strike profile as features. These features
classifier, resulting in improved accuracy rates for the gait detection were then classified into normal and PD. Their efforts, while
system. By analyzing the objective function curve, we can determine the commendable, do not result in greater accuracy than those of us using no
most effective combination of the initial learning rate and hidden units of pre-processing and Uni-LSTM.
the LSTM layer for achieving high accuracy rates in the system. Nineteen features were extracted by Khoury et al. [18] with consid-
Fig. 4 depicts the process of optimizing the learning rate and the erable effort. Then, they selected the most relevant subset of features by
number of hidden unit layers. This curve represents the changes in loss using a wrapper feature selection technique. They tested their proposed
over 30 iterations based on the model mean and the model minimum method using various supervised classifiers and two unsupervised
feasible. Fine-tuned training options based on the Bayesian optimization models. They achieved an accuracy rate of 90.32% using the SVM clas-
algorithm has summarized in Table 2 for each stage and classifier. In the sifier to classify PD and HC subjects. Even though the process of building
training environment, the MaxEpochs is 80, and the MiniBatchSize is CNNs from scratch is computationally expensive, El Maachi et al. [19]
150. invested considerable effort in developing a unique model to detect and
After observing the trend of the curve and identifying the optimal predict the severity of PD. A model was developed to process 18 1D
values for both parameters, we compare the final cost values of each signals from foot sensors measuring VGRF. The first part of their network
curve to determine which one converges faster. Fig. 5 compares the consists of 18 parallel 1D-CNNs that correspond to system inputs. To
observed minimum function and estimated minimum function for the obtain a final classification, the outputs of the concatenated 1D-CNNs are
detection and grading phases of PD and networks used in this study with connected to a fully connected network. They achieved 98.7% accuracy
the purpose of minimizing the loss function. The proximity of the mini- in detecting PD and 85.3% accuracy in predicting severity.
mum observed function to the minimum estimated objective can offer Wang et al. [20], constructed the phase-space of the VGRF initially,
information on the accuracy and reliability of the estimation method which preserves non-linear gait system dynamics. Then, the character-
utilized. istic envelope of the phase-space signal is extracted using Shannon en-
We then trained the Uni-LSTM and Bi-LSTM networks using the ergy. Subsequently, the Shannon energy envelope is subjected to dual
training set and tested the accuracy of the models using 10-fold cross- Q-factor signal decomposition, which is derived from tunable Q-factor
validation to achieve a generalizable result. The results showed that wavelet transform, to isolate its high and low resonance components.
the LSTM network achieved an accuracy of 99.19% in classifying PD Afterward, the high and low resonance components are decomposed into
patients and HC subjects, while the Bi-LSTM network achieved an ac- distinct intrinsic modes using variational mode decomposition, which
curacy of 97.28%. The sensitivity and specificity of the LSTM network yields representative features. These features are then inputted into five
were 99.12% and 99.72%, respectively, whereas the sensitivity and diverse ML-based classifiers to detect anomalies and rate the severity of
specificity of the Bi-LSTM network were 97.76% and 97.72%, PD patients based on the HY scale. Regarding the detection of PD and

8
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

Fig. 3. A sample image of FRP for a healthy subject and five-class PD, where each pixel indicates the degree of similarity between two states in the phase space.
Compared to the binary RP method, which only detects the presence or absence of recurrences, the FRP approach provides a more detailed and preferred texture
analysis of the system's behavior.

9
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

Fig. 4. A model's objective function has been optimized by the Bayesian optimization algorithm to determine the optimal initial learning rate and the number of
hidden units. The test dataset's loss function was set as an objective during 30 iterations of optimization, with a range of [0.001 to 1] for the initial learning rate and
[50 to 120] for the number of hidden units. This optimization method can improve the performance of the model.

Table 2
Summary of fine-tuned training options using the Bayesian optimization.
Stage Classifier InitialLearnRate (Optimization Range ¼ 0.001 to 1) NumHiddenUnits (Optimization Range ¼ 50 to 120)

Two-classes Uni-LSTM 0.0043 73


Bi-LSTM 0.0013 50
Five-classes Uni-LSTM 0.0566 55
Bi-LSTM 0.0038 50

severity rating, the SVM classifier has reported classification accuracies of 97.6% which was lower than our proposed method. Ma et al. [23]
of 98.20% and 96.69%, respectively. Despite their superior performance, applied three domains, namely the force domain, peak domain, and ab-
we achieved an accuracy of 92.285% using a time-efficient, affordable normality domain, in the gait analysis to extract twenty-two distinct
model with one 1D signal, CVGRF, and four features. In addition, an SVM features. An analysis of variance (ANOVA) with a recursive reduction
model requires 1490 kB of memory storage when implemented in a technique was employed to determine whether certain features should be
microcontroller, whereas a two-layer LSTM only requires 54 kB [35]. eliminated. Ultimately, the researchers reduced the number of sensors
Klinton Amaladass et al. [22] used gait signals to extract S-ELBP from 16 to 2 and identified six critical features based on the obtained test
features, which were then utilized as inputs to an ANN for classification results. By using XGBoost and CNN, they achieved a classification accu-
as either HC or PD. Their methodology yielded a classification accuracy racy of 97.32% and 98.4%, respectively. In summary, we achieved

10
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

Fig. 5. Model's minimum observed objective and minimum estimated objective during 30 iterations of employing Bayesian optimization to find the optimal value of
the initial learning rate and the number of hidden units of models.

Table 3
Two-class detection results of PD from healthy individuals: evaluation metrics presented as percentages of Accuracy, Sensitivity, Specificity, Precision, and F1-Score.
Accuracy Sensitivity Specificity Precision F1-Score

Uni-LSTM 99.192 ± 0.528 99.125 ± 0.641 99.727 ± 0.179 99.194 ± 0.566 99.149 ± 0.613
Bi-LSTM 97.283  0.99 97.761  1.447 97.725  3.338 96.833  2.174 97.220  0.953

Table 4
PD grading results based on five-class severity: evaluation metrics presented as percentages of Accuracy, Sensitivity, Specificity, Precision, and F1-Score.
Accuracy Sensitivity Specificity Precision F1-Score

Uni-LSTM 90.291  1.321 90.361  1.389 96.920  0.461 90.857  1.219 90.149  1.494
Bi-LSTM 92.285 ± 2.266 92.355 ± 2.246 97.689 ± 0.385 92.844 ± 1.755 92.082 ± 2.514

near-perfect performance by utilizing a lightweight and efficient model sensors are increasingly being used to evaluate patient outcomes in
and extracting only four features and one 1D signal. clinical trials. Our proposed model achieves comparable or better results
without the need for additional pre-processing steps, thus emphasizing
5. Conclusion its simplicity and efficiency. In addition, the proposed method takes
advantage of both frequency features and FFs of signals, resulting in a
In this research, a new intelligent approach is presented for detecting shorter feature extraction time, shorter model training time, lower
and grading PD only through a single signal obtained from wearable computational complexity, and potential savings in a real-time system.
sensors. The proposed method employs the use of the CVGRF, which has The authors of the study assert that the proposed method can be readily
been shown to be effective in both classification and grading. Wearable applied to classify other biological and clinical signals, thus expanding its

11
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

Table 5 Declaration of generative AI and AI-assisted technologies in the


Comparison between the proposed method and existing papers for two-class writing process
automated PD diagnosis on the same dataset.
Reference Method Accuracy During the preparation of this study, the authors used ChatGPT and
(%) Grammarly to improve readability and language. After using these tools,
[17] Peak detection and pulse duration algorithms, 92.68 the authors reviewed and edited the content as needed and take full re-
SVM sponsibility for the publication's content.
[18] KNN, DT, RF, NB, SVM, and K-means, GMM 90.32
[19] Deep 1D-CNN 98.07
[20] Phase-space reconstruction, Shannon energy 98.20
envelope, and SVM Declaration of competing interest
[22] S-ELBP and ANN 97.60
[23] XGBoost and CNN 98.41 The authors declare that they have no known competing financial
Proposed Uni-LSTM with Bayesian optimization 99.19 interests or personal relationships that could have appeared to influence
method
the work reported in this paper.

applicability in medical research and clinical settings. Additionally, the Acknowledgments


outcomes of the study suggest that this approach holds the potential to
reduce the necessity of multiple sensors in recording physiological data, We deeply appreciate Jeffrey Hausdorff and his esteemed colleagues
leading to both cost reduction and increased patient comfort. To for generously sharing their publicly accessible database, which has
conclude, this study highlights the potential of using a single signal ob- proved invaluable to the progress of our research.
tained from wearable sensors and the CVGRF approach for PD detection
and grading. It also underscores the value of wearable sensors and their References
potential to revolutionize the way patient outcomes are evaluated in
clinical trials. Further research in this area could enhance the accuracy [1] Dugger BN, Dickson DW. Pathology of neurodegenerative diseases. Cold Spring
and reliability of PD diagnosis and treatment, leading to improved pa- Harb Perspect Biol 2017;9(7):a028035. https://doi.org/10.1101/
cshperspect.a028035.
tient outcomes and QoL.
[2] Jankovic J. Parkinson's disease: clinical features and diagnosis. J Neurol Neurosurg
Psychiatry 2008;79(4):368–76. https://doi.org/10.1136/JNNP.2007.131045.
Limitations [3] Pardoel S, Kofman J, Nantel J, Lemaire ED. Wearable-sensor-based detection and
prediction of freezing of gait in Parkinson's disease: a review. Sensors 2019;19(23):
1–37. https://doi.org/10.3390/s19235141.
While the research has yielded commendable outcomes, it is imper- [4] Statistics | Parkinson's Foundation. https://www.parkinson.org/understanding-par
ative to acknowledge certain limitations. Firstly, the analysis for PD kinsons/statistics/. [Accessed 9 March 2023].
[5] Willis AW, Roberts E, Beck JC, Fiske B, Ross W, Savica R, et al. Incidence of
diagnosis relies solely on gait sensor data. Future research could enhance Parkinson disease in north America. NPJ Parkinsons Dis 2022;8(1):170. https://
accuracy by incorporating multiple-sensor data from a multi-modal doi.org/10.1038/s41531-022-00410-y.
framework. Another constraint is identified in the potential enhance- [6] Varadi C. Clinical features of Parkinson's disease: the evolution of critical
symptoms. Biology 2020;9(5):103. https://doi.org/10.3390/biology9050103.
ment of the model through exploring alternative segmentation strategies
[7] Abedinzadeh Torghabeh F, Hosseini SA, Ahmadi Moghadam E. Enhancing
or integrating dynamic segmentation methods. Moreover, the study Parkinson's disease severity assessment through voice-based wavelet scattering,
exclusively extracted four time-frequency and fuzzy features, possibly optimized model selection, and weighted majority voting. Med Nov Technol
overlooking other crucial information embedded in gait signals. This Devices 2023;20:100266. https://doi.org/10.1016/j.medntd.2023.100266.
[8] Perry J, Burnfield JM, Cabico LM. Gait analysis: normal and pathological function.
limitation might have constrained the model's ability to fully capture the J Sports Sci Med 2010;9(2):353.
complexity of the data. It is noteworthy, however, that the primary [9] Pistacchi M, Gioulis M, Sanson F, Giovannini ED, Filippi G, Rossetto F, et al. Gait
emphasis of the proposed method was on simplicity and minimal pro- analysis and clinical correlations in early Parkinson's disease. Funct Neurol 2017;
32(1):28–34. https://doi.org/10.11138/FNEUR/2017.32.1.028.
cessing. Exploring additional sensitive hyperparameters and employing [10] Tong J, Zhang J, Dong E, Du S. Severity classification of Parkinson's disease based
state-of-the-art optimization techniques also could potentially enhance on permutation-variable importance and persistent entropy. Appl Sci 2021;11(4):
our understanding of the model's stability, leading to superior perfor- 1834. https://doi.org/10.3390/APP11041834.
[11] Aşuroglu T, Açıcı K, Berke Erdaş Ç, Kılınç Toprak M, Erdem H, O gul H. Parkinson's
mance in future studies. disease monitoring from gait analysis via foot-worn sensors. Biocybern Biomed Eng
2018;38(3):760–72. https://doi.org/10.1016/J.BBE.2018.06.002.
Funding [12] Biundo R, Formento-Dojot P, Facchini S, Vallelunga A, Ghezzo L, Foscolo L, et al.
Brain volume changes in Parkinson's disease and their relationship with cognitive
and behavioural abnormalities. J Neurol Sci 2011;310(1–2):64–9. https://doi.org/
The authors declare that no funds, grants, or other forms of support 10.1016/j.jns.2011.08.001.
were received during the preparation of this manuscript. [13] Anjum MF, Dasgupta S, Mudumbai R, Singh A, Cavanagh JF, Narayanan NS. Linear
predictive coding distinguishes spectral EEG features of Parkinson's disease.
Parkinsonism Relat Disord 2020;79:79–85. https://doi.org/10.1016/
Data Availability Statement j.parkreldis.2020.08.001.
[14] Williamson JR, Telfer B, Mullany R, Friedl KE. Detecting Parkinson's disease from
wrist-worn accelerometry in the U.K. biobank. Sensors 2021;21(6):2047. https://
The datasets analyzed for this study can be accessed by the public doi.org/10.3390/s21062047.
through the “Gait in Parkinson's Disease v1.0.0” database available at [15] Zeng W, Liu F, Wang Q, Wang Y, Ma L, Zhang Y. Parkinson's disease classification
https://physionet.org/content/gaitpdb/1.0.0/. using gait analysis via deterministic learning. Neurosci Lett 2016;633:268–78.
https://doi.org/10.1016/j.neulet.2016.09.043.
[16] Joshi D, Khajuria A, Joshi P. An automatic non-invasive method for Parkinson's
CRediT authorship contribution statement disease classification. Comput Methods Programs Biomed 2017;145:135–45.
https://doi.org/10.1016/j.cmpb.2017.04.007.
[17] Abdulhay E, Arunkumar N, Narasimhan K, Vellaiappan E, Venkatraman V. Gait and
Farhad Abedinzadeh Torghabeh: Conceptualization, Data curation, tremor investigation using machine learning techniques for the diagnosis of
Methodology, Software, Visualization, Writing – original draft. Yeganeh Parkinson disease. Future Gener Comput Syst 2018;83:366–73. https://doi.org/
Modaresnia: Visualization, Writing – review & editing. Seyyed Abed 10.1016/j.future.2018.02.009.
[18] Khoury N, Attal F, Amirat Y, Oukhellou L, Mohammed S. Data-driven based
Hosseini: Investigation, Supervision, Validation, Writing – review & approach to aid Parkinson's disease diagnosis. Sensors 2019;19(2):242. https://
editing. doi.org/10.3390/s19020242.

12
F. Abedinzadeh Torghabeh et al. Medicine in Novel Technology and Devices 22 (2024) 100297

[19] El Maachi I, Bilodeau GA, Bouachir W. Deep 1D-Convnet for accurate Parkinson [27] Boashash B. Estimating and interpreting the instantaneous frequency of a
disease detection and severity prediction from gait. Expert Syst Appl 2020;143: signal—Part 1: fundamentals. Proc IEEE 1992;80(4). https://doi.org/10.1109/
113075. https://doi.org/10.1016/j.eswa.2019.113075. 5.135376.
[20] Wang Q, Zeng W, Dai X. Gait classification for early detection and severity rating of [28] Boashash B. Estimating and interpreting the instantaneous frequency of a
Parkinson's disease based on hybrid signal processing and machine learning signal—Part 2: algorithms and applications. Proc IEEE 1992;80(4). https://doi.org/
methods. Cogn Neurodyn 2022. https://doi.org/10.1007/s11571-022-09925-9. 10.1109/5.135378.
[21] Vidya B, Sasikumar P. Parkinson's disease diagnosis and stage prediction based on [29] Zadeh LA. Fuzzy sets. Information and control 1965;8(3):338–53. https://doi.org/
gait signal analysis using EMD and CNN–LSTM network. Eng Appl Artif Intell 2022; 10.1016/S0019-9958(65)90241-X.
114:105099. https://doi.org/10.1016/j.engappai.2022.105099. [30] Marwan N, Carmen Romano M, Thiel M, Kurths J. Recurrence plots for the analysis
[22] Klinton Amaladass P, Subathra MSP, Jeba Priya S, Sivakumar M. Enhanced local of complex systems. Phys Rep 2007;438(5–6):237–329. https://doi.org/10.1016/
pattern transformation based feature extraction for identification of Parkinson's J.PHYSREP.2006.11.001.
disease using gait signals. SN Comput Sci 2023;4(2):200. https://doi.org/10.1007/ [31] Zou Y, Donner RV, Marwan N, Donges JF, Kurths J. Complex network approaches to
s42979-022-01603-1. nonlinear time series analysis. Phys Rep 2019;787:1–97. https://doi.org/10.1016/
[23] Ma YW, Chen JL, Chen YJ, Lai YH. Explainable deep learning architecture for early J.PHYSREP.2018.10.005.
diagnosis of Parkinson's disease. Soft Comput 2023;27(5):2729–38. https://doi.o [32] Pham TD. Fuzzy recurrence plots. Europhys Lett 2017;116(5):50008. https://
rg/10.1007/s00500-021-06170-w. doi.org/10.1209/0295-5075/116/50008.
[24] Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov PC, Mark R, et al. PhysioBank, _ Fuzzy recurrence plot-based analysis of dynamic and static spiral tests of
[33] Cantürk I.
PhysioToolkit, and PhysioNet: components of a new research resource for complex Parkinson's disease patients. Neural Comput Appl 2021;33(1):349–60. https://
physiologic signals. Gait in Parkinson’s Disease v1.0.0; Circulation 2000;101(23): doi.org/10.1007/S00521-020-05014-2/METRICS.
e215–2738. https://doi.org/10.13026/C24H3N. [34] Pham TD, Wardell K, Eklund A, Salerud G. Classification of short time series in early
[25] Hausdorff JM, Balash J, Giladi N. Effects of cognitive challenge on gait variability in Parkinson's disease with deep learning of fuzzy recurrence plots. IEEE/CAA J
patients with Parkinson's disease. J Geriatr Psychiatry Neurol 2003;16(1):53–8. Autom Sin 2019;6(6):1306–17. https://doi.org/10.1109/JAS.2019.1911774.
https://doi.org/10.1177/0891988702250580. [35] Shalin G, Pardoel S, Lemaire ED, Nantel J, Kofman J. Prediction and detection of
[26] Pham TD. Time–frequency time–space LSTM for robust classification of freezing of gait in Parkinson's disease from plantar pressure data using long short-
physiological signals. Sci Rep 2021;11(1):1–11. https://doi.org/10.1038/s41598- term memory neural-networks. J NeuroEng Rehabil 2021;18(1):1–15. https://
021-86432-7. doi.org/10.1186/S12984-021-00958-5/TABLES/14.

13

You might also like