Professional Documents
Culture Documents
Applied Soft Computing Journal: Yılmaz Kaya, Ömer Faruk Ertuğrul
Applied Soft Computing Journal: Yılmaz Kaya, Ömer Faruk Ertuğrul
highlights
• A novel feature extraction approach, which was called motif patterns, was proposed.
• It was employed to estimate the neurological status from non-EEG bio-signals.
• Proposed approach is sensitive to local changes.
• The neurological status was determined accurately by non-EEG signals.
• The optimal sensor in determining neurological status was found as heart rate.
article info a b s t r a c t
Article history: In this paper, a novel feature extraction approach, which was called motif patterns, was proposed
Received 17 March 2019 and it was employed to estimate the neurological status from non-electroencephalography (non-EEG)
Received in revised form 8 June 2019 bio-signals. It was found from the literature that successful results were obtained by using the feature
Accepted 29 June 2019
extraction methods that are sensitive to local changes such as one-dimensional local binary patterns
Available online 2 July 2019
(1D-LBP). In 1D-LBP, the local changes in a signal were determined based on the comparisons between
Keywords: each ‘‘central value’’ with its neighbors. In order to increase the sensitivity of extracted features from
Motif patterns the local changes in a signal, each ‘‘value’’ in the signal was compared with its neighbor, and by
One-dimensional local binary patterns this way, a motif was obtained in the result of the comparisons in a specified window. To evaluate
Local changes and validate the proposed approach, the non-EEG bio-signals, which were recorded by electrodermal
Neurological status activity, temperature, accelerometer, heart rate, and arterial oxygen level sensors, were employed.
Bio-signals The features that were extracted from these signals by the proposed motif patterns were classified by
Feature extraction
machine learning methods. The neurological status of each of the samples was classified accurately
by the proposed approach. Furthermore, the optimal sensor types were investigated and it was found
that heart rate signals are enough to estimate the neurological status.
© 2019 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.asoc.2019.105609
1568-4946/© 2019 Elsevier B.V. All rights reserved.
2 Y. Kaya and Ö.F. Ertuğrul / Applied Soft Computing Journal 83 (2019) 105609
Block 2: In this block, the histograms of the detected motifs were (ELM), logistic regression (LR), random forest (RF), functional tree
computed. (FT), linear discriminant analysis (LDA), Bayesian network (BN),
Block 3: Seventeen (17) statistical features, which are given in Naïve Bayes (NB), and k nearest neighbor (kNN) methods. In all
Table 1, were extracted in the obtained histograms. experiments, each of the machine learning processes was done
Block 4: Extracted statistical features were classified to neuro- according to 10-folds cross-validation. According to the cross-
logical statuses by some popular machine learning methods such validation process, the dataset was divided into 10 subsets. In
as the artificial neural network (ANN), extreme learning machine each epoch, 1 subset of 10 subsets was used for testing and 9
4 Y. Kaya and Ö.F. Ertuğrul / Applied Soft Computing Journal 83 (2019) 105609
Table 1
Employed statistical features.
Feature Equation
N
∑
Xi
i=1
Mean f1 = (1)
N
N
∑
(Xi − f 1)2
√ i=1
Standard deviation f2 = (2)
N
N
∑
(Xi )2
√ i=1
Energy f3 = (3)
N
N
∑ Xi Xi
Entropy f4 = − log( ) (4)
f3 f3
i=1
N
∑ i ∗ Xi − f 1
Correlation f5 = (5)
σx
i=1
N
∑
|Xi+1 − Xi |
i=1
Absolute mean f6 = (6)
N
⎛ N ⎞3/2
1 ∑
√ ⎜N (Xi − f 1)3 ⎟
N(N − 1)
⎜ x=1
⎜ ⎟
Kurtosis f7 = ⎟ (7)
N −2 ⎜ ∑ N ⎟
⎝ 1 2
⎠
(Xi − f 1)
N
x=1
⎡ ⎛⎛ N ⎞ ⎞ ⎤
1 ∑
⎢ ⎜⎜ N (Xi − f 1)4 ⎟ ⎟ ⎥
N −1
⎢(N + 1) ⎜⎜ x=1
⎢ ⎜⎜ ⎟ ⎟ ⎥
Skewness f8 = ⎟ − 3⎟ + 6⎥ (8)
(N − 2)(N − 3) ⎣
⎢ ⎜⎜ ∑ N ⎟ ⎟ ⎥
⎝⎝ 1 2
⎠ ⎠ ⎦
(Xi − f 1)
N
x=1
Median f9
Minimum f10 = min{X 1 , X 2 , X 3 , X 4 . . . .X N }
Maximum f11 = max{X 1 , X 2 , X 3 , X 4 . . . .X N }
f1
Coefficient of variance f 12 = (9)
f2
N
∑
Xi2
√ i=1
Root mean square f 13 = (10)
N
N
1 ∑
√ X2
N
i=1
Shape factor f 14 = (11)
N
1 ∑
|Xi |
N
i=1
max|xi |
Crest factor f 15 = (12)
N
1 ∑
√ X2
N
i=1
max|xi |
Margin factor f 16 = ( )2 (13)
N √
1 ∑
|Xi |2
N
i=1
max|xi |
Impulse factor f 16 = (14)
N
1 ∑
| Xi |
N
i=1
Where N shows the length of the histograms, which depends on the length of the window. Xi and σx are the signal
and the standard deviation, respectively.
subsets were employed for training. By this way, each subset in training dataset. Each of the employed performance measures
cluster was employed as a test dataset, while it was not utilized was taken as the average of these 10 trials.
Y. Kaya and Ö.F. Ertuğrul / Applied Soft Computing Journal 83 (2019) 105609 5
2.4. Employed performance measures The accuracy, precision, and recall show the ratio of all cor-
rectly classified samples to all samples, the ratio correctly classi-
In order to validate the proposed approach, the following fied positive samples to all positively classified samples, and the
ratio correctly classified positive samples to all classified sam-
performance measures were employed.
ples. F-measure is calculated based on the precision and recall.
TP + TN These metrics were employed in order to show effectivity of the
Accuracy (%) = 100 ∗ (15)
TP + TN + FP + FN proposed approach.
TP
Precision = (16)
TP + FP 3. Obtained results
TP
Recall = (17) The Non-EEG Dataset for Assessment of Neurological Sta-
TP + FN
2(Recall ∗ Precision) tus [22] was employed in order to validate the proposed ap-
F − Measure = (18) proach. All experiments were performed based on 10-folds cross-
(Recall + Precision)
validation. The feature extraction, selection and classification
where, T, F, P, and N shows the true, false, positive and negative, processes were performed on a computer with Intel Core i5
respectively. E.g., TP shows the truly classified positive samples microprocessor, 8 GB RAM and a windows operating system.
and FN gives wrongly classified negative samples. Trials were performed with Matlab R2017b.
6 Y. Kaya and Ö.F. Ertuğrul / Applied Soft Computing Journal 83 (2019) 105609
Table 2 Table 5
Obtained accuracies (%) in The Non-EEG Dataset for Assessment of Neurolog- Selected features and achieved accuracies (%) by RF.
ical Status based on different window length and various machine learning Bio-signal Selected motifs Accuracy (%)
methods.
Accelerometer X-axis M3, M5, M6, M14, M15 81.25
Window length ANN SVM LR RF FT LDA BN NB kNN Accelerometer Y-axis M5, M9, M14, M16, M23 83.50
4 96.25 92.50 91.25 100 100 98.75 100 97.5 100 Accelerometer Z-axis M1, M6, M9, M14, M15, M17 87.25
5 96.25 93.00 91.25 100 100 85.00 98.75 95.00 100 Temperature M3, M4, M5, M6, M13, M15 83.25
6 92.50 92.50 87.50 100 100 95.00 98.75 96.25 95.00 EDA M1, M16, M20 71.25
7 95.00 92.50 87.50 100 100 92.5 98.75 96.25 95.00 SpO2 M1, M2, M5, M6, M11, M13, M22 81.25
Heart rate M6, M8, M15 88.75
Table 3
Obtained accuracies (%) in signals that were recorded from each of the employed
sensor types. that were extracted by MP was determined a correlation-based
Window length Accelerometer HR SpO2 EDA Temperature feature selection method that was proposed by Hall [32]. Ob-
4 97.50 100 100 96.25 100 tained relevant features and achieved accuracies by employing
5 97.50 100 100 100 100 the selected features were summarized in Table 5.
6 98.75 100 100 100 100
7 97.50 100 100 100 100
As seen in Table 5, searching and employing only M6, M8, and
M15 motifs that can be extracted from the heart rate signal is
enough to determine the neurological status.
Table 4
Obtained accuracies (%) in each of the axis of the accelerometer by RF.
4. Discussion
Window length X Y Z
4 97.50 100 97.50 4.1. Assessing the effectivity of the histograms that obtained by the
5 97.50 100 96.25
proposed motif patterns
6 98.75 97.50 97.50
7 98.75 97.5 98.75
In order to assess the effectivity of extracted histograms by
MP, the extracted features from heart rate signals that belong
to the first subject in various neurological statuses such as re-
3.1. Determining optimal window length and machine learning laxation, physical stress, cognitive stress, and emotional stress.
method Obtained histograms by MP are given in Fig. 6.
As seen in Fig. 6, different motifs were extracted in signals
Since the change in window length yields obtaining different that belong to different neurological status. The major reason
features, determining the window length is a critical issue in the behind this is the high sensitivity of MP to the local changes
proposed approach. Therefore, the window length changed from in the signal. Please note that the histograms that are shown in
4 to 7 and extracted features were classified by ANN, SVM, LR, Fig. 6 are extracted when window length is 4. Therefore, there
RF, FT, LDA, BN, NB, and kNN. Obtained accuracies (%) were sum- are 24 motifs in this figure. Furthermore, the relevant motifs in
marized in Table 2. Here all features were extracted by MP from
estimating neurological status were selected by a correlation-
signals recorded by each of the EDA, temperature, acceleration,
based feature selection method [32] and showed in a red frame
HR, and SpO2 sensors.
in Fig. 6. As seen in this figure, the most relevant motifs are M6,
As seen in Table 2, the neurological status can be successfully
M8, and M15.
classified by RF and FT in each of the window lengths.
4.2. Comparison of the obtained results with literature and some
3.2. Determining optimal sensor type in classifying the neurological
other methods
status
As seen in the literature, 97.5% and 98.8% accuracies were re-
In order to determine the optimal sensor type, the extracted
features by MP in each sensor type were classified by RF (because ported by k nearest neighbor and artificial neural network meth-
RF showed higher success; see Table 2). Obtained accuracies ods, respectively by using all of the bio-signals in the employed
based on different window lengths are given in Table 3. Non-EEG Dataset for Assessment of Neurological Status. Further-
As seen in Table 3, 100% accuracies were obtained in many more, 88.33% (secondarily generalized seizures) and 84.72% (com-
cases. High correlations were found in neurological status with plex partial seizures) accuracies were obtained in [33] by using
each of the electrodermal activity (EDA), temperature, and heart only heart rate signals. The achieved accuracies by the proposed
rate (HR), which is well suited to the literature findings. In order approach were higher than the literature findings since achieved
to assess the success of HR, the boxplot figure of the extracted results are without personalization.
features by MP while the window length is 4 are given in Fig. 5. In order to validate MP, the features were extracted from the
It can be seen in Fig. 5 that the neurological statuses can be eas- Non-EEG Dataset for Assessment of Neurological Status [22] by
ily distinguished by extracted features. Since the accelerometer one-dimensional local binary pattern (1D-LBP), one-dimensional
sensor is 3-dimensional, obtained accuracies by RF in each of the mean local binary pattern (1D-M-LBP), and one-dimensional me-
axis are summarized in Table 4. dian local binary pattern (1D-Med-LBP). Extracted features were
As seen in Table 4, higher accuracies were obtained by using classified by RF according to 10-folds cross-validation and ob-
the accelerometer signal that belongs to the Y -axis. tained success rates are summarized in Table 6.
As seen in Table 6, MP shows a higher success in determining
3.3. Determining relevant motifs in classifying the neurological sta- neurological status than the other employed feature extraction
tus methods, such as 1D-LBP, 1D-M-LBP, and 1D-Med-LBP. In LBP op-
erators, the comparisons are only done with a central value with
Since using a lower window length yields lower computational its neighbors in a specified window, while in MP, the comparisons
cost, assigning the window length as 4 is enough to achieve ac- are applied in each of the neighbors in a specified window.
ceptable accuracies (see Tables 2–4). The relevancy of the features Therefore, the sensitivity of MP to the local changes is higher than
Y. Kaya and Ö.F. Ertuğrul / Applied Soft Computing Journal 83 (2019) 105609 7
Fig. 5. Boxplot of the extracted features from HR signal by MP while window length is 4.
Fig. 6. Obtained histograms extracted from the heart rate signal in neurological statuses: (A) relaxation, (B) physical stress, (C) cognitive stress, and (D) emotional
stress.
8 Y. Kaya and Ö.F. Ertuğrul / Applied Soft Computing Journal 83 (2019) 105609