Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/330599733

Ischemic Stroke Detection using EEG Signals

Conference Paper · October 2018

CITATIONS READS
4 3,210

4 authors, including:

Arooj Qureshi Rong Zheng


McMaster University McMaster University
3 PUBLICATIONS   10 CITATIONS    222 PUBLICATIONS   4,453 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Indoor Localization, mapping and navigation View project

Wireless Data Center Monitoring View project

All content following this page was uploaded by Arooj Qureshi on 06 May 2019.

The user has requested enhancement of the downloaded file.


Ischemic Stroke Detection using EEG Signals
Arooj Ahmed Qureshi Canxiu Zhang
Dept. of Computing and Software Dept. of Electrical and Computer Engineering
McMaster University McMaster University
Hamilton, Ontario Hamilton, Ontario
quresa9@mcmaster.ca zhanc3@mcmaster.ca

Rong Zheng Ahmed Elmeligi


Dept. of Computing and Software HiNT
McMaster University Hamilton, Ontario
Hamilton, Ontario ahmed@hintneuro.com
rzheng@mcmaster.ca
ABSTRACT ACM Reference Format:
Stroke is the second leading cause of death in the United Arooj Ahmed Qureshi, Canxiu Zhang, Rong Zheng, and Ahmed
Elmeligi. 2018. Ischemic Stroke Detection using EEG Signals. In
States of America. 87% of all strokes are ischemic stroke,
Proceedings of CASCON ’18 (CASCON’18).
which is mainly caused by the blockage of small blood ves-
sels around the brain. Magnetic resonance imaging (MRI)
provides the gold standard for accurate diagnosis of ischemic
strokes, but it is both time-consuming and unsuitable for 24/7
monitoring. In this paper, we propose an ischemic stroke
1 INTRODUCTION
detection method through the multi-domain analysis of EEG Stroke is the second leading cause of death worldwide [12],
brain signal from wearable EEG devices and machine learn- [22]. One out of twenty deaths in America are due to stroke
ing. Using 40 healthy and 40 patients’ data, we find that Multi- and 62,000 strokes that occur each year in Canada affect
Layered Perceptron (MLP) and Bootstrap models (Extra-Tree all age groups and lead to a lifetime impact on health [12].
and Decision-Tree) can achieve test accuracy of 95% with an According to centers of disease control and prevention, in
area under the ROC curve 0.85.

CCS CONCEPTS
• Computing methodologies → Machine learning al-
gorithms; Bagging; Neural networks; Representation of math-
ematical functions; Cross-validation;

KEYWORDS
Ischemic Stroke, Multi-Layered Perceptron (MLP), Bootstrap
models (Extra-Tree and Decision-Tree), Multi Domain Fea-
Figure 1: Types of Strokes: Ischemic and Hemorrhagic. In Ischemic
tures.
Brain Stroke (left), a blood clot has blocked the flow of blood to a
specific area of the brain.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear the United States, someone suffers from brain stroke in every
this notice and the full citation on the first page. Copyrights for components 40 seconds [4].
of this work owned by others than ACM must be honored. Abstracting with There are two main types of stroke: ischemic and hemor-
credit is permitted. To copy otherwise, or republish, to post on servers or to rhagic. Ischemic stroke occurs when a blockage (obstruction)
redistribute to lists, requires prior specific permission and/or a fee. Request
of small blood vessels occurs around the brain (Fig.1). Magnet
permissions from permissions@acm.org.
CASCON’18, October 2018, Markham, Ontario Canada
Resonance Imaging (MRI) gives accurate results for stroke de-
tection, yet it is an expensive resource, requires several hours
to generate an examination report and is applicable for a
limited time. MRI is used only in situations where there is no
CASCON’18, October 2018, Markham, Ontario Canada Qureshi,A. Zhang, C. Zheng, R. Elmeligi, A.

time pressure to offer diagnosis, typically as follow-up imag- 2 BACKGROUND AND RELATED WORK
ing. MRI is an expensive and is not available at all healthcare Electroencephalographic (EEG) signals serve as a vital source
locations. In comparison, Electroencephalography (EEG) of- of information when it comes to brain function. It is possible
fers a continuous, real-time, non-invasive measure of brain to recognize abnormal activities of the brain functionality
function [5] and is capable of detecting ischemic stroke due using EEG signals. Most of the cerebral signal observed in
to variation in cerebral blood flow in the blood vessels. It the scalp, EEGs falls in the range of 1 − 20 Hz. Waveforms
has proven to be effective in detecting various other brain- are subdivided into bandwidths known as delta (δ ), theta (θ ),
related activities like Rapid Eye Movement (REM), sleep and alpha (α), and beta (β) to signify the majority of the EEG
awake stage and other seizure [17]. used in clinical practice (Fig.2) [5].
In this paper, EEG signals data from 40 healthy and 40
ischemic stroke patients datasets are used to detect ischemic
stroke if it occurs in a healthy person. For this purpose, EEG
data have been collected from six channels (two rare and two
at the front side of the brain). It is further bifurcated into EEG
frequency bands to measure the signal fluctuation. Various
features are extracted to detect this fluctuation (in 30 seconds
epoch) in the brain functionality due to ischemic stroke.
Besides frequency domain features, time domain features Figure 2: EEG Channel Sub-bands Frequency range.
are extracted as well to monitor fluctuation trends in the
EEG data due to ischemic stroke. Eventually, these features Ischemic stroke is primarily because of changes in Cere-
are used together to train Multi-Layered Perceptron (MLP) bral Blood Flow (CBF), and it can be detected through changes
and Bootstrap (Extra-Tree and Decision-Tree) models.This in EEG signal patterns [9]. Prominent changes include the
approach is shown to be effective as it achieves 95% accuracy reduction of delta (lowest frequency band) or the presence
and area under the ROC curve as 0.85 using test data. These of high-frequency bands (beta and alpha) [5], [3]. Further-
results are promising and may be used to detect the early more, the power density ratio between bands of the differ-
onset of Transient Ischemic Acute (TIA) attacks. ent hemisphere changes as stroke affects one hemisphere
There are two main contributions of this paper in the area [5], [3], [10]. The best results for Ischemic stroke detection
of ischemic stroke detection. The first contribution lies in so far, are obtained using MRI scans in conjunction with
utilizing multi-domain features for stroke detection using meta-data like patients’ history, medical prescriptions and
EEG signal.EEG feature extraction methods are primarily most importantly MRI scan [20]. Omar et.al has implemented
based on single domain analysis. Conversely, due to the power spectrum density (PSD) ratio between different EEG
highly non-stationary and nonlinear characteristics of the bands and channels, as features for ischemic stroke detection,
EEG, it is challenging to extract information from a single power spectrum density (PSD) [13], [14]. PSD is used to train
domain only. We extract frequency domain (FFT, PSD, Ratio) various classifiers such as KNN [14], ANN [13] and Extreme
and time domain (Root Mean Square, DFA) features. The Learning Machine (ELM) [16] for ischemic stroke detection.
second contribution is to deal with imbalanced training data With these settings, they were able to achieve 85% correct
using Bootstrap models, and achieve an accuracy of 95% and classification with Artificial Neural Network (ANN) [13] and
a greater area under the ROC curve. Imbalanced training 85% accuracy with KNN classifier [14]. [16] trained (Extreme
data is common in medical domains as people suffering from Learning Machine) with RPR as well and achieved an accu-
a particular ailment is often a small percentage of the general racy of 93%. Moreover, spectrograms are used as features
population, improving the overall area under the ROC curve to train the Convolutional Neural Network (CNN) for map-
(by reducing false negatives) and an accuracy of 95%. ping the changes in the EEG signal due to ischemic stroke
This paper is organized as follows. Background on EEG detection[11]. The spectrogram is a visual representation of
and related work are discussed in Section 2. In Section 3, we a relationship between frequency strength at specific time
presented the methodology, data pre-processing and feature step. In[11], the EEG signal is divided into 30 seconds time
extraction methods. In Section 4, performance evaluation windows with 20% overlapping. Spectrogram of each time
along with results are presented, and finally, conclusion and window is used as an input for the training of Convolutional
future work are discussed in Section 5. Neural Network.
In EEG signal analysis process, feature extraction is done
by utilizing series of transformations so that the required in-
formation can be studied or observed easily in the transform
domain to provide the best input to the classifier [19].
Ischemic Stroke Detection using EEG Signals CASCON’18, October 2018, Markham, Ontario Canada

In this paper, we have used a combination of the relative 3.1.3 Artifact Rejection. Wearable EEG devices are sus-
power ratio (RPR) between hemispheres and channels along ceptible to the same sources of artifacts found with tradi-
with detrended fluctuation analysis as features to train three tional research-grade devices, including eye blinks and eye
models, multi-layered perceptron (MLP), Decision-Tree (DT) movements (electrooculographic, or EOG, artifacts), as well
and Extra-Tree (ET). As we are dealing with imbalance data, as artifacts from jaw clenches, facial expressions, and other
therefore, decision-tree and extra-tree classifiers are used. muscle activity. The presence of artifacts in EEG signals can
These classifiers handle imbalance datasets better than base- distort the features which represent information of brain
line classifiers [1]. stroke and therefore leads to the false detection result.
The conventional method for artifact rejection for clinical
3 METHODOLOGY usage is to identify artifacts by visual inference and then
manually remove the artifact signal. Other methods proposed
The methodology proposed is based on supervised learning.
to remove artifacts from EEG recordings include a regression
EEG data signal is analyzed in the time domain as well as
in time/frequency, linear decomposition and reconstruction,
in the frequency domain. To the best of our knowledge, no
and many more [6].
previous work has exploited multi-domain features explicitly
Since artifact rejection is not the primary focus of this
for ischemic stroke detection with EEG signals. The process
project, we adopted visual inspection and manual artifact
shown in (Fig.3) involves data cleaning and features extrac-
rejection method. The collected EEG signals are applied man-
tion, and training and testing of machine learning models.
ual artifact rejection by experts.
Data cleaning comprises of data pre-processing leading to
time and frequency domain features extraction process. The
feature set is used to train and test machine learning models.
3.2 Feature Extraction
More details will be presented in the subsequent sections. Our experiment demonstrates that both spatial and temporal
features are useful in detecting ischemic stroke. Next, we
provide the detailed description of feature extraction in the
3.1 Data Pre-Processing
time domain and the frequency domain.
The EEG signal is sampled at 256 Hz. Data preprocessing
consists of three steps : data referencing, segmentation and 3.2.1 Frequency Domain. According to related medical
artifact removal. literature, occurrences of ischemic strokes affect the signals
in low-frequency range (for example delta, theta and alpha
3.1.1 Data Referencing. The EEG data were collected by bands) [3], [13]. An ischemic stroke usually occurs in a spe-
placing electrodes at six locations, also called six channels: cific area (hemisphere) of the brain,there will be differences
C3, C4, O1, O2, and two reference channels behind two ears in relative power among different channels [3]. Therefore,
as illustrated in (Fig.4). changes in the power of each sub-band of channels and the
Mean of reference channels is first subtracted from data relative power ratio between channels provide good indica-
C3, C4, O1, O2, tors when the ischemic stroke occurs.

r r e f1 (n) + r r e f2 (n) Sub-band Power Estimation. For each epoch of single chan-
x c (n) = r c (n) − (1) nel EEG signal x(n), n = 0, ..., N − 1, we used the Welch’s
2
weighted overlapped segment averaging method to estimate
where x c (n) representing the referenced EEG signals of chan- power spectral density, which involves splitting the recorded
nel c ∈ {C3, C4, O1, O2}, and r c (n) representing the raw EEG signal into overlapped windows of length L, calculating mod-
signals in channel c ∈ {C3, C4, O1, O2}, r r e f1 (n) and r r ef2 (n) ified periodograms of these windows, and averaging these
refer to raw EEG signals in the two reference channels. modified periodograms [21].
The resulting modified periodogram for the ith window is
3.1.2 Segmentation. The EEG data in 4 channels collected
L−1
from healthy participants and patients may last for hours in 1 Õ
p̄ i (f ) = | x i (n)w(n)e −j2π f n | 2 , (2)
duration. In general, the signals are non-stationary. However, LU n=0
it is widely accepted that if we divide them into 30-second
epochs, each epoch of the measured EEG data represents a where U is the normalization factor for the power in the
wide-sense stationary signal. Hence in this paper, the data window function:
are analyzed in 30-second epochs. After data referencing, L−1
the signal in each channel is split into 30-second epochs of 1Õ 2
U = w (n) (3)
7680 samples each. L n=0
CASCON’18, October 2018, Markham, Ontario Canada Qureshi,A. Zhang, C. Zheng, R. Elmeligi, A.

Figure 3: Methodology

Relative Band Power. From PSD, frequency bands and their


relative strengths are known. Now for each channel, the
relative power for each sub-band is calculated by (6):
p̄bc / p̄bc ,
Õ
(6)
c ∈C
where c ∈ {C3, C4, O1, O2}, b ∈ {δ, θ, α }, and p̄bc is the av-
erage power of sub-band b recorded by channel c. Since we
have three sub-bands and four channels for each observation,
the total number of relative band power features is twelve.
Relative Hemisphere Power. It shows the difference be-
tween the left hemisphere and the right hemisphere of person
(fig. 4). In our EEG recording device, C3 and C4 channels
located at the front left and the front right hemisphere re-
Figure 4: EEG Channels with blue arrows are C3, C4, O1, O2 in spectively, O1 and O2 channels located at the back left and
front and back hemisphere. the back right hemisphere respectively. We calculate relative
front hemisphere power RPR(b)f h the difference between
C3 and C4 for each sub-band in (7):
and w(n) is the window function. The Welch Power spectrum
is average of these modified periodograms: (|P̄bC3 − P̄bC4 |)/(P̄bC3 + P̄bC4 ), (7)
K −1
1 Õ i where P̄bC3 and P̄bC4 are average power of sub-band b ∈
p̄(f ) = p̄ (f ) (4) {δ, θ, α } in channel C3 and C4 respectively.
K i=0
Similarly, we calculate relative back hemisphere power
In this paper, we used 50% overlapped sliding window with 2 RPR(b)bh the difference between O1 and O2 for each sub
seconds window length, i.e., L = 2sec × fs = 512, to estimate band in (8):
the Power Spectral Density.The number of windows in an (|P̄bO 1 − P̄bO 2 |)/(P̄bO 1 + P̄bO 2 ), (8)
epoch is given by T − 1 = 29. The average power of sub-band
b ∈ {δ, α, θ } for channel c ∈ {C3, C4, O1, O2} is calculated where P̄bO 1 andP̄bO 2are average power of sub-band b ∈
by: {δ, θ, α } in channel O1 and O2 respectively.
Í
fmin ≤f <fmax p̄(f )
Combining (6), (7) and (8), there are 18 features for each
P̄dc = , (5) epoch in the frequency domain.
nb
where fmin and fmax are the lower and upper of frequency 3.2.2 Time Domain Features. EEG signal shows the fluctu-
range of each sub-band shown in (Fig.2) , nb = (fmax − ation of voltages due to specific activities in the brain. These
fmin )/fr is the number of frequency samples for each sub- signals possess a scale-invariant structure which repeats
band range. i.e. fr = fs /L. itself on subintervals of a signal [10].
Ischemic Stroke Detection using EEG Signals CASCON’18, October 2018, Markham, Ontario Canada

Detrended Fluctuation Analysis (DFA). Detrended fluctua- Table 1: Total Features


tion analysis (DFA) is capable of measuring the intensity of
fluctuation in the brain [11]. DFA can differentiate between Description Features
regular and irregular changes. DFA along with Root Mean
Detrended Fluctuation Analysis
Square (RMS) is used to measure the Hurst (h) exponent [7], h(q)c1 , h(q)c2 ,
(DFA)
which defines specific scale-invariant structure and fluctua- h(q)c3 , h(q)c4 ,
(4)
tion level of each epoch of 30 seconds in health and patient
data. It is a five-step process. P̄αc1 , P̄θc1 , P̄δc1 ,
Relative Power Ratio
P̄αc2 , P̄θc2 , P̄δc2 ,
Step 1 Mean Centering. This is the initial calculation to (Sub band)
P̄αc3 , P̄θc3 , P̄δc3 ,
converts noise like time series data signal into random walk (12)
P̄αc4 , P̄θc4 , P̄δc4 ,
data (difference signal x̃ ) by subtracting mean x̄ of x(n) from
each data point of x(n), Relative Power Ratio RPR(δ )f h , RPR(δ )bh ,
(Hemisphere) RPR(θ )f h , RPR(θ )bh ,
n
Õ (6) RPR(α)f h , RPR(α)bh ,
x̃(n) = (x k − x̄) n = 0, 1, 2, ..., (L − 1) (9)
k =1

Step 2 Split data. Split x̃(n) into Ns = ⌈ Ls ⌉ number of non- 3.4 Classifiers
overlapping time windows of size S, Three classifiers: Multi-layered Perceptron (MLP), and two
bootstrap models -: Decision-Tree and Extra-Tree, are trained
Step 3 Local Linear Regression. Calculate the local linear
with the multi-domain feature set shown in (Table.1).
trend yv , of each window v, by applying least-square fit,
3.4.1 Multi-layered Perceptron (MLP). Multi-Layer Per-
s ceptron is composed of one input layer, one or more hidden
1 Õ 2
F 2 (v, s) = x̃[(v − 1)s + j] − yv (j) (10) layers and a final output layer. Every layer except the out-
s j=1
put layer is fully connected to the next layer and includes
where j = 0, 1, 2, ..s and v = 1, 2, ..., Ns . a bias neuron. This model is sensitive to hyper-parameters,
that is why we used grid-function to identify appropriate
Step 4 Local Root Mean Square. RMS fluctuation of x(n) is hyper-parameter values of bootstrap models for the project.
calculated from equation below when q = 2:
3.4.2 Bootstrap Aggregation Techniques (Bagging Method).
v
u
t Ns
Bootstrap aggregation, also known as bagging technique
1 Õ is a robust ensemble method. It helps to reduce the vari-
Fq (s) = [F 2 (v, s)] (11)
Ns v=1 ance of all classifiers as per requirement. Random forest,
Decision-Tree,and Extra-Tree are the application of bagging
Step 5. Exponent of h(q). Hurst exponent of each epoch is techniques. Their training is based on several models inde-
calculated with RMS values Fq (s) by varying s. It is the power pendently and then average the prediction. On average, the
law relation between the multiple RMS values of different combined estimator is usually better than any of the single
window sizes. base estimator because its variance is reduced. In this work,
All features are summarized in (Table 1). Altogether there we adopt two bagging techniques.
are 22 features.
Decision-Tree Classifier. The purpose of Decision-Trees
3.3 Recursive Feature Elimination (DT) is to create a model that predicts the value of a target
variable by learning simple decision rules inferred from the
The Recursive Feature Elimination (or RFE) works by recur- data features.
sively removing attributes(features) and building a model
on those features that remain. It uses the model accuracy Extra-Tree Classifier. In extremely randomized trees, ran-
to identify which features (and the combination of features) domness goes one step further in the way splits are computed.
contribute the most to predicting the target attribute. The As in random forests, a random subset of candidate features
features which are rank 1 (and True) are the best 14 features is used, but instead of looking for the most discriminative
which contribute the most to the model accuracy. All time do- thresholds, thresholds are drawn at random for each can-
main features are selected along with ten frequency domain didate feature, and the best of these randomly-generated
features mostly related to the Alpha and Delta bands. thresholds is picked as the splitting rule.
CASCON’18, October 2018, Markham, Ontario Canada Qureshi,A. Zhang, C. Zheng, R. Elmeligi, A.

4 RESULTS AND PERFORMANCE (3) The above steps are repeated Q times (Q-fold cross-
EVALUATION validation), each time choosing different sets of train-
The dataset used for this project are shared by HiNT (Health- ing and test feature in Cl . The probability of correct
classification for each class can then be estimated by
care Innovation in Neuro Technology). HiNT develops a ÍQ
wearable point-of-care monitoring device that detects when P̂cl = Q1 q=1 P̂clq where P̂clq denotes the estimated
patients at high-risk are having a stroke [8]. Dataset consists probability of correct classification of class l at the qth
of 40 patients who had ischemic stroke history and 40 healthy trial, q = 1, . . . , Q, i.e., P̂clq = NNlT
lc
with Nlc and NlT
people. Patients have an average age of 72 with 13.6 standard being the number of correct classification and total
derivation whereas, healthy persons have a mean age of 73 number of members in class Cl at the qth trial.
with 7.1 standard derivation. For each person, EEG signals We adopted 10-fold cross-validation method in this paper.
were recorded from 15 minutes to 4 hours and sampled at
256 Hz. Table 2: Parameters of machine learning classifiers.
For this project, data from four channels and two reference
channels are collected, as shown in (fig.4). Classifiers Parameters
Keeping in mind the nature of this project, we consider Fully connected (dense) model.
accuracy,ROC score, precision, recall and F1-score as evalua- Multi-Layered Perceptron Layer 1 : 45 neurons
tion metrices as accuracy and ROC score are not sufficient (MLP) Layer 2: 18 neurons
for model evaluation [18]. ROC curve is created by plotting Layer 3: 1 neuron
the true positive rate (TPR) against the false positive rate
(FPR) at various threshold settings. The true-positive rate is Decision-Tree
Number of trees = 675
also known as sensitivity, recall or probability of detection (DT)
in machine learning. Since ischemic stroke detection is a Extra-Tree
binary classification problem, we also include the confusion Number of trees = 675
(ET)
matrix to gain further insights into the performance of each
algorithm.
Our implementation is base on Python libraries Keras [2] 4.3 Results
and scikit-learn [15].
For validation of results, 40 different combinations of test
and training datasets were considered for the training and
4.1 Classifiers testing of all classifiers. The 40 healthy and 40 patients’ data
The parameters of Multi-layered Perceptron, Decision-Tree are randomly divided into training and test datasets. 80% of
and Extra-Tree model are shown in (Table. 2). the healthy and patient data is in the training set, and 20% is
in the test set. There is no common patient in the training
4.2 Q-fold cross-validation method and testing data.
Ideally, the performance accuracy of our EEG classification
algorithm should be measured regarding its probability of
error which necessitates the knowledge of the ground truth
of the patient’s health state. However, since the ground truth
of the health state of a patient measured from the signal
epoch is not known, we, therefore, treat the library of signal
epochs classified by clinical experts as the ground truth. From
the library of collected signal epochs, we randomly selected
some as training signals and some as test signals so that
the validation of our classification methods is carried out as
follows:
(1) For each of the classes Cl , l = {0, 1}, containing Nl Figure 5: Classifiers Performance regarding accuracy. Blue refers to
the accuracy of the test dataset whereas, grey represents the accuracy
epochs of the patients or healthy person, we randomly
of the training dataset, Black error bar shows the standard deviation.
choose NlT matrix curves as the test set and the rest
(Nl − NlT ) as the training (library) set.
(2) Different classifiers were applied on the features ex- 4.3.1 Accuracy. From fig. 5, it is evident that the perfor-
tracted from each epoch. mance of the bootstrap classifiers is consistent regarding
Ischemic Stroke Detection using EEG Signals CASCON’18, October 2018, Markham, Ontario Canada

accuracy of training (which is 100%) and test (accuracy is


more than 93%). The accuracy of MLP is lower than the DT
and ET classifiers but still achieves reasonable accuracy. It
is observed that there is no significant change in the per-
formance of MLP accuracy of train and test datasets. This
implies that the MLP model might under-fit the data. A more
in-depth architecture may be adopted.
4.3.2 ROC. For ROC score, it is considered good if it is
close to 1 (or 100 %). Overall, the ROC score of all three
classifiers is higher than 80%. The average ROC score of Figure 8: ROC Curve of Extra-Tree Classifier.
these models is higher than 85%. It is evident from fig. 6
that with the MLP classifier, the micro-average ROC score is
higher than the overall ROC curve score (black line), which • The number of epochs for healthy datasets are more
reflects that the data is not balanced and has affected the than 8 times higher than the number of epochs of pa-
performance of the MLP classifier. tients datasets. This leads to the data imbalance prob-
lem. (fig. 9).

Figure 9: Confusion matrix of all three models’ performance with


the training dataset.
Figure 6: ROC Curve of MLP Classifier.

The Extra-Tree model shown in (fig 8) is the most promis-


ing and stable so far as compared to the other two models.

Figure 10: Confusion matrix of all three models’ performance with


the test dataset.

It can be concluded that data imbalance has affected all


models’ overall performance. For example, a total number
of the healthy dataset (fig. 9) has 17276 healthy datasets
and 9660 patient datasets where the healthy dataset is quite
significant in number than the patient dataset. ET handles
Figure 7: ROC Curve of Decision-Tree Classifier. imbalance datasets slightly better than MLP and DT.
4.3.4 Classification Report. Overall, precision, recall and
4.3.3 Confusion Matrix. Confusion matrix provides TPR F1-score of DT and ET are higher than 90% which reveals
and FNR. It validates and verifies the ROC curve score and that these two classifiers have predicted accurate results
accuracy %. A few interesting facts are prominent from con- as compared to MLP. However, MLP precision, recall and
fusion matrices of different models shown in (fig. 9, 10). F1-score are reasonable as well.(fig.11).
• Healthy datasets are predicted with 100% accuracy by
the bootstrap models with both healthy and patient 5 CONCLUSION AND FUTURE WORK
data (fig. 9) whereas, there are variations with test Ischemic stroke detection with multi-domain feature set has
datasets (fig 10). shown promising results for all three selected classifiers
CASCON’18, October 2018, Markham, Ontario Canada Qureshi,A. Zhang, C. Zheng, R. Elmeligi, A.

(2012), 141.
[8] HiNT inc. 2017. HiNT Healthcare Innovation in NeuroTechnology.
https://www.hintneuro.com/
[9] Jan W Kantelhardt, Stephan A Zschiegner, Eva Koscielny-Bunde,
Shlomo Havlin, Armin Bunde, and H Eugene Stanley. 2002. Mul-
tifractal detrended fluctuation analysis of nonstationary time series.
Physica A: Statistical Mechanics and its Applications 316, 1-4 (2002),
87–114.
[10] Zhiyong Liu, Jinwei Sun, Yan Zhang, and Peter Rolfe. 2016. Sleep
staging from the EEG signal using multi-domain feature extraction.
Biomedical Signal Processing and Control 30 (2016), 86–97.
[11] Vladimir Matic, Perumpillichira Joseph Cherian, Ninah Koolen, Amir H
Ansari, Gunnar Naulaers, Paul Govaert, Sabine Van Huffel, Maarten
Figure 11: Average Precision, Recall and F1-Score of all classifiers.
De Vos, and Sampsa Vanhatalo. 2015. Objective differentiation of
The black line marks 90% benchmark.
neonatal EEG background grades using detrended fluctuation analysis.
Frontiers in human neuroscience 9 (2015), 189.
[12] Dariush Mozaffarian, Emelia J Benjamin, Alan S Go, Donna K Arnett,
(Multi-layered Perceptron, Decision-Tree and Extra-Tree) Michael J Blaha, Mary Cushman, Sandeep R Das, Sarah de Ferranti,
when there is no issue of data imbalance. The performance of Jean-Pierre Després, Heather J Fullerton, et al. 2015. Heart disease and
all classifiers is triangulated by three different performance stroke statisticsâĂŤ2016 update: a report from the American Heart
matrices. Bootstrap technique based classifiers (Decision- Association. Circulation (2015), CIR–0000000000000350.
Tree and, Extra-Tree) has shown consistency in their perfor- [13] WRW Omar, Z Mohamad, MN Taib, and R Jailani. 2014. ANN classifi-
cation of ischemic stroke severity using EEG sub band relative power
mance as compared to MLP classifier performance. ration. In Systems, Process and Control (ICSPC), 2014 IEEE Conference
For future work, a number of directions will be pursued. on. IEEE, 157–161.
First, we need to improve the classification accuracy of un- [14] WRW Omar, MN Taib, R Jailani, Z Mohamad, AH Jahidin, and Z Sharif.
healthy subjects. Currently, the best performing model has 2014. Application of Discriminant Function Analysis in ischemic
zero FPR but a relatively high FNR. This implies in practice; stroke group level discrimination. In Signal Processing & its Applications
(CSPA), 2014 IEEE 10th International Colloquium on. IEEE, 229–232.
unhealthy subjects may be diagnosed as healthy subjects. [15] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O.
This is unacceptable for medical diagnosis. We believe that Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas,
this is in part due to different signal characteristics arising A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay.
from the locality of strokes. Possible remedies include more 2011. Scikit-learn: Machine Learning in Python. Journal of Machine
robust features and location-specific models. Another source Learning Research 12 (2011), 2825–2830.
[16] Osmalina N Rahma, Sastra K Wijaya, Cholid Badri, et al. 2017. Elec-
of errors may come from unbalanced data between healthy troencephalogram analysis with extreme learning machine as a sup-
and unhealthy subjects. Secondly, the application of these porting tool for classifying acute ischemic stroke severity. In Sensors,
models on wearable devices has its set of challenges like Instrumentation, Measurement and Metrology (ISSIMM), 2017 Interna-
computation, power drainage, etc. We will investigate how tional Seminar on. IEEE, 180–186.
to reduce the model complexity without sacrificing the per- [17] J Röschke and JB Aldenhoff. 1992. A nonlinear approach to brain
function: deterministic chaos and sleep EEG. Sleep 15, 2 (1992), 95–
formance. 101.
[18] Takaya Saito and Marc Rehmsmeier. 2015. The precision-recall plot is
REFERENCES more informative than the ROC plot when evaluating binary classifiers
[1] Jason Brownlee. 2015. Tactics to combat imbalanced classes in your on imbalanced datasets. PloS one 10, 3 (2015), e0118432.
machine learning dataset. Machine Learning Mastery 19 (2015). [19] Pushpendra Singh, Shiv Dutt Joshi, Rakesh Kumar Patney, and Kaushik
[2] François Chollet et al. 2015. Keras. https://github.com/fchollet/keras. Saha. 2016. Fourier-based feature extraction for classification of EEG
[3] Simon Finnigan, Andrew Wong, and Stephen Read. 2016. Defining signals using EEG rhythms. Circuits, Systems, and Signal Processing
abnormal slow EEG activity in acute ischaemic stroke: Delta/alpha 35, 10 (2016), 3700–3715.
ratio as an optimal QEEG index. Clinical Neurophysiology 127, 2 (2016), [20] Zhichuan Tang, Chao Li, and Shouqian Sun. 2017. Single-trial EEG
1452–1459. classification of motor imagery using deep convolutional neural net-
[4] Division for Heart Disease and Stroke Prevention. 2018. CDC National works. Optik-International Journal for Light and Electron Optics 130
Center for Chronic Disease Prevention and Health Promotion, Division (2017), 11–18.
for Heart Disease and Stroke Prevention. https://www.cdc.gov/stroke/ [21] Peter Welch. 1967. The use of fast Fourier transform for the estimation
index.htm of power spectra: a method based on time averaging over short, modi-
[5] Brandon Foreman and Jan Claassen. 2012. Quantitative EEG for the fied periodograms. IEEE Transactions on audio and electroacoustics 15,
detection of brain ischemia. Critical care 16, 2 (2012), 216. 2 (1967), 70–73.
[6] Gabriele Gratton, Michael GH Coles, and Emanuel Donchin. 1983. A [22] WHO. 2018. WHO World Health Organization. http://www.who.int/
new method for off-line removal of ocular artifact. Electroencephalog- topics/cerebrovascular_accident/en/
raphy and clinical neurophysiology 55, 4 (1983), 468–484.
[7] Espen Alexander Fürst EAFI Ihlen. 2012. Introduction to multifractal
detrended fluctuation analysis in Matlab. Frontiers in physiology 3

View publication stats

You might also like