Computer Methods and Programs in Biomedicine: Reza Boostani, Foroozan Karimzadeh, Mohammad Nami

Computer Methods and Programs in Biomedicine 140 (2017) 77–91
Contents lists available at ScienceDirect
Computer Methods and Programs in Biomedicine

journal homepage: www.elsevier.com/locate/cmpb
A comparative review on sleep stage classification methods in patients

and healthy individuals
Reza Boostani a, Foroozan Karimzadeh a,∗, Mohammad Nami b
a
Department of Computer Science and Information technology, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
b
Department of Neuroscience, School of Advanced Medical Sciences and Technologies, Shiraz University of Medical Sciences, Shiraz, Iran
a r t i c l e i n f o a b s t r a c t
Article history: Background and objective: Proper scoring of sleep stages can give clinical information on diagnosing pa-
Received 7 August 2016 tients with sleep disorders. Since traditional visual scoring of the entire sleep is highly time-consuming
Revised 17 November 2016
and dependent to experts’ experience, automatic schemes based on electroencephalogram (EEG) analysis
Accepted 5 December 2016
are broadly developed to solve these problems. This review presents an overview on the most suitable
methods in terms of preprocessing, feature extraction, feature selection and classifier adopted to precisely
Keywords: discriminate the sleep stages.
Sleep stage classification Methods: This study round up a wide range of research findings concerning the application of the sleep
Wavelet transform stage classification. The fundamental qualitative methods along with the state-of-the-art quantitative
Random forest classifier
techniques for sleep stage scoring are comprehensively introduced. Moreover, according to the results
Entropy
of the investigated studies, five research papers are chosen and practically implemented on a well-known
public available sleep EEG dataset. They are applied to single-channel EEG of 40 subjects containing equal
number of healthy and patient individuals. Feature extraction and classification schemes are assessed
in terms of accuracy and robustness against noise. Furthermore, an additional implementation phase is
added to this research in which all combinations of the implemented features and classifiers are consid-
ered to find the best combination for sleep analysis.
Results: According to our achieved results on both groups, entropy of wavelet coefficients along with
random forest classifier are chosen as the best feature and classifier, respectively. The mentioned feature
and classifier provide 87.06% accuracy on healthy subjects and 69.05% on patient group.
Conclusions: In this paper, the road map of EEG-base sleep stage scoring methods is clearly sketched. Im-
plementing the state-of-the-art methods and even their combination on both healthy and patient datasets
indicates that although the accuracy on healthy subjects are remarkable, the results for the main commu-
nity (patient group) by the quantitative methods are not promising yet. The reasons rise from adopting
non-matched sleep EEG features from other signal processing fields such as communication. As a con-
clusion, developing sleep pattern-related features deem necessary to enhance the performance of this
process.
© 2016 Elsevier Ireland Ltd. All rights reserved.
1. Introduction normal sleep patterns [1]. According to the international classifi-

cation of sleep disorders (ICSD-II) criteria [2], eighty four different
Sleep covers almost one third of human lifespan. Due to the sleep disorders are defined. Sleep disorders not only cause a
direct relationship among sleep quality and humans’ physical and reduction in physical performance during the day, but also leave
mental performance, sufficient night sleep is crucial. As a result negative effects on cognitive functions such as attention, learning
of machinery and stressful life, sleep disturbance is increasing and memory, in long-term [3]. For instance, beside the significant
in modern societies. In addition, research findings suggest that side effects of obstructive sleep apnea syndrome (OSAS) including
several psychological and neurological disorders can deteriorate the increased risk of cardiovascular diseases; neurocognitive de-
cline and excessive daytime sleepiness are considered as potential
consequences [3].
∗
Corresponding author. To achieve the right diagnosis and treatment based on the
E-mail addresses: boostani@shirazu.ac.ir (R. Boostani), f.karimzadeh@ various biological records, accurate sleep scoring is deemed to
cse.shirazu.ac.ir, binazir.bfk@gmail.com (F. Karimzadeh), torabinami@sums.ac.ir be a crucial part of the process. Up to now, the conventional
(M. Nami).
http://dx.doi.org/10.1016/j.cmpb.2016.12.004
0169-2607/© 2016 Elsevier Ireland Ltd. All rights reserved.
78 R. Boostani et al. / Computer Methods and Programs in Biomedicine 140 (2017) 77–91
visual scoring method is still the most acceptable approach, Table 1

The frequency range of sleep EEG
though it involves visual data interpretation of different signals
bands and events.
[4]. Qualitative scoring, however, subject to some pitfalls including
experts’ experience which might result in different scoring results Freq. band Freq. range (Hz)
by different experts [5,6]. In an optimistic view, the agreement Delta 0.5–4
between the obtained results by two experts, in average, is 83 ± Theta 4–8
3 % [7] which is not convincing. Moreover, visual inspection is a Alpha 8–13
Beta 13–30
time-consuming process for a whole night EEG labeling. Therefore,
Sleep spindles 12–14
automatic scoring is deemed to be an efficient approach [8,9]. K-complex 0.5–1.5
Several research teams have recently proposed various methods
to automate the process of sleep classification (sleep scoring).
Several signal processing techniques along with machine learning noise is added by different signal-to-noise-ratio (SNR) values and
algorithms are adopted to obtain useful information from biolog- their performance are measured in presence of the noise.
ical signals [10]. Such methods are divided into two categories, Later in this report Section 2 explains the qualitative and
i.e. multi-channel and single-channel processing. In the former quantitative sleep stage assessments. Section 3 describes several
approach, the combination of various biological signals such as single-channel based methods in detail. In Section 4, results of
multi-channel EEG signals, electromyogram (EMG) [11] and elec- these methods on both normal and patient data are demonstrated.
trooculogram (EOG) are utilized to extract informative features The final Section is dedicated to the discussion and conclusion.
[12–18]. While the use of multi-channel signals leads to the higher
performance [19], it imposes a considerable cost to patients, 2. Methodology
especially in home sleep testing [5]. Moreover, excessive number
of wire connections during the recording process might per se Sleep stages can be qualitatively/quantitatively analyzed. In
result in sleep disturbance [20]. this Section, first visual sleep stage scoring criteria (qualitative
On the other hand, single-channel EEG based analysis is a methods) are explained. Then, several quantitative sleep scoring
cheap way of automatic sleep scoring. EEG contains valuable and methods are introduced in detail.
interpretable information resembling the brain activities which
is not only used in extensive research contexts pertaining to the 2.1. Polysomnographic data and qualitative assessment
brain, but also to diagnose and consequently treat neurological
disorders [21]. Sleep neurology is a progressively-evolving sub- Sleep medicine uses polysomnography (PSG) as an efficient
specialty field in which the sleep EEG signals are utilized to study method to record several biological signals to evaluate the sleep
the function of the brain during sleep, also to diagnose various quality. PSG recordings generally involve overnight monitoring of
types of disorders based on sleep stage analysis. There are many patients sleep EEG, airflow through the nose and mouth, respira-
single-channel approaches for automatic sleep stage classification tory rate, blood pressure changes, electrocardiogram (ECG) signals,
in the literature [9,15,22–24]. According to the available evidence blood oxygen level, EOG signals, as well as the chin and legs
[23], EEG signals are almost sufficient for reliable scoring. surface EMGs [26,27].
To the best of our knowledge, the reported classification The qualitative analysis (visual inspection) of a whole night
accuracies of the suggested methods are mostly obtained from PSG recordings is performed through one of the two available
healthy subjects [22,24,25]. Only a few methods in the literature standards [28] including the traditional Rechtschaffen and Kales
are tasted on patients with various sleep disorders [26]. However, (R&K) [29] and the more recently-developed standards laid down
it should be noted that automatic sleep scoring methods should by the American academy of sleep medicine (AASM) [30]. Based
gain acceptable performance on analyzing EEG signals in sleep on both the R &K and AASM criteria, EEG signal is the most
disorders. Sleep disorders (such as sleep-disordered breathing, informative signal compared to others. Experts analyze the EEG
REM behavioral disorder and sleep-related movement disorders) signals visually within successive 30 second intervals (epochs)
impose disruptive effects on the recorded signals. In these cases, mainly based on its standard rhythms (frequency bands). They
sleep signals behave more irregular containing higher movement assign a sleep stage as a label to each epoch successively [28]. The
artifacts. In addition, drugs consumption may also change sleep standard sleep EEG rhythms are categorized as Delta, Theta, Alpha,
patterns [6]. Such pitfalls may also influence both manual and Beta bands (Table 1). Moreover, two important events happening
automatic sleep scoring processes and the issue tends to be through sleep EEGs are Sleep Spindles and K-complexes, where
more profound in automatic methods. Inaccurate sleep scoring both exclusively occur in the second sleep stage.
leads to mis-diagnosis; consequently, the treatment based on this For almost three decades, the R&K sleep classification manual
wrong diagnosis cause negative consequences on patients’ disorder was the only widely-accepted standard to describe the human
outcome and well being [6]. A few reports confirm that due to sleep process [31]. According to the R&K criterion, sleep study
irregularity of sleep EEG among patients, the scoring accuracy do comprises seven stages including: wakefulness (W), non-rapid eye
not exceed 75% which is considered below expected standards [1]. movement (NREM) including stage 1, stage 2, stage 3, and stage
This paper reviews most of state-of-the-art automatic sleep 4, rapid eye movement (REM) and movement time (MT). Although
scoring methods with their pros and cons being discussed. Such the recommended setup is brief and instruction is easy to follow,
insights would be expected to help implementing several single- many issues in sleep study still remain unresolved [32].
channel methods and apply them to normal and patient groups in Regarding the most recent AASM manual, at least 3 electrodes
order to assess the performance of published methods in different should be placed on frontal, central, and occipital head regions to
circumstances. To our knowledge, thus far, no comprehensive record EEG signals [30]. The criteria concerning sleep-wake transi-
review on sleep EEG scoring is performed to compare the results tion, sleep stages, sleep spindles, K-complexes, micro arousals, slow
of state-of-the-art single-channel methods on both patients and wave and REM sleep are revised. Unlike the R&K recommendations,
healthy subjects. Moreover, the performance of different classifiers in the new manual, stages 3 and 4 are merged into N3 and the
are compared to find the best classifier for this application. In MT stage is retracted from analyses. The trend transition from R&K
addition, to assess the robustness of these methods, Gaussian rules to new AASM standards, left only a minor influence on total
R. Boostani et al. / Computer Methods and Programs in Biomedicine 140 (2017) 77–91 79
Fig. 1. This figure illustrates Wake, N1, N2, N3, REM stages, from top to bottom
panels, respectively.
Fig. 2. Distribution of research papers by year of publication from 20 0 0 to 2015. Fig. 3. Automatic sleep stage classification process.
sleep time (TST), sleep efficiency (SE) and REM stage measures achieve an accurate automated (expert-independent) method for
while it affected the quantification of sleep latency, wakes after sleep stage scoring.
sleep onset (WASO) and the distribution of NREM sleep stages. Based on the available literature, the quantitative sleep stage
These result in implications both in the clinical and research fields. scoring schemes comprises 3 common steps include preprocessing,
According to the AASM manual, each of the five stages is feature extraction and classification. Furthermore, in some papers,
defined below and also illustrated in Fig. 1: feature selection step is added following feature extraction in
order to find suitable subset of features [35]. What follows is
• W: Awake state (stage W) is characterized by alpha or faster introducing more frequent techniques in each step. The process for
frequency bands occupying more than 50% of the epoch, automatic sleep stage scoring is schematically illustrated in Fig. 3.
frequent eye movements and high EMG tone.
• N1: Stage N1 is scored when alpha occupies more than 50% 2.2.1. Preprocessing
of epoch while theta activity, slow rolling eye movements and The presence of artifacts might lead to the misinterpreta-
vertex waves are evident. tion, inaccuracy and distorted quantitative results; therefore, a
• N2: Stage N2 is scored when sleep spindles or K-complexes preprocessing step is necessary to remove artifacts and magnify
(less than 3 min apart) are noted. informative components of raw EEG signals prior to any further
• N3: Stage N3 is characterized by delta activity detected in over analysis [36]. There are biasing factors which potentially affect the
20% of the epoch length. accuracy of sleep EEG scoring [26]. These factors can be measure-
• REM: Upon sleep scoring an epoch is marked as REM when ment noise, physiological factors (including psycho-physiological
saw-tooth waves along with rapid eye movements as well as variants and aging) and pathological conditions (such as disordered
lowest EMG signals are observed through each epoch. sleep breathing and REM behavioral disorder) [30,37].
Despite hardware filtering during the recording, EEG signals
2.2. Quantitative assessment are still contaminated by other noise sources [38]. Diffuse artifacts
may mask cerebral activity and simulate sleep phasic events
Several automatic sleep stage classification methods are pre- (sharp vertex waves, K-complexes). Moreover, some noises and
sented in the literature [33,34]. To demonstrate the growing artifacts are created by the measurement apparatus. The power
interest on sleep stage classification, the number of papers pub- line interference (the 50/60 Hz components) or movements of the
lished in 28 relevant journals from the year 20 0 0 to 2015 are wires and electrodes can impose changes in the baseline charac-
shown in Table 2. Among such publications, “Computer Methods teristics of signals. Moreover, some artifacts are originated from
and Program in Biomedicine” and “Neuroscience Methods” are of non-cerebral sources such as eye movement, muscle and cardiac
the greatest interest. sources. Rolling eye movements happen upon the consciousness
As demonstrated in Fig. 2, the rate of publications has remained while in REM sleep the eye movements acquire saccadic patterns.
steady between 20 0 0 and 20 09, while it has quickly increased In addition, given the comparatively high amplitude level of ECG
from 2010 to 2015. This may indicate the recent growing attention signals, the QRS complex regularly interfere in the EEG signals
and joint cooperation of biomedical engineers and practitioners to causing spiky pattern [39]. Muscle activities generate EMG signals
Table 2
Distribution of research papers by journal.
Journal title Amount Percentage (%)
Computer Methods and Program in Biomedicine 6 10.27

Journal of Neuroscience Methods 6 10.27
Computers in Biology and Medicine 5 8.47
Journal of Medical Systems 5 8.47
Expert System with Application 4 6.78
Artificial Intelligence in Medicine 3 5.08
IEEE Transactions on Biomedical Engineering 3 5.08
Biomedical Signal Processing and Control 2 3.39
International Journal of Neural System 2 3.39
Journal of Sleep Research 2 3.39
Methods of information in medicine 2 3.39
Neurocomputing 2 3.39
Sleep Medicine Reviews 2 3.39
Advances in Adaptive Data Analysis 1 1.69
Biomedical Engineering Applications, Basis and Communications 1 1.69
Biomedical Engineering On-line 1 1.69
Brain Research Bulletin 1 1.69
Clinical Neurophysiology 1 1.69
Computers and Biomedical Research 1 1.69
Frontiers of Computer Science 1 1.69
IEEE Transactions on Neural Systems and Rehabilitation Engineering 1 1.69
IEEE Transactions on Instrumentation and Measurement 1 1.69
International Journal of Adaptive Control and Signal Processing 1 1.69
Medical and Biological Engineering and Computing 1 1.69
Neural Computing and Applications 1 1.69
Sleep 1 1.69
Sleep Medicine 1 1.69
World Academy of Science, Engineering and Technology 1 1.69
Total 59 100
which can corrupt and distort EEG signals, especially during the isting papers, we can observe that Symmlet and Daubechies
arousal periods [26,40]. Further to the instrumental and biological mother wavelets are the most common used wavelets for EEG
artifacts, inter-individual variability of background activity and processing [46].
phasic events as well as age-related changes may influence the • Blind Source Separation (BSS): An alternative artifact removal
sleep stage scoring. On the other hand, focal cerebral lesions might approach is BSS algorithms. Independent Component Analysis
decrease phasic event’s amplitude, density and therefore mimic (ICA) [39,40]is one of the well-known BSS methods to produce
the REM pattern. In patients with wake or nocturnal epilepsy, the de-correlated brain source signals after fading the artifacts. By
interictal EEG epileptic activity may similarly mask sleep activity calculating the energy of source signals and using a threshold,
or bias sleep EEG phasic events [6]. Consequently, the above the independent signals with lower energy level will be elim-
biasing factors may hinder accurate scoring and emphasize the inated. Finally, with an inverse transform, the denoised signals
role of preprocessing in order to avoid deceiving clinical results will project back into the spatio-temporal domain.
both in visual and automatic scoring.
After the preprocessing step, based on the mentioned criteria,
Several methods are proposed in order to attenuate or elim-
EEG records are segmented into the successive 30 sec epochs
inate the effects of artifacts in EEG signals. These methods are
with no overlap, after which, further analysis are carried out at
described below:
each epoch.
• Digital Filtering: It covers a vast class of different filters
which can be linear or non-linear. To pre-process raw EEGs, 2.2.2. Feature extraction
it is common to apply a band-pass filter (e.g. Butterworth) to Features are descriptive values that project the content of a
eliminate the undesired parts. In order to eliminate the power signal from a special aspect. Usually, different types of features are
line noise from the signals, a digital Notch filter is utilized extracted from a windowed signal; therefore, to avoid collecting
[38]. Band-pass filters [41] such as Butterworth [38] are used redundant information, the elicited features should be as inde-
to reduce the muscle artifacts and eliminate linear trends pendent as possible. Extracting informative, discriminative and
[42]. Further to the linear filtering, adaptive filters are used to independent features is a crucial step to prepare a suitable set
remove the effects of the EOG and ECG artifacts in which the of values for a classifier [47]. A wide variety of signal processing
frequency contents overlaps with EEG spectrum [43]. techniques are utilized to extract discriminative information from
• Regression: Regression methods are able to learn a signal be- the EEG signals in order to automatically classify the sleep stages
havior. If an additive color noise distorts the EEGs, a regression [10,36]. Features can be totally divided into four categories (Time
method is used to learn this trend and therefore by modeling domain, Frequency domain, Time-Frequency domain, Nonlinear)
the color noise, it can be subtracted from the recorded signal which are briefly discussed below:
[44].
Time Domain Features. Time domain features can represent
• Wavelet Transform (WT): WT decomposes a signal into its
the morphological characteristics of a signal. They are simply
sub-bands. By calculating the energy in each sub-band and
interpretable and suitable for real-time applications. Some of the
applying a threshold, the noisy and undesired bands are elim-
widespread time-base features are described below:
inated [45]. Since there are many mother wavelet functions,
the one with highest similarity to the nature of EEG (in terms • Statistical Parameters: 1st to 4th-order moments of time series
of frequency bands) should be selected. Reviewing the ex- i.e., mean, standard deviation, skewness and kurtosis, respec-
Table 3
Time domainfeatures.
Feature name Formula Elucidation
Mean μ = N1 nN=1 xn xn (n = 1, 2, . . . , N ) is a time-series.

Variance 1
var = N−1 nN=1 (xn − μ )2 A measure of how far a set of numbers is spread out from mean.
Standard Deviation std = (Var )1/2 A measure of dispersion of a dataset.
skew = (E[E[(x(−x−μμ)2)])]3/2
3
Skewness A measure of symmetry of a dataset about its mean.
kurt = (EE((xx−−μμ))2 )4
4
Kurtosis A measure of tailedness of a dataset.
T h×N
Threshold Percentile T h −
per = 100
It selects those samples which fall lower than the threshold (in percentage).
x N +x N +1
2 2
for even N
Median M= 2 A sample value separating higher from lower half of a dataset.
x N+1 for odd N
2
tively [25,41], median and 25th, 75th percentile of the signal to model the two others. To solve this problem, some criteria
distribution [35] are known as the most effective time domain such as Akaike’s information criterion (AIC) [54] are introduced
features for EEG signals. Their mathematic formulas are pre- in the literature [55].
sented in Table 3. They are not only applied to EEG signals but • Non-parametric methods: In non-parametric approaches, PSD
also used to extract features from the other biological signals values are calculated directly from the signal samples in a
including EOG and EMG [13,38]. given windowed signal. They are based on Fourier transform
• Hjorth Parameters: Hjorth parameters are introduced by Bo [56]. Among non-parametric methods, Periodogram and Welch
Hjorth in 1970 [48]. These parameters include Activity (Ha ), [57] schemes are general for calculating PSD and extracting
Mobility (Hm ) and Complexity (Hc ) as the measure of variance features from EEGs. Easy implementation based on Fast Fourier
of a time series (x), proportion of standard deviation of the Transform (FFT) algorithm is the merit of non-parametric
power spectrum and change in the frequency, respectively. methods; however, they gain low frequency resolution when
They are widely applied features for EEG analysis [41,49] and the signal interval becomes short.
described below.
Higher Order Spectra (HOS). HOS is another method used to
Ha = var (x(t )) (1) extract frequency domain features. HOS techniques are employed
in several biomedical applications [58]. HOS represents the fre-
quency content of higher order statics of signals (cumulants)
var (x(t ) dx
dt
) [59,60]. Power spectrum equals to second-order spectra which
Hm = (2)
var (x(t )) lose the phase information. Also, the 3rd order spectrum is known
as bi-spectrum which is showed in Eq. (5).
Hm (x(t ) dx
dt
)
Hc = (3) B( f1 , f2 ) = E[X ( f1 )X ( f2 )X ∗ ( f1 + f2 )] (5)
Hm (x(t ))
where E[.] is expectation operator, X(f) is the Fourier transform of
• Zero Crossing Rate (ZCR): ZCR represents number of times that
the signal and f is the frequency.
a signal crosses the baseline (Eq. (4). The baseline can be
The advantage of using HOS is its ability to reveal non-Gaussian
obtained by taking a mean value from a windowed signal [41].
and nonlinearity characteristics of the data [58,60]. Since EEG is
It measures the number of sign-changes through a windowed
a complex signal subject to some non-linear interaction in its
signal. Given the differences in quality and the appearance of
frequency components, HOS methods are useful approaches in
the each sleep stage in the time domain, ZCR may vary from
analyzing sleep EEG signals [59].
one stage to another; as such, it can be used as a suitable
feature for sleep stage classification [25,35,50]. Nevertheless
ZCR is highly sensitive to additive noises. Time-Frequency Domain Features. Due to the non-stationary
nature of EEG signal, i.e. variation in its properties during the
ZCR = count {n|(xn xn−1 ) < 0} (4) time, a wide range of time-frequency methods are utilized as an
efficient tool in this field. There are three ways to transfer a signal
Frequency Domain Features. Frequency domain features are ver- into the time-frequency plane including signal decomposition,
satile features which are repeatedly utilized for describing changes energy distribution and modeling. The first two ways are utilized
in EEG signals [7,14,38,51,52]. There are several approaches to in the sleep applications [33,61] (explained below).
extract frequency features, herewith summarized.
Spectral Estimation. To obtain spectral characteristics of EEG
• Signal Decomposition: These methods decompose signals to a
signals, first, time series should be transferred to the frequency series of basis functions. Short Time Fourier transform (STFT)
domain. Due to the stochastic nature of EEG signals, signal au- is a very simple time-frequency analysis. In this method, first,
tocorrelation needs to be estimated first. Then, by taking Fourier signal is uniformly windowed and then FT is applied to each
transform (FT) from the autocorrelation function, Power Spectral window as described in Eq. (6).
+∞
Density (PSD) is achieved. The two approaches to estimate PSD of a
windowed signal include parametric and non-parametric methods. ST F Tx (t, f ) = x(τ )h∗ (τ − t ) exp(− j2π f τ ) dτ (6)
−∞
• Parametric methods: parametric methods are model-based where h(.) is a window function.
approaches in which the spectrum is estimated by the signal To use this method as a means of feature extraction, the
model. Autoregressive (AR), Moving average (MA) and Autore- issue of time and frequency resolution need to be well noted.
gressive moving average (ARMA) are amongst popular methods There is an inverse relationship between the length of the
in EEG processing [53]. When signals have low SNR and long window and the time resolution. In addition, the time and
length, parametric approaches present a smoother and better frequency resolutions have an trade-off together. For instance,
estimation of power spectra. However, model order determina- the shorter the data segment, the higher the time resolution
tion of AR is a main concern in such methods, since AR is able will be achieved and therefore frequency resolution decreased.
However, a fixed specific length of the window is considered Several entropy and complexity measures such as Renyi’s
to determine STFT leading to a limited frequency resolution. Entropy [33], Sample Entropy [7], Tsallis Entropy [64], Permuta-
Wavelet Transform (WT) is a popular time-frequency trans- tion Entropy [34], Lempel–Ziv [41], Multi Scale Entropy (MSE)
formation that applies different filters to decompose a signal [20] and Approximate Entropy (ApEn) [67] are introduced in the
into dyadic frequency scales. These filters are generated from literature. They have enough potential to serve as informative
a function which called mother wavelet. Both discrete and features in sleep stage classification [10].
continuous forms of WT are utilized for sleep classification • Fractal-based: Constructing a noisy-like shape can be done
[33,62]. Continuous Wavelet Transform formula of a given by repeatedly applying an operator on a geometric element.
signal of x(t) is shown in Eq. (7). Several phenomena exist which have a noisy behavior while
they are rule-base in nature. For instance, the behavior of EEG
Wa,s = x(t )φa,s (t )dt (7) signals or shape of leaves in a tree look random while one
can hardly claim that creating EEG signals or leaves in a tree
where a and s are scale and time shifting parameters, respec- are random. For the first time a novel method is proposed by
tively. φ a,s is mother wavelet transformation which is defined Mandelbrot [69] to measure the fractal dimension of irregular
as: shapes. The concept of fractal dimension can describe the
1
t − s behavior of random-like shape by determining the amount of
φa,s (t ) = φ (8) self-similarity on that given shape or signal. There are many
|a| a
fractal-based methods like Correlation Dimension [59], Lyapanov
WT is a powerful method since it describes the signal into exponent [41] and Hurst exponent [38] which first map a signal
different frequency resolutions where the decomposed signals into the phase space and then measures the self-similarity of its
are orthogonal in most mother wavelets. In addition, a colored trajectory shape. In other words, instead of analyzing a signal in
noise cannot invade all of the wavelet features in different the time domain, these methods analyze its trajectory behavior
scales [46]. in the phase space, in terms of self-similarity. There are some
• Energy Distribution (ED): ED represents the distribution of en- other methods determining the roughness or irregularity of the
ergy simultaneously through both time and frequency domains. signal in the time domain such as Katz [70], Petrosian [35] and
Several methods are placed in this category. Choi–Williams dis- Higuchi fractal dimension (HFD) [41]. Experimental results from
tribution (CWD), is energy conservative which maintains time different studies have suggested that measuring the behavior
shifts as well as the frequency shifts. These characteristics make in the phase space is more accurate than the same in the time
it a favorable method for the time-frequency analysis [33,63]. domain, though it imposes a higher computational burden [70].
Wigner-Ville distribution (WVD) is another non-linear time-
frequency method which is widely used to analyze non- 2.2.3. Feature selection
stationary signals. However, the presence of cross-terms in the In some sleep stage classification methods especially in multi-
time-frequency domain for multi-component non-stationary channel approaches, feature selection techniques are applied
signals hinder its efficiency. Therefore, the smoothed pseudo after the feature extraction stage to find discriminative subset of
Wigner–Ville distribution (SPWVD) (the modified version of features. The purpose of adding feature selection part is to obtain
WVD) is proposed to avoid the effects of cross-terms [61], as minimum number of features without redundancy while produc-
described in Eq. (9). ing higher accuracy. Furthermore, it avoids the issue of over-fitting
τ ∗ τ
and reduce the computational time. Since an overnight PSG record
S(t, f ) = h (τ ) f ( u − t )x u + x u− du is large, feature selection is useful especially once different types
2 2 of features are extracted.
×exp(− jwτ )dτ (9) To this end, some statistical techniques are proposed in the
literature. These techniques try to find appropriate and discrim-
where f(.) and h(.) are the time and frequency smoothing
inative subset of features to distinguish different sleep stages.
functions, respectively.
Sequential forward selection (SFS) [15] and sequential backward
In addition, Hilbert–Huang Transform (HHT) is a new method
selection (SBS) [15,71] are two simple strategies to explore through
in analyzing nonlinear and non-stationary signals such as EEGs.
the features and find a proper subset. Nevertheless, SFS and SBS
It is recently employed in many applications in biomedical
highly suffers from the lack of backtracking. In addition, feature
fields including sleep stage scoring [33].
extraction algorithms such as Principal Component Analysis (PCA)
and Linear Discriminant Analysis (LDA) [72] are used for feature
Non-linear features. Since EEG signals exhibit non-linear charac- reduction. On the other hand, meta-heuristic algorithms like
teristics and complex dynamics [64], non-linear measures are used Genetic Algorithm (GA) [66] and Particle Swarm Optimization
as effective techniques for EEG processing [65,66]. They are also [73] are repeatedly used for feature selection.
widespread in sleep stage scoring to extract efficient features from
sleep EEG [15,18,65,67,68]. Non-linear methods are mainly divided 2.2.4. Classification
into two categories described as follows: Various classifiers are utilized to classify the elicited features
and assign a sleep stage into each epoch. These classifiers are
• Entropy and complexity-based: Entropy-based methods calculate learned to construct linear/non-linear boundaries to separate
the irregularity and impurity of a signal in the time domain feature vectors of different classes. Some of the popular classifiers
[66]. In other words, the more the changing patterns inside used in sleep stage scoring are herewith explained.
a windowed signal get regular, the less would be the entropy
value of that windowed signal and vice versa. The most famous • K-Nearest Neighbor (KNN): This is a local classifier which is
entropy measure is introduced by Shannon as described below: suitable for multi-modal distribution data. In other words,
when the samples of a certain class are modularly scattered
Entropy(X ) = −iN=1 Pi Ln(Pi ) (10) in different locations of the feature space, even flexible and
strong classifiers cannot accurately put a boundary around this
where N is the number of samples and Pi is the probability of localities. KNN assigns a label to an input patterns based on
ith sample. the majority vote of its k-nearest samples [57].
separability among the samples is maximized in the projected

space [66]. The Fisher criterion is described as:
W T SB W
J (W ) = (12)
W T SW W
where W is potentially a hyperplane (in two-class problems) or
a matrix of hyperplane (in multi-class problems).
Incidentally, SW and SB are between and within class scatter
Fig. 4. RF Structure.
matrices, defined as:
SWi = x∈ci (x − mi )(x − mi )T , i = 1, 2, . . . , ci (13)

It can be statistically demonstrated that when k=1, the error
probability of 1NN is bounded between that of the optimum
classifier (Bayes) and twice of this value [74]. SW = kc=1 SW i (14)
Perror (Bayes ) ≤ Perror (1N N ) ≤ 2Perror (Bayes ) (11) where x is the input vector, c is the number of classes and m
• Support Vector Machine (SVM): The main discriminant advantage is mean value.
of SVM compared to the other classifiers is its objective func- SB = kc=1 (mk − m )(mk − m )T (15)
tion. The SVM goal is to minimize the error risk [75,76] while
other classifiers are concentrated to minimize the error itself. The final decision is made by applying a distance-base clas-
SVM is learned under a two-objective fitness function [77]. sifier (NC) to the LDA outputs. In other words, LDA acts as a
SVM maximizes the margin width around the separating plane feature extractor and the projected features are assigned to the
between two classes while minimizing the training error. corresponding classes according to the minimum distance of
Lagrange method is adopted to solve this constrained opti- the project sample to the center of each class, separately [46].
mization problem. After solving it, just those samples which • Neural Network (ANN): ANN is an artificial information process-
fall within the margin has a non-zero Lagrange multiplier and ing system comprising a number of interconnected processors
the other samples play no role in determining the separating called neurones. Scientists tried to assign functions for these
plane. These marginal vectors are called support vectors and neurons following the real behavior of biological neural cells.
SVM boundary is set according to these samples. The number The most employed functions of neuron in different ANNs are
of support vectors is too low compared to the number of sigmoid, hyperbolic and linear functions. ANN architecture can
samples within the classes; therefore, in the case of falling a contain several hidden layers each includes several neurons. In
few noisy samples among margin, as the support vectors, the the case of encountering with limited number of samples, it
SVM separating plane is highly deteriorated. is common to use just one hidden layer due to preventing the
• Random Forest (RF): RF is proposed by Breiman [78], consists over-fitting phenomenon, whereas in the case of accessing to a
of an ensemble of tree-structures. Each individual tree plays huge number of samples (e.g. image datasets), several number
the role of an individual classifier. The main difference of RF of hidden layers are used. When the number of hidden layers
to the other classifiers is that feeding the training samples exceeds than five, the network is called deep network in which
to the trees are performed as random as possible through a the memory of network is remarkably increased to preserve
random selection followed by different bootstrap selections. the information of input samples [80].
This process are repeated several times to blend the samples According to the statistics, most of the utilized neural networks
in order to desensitize the effect of noisy and outliers samples include an input layer, a hidden layer and an output [36]. The
of the training phase. Finally, the output is determined by number of nodes in input layer depends on the number of
voting of each trees’ outputs. In order to build the RF classifier, input features and the number of nodes in the output layers
first, an arbitrary high number of decision tree are randomly depends on the number of classes. The number of neurons
generated to construct a forest. Each of the trees (e.g. C4.5 in the hidden layer can be determined through the cross
or random tree) is grown through the bootstrap training. For validation phase. ANNs are widely used in sleep stage scor-
each tree, 63.2% of the train data is randomly selected with ing [7,13,16,36,81]. The schematic diagram of a feed-forward
replacement (in-bag data) and the rest are considered as vali- multi-layer perceptron (MLP) is shown in Fig 5.
dation set (out-of-bag data). Each tree has a partial observation Outputs (y) are calculated based on a transfer function as
of the train samples and the trees are trained independent illustrated in Eq. (16).
to each other. If the input sets of different trees are selected
y j = kt =1Wk(j2) fk (16)
as uncorrelated as possible to each other, the performance of
the constructed forest becomes high in terms of accuracy and
robustness against noise. Moreover, the strength of a forest is fk = f (Wik(1) xi + bk ) (17)
highly depends on its trees’ performance. There is no need for
a cross validation to set structure of random forest [33]. The where W(1) and W(2) are the input and output weight matrices,
general structure of RF classifier is summarized in Fig. 4 respectively. Indeed, t is the number of neurons in the hidden
• Linear Discriminant Analysis (LDA) & Nearest Centers (NC): LDA is layer and b is a threshold value which let a neuron to be fired.
developed by Fisher in 1936 and optimized by a very popular In contrast to the sluggish training phase of ANNs, these
criterion function, called as Fisher criterion [79]. In two-class networks act very fast in the test phase. The main drawback of
problems LDA can be considered as a classifier whereas in ANNs is their high tendency of being over-fitted to the noisy
multi-class problems, LDA is acted as a feature-extraction samples which are mostly located along the margin space be-
method. Since sleep EEG contains 5 classes, LDA provides more tween the classes. ANNs do not include a margin to consider a
separable features for the next classifier. LDA tries to maximize little space for deterioration of test samples around its border.
the ratio of between-to-within classes’ scatter matrices. This However, the selection of its parameters, such as network size
is done by projecting input samples onto a few number of and training algorithms, is trivial since such parameters can
hyperplanes (depends on the number of classes) such that the affect the classifier performance.
Fig. 7. HMM architecture.
• Hidden Markov Model (HMM): HMM is a comprehensive version

of Markov chain model in which the sequence of states is
unknown. HMM is dynamic classifier which can tolerate the
time warping of input observations, O = {O1 , . . . , OT } where
Oi is the ith observation (feature vector). The semi-continuous
HMM consists of a few continuous states, each constructed
by Gaussian functions, which are connected by discrete
transition probabilities. The main structure of HMM is the
semi-continuous left-to-right model, i.e.
λ = ( π , bi , A ) (20)
Fig. 5. ANN architecture, feed-forward multi-layer perceptron (MLP).
where A is the transition matrix, bi is the ith state probability
and π demonstrates the probability of the first state.
HMM training process involves 3 stages. First the HMM pa-
rameters should be determined by the Baum–Welch algorithm.
Afterwards, the sequence of states is found by Viterbi algo-
rithm [14]. Following the above, the problem is simplified into
calculating the output of Markov model (stage 3) [14]. The
number of states and their number of Gaussian functions are
determined by the Expectation Maximization (EM) criterion
[85]. Fig. 7 shows a schematic diagram of HMM.
where aij shows the probability of moving from ith to jth state.
• Clustering: Clustering is a kind of grouping or unsupervised
categorization where the samples lack labels. Therefore, this
Fig. 6. GMM Structure.
grouping should be performed based on a similarity measure
such as distance metrics, information criteria and statistical
• Gaussian Mixture Model (GMM): This statistical model is uti- measures. The idea of clustering is to create different clusters
lized in a wide range of applications as the density estimators such that each cluster contain similar samples according to
[82] and classifiers [59]. This model consists of several Gaus- one of the mentioned criteria. Clustering methods are gener-
sian functions integrated with different coefficients and finally ally divided into four categories including flat (e.g. K-means)
the weighted summation on these probabilities construct the [86,87], hierarchical (e.g. single and complete-linkage) [88,89],
output [59]. Therefore, it can be said that a GMM model has graph-base (e.g. Shared Nearest Neighbor (SNN)) [90] and
3 types of parameters which need to be estimated separately, density-base (e.g. DBSCAN, OPTICS and DenClue) [88].
i.e. mean vector (μ), covariance matrix ( ) and weight (W) Clustering methods, specially K-means, are repeatedly used
for each Gaussian, separately [83]. GMM is utilized to estimate in sleep stage scoring [57]. In other words, after clustering
a continuous probability density function from a multi- the whole sleep features, it is expected that feature vectors
dimensional feature set. The Gaussian mixture distribution and belonging to each sleep stage are gathered in a certain cluster.
the probability density of a single Gaussian component of D
dimensional are given in Eqs. (18) and . (19) , respectively. The 3. State-of-the-art studies in automatic sleep stage
model parameters are estimated using Expectation Maximiza- classification
tion (EM) algorithm such that the likelihood of observation
is being maximized [83,84]. Schematic diagram of GMM is Each of the available automatic sleep classification methods is
outlined below. evaluated based on a certain dataset with its own signal quality,
while changing in signals quality leads to deviate in the final re-
p(x ) = iN=1Wi g(x|μi , i ) (18)
sults. In practice, due to the importance of accurate sleep scoring
1
1
for patients with sleep abnormalities, these methods should have
g ( x | μi , i ) = exp − (x − μi )T i−1 (x − μi ) (19) high performance in patients’ recorded signals. To the best of our
(2π )D/2 |i | 2 knowledge, evidence on various methods on a specific dataset
where T is the sign of vector transform. acquired from patients with sleep disorders are thin. The present
In fact, the independent Gaussian functions with different review is an attempt to discuss the most recent automatic sleep
weights can model every arbitrary distribution functions. In stage classification methods using single-channel EEG data.
order to use GMM as a classifier, the distribution of input
features of each class is learned by a GMM where each GMM 3.1. Polysomnographic datasets
acquires a specific label, as illustrated in Fig 6. Consequently,
when a sample is entered to the parallel GMMs, its label Two EEG datasets which are available on-line in the data repos-
is assigned according to the label of the given GMM which itory physionet [91] have been employed in this study. The first
produces the highest probability. dataset contains EEG signals belong to healthy subjects whereas
• First-order moment of amplitude of diagonal element in

bi-spectrum:
mAmp = kN=1 klog(|B( fk , fk )| ) (24)
where k is a sample lies on the diagonal line of bi-spectrum
plane. It should be noted that Eq. (24), is determined in a
non-redundant region as shown in Fig. 8.
• Weighted Center of bi-spectrum (WCOB):
iB(i, j )
W COBx = (25)
B(i, j )
After extracting the above HOS features from each epoch, a
GMM classifier is trained to learn the feature vectors of each
sleep stage. At last, for the final classification of a whole sleep,
Fig. 8. Non-redundant region for biospectrum calculation. the elicited feature vector of each epoch is entered to all trained
GMMs. Since each GMM has a label, the label of the GMM which
produces the highest probability for each feature vector is assigned
the second one is acquired from patients who suffer from REM to the input.
behavioral disorder (RBD). The employed datasets in this study are
described as follow. 2. In another study, CWT is used to represent EEG signals into the
time-frequency domain [46]. To have a rich set of features, they
3.1.1. Sleep–EDF dataset [extended] employed three different mother wavelets including the reverse
This dataset includes the collection PSG records with respective bi-orthogonal (rbio3.3), Daubechies (db20), and Gauss of order 1
sleep stages annotations [92]. It consists of 2 files including SC and (Gauss1) with center frequencies of 0.43, 0.67, and 0.20, respec-
ST, where the first contains PSGs of 20 healthy males and females tively. After passing the EEGs through different wavelets’ filters, en-
aging between 25–34 years old, without any medication related tropy of each filtered signal is determined according to Eq. (26) for
to sleep. The records consist of EEG (from Fpz-Cz and Pz-Oz elec- each of the frequency bands shown in Table 1. Moreover, Beta band
trodes location), EOG, submental chin EMG and event marker. The is divided into two sub-bands with frequencies of 13 − 22 and
EEG signals are acquired with the sampling rate of 100 Hz. The 22 − 35 Hz; therefore, 7 frequency bands are totally considered.
ST file consists healthy male and female records with only mild
Ent = −in=1 pi logpi (26)
difficulty falling asleep. In addition hardware filtering is applied to
the signals. Each of the 30 sec epochs is manually scored based where p is the histogram distribution of wavelet coefficients in
on the R&K standard as the annotations consist of sleep stages W, each band with n bins.
R, S1, S2, S3, S4, M (Movement epochs) and U (unscored). In the These entropy values are arranged in a feature vector for each
present study, 20 records from SC files are selected and the bipolar epoch. Therefore, by calculating the entropy in seven frequency
records from Pz-Oz channels are used as a single-channel EEG. bands for each CWT separately, a feature vector consists of 21
elements for each epoch is formed and continued for the entire
3.1.2. CAP sleep dataset EEG signals. The sleep stage classification process is ended by
It contains PSG recordings of healthy subjects as well as feeding LDA+NC classifier with the extracted features.
patients with seven various sleep disorders datasets [93]. They
consist of at least 3 EEG channels (F3 or F4, C3 or C4 and O1 or 3. The Welch method is employed to extract 129 features from
O2, refer to A1 or A2), ECG, submental muscle EMG, bilateral an- each epoch by Günes et al. [57]. Due to the high number of input
terior tibial EMG and respiration signals. EEG signals are sampled features, four new features including minimum, maximum, mean
at 512 Hz. Prefiltering also applied to the signals including: LP value and standard deviation are extracted from each feature
(30 Hz), HP (0.3 Hz) and Notch filter (50 Hz). In addition, sleep vector. Then, the feature weighting process is pursued using the
stages annotations are available as: W=wake, S1-S4=sleep stages, K-means clustering based feature weighting (KMCFW) as described
R=REM, MT=body movements. In this study, 20 single-channel in Algorithm 1.
EEG records of RBD dataset are selected as the patient dataset.
Algorithm 1 K-means clustering based feature weighting (KM-
3.2. Comparative studies CFW).
Require: the dataset including classes
1. Informative features from the higher order spectral (HOS) 1: Find centers of features using K-means Clustering (KMC)
space are elicited for sleep stage classification by Acharya et al. 2: Calculate the ratios of means of features to their centers
[59]. First, bi-spectrum of EEGs were calculated in each epoch as 3: Multiply these ratios with each feature
explained in Eq. (5). Then, following features were extracted in the
biospectrum space.
• Normalized Bispectral Entropy (BE 1): Algorithm 2 MSE algorithm.
Require: A time series x = {x1 , x2 , . . ., xN }
BE1 = − pn logpn (21) 1: Divide x into non-overlapping windows of length τ (scale).
where 2: Determine the average of each window as an one dimensional
|B ( f 1 , f 2 )| sample.
pn = (22) 3: Identify the Entropy of these successive samples as a sample
|B ( f 1 , f 2 )|
entropy feature.
is the non-redundant region in Fig. 8
• Sum of logarithmic amplitude of bi-spectrum:
Afterwards, these weighted features are applied to KNN
H1 = log(|B( f1 , f2 )| ) (23) classifier, where K is set to 30 through the cross validation phase.
4. In another attempt, MSE feature containing 1 to 13 scales along

with AR coefficients (order 8) are extracted to describe the EEG
behavior by Liang et al. [20]. MSE algorithm is explained below. It
should be noted that AR coefficients are extracted from the delta
band of EEG signals.
Therefore, the combination of 8 AR coefficients and 13 MSE
features build the feature vector with 21 elements in each epoch.
The process is developed by classifying the extracted features
using the LDA+NC classifier.
5. In the last investigated method, three different time-frequency

techniques including CWT, CWD and HHT, are separately adopted
to represent EEG content in the time-frequency domain [33]. Each
of the time-frequency transforms divides the EEG into dyadic
sub-band (as described in Table 1). To elicit informative features,
Renyi’s entropy is applied to each sub-band within each epoch. The Fig. 9. Average classification accuracy over all sleep stages for both normal and RBD
formula of Renyi’s entropy is described in Eq. (27). (patients) datasets for the five papers.
1
REnt = log2 (in=1 Piα ) (27)
1−α the classification accuracy of sleep stages on both groups. The
where Pi is the histogram distribution belonged to the coefficients results in Fig. 9 are presented as an average for all sleep stages,
of each mentioned time-frequency algorithms in each band with while Table 4, illustrates the classification accuracy of those meth-
n bins. α is the order of Renyi’s entropy. When α equals to 1, the ods on just normal subjects upon the entire sleep. Moreover, a
Renyi’s Entropy converge to Shannon Entropy. short description of each method is presented in Table 4.
These entropy values are extracted from the sub-bands of The best performance in terms of accuracy belongs to the
each time-frequency transforms, separately. In their proposed 5th approach in which entropy of CWT and RF are chosen as
method an assessment is carried out by applying the entropy the features and classifier, respectively. Apparently, this approach
values, belong to each transform, to a random forest (contains 10 provide over 80% classification accuracy in average over the whole
trees) classifier. According to the obtained results, the CWT feature sleep stages. Moreover, to better understand the misclassification
provides the highest accuracy, compared to the others. among stages and reveal the affection of classes on each other, the
mean percentage of accuracy for 2 by 2 stages for the approach
4. Results with higher total accuracy (i.e. 5th approach) is demonstrated
in Table 5. This table is similar to what confusion matrix (CM)
After investigating the results from five elegant studies, their present, but it is not exactly CM since CM is defined for 2 classes,
methods are applied to the present datasets (containing healthy but here we have 5 classes. According to the obtained results,
subjects and patients) as described in Section 3. The employed Among all sleep stages, the classification accuracies of the W stage
datasets are annotated by experts according to R&K rules [92,93]. and N2 have shown the best performance, while the classification
Since AASM is an improved and more recent standard for sleep of stage N1 and REM have the lowest performance, respectively.
stage scoring, to match the results with this standard, S3 and S4 stage N1 mostly misclassified with stage W, N2 and REM, while
are merged into one stage (N3). These five methods are evaluated more than 20% of stage REM was classified as stage N2.
in terms of classification accuracy and their robustness against Moreover, as demonstrated in Fig. 9, the accuracy of five
additive noise (with different SNRs). To find the best combination methods on the RBD dataset is reduced more than 25% in av-
of features and classifier for achieving accurate and reliable re- erage. Although the accuracy of all methods has decreased for
sults, here, additional implementations are suggested and executed the patients’ dataset, the 5th method again presented the best
by applying the deployed features by the mentioned papers to classification performance.
different classifiers including: RF, KNN, LDA+NC and GMM. On the other hand, when comparing these five approaches from
To proceed to the preprocessing phase, the raw signals are fairly the robustness point of view, white Gaussian noise is added to the
cleaned by applying a bandpass Butterworth filter (order 5 with raw signals with different levels of SNR and then the whole pro-
frequency cutoff between 0.5 − 35 Hz) to remove the base-line cess is executed. The SNR levels are 1, 10, 20 dB. Since the results
drift and linear trends. Next, EEGs are segmented into successive of patients are not comparable to normals, additive noises are just
30 s epochs and then the features and classifier of each paper are added to the raw normal signals. According to the results which
separately executed to those epochs in order to produce the results are shown in Fig. 10, The 5th method shows of the lowest sensi-
on both groups. The accuracies of each sleep stage classification tivity to noise compared to others, even at different SNR levels.
method are demonstrated in Table 4. Results have emerged from As mentioned earlier, the present study add an additional
a cross validation phase in which features vector of one subject is phase to find the best combination among features and classifiers.
considered as the test set and the rest of subjects are considered As such, the feature sets of each approach is applied to the four
as the train sets. This substitution for constructing new train and classifiers (RF, KNN, LDA+NC, GMM). According to the results
test sets is continued in parallel up to the number of subjects. This (Fig 11), random forest classifier produce the best classifier for all
cross validation is called leave-one(subject)-out in which the train different sets of features. Moreover, the highest accuracy is belong
and test sets of subjects are selected totally blinded to each other. to the time-frequency domain features (i.e. CWT using different
Since the main concern of automatic methods is the accurate mother wavelets) along with random forest classifier (87.49%).
detection of the sleep stages for abnormalities in patients records,
to make a real evaluation among these papers on real patients, 5. Discussion and conclusion
here the methodologies of the papers are applied to both normal
and patient groups. Not surprisingly, lower performance on pa- The aim of this study is to carry out a comprehensive review
tients is expected compared to the normal groups. Fig. 9 illustrates on automatic sleep stage classification methods. This investigation
Table 4
Brief description of each method along with their obtained total accuracy (mean± standard deviation (std)).
No Authors Extracted features Feature selection Classifier Result: mean(%) ( ± std)
1 Acharya et al. 2010 4 features; HOS based features - GMM Classifier 77.25 ± 0.43
[59]
2 Fraiwan et al. 2010 21 features; Continuous Wavelet Transform (CWT) - LDA+NC Classifier 80.76 ± 0.06
[46] using 3 mother wavelets: reverse bioorthogonal
(rbio3.3), Daubechies (db20), and Gauss of order 1
(gauss1), Shannon Entropy.
3 Gunes et al. 2010 129 features; Welch spectral analysis; k-means 4 features: Max, Min, KNN Classifier 71.81 ± 0.18
[57] clustering based feature weighting (KMCFW). Mean, Std of
previous features
4 Liang et al. 2012 21 features; Multiscale Entropy (MSE) and - LDA+NC Classifier 75.15 ± 0.09
[20] Autoregressive (AR) coefficients.
5 Fraiwan et al. 2012 7 feature; Continuous wavelet transform (CWT), - Random Forest 87.06 ± 0.09
[33] Choi-Williams distribution (CWD), Hilbert-Huang Classifier
Transform (HHT) and Renyi’s Entropy measures.
Table 5 is performed to give an insight to enlighten the horizon of this

The mean accuracy (%) of test data for each stage against other.
field. Sleep scoring is widely used for clinical purposes as well as
Gold Standard investigating the brain functions. Although visual sleep scoring is
Stage W N1 N2 N3 REM
a traditional time consuming method and the accuracy of sleep
report highly depends on the expert experience, this method is
W 98.9 14.6 1.6 0.5 0.8
still a widely-accepted approach. Nevertheless, visual scoring re-
N1 0.01 25.5 0.2 0.1 8.8
Automatic scores N2 0.2 27.5 87 20.5 25.4 port varies from an expert to another one such that the similarity
N3 0.04 0.1 5.1 79.1 0 of two experts on a certain sleep signal do not exceed 83% [7].
REM 0.8 36.1 5.8 0.2 65 Moreover, the limitation of human eyes and brain fatigue make
the visual scoring more unstable.
There is a tendency to automatize the sleep scoring process
using signal processing methods (Fig. 2). If a simple prediction
method is applied to the elicited statistics in Fig. 2, one can expect
that this trend will significantly grow in the future. Since the
results of quantitative schemes do not suffer from the subjective
issues, they can be considered as an efficient auxiliary diagnostic
tool which can give valuable information to sleep specialists.
Although several automatic sleep stage classification methods have
been suggested, a large body of them are applied to EEG signals
of healthy subjects while the main concern of automatic methods
should be focused on detecting abnormalities in patients’ sleep
signals.
After reviewing the existing evidences, we conclude that the
earlier research has exclusively helped developing feature extrac-
tion/selection or modification of a classification method. Among
Fig. 10. Comparing the accuracy by adding noises with different SNR.
Fig. 11. Accuracy of combining different feature sets and classifiers.

the state-of-the-art studies, five papers are selected and discussed through the time domain while other related features may fail to
with regard to their performance. The pros and cons of their detect such information. Nevertheless, the main flaw of AR model
methods are discussed below. is its high sensitivity to additive noises since these coefficients are
determined by auto-correlation functions.
5.1. Evaluation of features
5.1.3. Time-frequency features
The suggested EEG features in those papers (illustrated in Since Fourier transform fails to reveal the frequency and time
Table 4) can be totally divided into four categories: spectrum-base resolution simultaneously, time-frequency domain transforms are
features, time-frequency features, time features, fractal & entropy developed. These transforms can be divided in two categories, i.e.
features. linear and nonlinear. Famous linear transforms are the continuous
and discrete Wavelet transform and STFT. Nonlinear transforms
such as spectrogram and Choi–Williams can demonstrate the en-
5.1.1. Spectrum-base features ergy distribution of a signal, while similar to HOS, they suffer from
Spectrum estimation methods are repeatedly used in sleep the effect of cross-terms in a very low frequency band signals like
stage classification studies based on which sleep stages are marked EEGs. Wavelet transform decomposes frequency content of a signal
with specific frequency ranges and are classified accordingly. To in a dyadic manner matched to the standard EEG frequency bands.
estimate the frequency content of each epoch, parametric (e.g. AR, In other words, Wavelet acts as a bank of filters to decompose
ARMA), non-parametric (e.g. periodogram, Welch) and HOS meth- signals into their sub-bands. Since sleep stage transition depends
ods are vastly utilized. Based on the existing evidences, the elicited on both the EEG frequency band changes and amplitude variations
features by Welch method could better distinguish the sleep stages wavelet features may efficiently detect the stage transition (similar
than other parametric methods. This supremacy roots in the lower to frequency features). The main advantage of Wavelet features
sensitivity of non-parametric methods to the still remaining compare to the spectrum-base features is that such features do
motion artifacts and noises as compared to the parametric and not involve in calculating inaccurate auto-correlation functions.
cumulant-base methods. This is because non-parametric methods Moreover, Wavelet transform can be applied to a whole epoch
(e.g. Welch method) need to estimate just one auto-correlation since it is not sensitive to the non-stationary property of EEG
function while parametric ones have to construct a matrix of auto- signals. In addition, unlike Welch method, no averaging is taken
correlation functions. Each auto-correlation function has a bias to determine the wavelet coefficients throughout an epoch. In
and variance; therefore, the calculation of auto-correlation matrix conclusion, Wavelet transform retain comparative advantages to
while taking an inverse from it, subject to great bias and variance the other time-frequency domain transforms in EEG analysis. It
to this estimation. In addition, impact of bias and variance of a should be noted that for this specific application, CWT provides
cumulant function is much worse than that of an auto-correlation more separable features than that of DWT in terms of frequency
function leading to the deterioration of HOS results. resolution within the same time frame. In addition, wavelet co-
On the other hand, HOS is capable of revealing the phase- efficients calculated by CWT are more redundant than DWT. This
coupling inside of a signal as well as frequency behavior of their redundancy provides more stable elicited features for sleep stage
corresponding cumulant. Since HOS of a signal presents in the classification problem leading to a higher accuracy [11].
form of a surface in bi-frequency space, it contains unique in-
formation and several features can be elicited from this space as 5.1.4. Fractal and entropy features
compare to other spectrum-base methods. Furthermore, bispectral Since EEG signals behave irregularly, fractal and entropy meth-
analysis of EEG signals during the surgical operation can finely ods are suitable to measure the amount of the roughness captured
estimate the depth of anesthesia [58]. Given the similarity of inside the signal. As the entropy of a signal is increased, its infor-
anesthesia to sleep in some extent, HOS features are expected to mation contents is inclined, accordingly. In other words, as a signal
provide discriminant values for different stages of sleep. Moreover, behaves more irregular, fractal dimension and entropy of the signal
higher-order cumulants easily remove the additive Gaussian noise. are increased. When a sleep stage is changed, the efficient fre-
In the case of sleep signals, if the movement artifact and other quency bandwidth along with the signal amplitude varies accord-
additive noises have Gaussian distribution, HOS can eliminate ingly. Therefore, it is expected that fractal and entropy measures
them perfectly. However, cumulant estimation impose a high can finely describe each sleep stage. Although these measures are
computational burden, due to several multiplication of the signal somehow correlated to each other, the way of determination of
with its shifted versions. The main flaw of HOS application raise entropy is less sensitive to noise rather than that of the fractal
from the narrow band content of EEGs and also creating a lot of dimension. Accurate computation of fractal dimension and entropy
cross-terms in this thin frequency band. needs a large number of samples making their application limited
for analyzing short time signals (e.g. an EEG epoch). Nevertheless,
5.1.2. Time features ApEn is a fast entropy measure at the cost of loosening the accu-
Based on the discussed publications, time domain features are racy. Standard EEG bands are defined by their frequency bandwidth
frequently used for sleep EEG analysis. The main reason of using and amplitude variations, while entropy and fractal measures are
such features is that the amplitude of EEG signals significantly not designed to capture the signal bandwidth. Therefore, these fea-
varies from one sleep stage to another one. So far, AR coefficients, tures might appear effective in detecting just the stage transitions.
Hjorth parameters, statistical parameters of the time domain
signals (e.g. variance, skewness, kurtosis and MSE), are suggested 5.2. Classifiers
as discriminative features for sleep analysis. Among the methods
discussed, AR model is frequently used for EEG analysis, and also The final phase of automatic sleep stage scoring is classifica-
provides good results for discriminating depth of hypnosis [94], tion. Some of the most used classifiers in this field are compared
identifying the depth of anesthesia [95] as well as sleep stage anal- in this study (outlined in Fig. 11). Among the compared classifiers,
ysis [20]. As such, AR coefficients can be considered as a versatile RF produces the highest sleep stage classification for different
feature for different EEG applications. The main advantage of AR types of feature sets introduced in the five selected publications
model is to encode the time behavior of EEG into few coefficients. (Fig. 11). Random forest is fast for large scale samples, it does
In other word, AR coefficients elicit under-layer dynamics of EEGs not over-fit, has shown a great performance in several real world
applications, it acts well for imbalance datasets, it generates 6. Future work

an internal unbiased estimate of the generalization error as its
structure grows, it acts well even if a large portion of data is In this research, available studies on sleep stage classification
missed. The random selection of features in RF classifier, increases are investigated. As far as the main goal of automatic sleep stage
the classification accuracy, decreases the sensitivity to noise in classification methods is detecting abnormal behaviors of EEG
the data, and minimizes the correlation among the features. In signals for the patients, none of the proposed methods achieved
addition, high efficiency of RF roots into the nature of EEG signals. a convincing result for patient group. Therefore, developing a cus-
Since such signals behave irregularly, their elicited features are tomized EEG based features to detect EEG arrhythmia is deemed
scattered through the feature space. To distinguish such scattered necessary. Although theses signal processing methods seem to be
samples, classic methods including LDA+NC, KNN and GMM cannot robust for analyzing artificial and quasi-rhythmic signals (e.g. ECG,
perform well. As the name implied, RF is designed to classify scat- radar signal), they are not designed to analyze a chaotic-shape
tered samples belonging to different classes with a high overlap. signal like the EEGs of patients with sleep disorders. It should be
For instance, KNN takes its decision according to the K nearest mentioned that the role of preprocessing and feature extraction
samples of an input pattern. When the samples of a class are not parts for this application is more important than the classification
dense, there is no guaranty whether the neighbors of the given part. The most similar sleep stages are N1, N2 and REM which
pattern belonging to a certain class. In other words, when the could not be well-separated by the traditional communication
number of samples belong to different classes becomes equal in based signal processing features. It does not mean that the REM
the vicinity of an input pattern, KNN cannot take any decision. content is independent to the other sleep stages. In fact, there is
As far as GMM is a density estimator, each sleep stage is mod- a degree of similarity between REM and the all other sleep stages
eled by a certain GMM. When the overlap of the classes is fairly leading to decrease the REM classification accuracy. Nonetheless,
high, their GMM distributions will be similar in terms of mean the similarity between REM, N1 and N2 is significantly higher than
vectors and covariance matrices. Since the recognition of a sleep the others. The combinatorial features capturing the roughness,
stage is performed by finding the maximum value among the frequency bands and amplitude variations of an EEG epoch can
GMM outputs, in the case of fairly equal probabilities of different significantly enhance the sleep stage performance.
GMMs (due to the high overlap among the classes), this classifier
cannot distinguish the sleep stages precisely.
Further to the above, LDA is a versatile method in EEG signal References
processing applications [66]. When LDA faces a multi-class dataset,
some hyperplanes are constructed to project the input samples [1] H.-J. Park, J.-S. Oh, D.-U. Jeong, K.-S. Park, Automated sleep stage scoring us-
ing hybrid rule-and case-based reasoning, Comput. Biomed. Res. 33 (5) (20 0 0)
into a more separable lower dimension. Since the final decision 330–349.
is made based on NC, the results are not necessary satisfactory. [2] Z.-T. Yeh, R.P.-Y. Chiang, S.-C. Kang, C.-H. Chiang, Development of the insom-
This results from the fact that NC measures the distance of the nia screening scale based on icsd-ii, Int. J. Psychiatry Clin. Pract. 16 (4) (2012)
259–267.
projected input samples with the center of classes without con- [3] M. Torabi-Nami, S. Mehrabi, A. Borhani-Haghighi, S. Derman, Withstanding the
sidering the covariance information of each class. In other words, obstructive sleep apnea syndrome at the expense of arousal instability, altered
in real applications, especially when the distribution of a class is cerebral autoregulation and neurocognitive decline, J. Integr. Neurosci. 14 (02)
(2015) 169–193.
multi-modal or skewed, the class-mean is not a good representer [4] M. Younes, W. Thompson, C. Leslie, T. Egan, E. Giannouli, Utility of technolo-
of all samples of a given class. Therefore, the decision is failed in gist editing of polysomnography scoring performed by a validated automatic
practice when the class distribution is not symmetric. system, Ann. Am. Thorac. Soc. 12 (8) (2015) 1206–1218.
[5] A. Malhotra, M. Younes, S.T. Kuna, R. Benca, C.A. Kushida, J. Walsh, A. Hanlon,
B. Staley, A.I. Pack, G.W. Pien, Performance of an automated polysomnography
scoring system versus computer-assisted manual scoring, Sleep 36 (4) (2013)
573–582.
[6] N.A. Collop, Scoring variability between polysomnography technologists in dif-
5.3. Natural bottleneck of patients’ sleep analysis ferent sleep laboratories, Sleep Med. 3 (1) (2002) 43–47.
[7] F. Chapotot, G. Becq, Automated sleep–wake staging combining robust feature
Although entropy of CWT features along with RF classifier extraction, artificial neural network classification, and flexible decision rules,
Int. J. Adapt. Control Signal Process. 24 (5) (2010) 409–423.
provide an acceptable performance for healthy subjects, the result
[8] R. Ferri, F. Rundo, L. Novelli, M.G. Terzano, L. Parrino, O. Bruni, A new quan-
of this combination is still unfavorable in patients (Fig. 9). Since titative automatic method for the measurement of non-rapid eye movement
patients with RBD subject to position change and movement, their sleep electroencephalographic amplitude variability, J. Sleep Res. 21 (2) (2012)
EEG signals contain high amplitude movement artifacts compared 212–220.
[9] C.-C. Chiu, B.H. Hai, S.-J. Yeh, Recognition of sleep stages based on a combined
to healthy subjects. Especially in the case of RBD patients, their neural network and fuzzy system using wavelet transform features, Biomed.
dream is accompanying with muscle contractions leading to Eng.: Appl. Basis Commun. 26 (02) (2014) 1450029.
changes in the amplitude of REM behaviors and makes it very [10] S. Motamedi-Fakhr, M. Moshrefi-Torbati, M. Hill, C.M. Hill, P.R. White, Signal
processing techniques applied to human sleep eeg signalsa review, Biomed.
similar to the wake stage [96]. Thus, regular EEG rhythms become Signal Process. Control 10 (2014) 21–33.
irregular for patients [1] and therefore conventional features [11] M. Shokrollahi, S. Krishnan, A review of sleep disorder diagnosis by elec-
brought from other fields (e.g. communications) cannot handle tromyogram signal analysis, Crit. Rev. Biomed. Eng. 43 (1) (2015).
[12] V.F. Helland, A. Gapelyuk, A. Suhrbier, M. Riedl, T. Penzel, J. Kurths, N. Wes-
this variation. When analyzing the classification accuracy for each sel, et al., Investigation of an automatic sleep stage classification by means of
sleep stage separately (shown in Table 5), we see that N1 was multiscorer hypnogram, Methods Inf. Med. 49 (2010) 467.
repeatedly misclassified due to its high similarity with REM in [13] S. Özşen, Classification of sleep stages using class-dependent sequential fea-
ture selection and artificial neural network, Neural Comput. Appl. 23 (5) (2013)
terms of amplitude and spectrum variations. Moreover, among
1239–1250.
the sleep stages, N1 has the lowest number of samples causing [14] S.-T. Pan, C.-E. Kuo, J.-H. Zeng, S.-F. Liang, A transition-constrained discrete hid-
the employed classifiers to fail to learn this class similar to other den Markov model for automatic sleep staging, Biomed. Eng. Online 11 (1)
(2012) 1.
classes. The classifiers boundary is thus biased toward the classes
[15] A. Krakovská, K. Mezeiová, Automatic sleep scoring: a search for an optimal
with higher population. Consequently, a high number of samples combination of measures, Artif. Intell. Med. 53 (1) (2011) 25–33.
in REM stage compared to the N1 one makes the classifier votes [16] M.E. Tagluk, N. Sezgin, M. Akin, Estimation of sleep stages by an artificial neu-
in favor of REM. In contrast, due to its difference from other stages ral network employing eeg, emg and eog, J .Med. Syst. 34 (4) (2010) 717–725.
[17] J. Shi, X. Liu, Y. Li, Q. Zhang, Y. Li, S. Ying, Multi-channel eeg-based sleep stage
(except REM) in terms of frequency content, wake pattern is easily classification with joint collaborative representation and multiple kernel learn-
detected. Therefore it is obvious that this stage is better detected. ing, J. Neurosci. Methods 254 (2015) 94–101.
[18] A. Piryatinska, W.A. Woyczynski, M.S. Scher, K.A. Loparo, Optimal channel se- [46] L. Fraiwan, K. Lweesy, N. Khasawneh, M. Fraiwan, H. Wenz, H. Dickhaus, et al.,
lection for analysis of eeg-sleep patterns of neonates, Comput. Methods Pro- Classification of sleep stages using multi-wavelet time frequency entropy and
grams Biomed. 106 (1) (2012) 14–26. lda, Methods Inf. Med. 49 (3) (2010) 230.
[19] S. Khalighi, T. Sousa, G. Pires, U. Nunes, Automatic sleep staging: a computer [47] C.M. Bishop, Pattern recognition, Mach. Learn. 128 (2006) 1–729.
assisted approach for optimal combination of features and polysomnographic [48] B. Hjorth, Eeg analysis based on time domain properties, Electroencephalogr.
channels, Expert Syst. Appl. 40 (17) (2013) 7046–7059. Clin. Neurophysiol. 29 (3) (1970) 306–310.
[20] S.-F. Liang, C.-E. Kuo, Y.-H. Hu, Y.-H. Pan, Y.-H. Wang, Automatic stage scor- [49] L.J. Herrera, C.M. Fernandes, A.M. Mora, D. Migotina, R. Largo, A. Guillén,
ing of single-channel sleep eeg by using multiscale entropy and autoregressive A.C. Rosa, Combination of heterogeneous eeg feature extraction methods and
models, IEEE Trans. Instrum. Meas. 61 (6) (2012) 1649–1657. stacked sequential learning for sleep stage classification, Int. J. Neural Syst. 23
[21] D. Al-Jumeily, S. Iram, F.-B. Vialatte, P. Fergus, A. Hussain, A novel method of (03) (2013) 1350012.
early diagnosis ofAlzheimers disease based on eeg signals, Sci. World J. 2015 [50] B.-L. Su, Y. Luo, C.-Y. Hong, M.L. Nagurka, C.-W. Yen, Detecting slow wave sleep
(2015). using a single eeg signal channel, J. Neurosci. Methods 24347–52.
[22] C. Berthomier, X. Drouot, M. Herman-Stoïca, P. Berthomier, J. Prado, [51] C. Stepnowsky, D. Levendowski, D. Popovic, I. Ayappa, D.M. Rapoport, Scor-
D. Bokar-Thire, O. Benoit, J. Mattout, M. d Ortho, Automatic analysis of sin- ing accuracy of automated sleep staging from a bipolar electroocular record-
gle-channel sleep eeg: validation in healthy individuals, Sleep-New York Then ing compared to manual scoring by multiple raters, Sleep Med. 14 (11) (2013)
Westchester 30 (11) (2007) 1587. 1199–1207.
[23] A. Flexer, G. Gruber, G. Dorffner, A reliable probabilistic sleep stager based on [52] A. Brignol, T. Al-Ani, X. Drouot, Phase space and power spectral approaches for
a single eeg signal, Artif. Intell. Med. 33 (3) (2005) 199–207. eeg-based automatic sleep–wake classification in humans: a comparative study
[24] Y.-L. Hsu, Y.-T. Yang, J.-S. Wang, C.-Y. Hsu, Automatic sleep stage recurrent neu- using short and standard epoch lengths, Comput. Methods Programs Biomed.
ral classifier using energy features of eeg signals, Neurocomputing 104 (2013) 109 (3) (2013) 227–238.
105–114. [53] T. Kayikcioglu, M. Maleki, K. Eroglu, Fast and accurate pls-based classifica-
[25] K. Šušmáková, A. Krakovská, Discrimination ability of individual measures used tion of eeg sleep using single channel data, Expert Syst. Appl. 42 (21) (2015)
in sleep stages classification, Artif. Intell. Med. 44 (3) (2008) 261–277. 7825–7830.
[26] T. Penzel, R. Conradt, Computer based sleep recording and analysis, Sleep Med. [54] H. Akaike, A new look at the statistical model identification, IEEE Trans. Au-
Rev. 4 (2) (20 0 0) 131–148. tomat. Contr. 19 (6) (1974) 716–723.
[27] D.-K. Kim, J. Choi, K.R. Kim, K.-G. Hwang, S. Ryu, S.H. Cho, Rethinking aasm [55] L. Sörnmo, P. Laguna, Bioelectrical signal processing in cardiac and neurological
guideline for split-night polysomnography in asian patients with obstructive applications, 8, Academic Press, 2005.
sleep apnea, Sleep and Breathing 19 (4) (2015) 1273–1277. [56] S.-F. Liang, C.-E. Kuo, Y.-H. Hu, Y.-S. Cheng, A rule-based automatic sleep stag-
[28] L. Novelli, R. Ferri, O. Bruni, Sleep classification according to aasm and ing method, J. Neurosci. Methods 205 (1) (2012) 169–176.
Rechtschaffen and Kales: effects on sleep scoring parameters of children and [57] S. Güneş, K. Polat, Ş. Yosunkaya, Efficient sleep stage recognition system based
adolescents, J Sleep Res. 19 (1p2) (2010) 238–247. on eeg signal using k-means clustering based feature weighting, Expert Syst.
[29] A. Rechtschaffen, A. Kales, A manual of standardized terminology, techniques Appl. 37 (12) (2010) 7922–7928.
and scoring system for sleep stages of human subjects (1968). [58] K.C. Chua, V. Chandran, U.R. Acharya, C.M. Lim, Application of higher order
[30] R.B. Berry, R. Budhiraja, D.J. Gottlieb, D. Gozal, C. Iber, V.K. Kapur, C.L. Mar- statistics/spectra in biomedical signalsa review, Med. Eng. Phys. 32 (7) (2010)
cus, R. Mehra, S. Parthasarathy, S.F. Quan, et al., Rules for scoring respiratory 679–689.
events in sleep: update of the 2007 aasm manual for the scoring of sleep and [59] U.R. Acharya, E.C.-P. Chua, K.C. Chua, L.C. Min, T. Tamura, Analysis and auto-
associated events, J. Clin. Sleep Med. 8 (5) (2012) 597–619. matic identification of sleep stages using higher order spectra, Int. J. Neural
[31] T. Hori, Y. Sugita, E. Koga, S. Shirakawa, K. Inoue, S. Uchida, H. Kuwahara, Syst. 20 (06) (2010) 509–521.
M. Kousaka, T. Kobayashi, Y. Tsuji, et al., Proposed supplements and amend- [60] A. Petropulu, Higher-order spectral analysis, V.K. Madisetti D.B. Williams Digi-
ments to a manual of standardized terminology, techniques and scoring sys- tal Signal Processing HandbookChapman & Hall/CRCnetBASE (1999).
tem for sleep stages of human subjects, the Rechtschaffen & Kales (1968) stan- [61] V. Bajaj, R.B. Pachori, Automatic classification of sleep stages based on the
dard, Psychiatry Clin. Neurosci. 55 (3) (2001) 305–310. time-frequency image of eeg signals, Comput. Methods Programs Biomed. 112
[32] S.-L. Himanen, J. Hasan, Limitations of Rechtschaffen and Kales, Sleep Med. (3) (2013) 320–328.
Rev. 4 (2) (20 0 0) 149–167. [62] T. Schlüter, S. Conrad, An approach for automatic sleep stage scoring and ap-
[33] L. Fraiwan, K. Lweesy, N. Khasawneh, H. Wenz, H. Dickhaus, Automated sleep nea-hypopnea detection, Front. Comput. Sci. 6 (2) (2012) 230–241.
stage identification system based on time–frequency analysis of a single eeg [63] L. Fraiwan, K. Lweesy, N. Khasawneh, M. Fraiwan, H. Wenz, H. Dickhaus, Time
channel and random forest classifier, Comput. Methods Programs Biomed. 108 frequency analysis for automated sleep stage identification in fullterm and
(1) (2012) 10–19. preterm neonates, J. Med. Syst. 35 (4) (2011) 693–702.
[34] T. Lajnef, S. Chaibi, P. Ruby, P.-E. Aguera, J.-B. Eichenlaub, M. Samet, A. Ka- [64] C.-F.V. Latchoumane, J. Jeong, Quantification of brain macrostates using dy-
chouri, K. Jerbi, Learning machines and sleeping brains: automatic sleep stage namical nonstationarity of physiological time series, IEEE Trans. Biomed. Eng.
classification using decision-tree multi-class support vector machines, J. Neu- 58 (4) (2011) 1084–1093.
rosci. Methods 250 (2015) 94–105. [65] F. Karimzadeh, E. Seraj, R. Boostani, M. Torabi-Nami, Presenting efficient fea-
[35] B. Şen, M. Peker, A. Çavuşoğlu, F.V. Çelebi, A comparative study on classifica- tures for automatic cap detection in sleep eeg signals, in: Telecommunications
tion of sleep stage based on eeg signals using feature selection and classifica- and Signal Processing (TSP), 2015 38th International Conference on, IEEE, 2015,
tion algorithms, J. Med. Syst. 38 (3) (2014) 1–21. pp. 448–452.
[36] M. Ronzhina, O. Janoušek, J. Kolářová, M. Nováková, P. Honzík, I. Provazník, [66] M. Sabeti, S. Katebi, R. Boostani, Entropy and complexity measures for eeg sig-
Sleep scoring using artificial neural networks, Sleep Med. Rev. 16 (3) (2012) nal classification of schizophrenic and control participants, Artif. Intell. Med.
251–263. 47 (3) (2009) 263–274.
[37] Y. Kim, S. Kazukawa, M. Kurachi, M. Horita, K. Matsuura, [problems and a [67] R. Acharya, O. Faust, N. Kannathal, T. Chua, S. Laxminarayan, Non-linear analy-
resolution of clinical sleep research–a recording system of polysomnography], sis of eeg signals at various sleep stages, Comput. Methods Programs Biomed.
Seishin shinkeigaku zasshi= Psychiatria et neurologia Japonica 93 (1) (1990) 80 (1) (2005) 37–45.
59–62. [68] A. Piryatinska, G. Terdik, W.A. Woyczynski, K.A. Loparo, M.S. Scher, A. Zlotnik,
[38] B. Weiss, Z. Clemens, R. Bódizs, Z. Vágó, P. Halász, Spatio-temporal analysis of Automated detection of neonate eeg sleep stages, Comput. Methods Programs
monofractal and multifractal properties of the human sleep eeg, J. Neurosci. Biomed. 95 (1) (2009) 31–46.
Methods 185 (1) (2009) 116–124. [69] B.B. Mandelbrot, How long is the coast of Britain, Science 156 (3775) (1967)
[39] M.B. Hamaneh, N. Chitravas, K. Kaiboriboon, S.D. Lhatoo, K.A. Loparo, Auto- 636–638.
mated removal of ekg artifact from eeg data using independent component [70] R. Esteller, G. Vachtsevanos, J. Echauz, B. Litt, A comparison of waveform fractal
analysis and continuous wavelet transformation, IEEE Trans. Biomed. Eng. 61 dimension algorithms, IEEE Trans. Circuits Syst. I 48 (2) (2001) 177–183.
(6) (2014) 1634–1641. [71] L. Zoubek, S. Charbonnier, S. Lesecq, A. Buguet, F. Chapotot, Feature selection
[40] M. Crespo-Garcia, M. Atienza, J.L. Cantero, Muscle artifact removal from human for sleep/wake stages classification using data driven methods, Biomed. Signal
sleep eeg by using independent component analysis, Ann. Biomed Eng. 36 (3) Process. Control 2 (3) (2007) 171–179.
(2008) 467–475. [72] C. Vural, M. Yildiz, Determination of sleep stage separation ability of features
[41] B. Koley, D. Dey, An ensemble system for automatic sleep stage classification extracted from eeg signals using principle component analysis, J Med Syst 34
using single channel eeg signal, Comput. Biol. Med. 42 (12) (2012) 1186–1195. (1) (2010) 83–89.
[42] J. Muthuswamy, N.V. Thakor, Spectral analysis methods for neurological sig- [73] B. Xue, M. Zhang, W.N. Browne, Particle swarm optimisation for feature selec-
nals, J. Neurosci. Methods 83 (1) (1998) 1–14. tion in classification: novel initialisation and updating mechanisms, Appl. Soft
[43] A.G. Correa, E. Laciar, H. Patiño, M. Valentinuzzi, Artifact removal from eeg sig- Comput. 18 (2014) 261–276.
nals using adaptive filters in cascade, in: Journal of Physics: Conference Series, [74] K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic press,
90, IOP Publishing, 2007, p. 012081. 2013.
[44] R.J. Croft, J.S. Chandler, R.J. Barry, N.R. Cooper, A.R. Clarke, Eog correction: a [75] V. Vapnik, The Nature of Statistical Learning Theory, Springer Science & Busi-
comparison of four methods, Psychophysiology 42 (1) (2005) 16–24. ness Media, 2013.
[45] J.-A. Jiang, C.-F. Chao, M.-J. Chiu, R.-G. Lee, C.-L. Tseng, R. Lin, An automatic [76] H.-t. Wu, R. Talmon, Y.-L. Lo, Assess sleep stage by modern signal processing
analysis method for detecting and eliminating ecg artifacts in eeg, Comput. techniques, IEEE Trans. Biomed. Eng. 62 (4) (2015) 1159–1168.
Biol. Med. 37 (11) (2007) 1660–1671. [77] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995)
273–297.
[78] L. Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32.
[79] R.O. Duda, P.E. Hart, D.G. Stork, Pattern classification, John Wiley & Sons, 2012. [89] F. Murtagh, A survey of recent advances in hierarchical clustering algorithms,
[80] J. Schmidhuber, Deep learning in neural networks: an overview, Neural Netw. Comput. J. 26 (4) (1983) 354–359.
61 (2015) 85–117. [90] R.A. Jarvis, E.A. Patrick, Clustering using a similarity measure based on shared
[81] R.K. Sinha, Artificial neural network and wavelet based automated detection of near neighbors, IEEE Trans. Comput. 100 (11) (1973) 1025–1034.
sleep spindles, rem sleep and wake states, J. Med. Syst. 32 (4) (2008) 291–299. [91] P. PhysioBank, Physionet: components of a new research resource for complex
[82] C. Fraley, A.E. Raftery, Model-based clustering, discriminant analysis, and den- physiologic signals, Circulation. v101 i23. e215-e220.
sity estimation, J. Am. Stat. Assoc. 97 (458) (2002) 611–631. [92] B. Kemp, A.H. Zwinderman, B. Tuk, H.A. Kamphuisen, J.J. Oberye, Analysis of
[83] D.A. Reynolds, R.C. Rose, Robust text-independent speaker identification us- a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of
ing Gaussian mixture speaker models, IEEE Trans. Speech Audio Process. 3 (1) the eeg, IEEE Trans. Biomed. Eng. 47 (9) (20 0 0) 1185–1194.
(1995) 72–83. [93] M.G. Terzano, L. Parrino, A. Sherieri, R. Chervin, S. Chokroverty, C. Guillem-
[84] J.A. Bilmes, et al., A gentle tutorial of the em algorithm and its application to inault, M. Hirshkowitz, M. Mahowald, H. Moldofsky, A. Rosa, et al., Atlas, rules,
parameter estimation for Gaussian mixture and hidden Markov models, Int. and recording techniques for the scoring of cyclic alternating pattern (cap) in
Comput. Sci. Inst. 4 (510) (1998) 126. human sleep, Sleep Med. 2 (6) (2001) 537–553.
[85] F. Yaghouby, S. Sunderam, Quasi-supervised scoring of human sleep in [94] Z. Elahi, R. Boostani, A.M. Nasrabadi, Estimation of hypnosis susceptibility
polysomnograms using augmented input variables, Comput. Biol. Med. 59 based on electroencephalogram signal features, Scientia Iranica 20 (3) (2013)
(2015) 54–63. 730–737.
[86] J. MacQueen, et al., Some methods for classification and analysis of multivari- [95] E. Jensen, P. Lindholm, S. Henneberg, Autoregressive modeling with exogenous
ate observations, in: Proceedings of the Fifth Berkeley Symposium on Mathe- input of middle-latency auditory-evoked potentials to measure rapid changes
matical Statistics and Probability, 1, Oakland, CA, USA., 1967, pp. 281–297. in depth of anesthesia., Methods Inf. Med. 35 (3) (1996) 256–260.
[87] R. Agarwal, J. Gotman, Computer-assisted sleep staging, IEEE Trans. Biomed. [96] M. Livia Fantini, J.-F. Gagnon, D. Petit, S. Rompré, A. Décary, J. Carrier, J. Mont-
Eng. 48 (12) (2001) 1412–1423. plaisir, Slowing of electroencephalogram in rapid eye movement sleep behav-
[88] P. Berkhin, A Survey of Clustering Data Mining Techniques, in: Grouping mul- ior disorder, Ann. Neurol. 53 (6) (2003) 774–780.
tidimensional data, Springer, 2006, pp. 25–71.

Computer Methods and Programs in Biomedicine: Reza Boostani, Foroozan Karimzadeh, Mohammad Nami

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Methods and Programs in Biomedicine: Reza Boostani, Foroozan Karimzadeh, Mohammad Nami

Uploaded by

Copyright:

Available Formats

Computer Methods and Programs in Biomedicine 140 (2017) 77–91

Contents lists available at ScienceDirect

Computer Methods and Programs in Biomedicine

A comparative review on sleep stage classiﬁcation methods in patients

1. Introduction normal sleep patterns [1]. According to the international classiﬁ-

visual scoring method is still the most acceptable approach, Table 1

Journal title Amount Percentage (%)

Computer Methods and Program in Biomedicine 6 10.27

Feature name Formula Elucidation

Mean μ = N1 nN=1 xn xn (n = 1, 2, . . . , N ) is a time-series.

separability among the samples is maximized in the projected

SWi = x∈ci (x − mi )(x − mi )T , i = 1, 2, . . . , ci (13)

Fig. 7. HMM architecture.

• Hidden Markov Model (HMM): HMM is a comprehensive version

• First-order moment of amplitude of diagonal element in

4. In another attempt, MSE feature containing 1 to 13 scales along

5. In the last investigated method, three different time-frequency

No Authors Extracted features Feature selection Classiﬁer Result: mean(%) ( ± std)

Table 5 is performed to give an insight to enlighten the horizon of this

Fig. 11. Accuracy of combining different feature sets and classiﬁers.

applications, it acts well for imbalance datasets, it generates 6. Future work

You might also like