Professional Documents
Culture Documents
Fully Parametric Sleep Staging Compatible
Fully Parametric Sleep Staging Compatible
Fully Parametric Sleep Staging Compatible
DOI 10.1007/s12021-009-9059-9
Introduction
Sleep stages scoring, as described in Rechtschaffen
and Kales (1968), is based upon recognition of several basic features of polysomnographic recordings.
These basic elements can be either structurescalled
transients or waveformspresent in the EEG, or
some general properties of the analyzed epoch of the
polysomnographic recording, relating also to the EMG
(electromyogram, that is muscle activity) and EOG
(electrooculogram, measuring eye movements).
EEG structure, relevant to staging, includes slow
waves, sleep spindles, K-complexes, vertex sharp waves
(VSW), alpha, theta and sawtooth waves. More or less
exact definitions of these phenomena were developed
in decades of visual analysis of EEG for the purpose
of standardization. They rely on terms like cycles per
second and time duration, which at a first glance
may appear to be easily translatable to the language
of computers methods used for signal analysis. This is
true for the frequency (cycles per second) since the
first application of Fourier Transform to EEG (Dietsch
1932). However, following decades of development of
computer methods brought surprisingly little progress
in reliable estimation of the other parameter, important
in classification of waveforms, that is their time duration. This stems from the fact that most of the methods
rely on a prior setting of this parameter, e.g. constant
246
length of the analysis window in spectrogram (shorttime Fourier transform) or constant ratio of frequency
to the time width of wavelets.
The first signal processing algorithm, which adopted
the window length to the local features of the analyzed
signal (which relates also to a reliable estimation of the
time width of detected waveforms) was the matching
pursuit (MP), proposed by Mallat and Zhang (1993).
It implements a suboptimal, iterative solution to the
problem of adaptive approximations in time-frequency
dictionaries of functions. Over 10 years of MP applications to various paradigms of EEG and MEG has
proven its sensitivity, selectivity, and high concordance
to visual detection of certain EEG waveforms, which
led to the thesis of unification of visual and computer
EEG analysis within this framework (Durka 2007b). In
particular, detection of sleep spindles and delta waves,
and its concordance with visual analysis, was presented in Durka et al. (2005), Malinowska et al. (2006),
Zygierewicz
et al. (1999).
This study gathers all these results into an open, user
friendly system for sleep staging, based upon a bottomup approach starting with detection and parameterization of relevant waveforms and patterns, and then
combining these results (together with information
from EMG and EOG signals) into final decision assigning sleep stages to epochsaccording to the R&K rules
(Rechtschaffen and Kales 1968).
Previous approaches to automatic sleep staging include hybrid systems (Park et al. 2000; Smith and
Karacan 1971), methods based on statistical pattern
recognition techniques (Martin et al. 1972; Stanus et al.
1987), neural networks (Schaltenbrand et al. 1996) and
expert systems (Ray et al. 1986; Kubat et al. 1994)
these examples include only a fraction of the studies on
digital sleep analyses reviewed in Penzel et al. (2007). In
addition to the automatization achieved so far, the procedure presented hereby can be fully controlled by the
user in any of its stages, from detection of waveforms
and patterns, through primary assignment of stages, to
final smoothing. All these steps can be verified separately by experts, who can adjust the parameters of
detection using terms and notions from the electroencepalographers rather than engineers dictionaryfor
example, minimum amplitude (microvolts) of delta
waves or sleep spindles. Primary assignment of stages
is a direct interpretation of R&K rules (Rechtschaffen
and Kales 1968), while the smoothing algorithm incorporates the classic rules and experiences published so
far in relevant literature. Given the above possibility
of interaction, and the clear foundations of the whole
algorithm, electroencephalographers can gain a full understanding of the procedures and their limitations, and
) cos (2 f (t u) + )
tu 2
w
(1)
M1
an gn
(2)
n=0
247
Frequency
Time duration
Min. amplitude
0.24 Hz
1115 Hz
812 Hz
48 Hz
0.052.5 Hz
0.5 s.
0.52.5 s.
1.5 s.
0.1 s.
0.31.5 s.
ASWA
13 V
5 V
15 V
100 V
of filtered EMG signal, as for MT above) were calculated, after omitting 10% of the smallest and 10% of
the largest values (Anderer et al. 2005). This procedure
was adopted to reject the burst-type high activities of
EMG occurred also in stage REM. Thresholds for the
value of tone EMG, detecting stage REM (Thr2), were
determined on the basis of the data collected to train
the system. These thresholds are computed for each
sleep recording as the average EMG tone of all the
epochs (with the additional constraint of not exceeding
50 V).
Rapid eye movement supports the information
needed to assign stage REM. Detection of rapid eye
movement is based upon correlation coefficients between two EOG derivations EOGL and EOGR, and
deflection between left and right EOG. The correlation
coefficients between EOGL and EOGR are computed
in 0.5-s epoch. Thresholds Thr3 for minimum deflection
between left and right EOG were also determined
based on the training data. If the correlation coefficients falls below 0.9, and EOGL-EOGR deflection is
lower than Thr3, a rapid eye movement event is scored.
Assigning Stages to Epochs
Automatic sleep stages classification is effectuated in
the hierarchical manner, presented in Fig. 2. In the
first step, each 20-s epoch is tested for muscle artifacts
in EEG or EMG derivation, which can indicate the
Movement Time (MT). Relevant parameter reflecting
increase in amplitude was calculated as described in
Section Detection of Body and Eye Movements. If
at least one of the analyzed derivations (C3, C4 or
EMG) exceeded a corresponding threshold (Thr1EEG
common for C3 and C4, and Thr1EMG for EMG) in
more than 50% of the 20-s epoch, the epoch is scored
as Movement Time.
In the second step, the algorithm detects slow wave
sleep (SWS) stages 3 and 4, by applying fixed 20%
and 50% thresholds to the amount of epochs time,
occupied by slow waves. Estimation of this parameter is
described in Section Parameterization of EEG Waveforms and in Durka et al. (2005).
In the next step, the algorithm detects sleep spindles
and K-complexes, which are related to stage 2. If at
least one sleep spindle or K-complex occurs in 20 s
epoch, and less than 75% of the epoch is occupied
by alpha activity (Section Parameterization of EEG
Waveforms) the epoch is scored as stage 2.
If alpha activity occupies above 75%, EOG and
EMG signals of this epoch are examined to distinguish stage REM from Wake. Stage REM is scored
in case of tone of EMG below Thr2, in presence of
248
249
Table 2 Inter-expert agreement in epoch-by-epoch scoring of sleep stages, evaluated on 7 overnight recordings (9128 epochs) scored
by 2 experts
Expert B
Expert A
Stage
S1
S2
S3
S4
SREM
MT
Selectivity
(%)
S1
S2
S3
S4
SREM
W
MT
Sensitivity (%)
126
85
1
0
12
18
13
49
36
3764
736
164
77
4
32
78
1
69
189
494
2
0
5
24
0
0
7
848
0
0
0
99
48
223
23
0
1663
2
29
83
36
34
0
0
3
188
27
65
7
17
0
3
9
4
129
76
49
89
20
56
94
87
55
76
Total concordance 76% (pair agreement range 7084%), Kappa coefficient 0.65, Stage 1 concordance 33%, S271%, S313%, S456%,
SREM80%, W60%, MT47%
Results
Staging criteria defined in Rechtschaffen and Kales
(1968) leave a significant margin for subjective interpretation; therefore, hypnograms constructed for the same
recording, even by experts from the same laboratory,
can be slightly different (Kim et al. 1992; Monroe 1967).
This effect is illustrated in Fig. 3, where hypnograms
by three human experts are displayed together with the
hypnogram constructed automatically by the system.
These differences in scoring of analyzed data are
summarized by statistical measures in Tables 2 and 3
tables of counts that cross-classify data. Table 2
presents a comparison of scoring sleep stages in 7
recordings by at least two experts.
As presented in Table 3, the total number of 20-s
epochs used in this study was 25316, corresponding to
Table 3 Concordance of the automatic detection of sleep stages, based upon MP parameterization of EEG (by rows) evaluated on 20
overnight recordings (25316 epochs), with hypnograms by human expert (by columns)
Human expert
MP Stager
Stage
S1
S2
S3
S4
SREM
MT
Selectivity
(%)
S1
S2
S3
S4
SREM
W
MT
Sensitivity (%)
279
168
96
3
235
90
24
31
125
9554
1294
175
370
45
102
82
59
664
1646
232
6
12
9
63
0
58
814
2170
0
0
6
71
181
701
64
6
4290
0
83
81
73
167
93
1
112
393
130
39
42
172
62
3
119
43
315
42
37
83
41
83
83
67
47
73
Total concordance 73% (range 6681%), Kappa coefficient 0.63, Stage 1 concordance 20%, S269%, S333%, S463%, SREM69%,
W33%, MT28%
250
Fig. 3 Example hypnograms
scored independently
for the same recording by
3 human experts (upper plots)
and presented algorithm
(bottom)
Expert2
M
W
S1
REM
S2
S3
S4
0
3
Expert3
M
W
S1
REM
S2
S3
S4
0
3
MP Stager
M
W
S1
REM
S2
S3
S4
0
about 140 h of polysomnographic recordings. Total concordance of the proposed automatic detection of sleep
stages with hypnograms by human experts, scored for
these epochs, is 73% with Cohens Kappa of about 0.63,
ranging from 66% to 81% for individual recordings.
Taking the stages separately, mean concordance between visual scoring and automated analysis was 21%
for sleep stage 1, 69% for stage 2, 33% for stage 3,
63% for stage 4, 69% for REM, and 33% and 28% for
stages Wake and Movement Time, respectively. These
concordances exhibit a clearly different pattern across
the stages for the system-expert and inter-expert cases
(Fig. 4). For stages 3 and 4 the algorithm achieves
on the averagebetter concordance with human experts (respectively 33% and 63%) than the inter-expert
concordance for these stages (13% and 56%). This is
not a paradox: if a system achieves a stable performance close to the average of experts decisions, then
the mean distance between any experts score and the
center of all the scores should be smaller than the
mean distance between random expert-expert pairs.
Similar result is reported in Anderer et al. (2005).
Location of the systems results close to the center of
the experts decisions is achieved owing to the adaptive
time-frequency parameterization by matching pursuit,
discussed in Section Parameterization of EEG Wave-
time [h]
[%]
80
70
60
50
40
30
20
10
0
S1
S2
S3
S4
SREM
system-expert
expert-expert
W
MT
Fig. 4 The concordance (percentage) within each of the categories (sleep stages: S1, S2, S3, S4, SREM, W, MT) consists as the
number of epochs, scored by two experts (or system and expert)
as each stage, given that either of the experts (or system and
expert) scored as these stages
251
Discussion
Results from the previous section indicate, that the concordance of the presented algorithm with visual staging
is similar to the inter-expert concordance. Although
similar results were already reported in literature, their
direct comparison to the performance of the presented
system, evaluated in details in Section Results, is very
difficult, because of different approaches to reporting
concordance. For example, Schaltenbrand et al. (1996)
reported sensitivity between 80 and 84.5%, Prinz et al.
(1994) mean proportion of agreement of 0.74 and a
mean kappa coefficient of 0.57, Hashizume et al. (2001)
total agreement ratio 85.8%, average agreement rate in
normal recordings 87.5% was reported by Park et al.
(2000), Stanus et al. (1987) found 7075% concordance,
Hasan et al. (1993) reported the agreements between
the computer and visual scores relatively good for 5
subjects having a prominent occipital alpha activity
during wakefulness (range 7079%) but less promising
(range 6470%) for the other 4 subjects with poor
occipital alpha activity. A review of these results is
given in Penzel et al. (2007).
These expert systems, tuned explicitly for maximizing concordance with visual staging, were usually based
on black-box approaches like e.g. artificial neural networks, which, according to Caffarel et al. (2006), are not
sufficiently accurate for sleep study analysis using the
252
References
Anderer, P., Gruber, G., Parapatics, S., et al. (2005). An
E-health solution for automatic sleep classification according to Rechtschaffen and Kales: Validation study of the
Somnolyzer 24 7 utilizing the Siesta database. Neuropsychobiology, 51, 115133.
Baumgart-Schmitt, R., Herrmann, W., & Eilers, R. (1998). On
the use of neural network techniques to analyze sleep EEG
data. third communication: Robustification of the classifi-
253
ing using the neural network model: Comparison between
visual and automatic analysis in normal subjects and patients. Sleep, 9(1), 2635.
Smith, J., & Karacan, I. (1971). EEG sleep stage scoring by an automatic hybrid system. Electroencephalography and Clinical
Neurophysiology, 31(3), 231237.
Stanus, E., Lacroix, B., Kerkhofs, M., & Mendlewicz, J. (1987).
Automated sleep scoring: a comparative reliability study of
two algorithms. Electroencephalography and Clinical Neurophysiology, 66(4), 448456.
Virkkala, J., Hasan, J., Vrri, A., Himanen, S.-L., & Mller,
K. (2007). Automatic sleep stage classification using twochannel electro-oculography. Journal of Neuroscience Methods, 166, 109115.
Zygierewicz,
J., Blinowska, K. J., Durka, P. J., Szelenberger, W.,
Niemcewicz, S., & Androsiuk, W. (1999). High resolution
study of sleep spindles. Clinical Neurophysiology, 110(12),
21362147.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.