Classification of EEG Signals Using

the Wavelet Transform

Neep Hazarika Jean Zhu Chen Ah Chung Tsoi Alex Sergejew

Ahsrr-ucr-This paper describes the application of an Many attempts have been made by engineers
artificial neural network (ANN) technique together with a to process non-stationary signals in an appropri-
feature extraction technique, viz., the wavelet transform,
for the classification of EEG signals. Three classes of EEG
ate way in order to circumvent the disadvantages
signals were used: Normal, Schizophrenia (SCH), and of the Fourier transform. Of particular interest
Obsessive Compulsive Disorder (OCD). The architecture is the wavelet transform, which provides an effi-
of the artificial neural network used in the classification is cient alternative to Fourier analysis as far as the
a three-layered feedforward network which implements
the backpropagation of error learning algorithm. After analysis of nonstationary signals is concerned.
training, the network with wavelet coefficients was able Signal analysis aims to extract appropriate in-
to correctly classify over 66% of the normal class and formation from a signal -this may be achieved
71% of the schizophrenia class of EEG’s. The wavelet
transform thus provides a potentially powerful technique through a transformation of the signal which en-
for preprocessing EEG signals prior to classification. ables a detailed study of relevant properties. An
Kejwords- EEG Classification, Neural Networks, assumption is made that the transformation is also
Wavelet Transform invertible. A transform, F,such that I’ = F z ,
where T is the signal to be transformed, and Y
1. INTRODUCrION is the transformed variable, is said to be invert-
ible, if F-’exists such that z = F - ‘ I - . In this
One of the first attempts to apply artificial neu- case, the analysis can be shown to represent the
ral network (ANN) techniques to the problem signal unambiguously, and inore involved oper-
of electroencephalogram (EEG) signal classifica- ations such as feature extraction and parameter
tion in psychiatric disorders was [13], which in- estimation can be performed on the “transform
dicated that by using autoregressive modelling of side.” [l 11. The wavelet transform is an example
EEG signals together with a nonlinear classifica- of such an invertible transform.
tion scheme, namely, themultilayer perceptron (a The wavelet transform can be thought of as
particular class of artificial neural network archi- an extension of the classic Fourier transform, ex-
tectures), it was possible to classify EEG signals cept that, instead of working on a single scale
obtained from those suffering from schizophre- (time or frequency), it works on a multi scale
nia and obsessive-compulsive disorder as well as basis. This multi scale feature of the wavelet
EEG signals obtained from normal subjects. This transform allows the decomposition of a signal
study sets out to cxtend these initial findings. into a number of scales, each scale representing a
When the Fast Fourier transform is applied to particular “coarseness” of the signal under study
successive segments of an EEG signal, the frc- [9]. This essentially decomposes the signal into
quency spectrum is observed to vary over time a set of signals of varying “coarseness”, ranging
as the Fourier coefficients vary [15]; this indi- from low frequency components progressivcly
cates that the EEG signal is a non-stationarysig- to high frequency components. Thus, if one can
nal. The principal motivation behind the work make some decision concerning the underlying
reported here is our reasoning that if the fea- frequency components of the signal, one may
ture extraction method could include modelling choose the appropriate scale i n the wavelet trans-
of possible non-stationary efTects in the under- form, whilst ignoring the contribution ofthe other
lying signal, better classification results may be scales. This decomposition ofthe signal into dif-
obtained than with the use of AR coefficients. ferent scales is particularly useful if the wavelet
Department of Computer Science, University of Aston,
transform is performed on an orthogonal basis
Birmingham, 84 7EI’.England. I . Hence, one may think of the high frequency
Department of Electrical and Computer Engineering Uni-
versity o l Queensland, St Lucia, Queeiisland 4072,Australia. ‘The wavelet transtorin may he performed on a set of or-
Faculty of Informatics, University of Wollongong, North- thogonal or non-orthogonal basis. The orthogonal basis trans-
fields Avc., Wollongoiig, NSW 2522. Australia. form is mol-e compact in its repi-esentatiorls,as it allows the
Centre for Applied Neurosciences, Swinburne University decomposition of the underlying space into a set of orthogo-
of Technology, Hawthorn, Victoria 3 122, Australia. nal subspaces, thus making it possihlc tu ignore some of the

components as representing the “noise” content quite different [SI. There are a number of meth
in a signal, and which may therefore be ignored. ods for obtaining an appropriate mother wavelet
It is this concept of decomposing the signal into a design. For details, the reader is referred to [9].
number of scales, and the ability to ignore some The mother wavelet should be chosen carefully,
of the decomposed signals which recommend the such that it exhibits good localization properties
wavelet transform as a possible method for signal in both the frequency and spatial domains 191.
processing. One such wavelet is the Lemarie wavelet [8]. We
The structure of this paper is as follows: in will use this wavelet basis in the current work.
section Ti, we will briefly describe the method Since we are using the subsampling by 2
for computing the wavelet coefficients, and their method, it would be useful to consider the length
interpretation, of a given signal. Then, we will of a segment of signal as 2 N . In this case, there
use the wavelet coefficients as features for the are a total of N+ 1 levels of resolution, containing
classification of EEG signals using an artificial respectively, 2N points, 2N-1, . . ., 2”= 1 point.
neural network in section 111. The results on the The total number of coefficients in a wavelet
performance of the wavelet transform method are transform is the sum of all the transformed points
shown in section IV. Finally, some conclusions at all the levels, including the original signal it-
are drawn concerning this classification method- N
ology. self, i.e., 2’.It can be shown that, given all
i =o


the wavelet coefficients at all levels N 1, it is +
possible to reconstruct the original exactly [9].
The wavelet transform decomposes a signal Often, as we are using an orthogonal wavelet
onto a set of basis functions called wavelets. basis, it is possible to “ignore” some of the “up-
These are obtained from a single prototype per” levels of the wavelet transforms, e.g., we
wavelet, called a mother wavelet, by dilations may ignore the contribution ofthe wavelet trans-
and contractions, as well as shifts. forms for level i > M ,where M < iV. This is
In this paper, we describe briefly how the based on the assumption that the “upper” levels of
wavelet transform can be applied to extract the the wavelet transform consist of mainly “noise”
wavelet coefficients of discrete time signals [ 5 ] , components, and hence can be “ignored”. The re-
[9]. The signal , f ( x ) is decomposed onto an sult is a set of wavelet coefficients of an approxi-
orthonormal basis. Given an original sequence mation of the original signal. In the present work,
f(n),11. E Z, where , f ( n )is the discrete version there are n, = 2” values f l , f2, . . ., fn, which
of . f ( x ) , we derive the difference of information are equally spaced values of a function f ( z ) . The
between the approximations of the signal at the goal, then, is to split .f into its components at dif-
resolutions 2j and 2j+’ ’.
In order to compute ferent scales. We shall use the superscript N
this difference, we build an orthonormal basis to indicate the level of decomposition. At each
by dilating and translating a particular function new level, the meshwidth is cut in half and the
$ ( T ) , called an orthogonal wavelet, or alterna- number of wavelet coefficients is doubled. The
tively, a mother- ~vavelet,where decomposition can then be represented as
f(N ,p’)+p-’)+,

1)2,(T) = 67,:(2jz). (1)

where ,9(.) is called the “detail” signal [9]. M is
Equation (1) is the central equation in the wavelet
so chosen that ,f(”-”‘) is sufficiently “blurred”
transform theory. It is observed that if the mother
wavelet d?(z) is given, then the other wavelet PI.
Further, since we are using an orthogonal
functions can be computed from (1) by dilation
wavelet basis, hence, by ignoring the “upper”
and translation. Different mother wavelets give
levels, e.g., i > M level, of the wavelet trans-
rise todifferent classes ofwavelets, and hence the
form, the reconstructed signal will be the best
behaviour of the “decomposed” signal could be
representation of the signal upto that particular
decomposed signals.
level, i . Thus, the wavelet transform can be used
‘Note that, hel-e, we have assumed for simplicity that the to obtain the bestrepresentation of the underlying
resolutions are “doubled”,or “halved”. In the literature, there signal at different scales, upto a particular level.
is some work on the possibility of tising resolutions which One way of reducing the number of wavelet
are non-~o~nmensurateratios, hut this work is still in its early
days, and hence it is not yet clear how they can he applied to coefficients to be used as features representing
the wavelet basis chosen for this paper. each segment of the EEG signals is to prescribe a

“stopping criterion” for the value of M in equa- We chose the value f2/1 = 4 in Equation 2,
tion 2 - this can be achieved through a thresh- ignoring the higher levels of decomposition of
olding operation [6]. the wavelet transform. We chose the Lemarie
wavelet [8] as the basis for decomposition of the
111. ANALYSED EEG SIGNALS AND THEIR EEG signals. At each level ofdecomposition, we
CLASSIFICAT~ON measured the absolute value of the wavelet co-
efficients, and retained the two coefficients with
The classification of the EEG signals com- the highest magnitude. Tlius, for M = 4, there
prises of the following steps: were 4 x 2 = 8 coefficients for each segment of
Preprocessing of the signals. In the present the EEG signals.
work, this comprises determination of the An artificial neural network that employs
wavelet coefficients, described in section 11. multi-layer perceptrons [7] with a single hidden
These coefficients will be used as “features” layer using a gradient search technique is used
describing the signal. to classify the signals. We will not describe the
The features thus extracted from the pre- training algorithm used here as it is the standard
processing operation arc input into an ar- backpropagation of error type algorithm, hut we
tificial neural network which carries out a refer the readers to [7] for details.
classification over the set of extracted pa- Since this is a study of the performance of
rameters, in this case, the set of wavelet ANNs as a classifying tool for EEG signals, we
coefficients. used a simple trial and error approach of changing
Three data types were selected. Each data the number of hidden layers and hidden units to
type comprised of EEG signals recorded from determine the most suitable ANN architecture for
subjects from one of three diagnostic groups: the different EEG data sets under consideration.
normal control subjects, patients diagnosed with Based on this approach, it was found that, using
schizophrenia (SCH), and patients diagnosed 8 signal feature parameters as the input, a net-
with obsessive-compulsive disorder (OCD). work with a single hidden layer containing fifty
Details of the recording methodology used hidden units and three output units performed
for all subjects are given elsewhere [15], [12]. well for each of the data sets under considera-
All EEG signals were acquired at 128 Hz and tion.
digitised with 8-bit resolution using a Bio-Logic Note that, in the training of the multilayer per-
Brain Atlas 111 system. Each recording used a ceptron, we could have used the entire subject’s
gain of 30,000 and a 1-30 Hz band pass filter. complement of 120 x 8 parameters as inputs, to-
Only data taken with eyes open was subsequently gether with the associated output classification.
analysed, and ofthe the 19 channels of data actu- However, this would have caused the training
ally acquired, only data recorded from the vertex process to be extremely slow, as the resulting net-
of the scalp was used for this study. All EEG data work would have had a large number of weights.
was visually inspected off-line and data contam- Instead, we have treated each segment as inde-
inated by artefacts was manually rejected. pendent, hence we only had 8 input parameters
Each EEG signal was divided into segments. and an associated output classification. This re-
with each segment comprising 2’ = 128 sam- sulted in a much smaller neural network.
ples, i.e. each segment was 1 second in duration.
120 such segments were taken from each subject. IV. RESULTS
Thus. up to seven levels of decomposition of the The layered feedforward net was trained us-
signal onto the wavelet basis are possible. Our ing the standard backpropagation of error method
preliminary work [15], [12], [14], in which we [7]. The output activation is considered to be U -
studied the effects of varying scgment lengths and known if all the values of the activations at the
applied a likelihood ratio analysis, indicated that output nodes are less than 0.5, or if there is a tic
these EEG signals were generally approximately in the number of output activations for each class
stationary within one second segments. of EEG.
A total of 41, GO and 35 EEG files were ob- The output activations and the classifications
tained respectively from normal, schizophrenic based on the activations of the output nodes using
and OCD subjects. We used 26,39 and 24 EEG the method described in this paper are shown in
files respectively for training, and the rest of the figure 1 3. Each bar represents the ANN clas-
files for testing purposes. The testing data files
Normal 10115 1/15 2/15 2/15
SCH 2/21 15/21 2/21 2/21 This work was supported by a grant from the
OCD 311 1 1/11 4/11 3/11 National Health and Medical Research Coun-
TABLE I cil. The authors acknowledge the technical assis-
THECONFUSION TABLE OF THE CLASSIFICATION RESULTS ON tance of Dr. Greg Price of the Clinical Electro-
THE TESTING DATA SET USING THE WAVELET TRANSFORM. physiology Unit, Wolston Park Hospital, Wacol,

included in the printed proceedings. Interested readers can contact the first 01-the third author for a copy of the figure.
