Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO.

12, DECEMBER 2004 2103

Classification of Normal and Dysphagic Swallows


by Acoustical Means
Lisa J. Lazareck, Member, IEEE, and Zahra M. K. Moussavi*, Member, IEEE

Abstract—This paper proposes a noninvasive, acoustic-based theory explaining the physiological cause or acoustical charac-
method to differentiate between individuals with and without dys- teristics of normal swallowing sounds. Thus, without a thorough
phagia or swallowing dysfunction. Swallowing sound signals, both study of the normal swallowing sound, the potential for using
normal and abnormal (i.e., at risk of some degree of dysphagia)
were recorded with accelerometers over the trachea. Segmentation
acoustical analysis to understand dysphagic swallowing sounds
based on waveform dimension trajectory (a distance-based tech- is limited.
nique) was developed to segment the nonstationary swallowing Normal swallowing involves intricate control and coordina-
sound signals. Two characteristic sections emerged, Opening and tion of three swallowing phases, commonly referred to as oral,
Transmission, and 24 characteristic features were extracted and pharyngeal, and esophageal [3], [6]. The oral phase begins with
subsequently reduced via discriminant analysis. A discriminant the insertion of the food bolus (solid or liquid) into the mouth.
algorithm was also employed for classification, with the system
trained and tested using the leave-one-out approach. Overall, 350 The lips, tongue, and teeth confine and process the bolus within
signals were used from three bolus consistencies (semisolid, thick the mouth. The pharyngeal phase begins once the bolus reaches
and thin liquids). A final screening algorithm correctly classified the epiglottis and synchronized measures (with the breathing
13 of 15 control subjects and 11 of 11 subjects with some degree mechanism) are taken to protect the airway. Subsequently, the
of dysphagia and/or neurological impairments. The proposed bolus is passed into the esophagus. Lastly, the esophageal phase
method has great potential to reduce the need for videofluoro-
begins once the bolus is within the esophagus and the crico-pha-
scopic swallowing studies (the current gold standard method for
swallowing assessment, which is invasive and nonportable) and ryngeus juncture sphincter, which is always closed, momen-
to assist in the overall clinical assessment of swallowing sound tarily relaxes during deglutition [6], [7]. Peristaltic waves propel
signals. the bolus into the stomach. The oral phase is voluntary, while the
Index Terms—Classification, dysphagia, segmentation, signal pharyngeal and esophageal phases are involuntary muscle syn-
analysis, swallowing, waveform dimension. chronizations. Dysphagic individuals lack consistency of this
systematic coordination, which may cause aspiration. Abnor-
malities in food processing cause delays in initiating swallows,
I. INTRODUCTION difficulties in liquid swallowing and inefficient oral and pharyn-
geal clearance of swallowed material [8]. Overall, it is the del-
S WALLOWING dysfunction or dysphagia is most common
in individuals with neurological impairment such as brain-
stem stroke, head/neck injuries, and spinal cord injury with ante-
icate nature of the swallowing mechanism that lends itself to
increased vulnerability and possible dysphagia.
rior cervical fusion [1]. Also affected are individuals who suffer The current gold-standard method for the assessment of aspi-
from some degree of muscular incoordination and/or paralysis ration is the videofluorosopic swallow study (VFSS). VFSS is a
as a result of poliomyelitis, Guillian-Barré Syndrome and cere- radiologic procedure, whereby subjects ingest small amounts of
bral palsy [2]. Up to 80% of individuals severely affected by barium-coated boluses while X-rays penetrate the subject and
cerebral palsy have some degree of dysphagia, placing them the resultant images are video-recorded. VFSS allows imme-
at risk of aspiration (the event of food being drawn into the diate visual inspection, detection, and localization of abnormal-
airway below the glottis) [3], [4]. Repeated choking, malnutri- ities and demonstrates function impairments, which may result
tion, dehydration, recurrent respiratory disease, cachexia, and in aspiration [9]. However, VFSS is time-consuming and results
death may all be consequences of recurring aspiration and dys- in some radiation exposure. The test is not portable and cannot
phagia [5]. Despite severe and serious repercussions, our under- be done in the patients’ typical eating environment, which po-
standing of the fundamental dysphagic swallowing mechanism tentially limits the patients’ cooperation. Because of the X-ray
is still far from complete. Moreover, there is not a well-accepted exposure and lack of portability, VFSS cannot be used repeat-
edly when assessing/monitoring intervention strategies or as-
sessing an evolving neurological condition of a patient. In addi-
Manuscript received July 4, 2003; revised April 16, 2004. This work was
supported in part by the Natural Sciences and Engineering Research Council tion, VFSS may be challenging in neurologically impaired chil-
(NSERC) of Canada. Asterisk indicates corresponding author. dren or adults who may not cooperate or position easily into the
L. J. Lazareck was with the Department of Electrical and Computer Engi- testing area. Thus, new techniques need to be developed to help
neering, University of Manitoba, Winnipeg, MB R3T 5V6 Canada. She is now
with the Department of Engineering Science, University of Oxford, Oxford OX1 assess the performance of the swallowing mechanism.
3PJ U.K. (e-mail: lisa.lazareck@eng.ox.ac.uk). In recent years, cervical auscultation has been used by
*Z. M. K. Moussavi is with the Department of Electrical and Computer En- clinicians and researchers who investigate acoustical patterns
gineering, University of Manitoba, Winnipeg, MB R3T 5V6 Canada (e-mail:
mousavi@ee.umanitoba.ca). and temporal relations, which exceed auditory perception, at
Digital Object Identifier 10.1109/TBME.2004.836504 both subjective levels via the stethoscope and objective levels
0018-9294/04$20.00 © 2004 IEEE
2104 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 12, DECEMBER 2004

via surface microphones or accelerometers (with subsequent sounds. The specific goals were to divide the swallowing sound
computer programming and pattern recognition techniques). signal into characteristic segments (i.e., pharyngeal, esophageal,
Three distinct and previously unrecognized structures within etc.) using WDT; and to extract the characteristic features to
swallowing sound signals of infants were identified: initial classify between the swallowing sounds of normal subjects and
discrete sound (IDS), bolus transit sound (BTS), and final patients who might be at risk of having swallowing disorders
discrete sound (FDS), where the discrete sounds demonstrated (i.e., dysphagia). In particular, this paper reports on the tech-
remarkably stable repetition of the swallow composite [10]. IDS niques for swallowing mechanic corroboration, waveform di-
and FDS were also documented as the characteristic swishing mension method utilized for segmentation, feature extraction,
double-click when the bolus passes through the pharynx into and classification of a number of normal and dysphagic swal-
the esophagus [11], [12]. lowing sound signals.
In the late 1990s, the majority of acoustical swallow studies
were mainly concerned with the timing of the swallow within
the breath cycle [13], [14]. Breath and swallow coordination II. METHODS
was studied to assess the maturity and competence of swal-
lowing mechanisms along with the rhythmic breath-swallow In this study, the term “normal” was applied to the swal-
patterns in preterm infants [15]. Later, more attention was paid lowing sounds recorded from healthy subjects with no history
to the basic characteristics of the swallowing sound and whether of swallowing dysfunction. These individuals were considered
it could be used as an indicator of abnormalities [16], [19]. as “control” subjects. On the other hand, the term “abnormal”
Based on studying suck and swallows of infants over the first was applied to the swallowing sounds recorded from patients
month of life, a maturational pattern of suck-swallow rhythmic with swallowing disorders and/or neurological impairments. For
integration was proposed to exist and could be compared with these subjects, we analyzed the swallowing sounds in which as-
patterns from infants with bronchopulmonary dysplasia (BPD) piration had not transpired. This selection was made for two rea-
[16], [19]. Stationarity and normality of pediatric aspiration sons: first, the recording was stopped as soon as an aspiration
sound signals were studied and it was found that of the 94 occurred; hence, there were very few swallows with aspiration
waveforms studied, 62% were stationary over small window (one per patient) for analysis. Second, the goal was to develop a
sizes and 96% violated normality [17]. Given the fact that technique that could identify patients at risk of aspiration, only
swallowing sounds are nonstationary by nature, in a pilot study, by acoustical means; otherwise, detection of aspiration is fairly
normal swallowing sound signals were divided into stationary simple as the breath sound immediately following an aspiration
segments using variance fractal dimension (VFD) [18]. VFD is changes significantly (i.e., the breath sound is damped due to an
a fractal-based measurement describing the underlying signal external object in the airway). Therefore, the swallowing sounds
complexity or major changes in the signal’s variance [20], [21]. analyzed in this study were normal and marginally normal (i.e.,
Loosely based on the principle of fractal dimension, wave- “abnormal,” based on the definition above). Fig. 1 depicts the
form dimension (WD) is also a measurement of the degree of swallowing and breath sound signals of a normal subject in both
complexity or meandering between the points of a signal in the the time and frequency (i.e., spectrogram) domains along with
time domain. More specifically, WD is a measurement calcu- the corresponding airflow. Note that airflow was not calibrated
lated for a specified window size, with the window being slid and was not used for any analysis in this study, and is shown
through the entire signal, creating a waveform dimension tra- only for the purpose of illustration. The spectrogram was cre-
jectory (WDT) ated using a 100-ms (1024 samples) Hanning window with 50%
overlap between successive segments and calculating the dis-
(1) crete Fourier transform of each segment.

A. Subjects
where (within a time window) is the number of steps in the
waveform, is the planar extent or diameter of the waveform Two groups of subjects participated in the study, including
(that can be considered to be the farthest distance between the 15 controls and 11 patients. The controls were 12 children
starting point “1” and any other point “i” of the waveform), (ages 3–16 years) and three adults (ages 35, 38, and 54 years)
is the total length of the waveform, and is the average dis- with no history of swallowing dysfunction, eating or nutrition
tance between successive points [22]. In comparison with other problems, or lower respiratory tract infections. Controls were
waveform-based algorithms, such as the Higuchi and Petrosian tested in the Respiratory Acoustics Laboratory at the Children’s
methods, the WDT calculation procedure is reported to be fairly Hospital, Winnipeg, MB, Canada. The patients were 11 young
insensitive to noise [23]. In comparison with the VFD algorithm, adults (ages 16–25 years), divided into two groups: subjects
VFD trajectories heavily depend on the selected window size with swallowing dysfunction due to cancer and/or brain tumor
and the percentage of (sample) overlap; on the other hand, WD (2 subjects, at age 16) and severely affected by cerebral palsy
trajectories are less susceptible to these changes by default [18]. but feeding orally with no clinical signs or symptoms of aspi-
The WD is also a faster algorithm than VFD. Thus, the WDT ration (9 subjects). The former group was recruited from the
was used for the current investigation. Winnipeg Children’s Hospital; they were considered healthy
The objective of this study was to investigate whether there prior to illness (i.e., illnesses were noncongenital). These two
are any differences between normal and abnormal swallowing subjects were tested in the videofluoroscopy room, Radiology
LAZARECK AND MOUSSAVI: CLASSIFICATION OF NORMAL AND DYSPHAGIC SWALLOWS BY ACOUSTICAL MEANS 2105

the healthy children group ages 3–16 years. Lastly, it is impor-


tant to note that this study is retrospective and the convenience
sample of subjects (i.e., controls and heterogeneous mix of sub-
jects with neurological impairments) was originally collected
for the purpose of studying the pattern of swallow in respira-
tory cycle.

B. Experiments
Subjects participated in one of the two conducted exper-
iments: videofluoroscopic swallow study (VFSS) or audio-
recording session (non-VFSS). The VFSS was administered
for three healthy adults and two subjects with cancer and/or
brain tumor who were thought to be aspirating based on the
clinical evaluation, with both video and audio signals recorded.
Audio recording alone (non-VFSS) was performed on 12
healthy children and 9 subjects with cerebral palsy, with only
audio signals recorded. For both experiments, subjects were
fed three textures, “semisolid,” “thick liquid,” and “thin liquid”
in the same order. Respectively, prepackaged pudding, pudding
diluted with an equal quantity of milk, and fruit juice were
consumed. The physician in charge monitored each feeding for
all subjects. Subjects were offered five to ten 5-mL spoonfuls
of pudding and single-bolus sized sips of thick and thin liquids
respectively. Overall, 350 swallowing sound signals were
utilized (108 semisolid, 135 thick liquid, and 107 thin liquid).

C. Data Acquisition and Signal Processing


In both experiments, breath and swallowing sound signals
were recorded by accelerometers (Siemens EMT25C) being
placed over the suprasternal notch of the trachea and mid-
clavicular second intercostal space, respectively and secured
with double-sided adhesive tape rings. The signals were ampli-
fied with the same gain for all subjects and bandpass filtered
(50–2500 Hz) in hardware to minimize very-low-frequency
Fig. 1. Typical corresponding airflow (top), swallowing and breath sound
signals in the time (middle) and frequency (bottom) domains. Legend: “au,” movement artifacts and high-frequency noises. During VFSS
= =
arbitrary units for normalized amplitude; “SW” swallow, “I” inspiration, experiments, breath and swallowing sounds along with the
=
and “E” expiration. video images were simultaneously recorded with a Super VHS
Sony VCR. Later, the audio/video signals were digitized at
Department (Winnipeg Children’s Hospital), during their reg- 11.025-kHz sampling rate and 30 frames/s with a frame grabber
ular swallowing assessment. The latter group consisted of non- (ATI Multimedia Software). During non-VFSS experiments,
ambulant, wheelchair users affected by spastic quadriparetic simultaneously with breath and swallowing sounds, airflow
cerebral palsy. These patients resided in a congregate care set- was also recorded by nasal cannulae attached to a Fleisch
ting (St. Amant Center in Winnipeg, MB, Canada). Each of (No 3) pneumotachograph and Validyne differential pressure
the subjects had good head control (but no independent sitting transducer (Northridge, CA). The signals were digitized at
ability), moderate or severe cognitive impairment, and good nu- 10.240-kHz sampling rate; however, airflow signals were later
tritional status (as judged by their Center’s clinical dietician). decimated to 320 Hz.
In addition, they were free of lower respiratory tract infec- Swallowing sounds were extracted from each recording. The
tions for at least one year prior to investigation. Subject selec- start and end of each swallow were determined manually by
tion was based on the recommendation of their care providers. repeated listening and monitoring of the signal in the time and
These subjects were tested in St. Amant Center, their familiar frequency domains. The start point of a swallow was chosen
care environment, during regular meal or snack times by their as the first point of the deglutition apnea after any inspiration
regular feeders. In each experiment, informed consent was ob- or expiration. The end point of the swallow was chosen as the
tained from all participants or their parents/guardians prior to sample point preceding the FDS if present, or by the breath
the study. The study was approved by the Ethics Committee immediately following deglutition apnea. FDS may or may
for the Use of Human Subjects in Research of the University not be present in a swallowing sound. Each swallow event
of Manitoba. It should be noted that the age group of 16–25 was chosen through auditory analysis of the sound signal
years of cerebral palsy patients was considered equivalent to (both datasets), visual inspection of the signal in the time and
2106 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 12, DECEMBER 2004

2) Feature Extraction: Having done Part 1 (verifying the


match of WDT with time transcriptions and physiological
events), the next step was to extract characteristic features
for every swallowing section. From the results of Part 1,
across the 350 signals, two prominent and similar character-
istic sections (amongst and between subjects) emerged and
were denoted “Opening” and “Transmission.” The Opening
section represented a bolus entering the esophagus and was
defined as Swallow Commencement to IDS Termination
(abovementioned time transcription titles). The Transmission
section represented a bolus traveling down the esophagus to
the stomach and was defined as the swallowing sound signal
from the sample point following IDS Termination to Swallow
Termination. Both Opening and Transmission sections were
Fig. 2. Typical normal subject swallowing sound and corresponding WDT found to be wide-sense stationary; i.e., the mean value and vari-
signals. Legend: “au,” arbitrary units for normalized amplitude; “d,” waveform ance were unchanged per subject. Also, note that extraneous
0 2
dimension units [normalized by (WDT 1:0000 10 )]; “C” = click segment,
noises (“clack,” “gulp,” and so on) were not removed from the
“Q” = quiet segment; solid line demarcates Opening and Transmission sections;
dotted line indicates segmentation for matching sound descriptions. Opening and Transmission sections. The swallowing signals
depicted in Fig. 2 were demarcated accordingly.
frequency domain (both datasets), and visual inspection of Next, potential features were extracted, all pertaining to
the video frames (VFSS dataset). A time transcription was Opening and Transmission sections plus the “Total” swallowing
prepared of the “Swallow Commencement” and “Swallow sound signal in its entirety. Stemming from the time transcrip-
Termination” for each extracted swallow. In addition, for each tions, WDT and the swallowing sound signal magnitude, three
swallow, another time transcription was prepared for the “IDS sets of features were computed for each of the Opening, Trans-
Commencement,” and “IDS Termination.” Other acoustically mission, and Total sections. The first feature set was “Time
evident modulations were also noted, as most swallowing sound Duration,” or length of the swallowing sound signal in the time
signals included typical onomatopoeically designated “clacks,” domain per section. The second feature set was “Waveform
“gulps,” “squishes,” and “braaps.” The overall averaged timings Dimension,” the maximum value of WDT for the Opening and
(across normal subjects) were compared to those previously Transmission sections and the mean value of WDT for the Total
published. section. The third feature set was “Magnitude,” or the mean
1) Adaptive Segmentation Using Waveform Dimension rectified value of the swallowing sound in the time domain
Trajectory (WDT): An adaptive segmentation method based for the Opening, Transmission and Total sections. Following
on the aforementioned WD was applied to the 350 swal- the initial intuitive features, more features were selected and
designated for the Opening section only, as the Opening sec-
lowing sound signals. For each single swallowing sound
tion was the least contaminated by extraneous noises, such as
extracted from the entire breath and swallowing sound signal,
“gulp” and “clack.” Using the fast Fourier transform, the power
a 512-sample window was used to calculate WD [using (1)].
spectrum was calculated for every 100 ms segment of the signal
The window was slid across the entire signal with an increment
with 50% overlap between adjacent segments. Then from the
of one sample at a time, and WD was computed accordingly.
spectrum of the Opening section, “Fpeak,” “Fmax,” “Fmean,”
The 512-window corresponded to 46.4 and 50 ms for the
“Fmedian,” and “Pave” were calculated. Fpeak (Hz) was the
VFSS and non-VFSS datasets (sampling rates of 11.025- and
frequency corresponding to the maximum power spectrum
10.240-kHz). From studying the swallowing sound signals in magnitude, Fmax (Hz) was the frequency beyond which the
the time and frequency domains, the shortest characteristic power spectrum magnitude dissipated to less than 10% of its
signal of a swallowing sound is a “click” sound, which at maximum, and Fmean (Hz) and Fmedian (Hz) were the sta-
most is approximately 33 ms in duration. Thus, a 512-sample tistical mean and median frequencies of the power spectrum.
window was long enough to capture the “click,” and short Pave (dB) was the average power calculated over seven specified
enough to eliminate any redundant details. Once the window frequency bands (read subscripts, in Hertz), thus branching
completely slid through the signal, the resultant collection of into seven additional features: “ ,” “ ,”
WD values was denoted as the WDT, which had a one-to-one “ ,” “ ,” “ ,” “ ,”
correspondence with its analogous swallowing sound signal. and “ .” Finally, for the Opening and Transmission
For each WDT signal, significant changes were marked as sections, “Skewness” and “Kurtosis” features were computed.
the boundaries of characteristic segments of the swallowing Skewness and Kurtosis are third- and fourth-order statistics,
sound signal. Fig. 2 illustrates a typical normal swallowing respectively, measuring the asymmetry of the data around its
sound signal and the corresponding WD trajectory. The WDT sample mean and how outlier-prone the distribution is.
boundaries were compared to the manually transcribed timings Following feature extraction, statistical t-tests were per-
(VFSS and non-VFSS datasets) to investigate whether the WD formed, testing the hypothesis that there was a true difference
technique was able to detect the characteristic sections of a between the two means (normal and abnormal) of each fea-
swallowing sound. ture. In addition, t-tests were also performed between texture
LAZARECK AND MOUSSAVI: CLASSIFICATION OF NORMAL AND DYSPHAGIC SWALLOWS BY ACOUSTICAL MEANS 2107

pairs (semisolid-thick liquid, thick liquid-thin liquid, and thin this screening, a new accuracy (or misclassification rate) was
liquid-semisolid) to investigate any significant difference for calculated for each texture. Also, for each texture, only sub-
any of the 24 features, between the normal and abnormal groups jects contributing more than one swallow were included for the
. screening procedure.
3) Feature Reduction & Classification Techniques: In this
study, each swallowing sound signal was classified as normal or III. RESULTS
abnormal (marginally normal) as described at the beginning of
Section II. As each case was known prior to classification, su- 1) Adaptive Segmentation Using Waveform Dimension Tra-
pervised learning was used [24]. The classification model used jectory (WDT): Prior to the assessment of WDT segmentation,
in this part of the study was discriminant analysis [25] using manual inspection of the swallowing sound signals, by visual
Statistical Package for the Social Sciences (SPSS) software. and auditory means in the time and frequency domains as well
To begin with, the 24 computed features were assembled into as the video data, was crucial, revealing much regarding swal-
groups similar in meaning and importance. The following sym- lowing sound corroboration with swallowing mechanism. Con-
bols represented the feature groups: T Time Duration and sequently, these results (described first) were then utilized in
Waveform Dimension, M Magnitude, F Frequency, P the evaluation of WDT results (described second). To begin,
Power, and N Normality. In order to select the best group the Opening section corresponded with the bolus entering the
or groups of characteristic features, discriminant analysis was pharynx by way of the opening of the crico-pharyngeus muscle.
performed with each individual group, utilizing all cases as the The IDS of the swallowing signal was definitely present in all
training data. Thick liquid texture was selected for this part of swallows in the controls and patients (all datasets). The Trans-
the study as it contained the highest number of overall swal- mission section corresponded with the bolus traveling down the
lowing sound signals (135) in comparison to semisolid and thin esophagus and the return of the epiglottis (commonly causing
liquid textures (108 and 107). Using the leave-one-out method, an ensuing “click” sound, noted as FDS). The FDS of the swal-
the optimal feature set or combination was selected. The dataset lowing signal was noticeably absent in a number of swallows;
was then divided into training and testing subsets. Since the therefore, FDS was not included in this analysis. The averaged
number of subjects was not large enough for randomly dividing IDS and FDS (where present) duration, for all video-based swal-
between the training and testing subsets, the leave-one-out ap- lowing sound signals, were found to be approximately 33.3 ms
proach was used such that first, the individual swallowing sound or one frame (both normal and abnormal subjects). In addition
signals were grouped together (per subject); second, data of all to the IDS and FDS, several extraneous noises were catego-
subjects—except one—were selected as training data to create rized. The “clack” sound corresponded to tongue movement,
the classification model. The left-out subject’s data were se- and the “gulp,” “squish,” and “braap” sounds were of a more
lected as testing data to assess the accuracy of the classifica- general nature (i.e., not dependent on one physiologically spe-
tion model. This procedure was repeated until all subjects were cific movement). These universal and superfluous sounds typi-
considered as the test subject once. The classification accuracy, cally emanated from bolus movement within the esophagus (i.e.,
however, was calculated considering each swallowing sound as Transmission section), were temporally sporadic, and although
a case. Therefore, for each subject, classification accuracy was disruptive (i.e., high in amplitude), were not as prominent as the
calculated for the number of swallowing sounds that the sub- IDS and FDS signals.
ject contributed. Then, the classification accuracy was averaged Subsequent WDT segmentation revealed a definite corre-
between the subjects. The entire procedure of discriminant anal- spondence between its boundary markings and the aforemen-
ysis was repeated for every texture. tioned manual time transcriptions. For example, Fig. 2 depicts
This preliminary classification detailed the classification rates the demarcation of one swallowing sound signal into the
of each swallow per subject. From each completed analysis, Opening and Transmission sections via WD trajectories in the
a subsequent tally of each predicted group membership was time domain. First and foremost, the prominent IDS signals (or
recorded. A secondary classification, or screening process, was click segments) were always recognized within the swallowing
then employed to assess the overall classification rates for each sound signal as they corresponded with the largest variations
subject. For each texture and each subject, two error rates were or contours in the WD trajectory signals. In addition, extended
calculated: , the chance that a swallow was falsely classified as quiet periods and extraneous noises were also consistently iso-
abnormal; and , the chance that a swallow was falsely classi- lated via WDT signals. Overall, manual time transcriptions and
fied as normal. Each subject had predicted group memberships WDT segmentation resulted in highly correlated swallowing
equal to the number of times discriminant analysis was per- sound identification. In comparison to VFDT reported in [18],
formed; hence, the error rates had to be averaged per subject. the WD algorithm resulted in more accurate segmentation.
Next, the error rates were compared to a preliminary threshold 2) Feature Extraction: After corroboration of manual time
of 50%, which was further investigated to determine whether or transcriptions and WD trajectories, feature extraction and
not a tighter threshold was possible for use. If more than 50% t-tests were performed to identify the most important signal
of the swallowing signals of a subject were predicted as being characteristics; i.e., features that would best differentiate be-
normal, the subject’s swallowing was considered to be normal tween normal and abnormal swallowing sound signals, and
overall. Conversely, if more than 50% of the swallowing sound furthermore, between textures for both normal and abnormal
signals of a subject were predicted as being abnormal, the sub- swallows. Stemming from manual time transcriptions and the
ject’s swallowing was considered to be abnormal overall. After corresponding WDT signals, Time Duration was first examined
2108 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 12, DECEMBER 2004

TABLE I
TIME DURATION (SECONDS) OF SWALLOWING SOUND SIGNALS (MEAN 6 STD)

“ ” = significant difference (SD) between “N” = normal and “A” = abnormal mean values (P < 0:05)

(Table I). The durations of Opening, Transmission and Total more damped) the average power. This effect can be heard
sections were significantly longer for abnormal than normal from repeated audition of swallowing sound signals from one
swallowing sounds for both semisolid and thick liquid textures. subject, while changing bolus textures.
No significant differences were observed for thin liquid texture. Finally, normality features were examined. For Gaussian or
For normal data, Opening Time Duration yielded significant normal distributions, skewness and kurtosis are, respectively,
differences between thick and thin liquids textures, whereas, for zero and three, representing data perfectly symmetric in distri-
abnormal data, Transmission and Total Time Duration yielded bution. Considering the results of Skewness, one may conclude
significant differences between semisolid and thick liquid as that the swallowing sounds were slightly skewed toward either
well as between thick and thin liquid textures. left or right of the sample mean (which is zero as the sounds
In terms of Waveform Dimension value, for thick and thin were recorded by a bandpass filter, without a DC offset). How-
liquids, Opening WDT was significantly lower (i.e., less com- ever, since the results of Kurtosis were far above 3, overall, one
plex) for abnormal than normal swallowing sounds. Also, thick may conclude that the normality test was failed in swallowing
liquid Total WDT was significantly lower for abnormal than sound signals. In addition, a subsequent Lilliefors test for good-
normal swallowing sounds. For normal data, Opening WDT ness of fit to a normal distribution for all textures (Opening and
yielded significant differences between thick and thin liquids Transmission sections) revealed the absence of normality in all
and semisolid and thin liquid, whereas Total WDT yielded sig- swallowing sound signals. Nevertheless, while there were no
nificant differences between semisolid and thin liquid. For ab- significant differences in the Skewness feature between normal
normal data, both Opening and Transmission WDT yielded sig- and abnormal swallowing sound signals, Kurtosis showed sig-
nificant differences between semisolid and thick liquid. nificant differences between the two groups. On the whole, Kur-
Magnitude features (of Opening, Transmission, and Total tosis was smaller for abnormal than normal swallowing sound
sections) yielded significantly higher values for abnormal than signals. However, it was variable between the subjects (i.e., high
normal swallowing sounds, for every texture. Overall, abnormal standard deviations). Neither Skewness nor Kurtosis showed
swallowing sounds were louder than normal ones. For normal any significant differences between the textures.
data, Opening Magnitude yielded significant differences be- 3) Feature Reduction and Classification Tech-
tween thick and thin liquids, whereas both Transmission niques: Discriminant analysis was performed with
and Total Magnitude yielded significant differences between each individual group (all cases as training data). As previously
semisolid and thick liquid. For abnormal data, Opening Mag- mentioned, Group T yielded the highest overall accuracy;
nitude yielded significant differences between thick and thin therefore, it was included in all further testing. Subsequently,
liquids and semisolid and thin liquid. Group T was subdivided into Time Duration and Waveform
Of the Frequency features, for thick and thin liquids, Fpeak Dimension features with discriminant analysis performed
was significantly lower for abnormal than normal swallowing with each subset. The Time Duration features (first subset)
sounds. For semisolid, Fmax was significantly higher for ab- resulted in higher accuracy for semisolid and thick liquid
normal than normal swallowing sounds. However, Fmean and textures in comparison with Waveform Dimension features
Fmedian features did not show any significant differences be- (second subset). For thin liquid, the accuracies between subsets
tween either normal/abnormal swallowing sounds or different were comparable. Including Group T in every group, sixteen
textures. new combinations were tested. The combinations ‘TMP’
Average Power in all seven frequency bands yielded signifi- and ‘TMPN’ yielded the highest accuracy: 94.8% (overall),
cant differences for semisolid data, where abnormal swallowing 94.8% (within the normal class), 94.7% (within the abnormal
sounds had significantly higher average power than normal class). However, since the removal of the Group N features
swallowing sounds. In addition, thin liquid data showed did not result in a reduction of classification accuracy, the
significantly higher power for abnormal than TMP combination was considered to be the optimal feature
normal swallowing sounds and conversely, higher power set. Next, the individual features of Groups T, M and P were
for normal than abnormal swallowing sounds. examined using the leave-one-out approach. The eliminated
Between the textures, all seven Power features showed sig- features included Time Duration (of the Total section) as well
nificant differences between semisolid and thick liquid, and as , , and (of
semisolid and thin liquid textures. Overall, a damping effect the Opening section). Based on the results of analysis using
was observed; i.e., the thicker the bolus texture, the lower (or SPSS, Time Duration Total was removed due to its strong
LAZARECK AND MOUSSAVI: CLASSIFICATION OF NORMAL AND DYSPHAGIC SWALLOWS BY ACOUSTICAL MEANS 2109

TABLE II
TRAINING AND TESTING ACCURACY OF CLASSIFICATION FOR SEMISOLID, THICK LIQUID, AND THIN LIQUID TEXTURES (MEAN 6 STD ERROR)

TABLE III
TOTAL TALLY AND AVERAGES OF ERROR RATES OF CLASSIFICATION OF THE TWO NORMAL AND ABNORMAL GROUPS

correlation with Opening and Transmission Time Duration 9/10 normal subjects and all (10/10, 7/7, and 4/4) abnormal sub-
features. This result was expected as the sum of Opening and jects (Table II). The only three misclassified normal cases (two
Transmission Time Duration gives the Total Time Duration. with semisolid, one with thin liquid) stemmed from the two
Also, according to the results of analysis using SPSS, the older adult subjects in the control group, ages 38 and 54.
aforementioned power features were eliminated because they Finally, considering each swallowing sound as a case, two
did not contribute any distinctive information or tangible averages were calculated for the and error rates per subject
characteristics to the classification model. (Table III). First, “Exclusion A” described the averaged error
With the remaining eleven features, further testing did not re- rates excluding subjects who contributed only one swallow for
veal any additional classification improvement; i.e., the classifi- a texture. Second, “Exclusion B” described the averaged error
cation accuracy reached a plateau. Overall, the reduced feature rates excluding the two older adult control subjects and any sub-
set contained the following features: Time Duration (Opening, ject who contributed only one swallow for a texture (one control
Transmission), WD (Opening, Transmission, Total), Magnitude and three patients). Exclusion A and B averages ranged from
(Opening, Transmission, Total), and , , 4.0% to 22.2% and 0.5% to 17.2%, respectively. With the ex-
. All of these features, with the exception of WD clusion of the adult subjects, Exclusion B averages indicated a
(of the Transmission section), yielded significant differences be- drastic reduction of error rates.
tween normal and abnormal swallowing sound signals for at
least one texture. IV. DISCUSSION
Discriminant analysis was performed (with the eleven fea- Time, frequency, audio and video domain analyses were re-
tures identified in the preceding paragraph) for preliminary clas- quired to fully identify, extract and compare swallowing sound
sification of normal and abnormal swallowing sound signals for signals. First, identification of individual swallowing sound sig-
each texture. Considering each swallowing sound as one case, nals depended upon the detection of IDS and FDS signals (in
overall training and testing classification accuracies were aver- both the time and frequency domains). These sounds were con-
aged between the subjects (Table II). For semisolid, thick, and sistently sharp, crackling and at times brilliant, as described by
thin liquids, the training accuracies were 82.0%, 94.9%, and Qureshi et al. [11]. On the average, IDS was 33 ms in dura-
94.4% for normals and 89.8%, 95.1%, and 87.5% for abnor- tion in comparison with 10 to 30 ms reported by Vice et al.
mals, respectively; the testing accuracies were 58.0%, 76.6%, [10]. The difference may be due to different video digitization
and 84.6% for normals and 84.8%, 88.1%, and 63.3% for ab- rates. The IDS signals were initiated by a prominent excursion
normals, respectively. Secondary classification or screening in- with a rapid rise, graduating into activity of lower amplitude and
creased the overall classification accuracy significantly for all lower frequency of repetition; on the other hand, the FDS sig-
textures, as it yielded correct classifications for 7/9, 10/10, and nals were less prominent and more variable in amplitude and
2110 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 12, DECEMBER 2004

morphology. Physically (in the audio and video domains), the GERD group demonstrated longer time durations than their
IDS corresponded to the subsequent opening or relaxing of the control group counterparts. In addition, all of the remaining
crico-pharyngeus sphincter and shifting of the bolus into the recorded events, such as laryngeal elevation and closure, were
pharynx (once the epiglottis gradually moved downward, which also longer in duration in the GERD than control group, but
took up to 100 ms). After the explosive shift, deglutition con- not significantly so [30]. Also within Group T were Waveform
tinued until the epiglottis was returned to its normal position Dimension features, which revealed significantly lower com-
(which occurred at a much faster rate than its initial downward plexity in abnormal than normal swallowing sound signals.
movement; i.e., was at most about 33 ms or one frame) and the This finding was interesting as WDT values may be potential
airways were reopened. These final movements corresponded to indicators of specific deficiencies within the swallowing mech-
the FDS (although they were not always present). Overall, con- anism. When an aspiration event occurs, an error in the normal
sistent identification of IDS and FDS were congruent with the swallow sequence transpires. This error may be due to a number
findings of Vice et al. and Kahrilas et al. [12], [26]. of problems, including a few milliseconds of delay with the
With the success of WDT as a signal segmentation tool, airway protective features, a mistiming of the breath-swallow
characteristic Opening and Transmission sections were ob- coordination, or a poor clearance of the swallowed bolus. The
served and defined. Subsequently, features were calculated neurological impairments that underlie these problems in the
with various degrees of significance. In general, a high degree swallowing mechanism are numerous. Regardless of the cause
of overlap was observed in all 24 calculated features between of swallow difficulty, the results of this study suggest that
normal and abnormal swallowing signal characteristics (i.e., abnormal swallowing sounds are lacking some characteristic
large standard deviations values were observed). This result was component compared to normal swallowing sounds, and that
entirely expected, as the abnormal swallowing sound signals this is primarily reflected in the waveform dimension results.
were actually marginally normal (or the successful, nonaspi- Of interest were the Frequency and Power features of Groups
rated, swallows generated by neurologically impaired subjects). F and P. Specifically, we noted an Fpeak of 500–900 Hz, which
Thus, ensuing classification to detect abnormal swallowing has an approximate agreement with the “wet” or water swallow
sound signals regardless of their marginal normalcy must reportedly reaching 400–600 Hz and 1000 Hz (respectively for
overcome this commonality factor (i.e., shared or overlapping the beginning and ending of a swallowing sound signal) [31].
characteristics between normal and abnormal groups). Also, averaged Fpeak, Fmax, and Fmean were 800, 2700, and
From the Opening and Transmission sections, and of the 1250 Hz in comparison to the 300, 700, and 350 Hz reported
five major feature groups (T, M, F, P, and N), Group T was the by [32] for breath sound signals, indicating the dominating
most important, yielding the highest individual classification high-frequency components of a swallow within the breath and
accuracy. In addition, and within Group T, the Time Dura- swallowing sound signal. Significant differences were observed
tion features were deemed to be the most important (as they between all Power features for semisolid data. This result
yielded higher classification accuracies than the Waveform provides the quintessential basis for this study; i.e., abnormal
Dimension features). Within Time Duration, of Group T, the swallowing sound signals are distinct (in their average values)
most notable feature was Total Time Duration, although it and yet parallel (in their standard deviations) versions of normal
was excluded from the reduced feature set due to its strong swallowing sound signals.
correlation with Time Duration of the Opening and Transmis- The Normality features of Group N, Skewness and Kurtosis,
sion sections. In deglutition studies, Total Time Duration is a indicated that both normal and abnormal swallowing sound
frequently calculated feature. For our normal subjects, Total signal segments lack normality. One recent study described
Time Duration feature values (averaged across ten subjects, 94 aspirated swallowing sound signals from children as non-
ages 3–54 years) ranged from 790 to 880 ms, which differed stationary with departures from normality [17]; however, it
slightly from total deglutition apnea duration values reported was not clear which exact section of the swallowing sound
by [26]–[29]. Kahrilas et al. reported a duration of 870 ms was studied. Also, given the fact that physicians normally halt
from upper esophageal sphincter opening to airway reopening, videofluoroscopy after observing one or two aspirations, it was
with subsequent swallow in between, for one normal subject not clear if the data included the swallows in which aspiration
(male, age 35 years) [26]. The other researchers [27]–[29] had occurred or swallows of subjects at the risk of aspiration.
respectively reported apnea durations of 600, 800, and 750 ms Nevertheless, their finding of the absence of normality in
in their bolus viscosity and volume studies (with respectively swallowing sound signals is congruent with our results.
92, 15, and 12 healthy normal adult participants). Another Of the 24 features extracted in this study, 11 were selected as
notable result was that all of the Time Duration features were the optimal feature set with the majority (all but one) yielding
significantly different between normal/abnormal swallows for significant differences between normal and abnormal swal-
semisolid and thick liquid textures; i.e., abnormal swallowing lowing sound signals for at least one bolus texture. The feature
sound signals were significantly longer than normal swallowing exception was WD of the Transmission section. We speculate
sound signals. This result was in accordance with those found that the extraneous bolus transmission sounds (“gulp” and
in a recent pilot study comparing healthy controls and subjects so on) frequently emanating from the Transmission section
with gastroesophageal reflux disease (GERD) [30]. For several interfered with the section’s true characteristic sound.
pharyngeal swallow events, such as pharyngeal response and With the optimal feature set, discriminant analysis proved
transit time, base of tongue movement to the posterior pharyn- successful as a classification tool. The classification was
geal wall, hyoid movement, and cricopharyngeal opening, the achieved in two stages. In Stage 1, discriminant analysis was
LAZARECK AND MOUSSAVI: CLASSIFICATION OF NORMAL AND DYSPHAGIC SWALLOWS BY ACOUSTICAL MEANS 2111

used to classify each swallow as either normal or abnormal. In Moreover, in another recent study, Hiss et al. [34] tested 60
Stage 2, screening was used to classify each subject as either adults (20 per group; ages 20–39, 40–59, and 60–83 years) and
normal or abnormal. As a rule of thumb, a subject was classified reported an increase in swallowing apnea durations with age. In
as abnormal (at risk) if more than 50% of his/her swallows addition, oro-pharyngeal and hypo-pharyngeal pressures were
were classified as abnormal and vice versa. However, the results reportedly higher in amplitude and duration for older individ-
showed that the same second stage classification accuracy could uals in comparison with younger adults. Thus, in aging individ-
be achieved with a threshold of 70%, with the exception of one uals, changes in swallowing sound signal are directly correlated
case (an adult control). The secondary screening significantly to age-related changes in anatomy and physiology [34]. This
improved all classification accuracies, for all textures. The most may explain the reason for our two older control subjects being
important results stemmed from thick liquid, as this texture misclassified.
contained the largest number of swallowing sound signals. The proposed methods in this study hold promise as tools
Final classification accuracies for thick liquid were 100.0% for in the evaluation of swallowing dysfunction; and more im-
both normal and abnormal subjects. portantly, in understanding the complexity of the swallowing
Next in importance was thin liquid, as the texture posed the process in individuals with and without neurological impair-
highest risk for our subjects (i.e., the subjects aspirated more ment. Stemming from the results, long-term objectives include
frequently with the thin liquid than with the other two textures). viable, noninvasive diagnosis of subjects (normal and abnormal)
However, in subjects with significant weakness, thick liquid based on the analysis of externally recorded swallowing sound
may pose a greater risk than thin liquid. Thus, the importance signals. The results of this study pave the way toward reducing
of texture is relative to the conditions of the subjects. For the need for videofluoroscopic swallow studies and assisting in
thin liquid, final classification accuracies were 90.0% (9 of the overall clinical assessment of swallowing sound signals.
10 subjects) and 100% for normal and abnormal subjects, re-
spectively. Of the ten normal subjects tested, one subject—the ACKNOWLEDGMENT
eldest normal subject (age 54 years)—was misclassified; i.e.,
he was classified as at the risk of aspiration. As a result of this L. J. Lazareck and Z. Moussavi sincerely thank Dr. G. Rempel
interesting finding, we postulate that age increases a normal for her review, input, and assistance in data collection.
subject’s likelihood to experience some degree of swallowing
dysphagia (for example, recurring aspiration). According to the REFERENCES
error rates averaged in Table III, the inclusion of two older adult [1] J. A. Logemann, Manual for the Videoflourographic Study of Swal-
lowing, 2nd ed. Boston, MA: College-Hill Press, 1986.
subjects as controls contributed to higher misclassification [2] N. P. Reddy, B. R. Costarella, R. C. Grotz, and E. P. Canilang, “Biome-
percentages for all textures. Without these two subjects, the chanical measurements to characterize the oral phase of dysphagia,”
overall error rates were extremely low. IEEE Trans. Biomed. Eng., vol. 37, pp. 392–397, Apr. 1990.
[3] W. G. Selley, F. C. Flack, R. E. Ellis, and W. A. Brooks, “The exeter
Last in importance was semisolid, as the texture posed the dysphagia assessment technique,” Dysphagia, vol. 4, pp. 227–235, 1990.
lowest risk for our subjects (i.e., the subjects aspirated less fre- [4] P. L. Mirrett, J. E. Riski, J. Glascott, and V. Johnson, “Videofluroscopic
quently with semisolid consistency than with the other two tex- assessment of dysphagia in children with severe cerebral palsy,” Dys-
phagia, vol. 8, pp. 209–214, 1993.
tures). The semisolid texture, the safest texture to consume, also [5] N. P. Reddy, E. P. Canilang, R. C. Grotz, M. B. Rane, J. Casterline,
had the lowest error rate. For semisolid, the final classification and B. R. Costarella, “Biomechanical quantification for assessment and
accuracies were 77.8% (7 of 9 subjects) and 100.0% for normal diagnosis of dysphagia,” IEEE Eng. Med. Biol. Mag., vol. 7, pp. 16–20,
Sept. 1988.
and abnormal subjects, respectively. Again, the two misclassi- [6] H. Firmin, S. Reilly, and A. Fourcin, “Non-invasive monitoring of re-
fied subjects were the two older controls (ages 38 and 54 years). flexive swallowing,” Speech, Hearing, Language, vol. 10, pp. 305–312,
1997.
Of interest were the results of Selley et al., where four groups [7] R. J. Last, ANATOMY Regional and Applied. London, U.K.: Churchill
of healthy subjects were tested to establish the normal swal- Livingstone, 1978, ch. 6, pp. 405–430.
lowing behavior and mechanism [3]. The groups, all without [8] M. J. Casa, K. A. McPerson, and D. J. Kenny, “Durational aspects of
oral swallowing neurologically normal children and children with cere-
neurological diseases or swallowing problems, included 25 chil- bral palsy: an ultrasound investigation,” Dysphagia, vol. 8, pp. 359–367,
dren (ages 2–11 years), 23 school pupils (ages 11–18 years), 1995.
15 university students and staff (ages 18–30 years), and 18 el- [9] J. B. Palmer, K. V. Kuhlemeier, D. C. Tippett, and C. Lynch, “A pro-
tocol for the videofluorographic swallowing study,” Dysphagia, vol. 8,
derly adults (ages 60–90 years). Selley et al. noted that the com- pp. 209–214, 1993.
plex feeding respiratory pattern in a given adult repeats in 95% [10] F. L. Vice, J. M. Heinz, G. Giuriati, M. Hood, and J. F. Bosma, “Cervical
of swallowing events, and maturation of the feeding respira- auscultation of suckle feeding in newborn infants,” Developmental Med.
Child Neurol., vol. 32, pp. 760–768, 1990.
tory pattern emerges in the teenage years and becomes remark- [11] S. Hamlet, D. G. Penney, and J. Formolo, “Stethoscope acoustics and
ably consistent thereafter [3]. However, they focused mainly on cervical auscultation of swallowing,” Dysphagia, vol. 9, pp. 63–68,
respiratory patterns (i.e., percentage of expirations immediately 1994.
[12] F. L. Vice, O. Bamford, J. M. Heinz, and J. F. Bosma, “Correlation of
following the swallowing sound signal) in addition to time du- cervical auscultation with physiological recording during suckle-feeding
ration. On the other hand are the speculations of Fillion and Kil- in newborn infants,” Developmental Med. Child Neurol., vol. 37, pp.
167–179, 1995.
cast [33], who conjectured that dysphagia, in the form of poor [13] S. C. Tarrant, R. E. Ellis, F. C. Flack, and W. G. Selley, “Comparative
oral sensory integrity and consequently bolus formation and review of techniques for recording respiratory events at rest and during
movement, stems from age-related changes in muscle strength deglutition,” Dysphagia, vol. 12, pp. 24–38, 1997.
[14] Z. K. Moussavi, M. T. Leopando, and G. R. Rempel, “Automated detec-
and a generalized decrease in epithelial sensitivity (caused by a tion of respiratory phases by acoustical means,” in Proc. 20th Annu. Int.
decrease in the number of nerve endings with increased age). Conf. IEEE Eng. Med. Biol. Soc., vol. 20, 1998, pp. 21–24.
2112 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 51, NO. 12, DECEMBER 2004

[15] I. H. Gewolb, J. F. Bosma, V. L. Taciak, and F. L. Vice, “Abnormal [33] L. Fillion and D. Kilcast, “Food texture and eating difficulties in the
developmental patterns of suck and swallow rhythms during feeding elderly,” Food Ind. J., vol. 4, no. 1, pp. 27–33, 2001.
in preterm infants with bronchopulmonary dysplasia,” Developmental [34] S. G. Hiss, K. Treole, and A. Stuart, “Effects of age, gender, bolus
Med. Child Neurol., vol. 43, no. 7, pp. 454–459, July 2001. volume, and trial on swallowing apnea duration and swallow/respiratory
[16] M. A. Qureshi, F. L. Vice, V. L. Taciak, J. F. Bosma, and I. H. Gewolb, phase relationships of normal adults,” Dysphagia, vol. 16, pp. 128–135,
“Changes in rhythmic suckle feeding patterns in term infants in the first 2001.
month of life,” Developmental Med. Child Neurol., vol. 44, no. 1, pp.
34–39, January 2002.
[17] T. Chau, M. Casas, G. Berall, and D. Kenny, “Testing the stationarity
and normality of paediatric aspiration signals,” in Proc. IEEE Eng. Med.
Biol. Soc. Conf. ’02, vol. 1, 2002, pp. 186–187. Lisa J. Lazareck (M’98) was born in Winnipeg, MB
[18] L. J. Lazareck and Z. K. Moussavi, “Adaptive swallowing sound seg- Canada, in 1979. She received the B.Sc. and M.Sc.
mentation by variance dimension,” in Proc. EMBEC 02 Eur. Medical degrees in electrical engineering from the University
and Biological Engineering Conf., vol. 1, 2002, pp. 492–493. of Manitoba, Winnipeg, MB, in 2001 and 2003
[19] I. H. Gewolb, J. F. Bosma, E. W. Reynolds, and F. L. Vice, “Integra- respectively. Her M.Sc. degree thesis focus included
tion of suck and swallow rhythms during feeding in preterm infants with classification of normal and dysphagic swallowing
and without bronchopulmonary dysplasia,” Developmental Med. Child sound signals by acoustical means. Currently, she is
Neurol., vol. 45, no. 5, pp. 344–348, May 2003. working towards the D.Phil degree at the University
[20] W. Kinsner, “Batch and Real-Time Computation of a Fractal Dimension of Oxford, Department of Engineering Science in
Based on Variance of a Time Series,” Dept. Elect. Comput. Eng., Univ. electrical engineering.
Manitoba, Winnipeg, MB, Canada, Tech. Rep., DEL94-6, 1994. She is a member of the Signal Processing and
[21] H.-O. Peitgen, H. Jürgens, and D. Saupe, Chaos and Fractals New Fron- Neural Network Engineering Research Group and St. John’s College, Oxford,
tiers of Science. New York: Springer-Verlag, 1992, ch. 9, pp. 457–503. U.K. Her current research interests include the analysis and associated pattern
[22] M. J. Katz, “Fractals and the analysis of waveforms,” Comput. Biol. recognition of electroencephalogram (EEG) signals as they pertain to sleep
Med., vol. 18, no. 3, pp. 145–156, 1998. disorders, specifically obstructive sleep apnoea.
[23] R. Esteller, G. Vachtsevanos, J. Echauz, and B. Litt, “A comparison of Ms Lazareck is a member of IEEE, WIE (Women In Engineering), and EMBS
waveform fractal dimension algorithms,” IEEE Trans. Circuits Syst-I, (Engineering in Medicine and Biology Society)—where she serves as the EMBS
vol. 48, no. 2, pp. 177–183, Feb. 2001. Student Representative for the 2004–2006 term. In 2000 and 2002, she received
[24] C. M. Bishop, Neural Networks for Pattern Recognition. New York: a NSERC Undergraduate Research Award and a NSERC Post Graduate Schol-
Oxford Univ. Press, 1995, ch. 1, pp. 1–32. arship respectively where she worked on her B.Sc. and M.Sc. thesis projects.
[25] M. J. Norušis, SPSS Professional Statistics 6.1. Chicago, IL: SPSS
Inc., 1994, ch. 1, pp. 1–45.
[26] P. J. Kahrilas, S. Lin, J. Chen, and J. A. Logemann, “Three-dimensional
modeling of the oropharynx during swallowing,” Radiology, vol. 194,
no. 2, pp. 575–579, 1995. Zahra M. K. Moussavi (M’98) received the B.Sc.
[27] M. Boiron, P. Rouleau, and E. H. Metman, “Exploration of pharyngeal from Sharif University of Technology, Tehran, Iran,
swallowing by audio signal recording,” Dysphagia, vol. 12, pp. 86–92, in 1987, the M.Sc. degree from the University of Cal-
1997. gary, Calgary, AB, Canada, in 1993, and the Ph.D.
[28] M. S. Klahn and A. L. Perlman, “Temporal and durational patterns as- degree from University of Manitoba, Winnipeg, MB,
sociating respiration and swallowing,” Dysphagia, vol. 14, pp. 131–138, Canada, in 1997, all in electrical engineering.
1999. She then joined the respiratory research group
[29] A. L. Perlman, S. L. Ettema, and J. Barkmeier, “Respiratory and acoustic of the Winnipeg Children’s Hospital and worked as
signals associated with bolus passage during swallowing,” Dysphagia, a Research Associate for one and one-half years.
vol. 15, pp. 89–94, 2000. In 1999, she joined the Biomedical Engineering
[30] D. A. Mendell and J. A. Logemann, “A retrospective analysis of the pha- Department of Johns Hopkins University, Baltimore,
ryngeal swallow in patients with a clinical diagnosis of GERD compared MD, and worked there as a Postdoctoral Fellow for one year. Following that,
with normal controls: a pilot study,” Dysphagia, vol. 17, pp. 220–226, she joined the University of Manitoba, Department of Electrical and Computer
2002. Engineering as a faculty member, where she is currently an Assistant Professor.
[31] R. C. Mackowiak, H. S. Brenman, and M. H. F. Friedman, “Acoustic She is also an Adjunct Professor at the TR lab of Winnipeg. Her current
profile of deglutition,” in Proc. Soc. Experimental Biology and Medicine, research includes respiratory and swallowing sound analysis, postural control
vol. 125, 1967, pp. 1149–1152. and balance, rehabilitation, and human motor learning.
[32] M. J. Mussell and Y. Miyamoto, “Comparison of normal respiratory Dr. Moussavi is a member of IEEE Engineering in Medicine and Biology
sounds recorded from the chest and trachea at various respiratory air Society (EMBS), and the CMBES and International Lung Sound (ILSA) asso-
flow levels,” Frontiers Med. Biol. Eng., vol. 4, no. 2, pp. 73–85, 1992. ciations. She is also currently the EMBS Chapter Chair, Winnipeg Section.

You might also like