Age-related changes in human vocal tract

configurations and the effects on speakers’ vowel
formant frequencies: a pilot study
An Xue1, Jack Jiang2, Emily Lin2, Raymond Glassenberg2 and Peter B. Mueller3
Ohio University, Athens, OH, 2Northwestern University Medical School, Chicago, IL, 3Kent State University, Kent,
Log Phon Vocol 1998; 24: 132–137

The study aims to: (i) make accurate measurements of age-related changes in female speakers’ vocal tract configurations
with acoustic reflection technique (ART); (ii) obtain acoustic information of vowel formant frequency changes as a function
of aging; and (iii) test the hypothesis that there are age-related vocal tract dimensional changes and concomitant decreases
in all the vowel formant frequencies as people age. Preliminary findings indicate that older female subjects tend to have a
more expanded pharyngeal lumen, but not longer vocal tract length, than their younger cohorts, and formant frequencies
would not unanimously decrease as a function of aging. The study highlights the importance of larger-scale measurements
of age-related vocal tract configuration changes and the necessity of developing new acoustic models that will delineate all
the concomitant formant frequency and other acoustic changes as people age.
Key words: acoustic reflection technique, aging, speech.
(Ste6e) An Xue, Ph.D., School of Hearing & Speech Sciences, Ohio Uni6ersity, Lindley Hall 201, Athens, OH 45706, USA.
Tel: +1 740 5930171; Fax: +1 740 5930287; E-mail:

INTRODUCTION lack of efficient investigative devices applicable in

this particular area. For example, several studies
Previous research dealing with age-related anatomi-
speculated on the senile vocal tract changes based
cal and physiological changes of speech mechanisms
on their acoustic research findings. Linville and
mainly focuses on the following 4 systems: respira-
Fisher (27), employing as subjects 75 women at
tory system (14, 15, 41), laryngeal system (10, 13,
three age levels (25–35, 45–55, 70–80 years), in-
17, 19–21, 23, 32, 33, 38), supraglottal system (1, 2,
5, 18, 24, 25, 28, 29, 39, 45) and nervous system tended to compare acoustic data of phonatory sta-
(22, 30, 31, 34). Many researchers also concentrate bility and resonance features in women of different
on the acoustic and functional changes of the el- ages. Resonance peaks (F1 and F2) were deter-
derly speakers’ speech, and attempt to develop rea- mined for both the normally phonated and whis-
sonable predictions with regard to the effects of pered vowel /ae/. Both formant measures showed a
anatomical and physiological changes on the speech progressive decrease in frequency from young adult-
of the aged (3, 11, 16, 26, 35, 37, 43, 44). hood to old age for the vowel, and there was ‘‘a
Among the 4 systems mentioned above, senile significant lowering of the frequency of the first for-
anatomical and physiological changes in the supra- mant (F1) with advanced age’’ (p. 324). They sug-
glottal system and the effects of these changes on gested that this decrease reflected either age-related
the speech of the aged may be the least studied. changes in vocal tract dimensions or the positioning
Weismer & Liss (42) indicated that, to date, no of speech structures. A decrease in vowel formant
extant theories of speech production have addressed frequencies with age was also reported by Endres et
aging effects, and ‘‘this is probably because of the al. (8), who speculated that these trends were due
absence of relevant data especially on the supraglot- to enlargement of the vocal tract with age. How-
tal articulatory function in the aged’’ (p. 221). This ever, contradictory findings were also reported.
slow progress may also be directly related to the Rastatter and Jacques (36) found that for the el-

Vocal tract 133

derly speakers the average F1 frequency levels for METHODS AND MATERIALS
the front and mid vowels were significantly higher
as compared to the younger subjects. They also ob-
served that for the back vowels (e.g. /a/) F1 fre- Two groups of female subjects were involved in the
quency levels were significantly lower and F2 study. The younger age group consisted of 10 sub-
frequency levels tended to be higher for the older jects (mean age = 40; range =33–48 years; SD =
speakers. Based on these findings, they speculated 4.93) and the older age group consisted of 12 subjects
that articulatory positioning, and not alterations in (mean age= 56; range= 50–66 years; SD= 5.78). All
the dimensions of the pharynx, may be held ac- subjects were free of any language/speech or any oral
countable for such differences. It is very important peripheral disorders at the time of the experiment.
to note that none of the above mentioned studies The average height and weight for the younger group
had made any objective measurements of the di- were 164 cm (SD= 4.01) and 69.35 kg (SD= 13.48),
mensional changes of the vocal tracts in their sub- and for the older group were 163.98 cm (SD= 3.93)
jects (probably due to the limitations of the and 83.92 kg (SD= 24.65). Analysis of variance
available measurement devices). Thus, they all failed (ANOVA) showed that the 2 groups of subjects did
to make any valid and convincing correlation be- not differ significantly (at 0.05 significance level) in
tween their acoustic findings and their speculated terms of height and weight.
changes of human vocal tract as a function of ag-
ing. A logical solution to the contradictory specula- Data collection
tions of these previous studies is to make accurate The subjects were seated in a double-walled room to
measurements of vocal tract dimensions of both perform the experimental task. The upper airway was
younger and older speakers, and to obtain the for- assessed using a prototype Acoustic Pharyngometer
mant frequency measurements from the same two-microphone imaging system (E. Benson Hood
speakers. Laboratories, Pembroke, MA). The device consisted
The development of the acoustic reflection tech- of 2 microphones and 1 sound-generator mounted on
nique (ART), and its extensive applications in med- a 30-cm-long, 1.89-cm inner diameter wave tube, and
ical research and treatment, has provided a a microcomputer equipped with digital-to-analog and
non-invasive, precise and practical means of mea- analog-to-digital converters and software for data
suring the dimensions of human oral and pharyn- processing (see Fig. 1). The instrument underwent a
geal lumina. ART was developed as a medical self-calibration procedure prior to recording each
diagnostic device used for the objective assessment subject. During the actual recording, the proprietary
of the upper respiratory airway. It utilizes acoustic data acquisition algorithm collected 10 consecutive
reflected signals to provide graphical representations incident-reflected wave combinations. A moving win-
of area–distance curve representing supra-glottal lu- dow maintained the 10 most recent curves in com-
men. Its essential clinical and physiological applica- puter memory.
tions include demonstration of: (i) structural and Subjects inhaled room air by mouth via a respira-
functional abnormalities of the oral/pharyngeal lu- tory mouthpiece without vocalizing through the wave
mina and glottis; (ii) impact of tonsils on the upper tube. Data were acquired with the neck held in
airway; (iii) risk factors for obstructive sleep apnea; neutral position. Individual acoustic reflection data
(iv) airway response to therapeutic intervention; (v) files were first filtered manually to smooth artifacts
screening indices for prediction of intubation fail- and eliminate divergent points caused by reduced
ure; and (vi) site and degree of airway obstructions. resolution at large distances. After data filtration, the
Current literature on the application of ART con-
sistently supports its reliability for accurately mea-
suring human oral/pharyngeal lumina (4, 6, 7, 9,
This study aims to: (i) make accurate measure-
ments of age-related changes in female speakers’
vocal tract configurations with ART; (ii) obtain
acoustic information of vowel formant frequency
changes as a function of age; and (iii) test the hy-
potheses that there is no significant differences in
the dimensions of the female speakers’ vocal tract
as they age, and there is no significant difference in Fig. 1. Schematic of the acoustic pharyngometer 2-
the vowel formant frequencies as people age (8, 27). microphone imaging system.

134 A. Xue et al.

Fig. 2. Characteristic
area–distance curves
measured for a subject.
Landmarks are: a, hard
palate; b, soft palate; c,
uvula; d, posterior
oropharynx; e,
hypopharynx; f, glottis; g,
subgottic expansion; O,
oral cavity; P, pharyngeal

cumulative upper airway volume was calculated as were instructed to produce sustained phonation of /a/
the area under the airway area – distance curves using at their comfort levels of pitch and loudness for at
Simpson’s one-third rule for numerical integration. least 3 sec. A 1-sec long steady segment was ran-
The measured area – distance curves (see Fig. 2) and domly selected from the middle portion of the sus-
the resultant volume – distance relationships (see Fig. tained vowel phonation and then digitized at a
3) were divided into two sections: an oral region sampling rate of 20 kHz. The digitized signals were
extending from the incisors to the anterior margin of processed with an automated algorithm developed in
the soft palate, and a pharyngeal region extending the Laryngeal Physiology Laboratory at the North-
from the soft palate to the glottis opening. The western University Medical School. The program was
demarcation of these two regions and their lengths designed to perform a linear prediction coding (LPC)
were evaluated individually for each curve obtained. analysis on the acoustic signals and to detect spectral
The level of the mouthpiece held by the incisors (zero peaks from the LPC spectrum. The parameter set-
distance on Figs. 2 and 3) was readily identified as tings for the LPC analysis were: sampling rate =20
the point at which the area was 1.8 cm2. The anterior kHz; filter order=25; total data length= 16384 sam-
margin of the soft palate corresponded to the deflec- pling points; no pre-emphasis. The frequency and
tion point occurring at approximately 6 cm distance, amplitude of the first 3 formants were obtained.
labelled b in Fig. 2. The level of the glottis was at the
local minimum in area f (Fig. 2) preceding the dis-
tinct subglottic area expansion (area g, Fig. 2). The RESULTS
glottic landmark position, located approximately 16 Table 1 is the descriptive statistic summary of the
cm from the incisors, has been well-established (12). means, SDs, minimal and maximum values for length
For acoustic recordings, the output of the micro- of vocal tract (in cm), pharyngeal volume (in ml) and
phone amplifier was connected to the audio input of the first 3 formant frequencies (in Hz) of the younger
a digital audio processor (SONY PCM-501ES). The and older subjects. ANOVA was applied to the 2 age
output of the processor was wired to the video input groups integrating the first 3 formant frequencies (in
of a video cassette recorder (Panasonic AG-6300MD) Hz), length of vocal tract (in cm) and pharyngeal
and recorded onto a videotape (SONY VHS-60). For volume (in ml) as the dependent variables. Descrip-
signal retrieval, the video output of the video cassette tive statistics showed that F1 (668.34 Hz) and F3
recorder was fed to the digital audio processor. The (2974.28 Hz) of the older group were lower than
audio output of the processor was connected to a those of the younger group (897.83 Hz and 3191.28
personal computer (AST Advantage Adventure Hz, respectively). ANOVA results showed that F1 of
8090P) via a 12-bit A/D converter (National Instru- the older group was significantly lower (df= 1, 20;
ments AT-MIO-E-2). The investigators located the F= 4.4; pB 0.05) than that of the younger group.
target signal by listening to the acoustic signal from Results also showed that unlike F1 and F3, F2 of the
the headphone output of the recorder. All subjects older group (1760.73 Hz) was 128.89 Hz higher than

Vocal tract 135

Fig. 3. Characteristic
cumulative airway
volume–distance curves
for a subject. Volume

data correspond to area
data shown in Fig. 1.

that of their younger cohorts (1631.84 Hz) (see Fig. cm). The older group tended to have larger pharyn-
4). But this difference was not statistically significant. geal volume (31.06 ml) than the younger group (27.12
The 2 groups shared very similar total vocal tract ml), with a group difference of 3.94 ml. Neither of
length (i.e. 18.15 cm for the younger group and 17.58 these differences were statistically significant.
for the older group, with a group difference of 0.57
Table 1. Descripti6e statistic summary of the means, The current study is an innovative attempt at making
SDs, minimal and maximum 6alues for length of 6ocal objective measurements of age-related human vocal
tract (in cm), pharyngeal 6olume (in ml) and the first 3 tract configuration changes, and at discovering possi-
formants (in Hz) of the younger and older subjects ble correlations between such vocal tract configura-
tion changes and concomitant changes in vowel
Younger formant frequencies of the speakers. Preliminary find-
subjects Older subjects
ings of this pilot study indicate that age-related
Vocal tract length 18.15 17.58 changes in human vocal tract dimensions may be
mean (in cm) more complicated than the notion of progressive
SD 1.27 1.26 increases in the vocal tract length, as previous studies
Range 16.00–19.50 15.50–19.50 have assumed (18). It also suggests that the assump-
Pharyngeal volume 27.12 31.06 tion of consistent decrease in the vowel formant
mean (in ml) frequencies of the speakers as they age (27) may be
SD 5.56 8.35
Range 17.80–36.10 22.20–53.30
over-generalized, in that such a trend could vary
depending on the different vowels utilized for for-
Formant 1 mean 897.83 668.34 mant frequency measurements. Enlargement in the
(in Hz)
SD 136.76 323.10 configurations of the pharynx tend to occur as a
Range 585.94– 227.05–1109.62 function of aging. The resultant acoustic effects of
1086.43 such enlargement, as well as the assumed alterations
Formant 2 mean 1631.84 1760.76 and/or modifications in the articulatory positioning
(in Hz) of older speakers (36, 39), must await further studies
SD 590.88 1090.81 and investigations.
Range 1219.48– 726.32–4226.07 It is important to point out that the current pilot
study is limited by the small number of subjects. In
Formant 3 mean 3191.28 2974.28 addition, only 1 vowel (i.e. the back vowel /a/) was
(in Hz) used to obtain formant frequency changes. The aver-
SD 715.03 1427.18
Range 2486.57– 1350.10– age age of both groups (40 years for the younger
5133.06 6640.63 group and 56 years for the older group) was only 16
years apart. In addition, this study only investigated

136 A. Xue et al.
Fig. 4. The first 3 formants of sustained /a/ productions from the younger and older subjects.

senile vocal tract and vowel formant frequency 7. Eckmann DM, Glassenberg R, Gavriely N. Acoustic
changes of female speakers. Another investigation on reflectometry and endotracheal intubation. Anesth
senile vocal tract changes of male speakers is being Analg 1996; 83: 1084 – 9.
8. Endres W, Bambach W, Flosser G. Voice spectrograms
carried out by the investigators. Future studies should as a function of age, voice disguise, and voice imita-
involve a larger number of subjects with even younger tion. J Acoust Soc Am 1971; 49: 1842 – 7.
(e.g. 20–30 years) and older (e.g. 70 – 80 years) age 9. Fredberg JJ, Wohl MEB, Glass GM, Dorkin HL.
groups, in order to obtain more reliable normative Airway area by acoustic reflections measured at the
data on human vocal tract configuration changes. For mouth. J Appl Physiol 1980; 48: 749 – 58.
10. Gracco LC. Age related changes in the human vestibu-
data on vowel formant frequency changes, all the lar fold of the larynx: a histomorphometric study.
cardinal vowels should be used in order to substanti- Unpublished doctoral dissertation, University of Wis-
ate the role of articulatory modifications of speakers consin-Madison, 1988.
from different age groups. Only then can new speech 11. Hartman DE. The perceptual identity and characteris-
models be developed that will fully account for all tics of aging in normal male adult speakers. J Commun
Dis 1979; 12: 53 – 61.
age-related changes in human vocal tract configura-
12. Hilberg O, Jensen FT, Pedersen OF. Nasal airway
tions and the concomitant changes in the formant geometry: comparison between acoustic reflections and
frequency levels of the speakers. magnetic resonance scanning. J Appl Physiol 1993; 75:
2811 – 9.
13. Hirano M, Jurita S, Nakashima T. Growth develop-
ment and aging in the human vocal folds. In: Bless D,
Logoped Phoniatr Vocol Downloaded from by University of Connecticut on 10/29/14

SAMMANFATTNING

For personal use only.

