Journalof Voice

Vol. 2, No. 3, pp, 183-194

© 1988Raven Press, Ltd., New York

A Framework for the Study of Vocal Registers

Ingo R. Titze

Voice Acoustics and Biomechanics Laboratory, Department of Speech Pathology and Audiology, University of Iowa,
Iowa City, Iowa, and Recording and Research Center, Denver Center for the Performing Arts,
Denver, Colorado, U.S.A.

Summary: Register transitions are divided into two classes, periodicity tran-
sitions and timbre transitions. Periodicity transitions refer to changes in vocal
quality that occur whenever glottal pulses .are perceived as individual events
rather than as a continuous auditory stimulus. Timbre transitions refer to
changes in vocal quality associated with changes in spectral balance. Physio-
logically, these can be quantified with an abduction quotient. The singing reg-
isters appear to be based on timbre transitions resulting from subglottal reso-
nances that interfere with the vocal fold driving pressure. Four of the major
singing register shifts are predicted (in frequency and relative importance) on
the basis of the first subglottal formant. Strategies for register equalization are
proposed on the basis of supraglottal formant tuning (vowel modification) and
adjustments in glottal adduction. Key Words: Register--Register transitions.

The term "register" has been used to describe Without giving names to all the registers that
perceptually distinct regions of vocal quality. have been claimed in the literature [e.g., Large (1),
Abrupt changes in register may occur voluntarily or Hollien (2), or Miller (3)], we propose to classify the
involuntarily. In some voices they seem to be less registers in terms of two types of transitions. The
apparent than in others. If a voice is noticeably reg- first is a periodicity transition. Below some funda-
istered, a stair-step effect in quality is perceived. mental frequency Fc, which we call the crossover
This is illustrated in Fig. 1, where vocal quality is frequency, the ear seems to be able to detect indi-
plotted against some physiologic or acoustic vari- vidual pulses in the speech waveform if a major
able that controls pitch, loudness, or vowel. It is not vocal tract excitation is localized within the glottal
important to specify at this point what the physio- cycle. Above the crossover frequency, the percep-
logic or acoustic variable is. It is important, how- tion is a continuous tone rather than a series of
ever, to state that this variable should be able to pulses. It appears, then, that the major acoustic
change continuously and gradually over some sig- variable (and perhaps the only acoustic variable of
nificant range. Vocal quality, measured in psycho- significance for this transition) is fundamental fre-
physical terms, then appears to have plateaus (or quency. All vocal sounds with localized excitations
near-plateaus) that are connected by abrupt transi- and F0 below Fc are perceived as belonging to the
tions. The transitions are perceptually more salient "pulse register" or the "vocal fry." Above F~ there
than the plateaus. A one-register voice is effectively are other registers, but none have the pulse quality.
a no-register voice because there are no transitions The second type of register transition, which we
to draw attention to quality change. will call a timbre transition, is characterized by an
abrupt quality change that results from loss or gain
of high-frequency sound energy at the source. An
Address correspondence and reprint requests to Dr. I. R. acoustic variable of importance is the spectral
Titze at Voice Acoustics and Biornechanics Laboratory, Depart-
ment of Speech Pathology and Audiology, University of Iowa, slope, also known as spectral tilt or spectral rolloff.
Iowa City, IA 52242, U.S.A. It measures the decrease in amplitude of successive

184 I . R . TITZE

of "natural" register breaks on the keyboard for

sopranos, altos, tenors, and basses, and several cat-
I egories in between.
Register 1 i
The major objectives of this article are to show
that (a) involuntary timbre transitions result from
resonances in the subglottal system (the trachea),
(b) voluntary timbre transitions result from regulat-
ing vocal fold adduction, and (c) periodicity transi-
tions have a characteristic frequency that depends
"d on formant bandwidth. We begin with the periodic-
o ity transitions.


A periodicity transition is illustrated in Fig. 2.
Transition i
I I Results of a perceptual experiment conducted by
>l i<
Keidar (4) are displayed in Fig. 2a. Listeners were
presented synthetic vowel stimuli that differed in
Physiologic or Acoustic Variable fundamental f r e q u e n c y and glottal flow pulse
FIG. I. General characteristic of a register transition•
partials of the voice source, usually in decibels per
octave. Using the qualities of musical instruments l i
"~ 80.
as an analogy, a brass instrument has a rich timbre,
with a lot of high-frequency partials (i.e., a small
~ 60 I i i
spectral slope). A flute, on the other hand, has
fewer high-frequency partials, with a greater spec- DI

tral slope. In the human voice, abrupt timbre tran- 40

sitions can be brought about by changes in the clo- ~0
sure conditions of the glottis. The major physiologic 2O
variable would seem to be the degree of vocal fold
adduction. We postulate here that the primo pas- 0 , , i , V'--
saggio and secondo passaggio (the primary and 2O 40 60 80 i00
secondary passages, or transitions) in the male and Fundamental Frequency (Hz)
the female voice are timbre transitions, whereas the (a)
chest-fry transition is a periodicity transition.
In addition to these two basic register transitions,
it is possible that a third type of transition may in-
~ oo I~

volve a shift from vocal fold vibration to whistling A

in the larynx. This is pure conjecture at this point,
however. There is virtually no evidence that the
so-called whistle register is anything but vocal fold ] ~ o I~

vibration at a very high pitch. Given the uncertainty

about the physiology of this register, we will ex-
clude it from further discussion in this article. 0

Any eventual theory of vocal registers must take I ~ I

into account the voluntary and involuntary transi- 0 5 10 15 20 25 30 35
tion phenomena. Timbre transitions can be induced Time (ms)
or inhibited over a large range of fundamental fre- (b)
quencies by skilled vocalists, but they also exhibit a
FIG. 2. Periodicity register transition, a: Identification curve; b:
certain invariance with Fo when unattended to. pressure waveforms corresponding to four frequencies above.
Thus, vocal pedagogues have charted the location (After Keidar, ref. 4.)

Journal o f Voice, Vol. 2, No. 3, 1988


shape. They were asked to identify the production period is shown with a dotted line in Fig. 2b). Vo-
as either pulsed (fry) or nonpulsed. The pulse shape calizations with F 0 below Fc were judged more of-
was varied with respect to open quotient, speed ten as pulsed, and those with F 0 above Fc were
quotient, and a rounding factor that controlled the judged more often as nonpulsed. In natural speech,
abruptness of closure (5). The stimuli were random- however, the value o f F c could easily be affected by
ized and repeated several times. It was found that a number of factors. First, there could be multiple
glottal flow pulse shape had a minor effect on the excitations within the period, as shown in Fig. 3.
register perception, even though chest-like and fal- Here a secondary excitation renews the formant en-
setto-like flow pulses were part of the battery of ergy, thereby extending the total decay time. Such
stimuli. Fundamental frequency dictated the per- a secondary excitation may result from successive
ception of fry, with the identification function in partial closures of the glottis. Fc may also be af-
Fig. 2a displaying the typical ogive found in cate- fected to some extent by glottal pulse shape, as pre-
gorical perception of speech sounds (6). liminary results suggest (4). Finally, it should be
An explanation for this identification function is pointed out that damping does not usually remain
given in Fig. 2b. Two periods of a speech waveform constant throughout the cycle. The formant band-
with a single formant are shown for each of four width increases in the open portion of the glottal
fundamental frequencies. These frequencies corre- cycle (9). Also, the damping is vowel dependent.
spond to the 40-, 60-, 80-, and 100-Hz stimuli in Fig. For example, the /a/ vowel has about twice the
2a. For the waveform to be perceived as pulse-like, bandwidth of t h e / u / v o w e l (8). We would predict,
we postulate that the formant energy must die out therefore, that pulse register would be perceived at
before a new excitation occurs. In other words, slightly higher pitches f o r / a / t h a n for/u/.
there should be no significant carryover of acoustic A series of perceptual studies is needed to deter-
energy from one cycle to the next. Each excitation mine the range of Fc. Hollien and Michel (10), for
will then be perceived as a separate event. example, found that 12 male and 11 female subjects
From a practical point of view, let us decide that had pulsed phonation ranges from 0 to 60 Hz and
the formant energy is sufficiently damped out when nonpulsed ranges beginning at - 8 0 Hz. This estab-
the envelope of the exponentially decaying sinusoid lishes the 70-Hz crossover frequency within a
has reached 1% of its original value ( - 40 dB). We -+10-Hz range without specific control over pulse
can then write shape or number of excitations within the cycle.
The effect of vowel needs further investigation,
e - B t / 2 -- 0 . 0 1 (1) however.

where B is the formant bandwidth (rad/s) and t is the

decay time to the 1% value. (Readers not familiar
with the above expression are referred to engineer-
ing textbooks, e.g., ref. 7.) For the first formant
bandwidth of 77 Hz [an average over 10 vowels (8)]
the decay time is 19 ms. This is less than the period
of the 40-Hz signal in Fig. 2b, but greater than the
periods of the 60-, 80-, and 100-Hz signals. We
would expect, therefore, that the 40-Hz signal
would be judged as pulse register 100% of the time
by listeners, whereas the others would be judged as
pulse register less frequently. This is borne out in o
Fig. 2a. The 100-Hz signal, for example, was judged
as pulse register only 3% of the time. At its 10-ms
period, the envelope of the decaying sinusoid is
13% of its original value. Thus, a significant amount
of energy is carried into the next period.
The 50% crossover frequency Fc in the identifi-
cation function came at 70 Hz in the Keidar study Time
(dotted lines in Fig. 2a; the corresponding crossover FIG. 3. Double-pulse p r e s s u r e waveform.

Journal of Voice, Vol. 2, No. 3, 1988

186 I . R . TITZE

A confounding factor in the perception of period- frequency Fo/n is responsible for the perceived
icity is the presence of subharmonics in the speech pitch. Clearly, at 100% modulation the low-
waveform, illustrated in Fig. 4. In Fig. 4b, high- and amplitude pulse in Fig. 4b disappears completely,
low-amplitude pulses are shown to alternate, making the pitch an octave below the pitch for the
equally spaced in time. The effect is an amplitude unmodulated signal. At 10% modulation, on the
modulation at Fo/2, a subharmonic frequency. Sub- other hand, it is anticipated that the pitch will not be
harmonic generation can occur when there is left- different from that in the unmodulated case, but a
right asymmetry in the mechanical or geometric roughness component will be present. Somewhere
properties of the vocal folds (11,12). Two or more in the vicinity of 50% modulation, a crossover
independent natural modes compete for dominance should occur in the pitch perception. Thus, a rough-
in the vibration pattern. Some phase locking be- sounding voice may change to a lower-pitched
tween the motion of the left fold and the right fold is voice if asymmetries in vocal fold movement be-
required to sustain a periodic signal, but it may re- come more and more severe. A recent perceptual
quire several openings and closings to establish a study (13) supports this hypothesis. In a pitch-
fundamental periodicity. Such periodicity may matching paradigm, the vowel productions of mod-
show up every second, third, or nth cycle of the erately to severely dysphonic patients received sig-
unmodulated waveform. nificantly lower pitch judgments than vowels pro-
From a perceptual standpoint, it is important to duced by mildly dysphonic patients.
judge at what percent modulation the subharmonic The perception of pulse register is likely to occur
if Fo/n <70 Hz. Thus, in the presence of an F0/2
100 subharmonic, pulse register may be perceived when
F 0 gets below 140 Hz. Similarly, an F0/3 subhar-
80 monic could give rise to the perception of pulse
at f r e q u e n c i e s below 210 Hz. The so-called
"diplophonic" or " c r e a k y " voices, both character-
ized by vocal roughness, may be perceived as
pulse-like if the frequency is low enough and the
'~ 4O
modulation is high enough. A hypothetical identifi-
cation curve, illustrated in Fig. 4a, could become
~ 20- the basis for some future synthesis studies. For the
Fo/2 subharmonic, the independent variable would
be an amplitude modulation index, such a s ( A i -
(Ai - Ai+I)/(Ai + Ai+l) Ai+ l)/(Ai "q-Ai+ 1 ) , where the adjacent amplitudes Ai
(a) and Ai+ 1 are shown in Fig. 4b.


L Consider now the timbre transition, illustrated in
m Fig. 5. Here a sudden change in quality may result
a. from a discontinuity in the derivative of the glottal
:d flow waveform. This is shown in Fig. 5b, where
four cases of glottal flow and flow derivative are
plotted on the left side, and corresponding shapes of
the glottis are illustrated on the right. In the glottal
shape illustrations, the maximum and minimum
glottis are shown with solid lines, and the prepho-
Time natory glottis is shown with dashed lines. Although
the progression from top to bottom looks quite con-
(b) tinuous on the flow waveforms and the glottal
FIG. 4. Subharmonic periodicity register transition, a: Hypo- shapes, a distinct difference in the flow derivative is
thetical identification curve as a function of amplitude modula-
tion index; b: pressure waveform showing amplitude modulation seen between the top two and the bottom two cases.
at Fo/2 to produce subharmonic frequency. A sharp negative peak in the flow derivative, fol-

Journal o f Voice, Vol. 2, No. 3, 1988


,-.100 -10 The critical separation is between the second and

third case. It can be quantified by an abduction quo-
~ 80- -1?. tient Qa, defined by Titze (15) as a measure of the
O effective separation of the vocal processes. Qa is
.~ the ratio of the prephonatory glottal halfwidth Co at
[..., 60- . - 1 4 ~,~
I I ' the vocal processes to the amplitude of vibration A
0 at the center of the glottis (see bottom right case in
"~ 40- -16
Fig. 5b for definition). The minimum glottal area
and minimum flow depend not only on the prepho-
"~20- c~ natory separation Co, but also on the amplitude of
vibration. A small A can fail to reduce the glottal
N -9-0 chink created by a large Co. By taking the quotient
-2 -I 0 i
Qa = Co~A, the combined effect of both variables
Abduction Quotient Qa=~o/A
can be assessed. The quotient is the kinematic
(.) equivalent of a DC/AC ratio for a glottal flow wave-
form. Negative values of Qa are permissible if we
./%~ F,.,, q,,._~. ~ + ~
interpret them as describing pressed phonation. Al-
though the tissue cannot overlap, as negative values
of Co would suggest, vertical deformation (squish-
ing) of the tissue during collision has an effect sim-
ilar to overlap.
Figure 5a shows the timbre transition in the per-
ceptual domain. Percentage identification of voiced
sounds having a rich timbre (chest register) is plot-
ted against the abduction quotient Qa. The curve is
QR.2 still hypothetical, but some confirmation of its
shape can be obtained from the data of Colton (16)
and Keidar et al. (17). In the Keidar et al. study,
~J.IlXEI two vocalists, a male and a female, produced a se-
ries of vocal qualities that could easily be identified
(b) by a panel of listeners to be either chest register or
FIG. 5. Timbre register transition a: Identification curve as a falsetto register. As expected on the basis of previ-
function of abduction quotient; b: flow waveforms, flow deriva-
tives, and glottal configurations corresponding to four abduction ous work (16,18,19), there was a direct correlation
quotients above. between spectral slope and perceived register. The
difference in spectral slope between those produc-
lowed by a sudden return to the baseline, is ob- tions identified >80% as rich timbre (chest register)
served whenever there is uniform closure over the and those identified >80% as poor timbre (falsetto
entire glottis in a horizontal sense. This occurs register) was - 6 dB/octave on the average for both
whenever the vocal fold processes are in contact the male and the female subject (note the right-hand
(top two cases, right side). The degree to which the scale in Fig. 5a for the two data points in the mid-
processes are then pressed together may alter the die). This difference was determined by averaging
pulse shape and the open quotient further, but the the spectral slope over four octaves of the filter
flow discontinuity effect has been saturated. frequencies selected by Keidar et al. for frequency
On the other hand, separation of the vocal pro- band analysis. It must be remembered, however,
cesses allows for a gradual reduction of the glottal that a single spectral slope measure captures only
area (and hence the glottal flow) in the posterior the gross features of the source spectrum. The low-
glottis, as seen in the bottom two cases (exagger- frequency harmonics are affected differently from
ated for visual clarity). The degree of the separation the higher ones by the glottal waveshape (14).
will again alter the pulse shape and the open quo- By further relating spectral slope to abduction
tient, but the continuity (roundness) effect of the quotient on waveforms produced with a computer
flow derivative has been saturated. (See ref. 14 for model (15), we were able to predict Fig. 5a. This
an extensive discussion of the flow derivative.) curve is presently more of a hypothesis than proven

Journal of Voice, Vol. 2, No. 3, 1988

188 I . R . TITZE

fact. Further perceptual experiments are needed to muscle physiology than can be conveniently cov-
map out this identification function of timbre regis- ered here.
ters directly on the basis of the abduction quotient.
We predict that the crossover point will be some- SUBGLOTTAL RESONANCES AND
where between Qa = 0 and Qa = l (dotted lines). It INVOLUNTARY TIMBRE TRANSITIONS
is interesting to note that an abduction quotient
slightly greater than zero produced a maximum Some discussion of the effects of subglottal res-
loudness level at constant subglottal pressure in ex- onances on the vibration pattern of the vocal folds
cised canine larynges (20). This suggests that per- was offered by van den Berg (21) and more recently
haps a mixture between the chest register and the by Titze (22). Acoustic pressures below the vocal
falsetto register is ideal for maximizing vocal loud- folds are believed to be phased in such a way that
ness at a given subglottic pressure and fundamental they interfere constructively with the intraglottal
frequency. The " f l o w " mode of phonation, de- driving pressures for rich timbre (chest register) and
scribed by Sundberg and Gauffin (35) as being pref- destructively for poor timbre (falsetto register). Van
erable to either breathy or pressed, may lie close den Berg appeared to have a good case for this on
to this crossover point. the basis of his 300-Hz measurement of the first
A major unresolved issue in the study of registers subglottal formant on cadavers. The measurement
is the consistency with which involuntary timbre was either in error or inappropriate, however, for
transitions can be located at specific fundamental later measurements on live subjects have estab-
frequencies. Timbre transitions appear almost like lished a 500- to 600-Hz range for the first formant
normal modes in vibrating strings, plates, or mem- frequency in the trachea (23,24). Based on these
branes. On a neuromuscular basis, one would not revised measurements, and in particular the most
expect difficulty in control of adductory-abductory recent 510-Hz measurement reported by Cranen
movement to be confined to several finely tuned and Boves (24), we will now provide evidence that
regions within the frequency range. Why, for exam- timbre registers are linked to resonance phenomena
ple, is a major involuntary timbre transition so con- in the trachea.
sistently found in the region of -300-350 Hz for Figure 6 shows several cases of constructive and
both males and females? In singing terminology, destructive interference between subglottal pres-
this region is called the primopassaggio (first pas- sure and vocal fold movement. In all cases, excita-
sage, or transition) for females and the secondo pas- tion and decay of the first subglottal formant are
saggio (second passage, or transition) for males. shown with solid lines. The abscissa is labeled in
The two passages seem to reflect the same phenom- periods of F~, the lowest subglottal resonance fre-
enon, a breaking into or out of the chest register. quency. Superimposed (in dashed lines) is one pe-
Speakers and singers of both sexes tend to shift riod of the glottal area for six different values of F 0
register between D 4 (294 Hz) and F 4 (349 Hz). The that correspond to specific fractions of F~. The
transition can be smoothed out by training, as is range is from Fo = (2/7)F~ to Fo = 3F~. The six
done in the operatic and lieder styles of singing, or cases were carefully chosen to reflect maximum
it can be accentuated and employed artistically, as constructive and destructive interference between
in yodeling and some country-western styles. subglottal pressure and vocal fold vibration. The
Two hypotheses are worthy of testing in the illustration is based on a conceptualization that was
search for a physiologic or acoustic explanation of previously introduced by the author (22). Each
the invariance of involuntary timbre transitions. waveform begins at glottal closure, where the for-
The first is that subglottal resonances inhibit or fa- mant energy is renewed and the subglottal pressure
cilitate vocal fold vibration. Vibratory modes in the is maximum. The peak value of 10 cm H20 is typical
tissue would be influenced by acoustic modes in the in the data reported by Miller and Schutte (25) for a
trachea. The second hypothesis is that intrinsic bass-baritone professional singer. It seemed to re-
muscle forces, in particular the cricothyroid and main nearly constant over an entire octave in their
thyroarytenoid forces, cannot maintain a proper study. Exponential decay of the pressure amplitude
balance in certain regions of the frequency range. is based on a damping coefficient of 0.1, calculated
We will not develop the second hypothesis in this from data by Cranen and Boves (24) and Miller and
article because (a) it has lower priority in our cur- Schutte. The glottal area function is a truncated si-
rent thinking and (b) it requires more background in nusoid and has been assigned an open quotient of

Journal of Voice, Vol. 2, No. 3, 1988


[ entire outward movement (first half of the open

1-0 "-~'-~t F-3F ':1530I-Iz phase) and negative pressure during the entire in-
0 '[I!
I \ t:l 0 1
ward movement (second half of the open phase).
The F 0 = F~ case, on the other hand, shows the
opposite situation. We conclude from this that sub-
10 :~ rc-zrt-to~,o r~ glottal resonance reinforces vocal fold vibration at
l: I
Jit F o = (3/5)F~ and impedes it at F o = F;. For F~ =
510 Hz, the favorable fundamental frequency is 306
Hz and the unfavorable frequency is 510 Hz. A sim-
o l0 :x F - F ' = s t o r l :
ilar pair of frequencies exist in the lower part of the
frequency range, at F 0 = (2/7)F; and F0 = (2/5)F;,
which corresponds to 146 and 204 Hz, respectively.
Above F0 = 2F;, there is only constructive inter-
i \ 71-'~ r -~/6F.'=ao8a: ference because the strong first negative dip in Ps is
0 avoided in the opening phase by the short funda-
ta., mental period. Two cases of positive reinforcement
'tO are shown at the top of Fig. 6, corresponding to
\ ,'7"~-',
/ / !
\~. frequencies of 1,020 and 1,530 Hz.
0 / ' \ To quantify the resonance effects vis4t-vis regis-
ter perception, we seek to first evaluate the change
10 "K" ¢ . 7 - ~ .'" - - FO'2/?F 1'
in amplitude of vibration that is brought about by
\ ~ /t : x x =14.6 i'Iz the reinforced or diminished driving pressure on the
0 . . . . . " ~,., ~ "" vocal folds. In a recent theoretical study of vocal
fold vibration (26), an expression for the aerody-
namic driving pressure in the open glottis was de-
0 i 2 3 4 rived. Excepting some minor correction terms, this
P e r i o d s of F 1
i driving pressure can be written as

FIG. 6. Phase relationships between the pressure waveform of Pg = Pi + (P~ - Pi)( 1 - a2/al) (2)
the first subglottal formant F ; (solid lines) and the glottal area
waveform (dashed lines) for systematically increasing fundamen-
tal frequency Fo (bottom to top). where Pi is the input pressure to the vocal tract
(supraglottal pressure), a 1 is the glottal entry area,
0.5 for simplicity. (Future discussions will include a and a 2 is the glottal exit area. The glottal conver-
varying open quotient and skewed area pulses.) gence ratio az/a 1 changes from a value near zero at
Positive reinforcement of vocal fold vibration is the beginning of opening to a value near 1.0 (or
achieved when the subglottal pressure P~ is positive slightly greater than 1.0) prior to closing. The trans-
during opening and negative during closing. In other glottal pressure Ps - P i is therefore most effective
words, when the vocal folds are moving outward, in driving the vocal folds in the early part of the
Ps should help to push them outward, and when the opening phase, where (1 - az/al) ~ 1. This is also
vocal folds are moving inward, Ps should help to where Ps has the greatest amplitude in most of the
suck them inward. (We are referring here only to cases shown in Fig. 6. Increased damping through-
the time-varying part of Ps. A large DC component out the open phase reduces the effect of subglottal
exists, of course, that keeps the overall subglottal resonance in the closing phase. Thus, for a simpli-
pressure positive throughout the cycle. This DC fied analysis, we can neglect the closing phase for
component does not affect the arguments presented two reasons: (a) Less of the transglottal pressure is
here.) Note that conditions of constructive and de- applied to the medical surface of the folds, and (b)
structive interference exist for alternate cases most of what would be applied has already been
shown. Stippled regions show where the pressure damped out.
affects the movement during the opening portion of For the present, we will effectively decouple the
the cycle and hatched regions show where the pres- supraglottal system by letting P i = 0, This will al-
sure affects the closing portion. For example, the F 0 low treatment of the effects of tracheal resonance
= (3/5)F~ case shows positive pressure during the without vowel dependence. Thus, the involuntary

Journal of Voice, Vol. 2, No. 3, 1988

190 I . R . TITZE

transitions will be exposed. In a following section, a mm. It appears, therefore, that the vibratory pat-
brief description will be given of how supraglottal tern may be affected significantly by the time-
resonances can be used to tune out undesirable sub- varying components of the subglottal pressure.
glottal effects by vowel modification. Rothenberg (27) found interactions between vocal
Using the impulse momentum principle of me- fold movement and supraglottal pressures that
chanics, the integrated force over ~he opening phase seemed to change the amplitude of vibration by as
can be related to the change in average momentum much as a factor of 2. This was based on observed
over the same interval. This is written as changes in the shape of electroglottographic signal.
Given that the pressure variations above and below
LT Pg dt = A(M~) the glottis are similar in magnitude (25), the agree-
•s O
ment between Rothenberg's results and our predic-
tions is not surprising. Changes in the amplitude of
the electroglottographic signal at natural register
boundaries have been further documented in recent
studies (28).
The entire response curve of zEA as a function of
where LT (length x thickness) is the medial surface F 0 is shown in Fig. 7. The two major peaks and two
over which the p r e s s u r e P g is acting to produce the major valleys correspond to the bottom four cases
driving force, M is the vocal fold mass, ~ -- 2~A/~r shown in Fig. 6. The frequencies of the peaks are
is the average velocity of a sinusoidal displacement 146 and 306 Hz (approximately D3 and D4, respec-
Asin~ot from t = 0 to t = Td4, ~ is the radian tively), whereas the frequencies of the valleys are
frequency, and A is the amplitude of vibration. We 204 and 510 Hz (approximately G ( and C5, re-
see that the change in amplitude &4 is proportional spectively). The peaks are the frequencies where
to the integrated driving pressure rg. the amplitude of vibration increases maximally ow-
Since we have few data on the precise variation ing to subglottal reinforcement, even though the
of (1 - a2/afl in Eq. 2, and on the vibratory mass M lung pressure and all other adjustments stay the
and the surface area LT, it is difficult to obtain exact same. In particular, if the prephonatory halfwidth ~0
calculations of &4. An order of magnitude estima- is kept constant (no change in arytenoid spacing),
tion is possible, however, with the following ap- C3 C4 C5 C6
proximations. Since az/a 1 is negligible over most of 0.8" I I I

the opening phase (owing to high glottal conver- \

gence), we can let P g = P~ and integrate graphically [ I)4
over the stippled portions shown in Fig. 6. Numer-
ical values of the impulse can thus be obtained. 0.4-
Then, from Eq. 3, we calculate D3
0.2- - /~ ,04
(,rtLT~ fTo/4
zEA= \2--M-~/.,0 P~dt (4)
provided L, T, and M are known. The value of LT Q)
can be approximated at 1 cm 2 (1.4 cm length × 0.7 'o -0.2 -
cm thickness). This value will be nearly constant at
all frequencies, since length and thickness are in- -0.4
versely proportional (owing to conservation of tis-
sue volume). We assume also that the product M~o -0.6
is constant for all frequencies, i.e., that the vibrat- 05
ing mass changes inversely with o). By letting M --0.6 , i , i , ' ,
0.1 g at 100 Hz, changes in amplitude can be com- 0 400 800 1200 1600
puted at each frequency of interest. For example, a F 0 in Hz
AA of 0.8 mm is computed for the 306-Hz case in
FIG. 7. Predicted change in vibrational amplitude as a function
Fig. 6. This can be highly significant, given that of fundamental frequencydue to subglottalpressure variations in
typical vibrational amplitudes are on the order of I the vocal fold driving pressure.

Journal o f Voice, Vol. 2, N o . 3, 1988


the abduction quotient ~o/A decreases and the per- by changing adduction o r (b) by changing lung pres-
ception of register changes toward the richer timbre sure. An increase in lung pressure will increase the
(Fig. 5a). The notes around D 3 and D4 should there- amplitude of vibration, thereby reducing the abduc-
fore be optimally suited for chest register produc- tion quotient and enriching the timbre. A m e s s a di
tion. For slightly higher frequencies, it is seen in voce exercise, a gradual crescendo followed by a
Fig. 7 that the vibrational amplitude decreases decrescendo, may involve a timbre change if con-
quickly, causing the abduction quotient to increase trol of amplitude and adduction are not well coor-
and the register to shift toward a poorer timbre. The dinated. As amplitude of vibration increases in the
notes around 200 Hz (G3~) and those in the region of crescendo segment, the vocal folds should be ab-
400--800 Hz (about G4-Gs) should therefore be less ducted slightly to keep Qa constant. Conversely, as
suitable for chest production. Tenors are well aware amplitude decreases in the decrescendo segment,
of this, the region from G 4 to C 5 being the most the vocal folds should be adducted slightly. Other-
difficult for production with rich timbre (chest wise, the m e s s a di voce is likely to involve pressed
voice). voice and breathy voice at the loudness extremes.
It is conceivable that all of the major register tran- As yet, no obvious neuromuscular mechanisms
sitions identified by vocal pedagogues for singing have been discovered that would prevent adduc-
(see ref. 1 or 3 for reviews) can be accounted for by tory-ab.ductory maneuvers to be executed with pre-
the first subglottal resonance. If we consider an ap- cision over the entire frequency range. Should there
proximate 10-20% difference in tracheal length be- indeed be none, adductory control could be used to
tween basses and sopranos, the peaks and valleys equalize the involuntary register transitions brought
will all have a spread of this percentage. In musical about by subglottal resonance. In the facilitory tran-
terms, it would correspond to 2-3 semitones, which sitions, where there is a natural tendency toward a
agrees with the observed spread in involuntary reg- richer timbre, the vocal processes would be spread
ister transitions. Furthermore, if we consider the apart slightly to offset the increasing amplitude of
involuntary transitions to take place somewhere be- vibration as the pitch is raised. Alternatively, or
tween the peaks and valleys in Fig. 7, i.e., near the additionally, the thyroarytenoid (vocalis) muscle
zero crossings, we would predict a total of four could be relaxed gradually to allow the inferior
transitions. Three transitions would be in the male part of the vocal fold to abduct (29,30). For exam-
two-octave range C3-C 5 (note scale on top) and two ple, the adjustment could begin to take place some-
in the female two-octave range C4-C 6. One transi- where around C4 and continue in varying amounts
tion overlaps both genders. The common transition t h r o u g h F4, with maximum equalization at D 4 (Fig.
is the secondo p a s s a g g i o for males and the p r i m o 7). In the inhibitory transition, a gradual increase in
passaggio for females. It is the major chest to fal- adduction may be necessary to offset the decreasing
setto transition following D 4. amplitude of vibration. This adjustment could begin
Two of the transitions might be called facilitory at F 4 and continue past C5.
and two inhibitory for chest voice. Thus, the D3-G3e Some limitations may be encountered with this
and the D4-C 5 transitions are inhibitory (downward equalization technique, especially at high funda-
in Fig. 7), whereas the Gae-D4 and the Cs--C6 tran- mental frequencies. Increased adduction, together
sitions are facilitory (upward in Fig. 7). Much at- with a reduced vibrational amplitude, may cause a
tention is paid to the inhibitory transitions in voice "strangulation" effect. Even though the timbre
training because they have a greater potential for may be equalized reasonably well, the output power
vocal debilitation. Both types of transitions need to may dwindle because there is too little airflow. At
be smoothed out, however, for even vocal produc- this point, the singer usually resorts to both subglot-
tion over a wide frequency range. We now address tal pressure increase and vowel modification to
some of the techniques employed in this smoothing- boost the output power and tune out the undesirable
out process. subglottal resonances. After Sundberg (31), the
vowel modification process has been called "for-
VOLUNTARY TRANSITIONS AND mant tuning." Supraglottal resonances, unlike the
REGISTER EQUALIZATION subglottal resonances, are highly adjustable by ar-
ticulatory changes. Even within a given vowel cat-
There appear to be at least two ways in which egory, the first and second formant frequencies (F 1
timbre transitions can be executed voluntarily: (a) and F2) can vary by as much as 50-100%. This

Journal of Voice, Vol. 2, No. 3, 1988

192 I . R . TITZE

makes formant tuning an attractive way to equalize

4000 \ t J , J
~ ] I .I / -
the subglottal registers.
To illustrate the point, Fig. 6 can be used to de-
scribe the effect of supraglottal resonances on vocal
fold vibration if the polarity of the damped expo-
ooo / ,.
nentials is reversed, i.e., if the formant patterns are
flipped upside down. The physical reason for this is 2000 .~
that the supraglottal pressure is maximally negative N

at glottal closure, whereas the subglottal pressure is

maximally positive. One can get an intuitive feeling
for this opposite polarity by recognizing that an air
condensation and air rarefaction take place below 1000
the glottis and above the glottis, respectively, as the
vocal tract air column attempts to maintain its for-
ward motion through glottal closure. Excellent ex-
amples of this are provided by Koike (32), Kitzing
and Lffqvist (32), and Miller and Schutte (25). All 500 I I I i I
of the frequency ratios established in Fig. 6 will 400 800 1~00
therefore apply to the supraglottal system, except F I in Hz
that constructive and destructive cases are inter- FIG. 8. F2-FI vowel chart showing criticalfrequencies (dashed
changed. However, since F1 is highly adjustable, lincs) wherc peaks and valleys in Fig. 7 occur. Slightly curved
there are no invariant frequencies where transitions diagonal lines indicate where double formant tuning is possible.
(After Peterson and Barney, rcf. 34.)
Consider an example of register equalization by
formant tuning. The difficult region between 400 vowels, but it is more likely that adductory equal-
and 800 Hz in Fig. 7, which results from F 0 being in ization will be dominant here, given that the vowel
the vicinity of F~, can be equalized by placing F 1 in selection is rather poor.
the same region. A number of vowels, particularly In the upper part of the female range (500-1,000
t h e / z / , / u / , / e / , and/~/vowels, can be used for this Hz), there is a gradual and sustained facilitory tran-
purpose. Baritones and tenors know that these sition out of the 500-Hz valley in Fig. 7. Here the
vowels, rather than the so-called corner vowels/i/, tuning of F1 to F[ can gradually be replaced by tun-
/a/, and/u/, are easiest to produce in the high part of ing to F 0. Tuning to F o is not to equalize registers so
their ranges. much as to increase output power (30). The vowels
A more complete picture of the register-vowel best suited for this are the brighter vowels/~e/,/A/,
interaction can be obtained by locating the peaks and/a/in Fig. 8. Note the wide range of F1 for these
and valleys of Fig. 7 on the classic Peterson and vowels, well over an octave. Thus, F1 = F0 can be
Barney vowel chart (34). This is done with dashed maintained over most of the upper female range
vertical lines in Fig. 8. F 2 is plotted again<st F1 for a with these vowels.
number of clusters of American English vowels. A double-tuning effect can be achieved by letting
The locations of phonetic symbols represent aver- F1 = F0 and F2 = 2F0, the second harmonic. This
age adult male values for F 1 <600 Hz and average condition is represented by the lower of the two
adult female values for F 1 >600 Hz. This division curved solid lines of Fig. 8. If for every fundamen-
was created to highlight vowel modifications in the tal frequency a vowel is used that falls on one of
upper part of the respective ranges. Note that the these lines, significant boosting of the output power
four male vowels/~/,/u/,/o/, and/e/lie very close to may be possible. For example, at G5 (784 Hz) the
the F~ vertical line, suggesting that register equal- vowel/A/lies near the F 2 = 2F 1 curve. This vowel
ization is readily accomplished with these vowels. can be modified and blended into the/~e/vowel to
The corner vowels, on the other hand, are far re- cover the entire Ds---D 6 o c t a v e (587-1,175 Hz) with-
moved from the F~ line. out detuning either the first or the second harmonic.
The (2/7)F~, (2/5)F~, and (3/5)F~ lines all lie in the This double tuning can also be used in the 300- to
/i/ and /u/ vowel spaces. Some equalization could be 500-Hz range, but only for the closed vowels/u/and
expected in this 150- to 300-Hz region with these /U/. By lowering the larynx, the/~/vowel could pos-

Journal of Voice, Vol. 2, No. 3, 1988


sibly qualify if F 1 is appreciably lowered by this that are not explained by this simple hypothesis.
action. For example, the fact that involuntary register tran-
The second double-tuning curve (upper sloping sitions from chest voice to falsetto voice are often
line in Fig. 8) is characterized by F 1 = F 0 and F2 = accompanied by large upward jumps in fundamental
3F0. Here the vowel must be some mixture between frequency is not explained. Also not explained is
/e/ and /~e/ for females in the upper range, but the the apparent need to deactivate the thyroarytenoid
fundamental frequency range is limited to - 8 0 0 Hz. muscle for fundamental frequencies above F 4 (29).
For the tenor high C, however, the condition seems We are currently investigating another hypothesis
to be ideal if the neutral vowel/o/is used. Here the that a physiologic limit in the maximum active
unwanted subglottal resonance can be canceled stress in the thyroarytenoid muscle may dictate a
with F 1 while the third harmonic can resonate to- transition from chest register to head register, and
gether with Fo. The vowel can be gradually blended ultimately to falsetto. Such a transition would be
into the /u/ vowel for lower fundamental frequen- superimposed on the series of transitions described
cies. This may be related to the " c o v e r e d " sound here. Thus, an acoustic and a neuromuscular gene-
sometimes mentioned in pedagogy. sis of vocal registers could coexist.
Much more research is needed to quantify the The transition into pulse register (vocal fry)
many other resonance possibilities. In theory, n for- stands apart from all the other transitions in its or-
mants can be tuned to m harmonics, yielding a dou- igin. It appears to be an interaction between for-
ble-infinite set of possibilities. In practice, of mant bandwidth and fundamental frequency. When
course, only a handful make any sense. Further- formant energy is damped out completely before a
more, personal preference in sound quality may new vocal tract excitation occurs within the glottal
steer the vocalist away from overuse of vowel- cycle, individual pulses are perceived. The percep-
modified productions. In such a case, register tion is basically independent of the shape of the
equalization by adductory control is still an option, glottal pulse. Subharmonic frequencies in a voice
as discussed earlier. In many styles of singing, no signal may also lead to the perception of vocal fry,
equalization is preferred, particularly when the ver- but formal experiments are needed to confirm this.
bal message takes precedence over a smooth tim- Finally, a word about register terminology is in
bre. order. The term "chest register" is appropriate in
light of the current findings. Given that tracheal res-
SUMMARY AND CONCLUSION onances carry energy into the upper thoracic re-
gion, there should be little objection to labeling the
This has been an exploration of one hypothesis phenomenon according to a sensation experienced
about vocal registers, namely, that subglottal reso- by the vocalist. The term " m o d a l " is particularly
nances induce the involuntary register transitions. unfortunate because modes of vibration have a spe-
In particular, the primo and secondo passaggi in the cific, but rather different, meaning in acoustics and
singing voice would appear to achieve their invari- other fields of physics. This author prefers "pulse
ance in frequency by the relatively constant length register" to "vocal fry" or other names that have
of the trachea. The hypothesis has been shown to been invented. Pulses are indeed perceived and
have strong support in a number of areas. First, the pulses are observed in\the waveforms. "Falsetto
location of the register transitions is predicted ac- register" gets this author's vote because it is such a
curately. Second, the number of transitions and unique word. It can hardly be confused with any
their relative importance are predicted, as far as the other word in speech or voice science. Even though
author can glean from the literature. In particular, it conveys no sensory information, it has been
the difficulty that vocalists encounter in carrying around long enough that everyone can relate to it.
the chest voice from F 4 to C5 is explained rather Finally, "head register" does not get a particularly
well on the basis of tracheal resonance. Third, the high score in the author's view. It does convey sen-
technique used by vocalists to equalize the registers sory information, but because it is likely to be some
by adductory control or by vowel modification fit mixture of chest and falsetto register, it does not
well into the scheme. Fourth, voluntary register identify an independent phenomenon. On the other
changes are not affected by the involuntary transi- hand, if singers can easily and consistently relate to
tions. it as a distinct production, another label does not
There are some phenomena related to registers hurt. Whenever possible, scientists and pedagogues

Journal of Voice, Vol. 2, No. 3, 1988

194 I.R. TITZE

w o u l d d o w e l l to c la r if y t h e i r l a b e l s in t e r m s o f sup- 16. Colton R. Spectral characteristics of the modal and falsetto

registers. Folia Phoniatr 1972;24:337-44.
plementary acoustic and physiologic information. 17. Keidar A, Hurtig R, Titze I. The perceptual nature of vocal
register change. J Voice 1987;1:223-33.
Acknowledgment: This work was supported by a grant 18. Large J. An acoustical study of isoparametric tones in the
from the National Institutes of Health, no. NS 16320-08. female chest and middle registers in singing. N A T S Bull
The author appreciates the assistance of David Druker, 1968 ;25:12-5.
Sue Ann Philippbar, and Linnie Southard in preparation 19. Sundberg J. The source spectrum in professional singing.
Folia Phoniatr 1973;25:71-90.
of the figures and manuscript. 20. Titze IR. Regulation of vocal power and efficiency by sub-
glottal pressure and glottal width. In: Fujimura O, ed. Vocal
REFERENCES Physiology. New York: Raven Press, 1988.
21. van den Berg JW. An electrical analogue of the trachea,
1. Large J, ed. Vocal registers in singing. The Hague: Mouton,
lungs, and tissues. Acta Physiol Pharmacol (Neerl) 1960;
2. Hollien H. A review of vocal registers. In: Lawrence V, ed. 9:1-24.
22. Titze IR. The importance of vocal tract loading in maintain-
Transactions of the twelfth symposium on care of the pro-
ing vocal fold oscillation. Proc Stockholm Musical Acoustics
fessional voice. New York: Voice Foundation, 1983:1-6.
ConfSMAC83, vol 1, Royal Swedish Academy of Music no.
3. Miller R. The structure of singing. New York: Schirmer,
46(2), 1983, pp 61-72.
23. Ishizaka K, Matsudaira M, Kaneko T. Input acoustic imped-
4. Keidar A. Vocal register change: an investigation of percep-
tual and acoustic isomorphism. Ph.D. dissertation, Univer- ance measurement of the subglottal system. J Acoust Soc
sity of Iowa, 1986. A m 1976;60:190-7.
24. Cranen B, Boves L. On subglottal formant analysis. J
5. Titze IR. Synthesis of sung vowels using a time-domain ap-
Acoust Soc A m 1987;81:734--46.
proach. In: Lawrence V, ed. Transactions of the twelfth
25. Miller D, Schutte H. Characteristic patterns of sub- and su-
symposium on care of the professional voice. New York: praglottal pressure variations within the glottal cycle. In:
Voice Foundation, 1983:90-8.
6. Liberman A, Harris K, Hoffman H, Giffith B. The discrim- Lawrence V, ed. Transcripts of the thirteenth symposium on
ination of speech sounds within and across phoneme bound- care of the professional voice. New York: Voice Founda-
aries. J Exp Psychol 1957;54:358-68. tion, 1985:70-5.
7. Van Valkenburg M. Network analysis. Englewood Cliffs, 26. Titze IR. The physics of small amplitude oscillation of the
NJ: Prentice Hall, 1955:194-97, 386. vocal folds. J Acoust Soc A m 1988;83:1536--52.
8. Klatt D. Software for a cascade/parallel synthesizer. J 27. Rothenberg M. Acoustic reinforcement of vocal fold vibra-
Acoust Soc A m 1980;67:971-95. tory behavior in singing. In: Fujimura O, ed. Vocal Physi-
9. Fant G, Ananthapadmanabha T. Truncation and superposi- ology. New York: Raven Press, 1988.
tion. In: Quarterly Progress and Status Report STL-QPSR 28. Roubeau B, chevrie-Muller C, Arabia-Guidet C. Electro-
2-3. Stockholm: Speed Transmission Laboratory, Royal In- glottographic study of the changes of voice registers. Folia
stitute of Technology, 1982:1-17. Phoniatr 1987;39:280-9.
29. Hirano M, Vennard W, Ohala J. Regulation of register,
10. Hollien H, Michel J. Vocal fry as a phonational register. J
pitch, and intensity of voice. Folia Phoniatr 1970;22:1-20.
Speech Hear Res 1968;11:600--4.
30. Hirano M. Phonosurgery. Basic and clinical investigations.
11. Ishizaka K, Isshiki N. Computer simulation of pathological Official report, 76th Annual Convention Oto-Rhino-
vocal cord vibration. J Acoust Soc A m 1976;60:1193-8. Laryngological Society, Japan, 1975.
12. Wong D. A hybrid model of vocal fold vibration with appli- 31. Sundberg J. The acoustics of the singing voice. Sci Am
cation to some pathological cases. M.S. thesis, University of 1977;236:82-91.
British Columbia, 1985. 32. Koike Y. Sub- and supraglottal pressure variation during
13. Wolfe V, Ratusnik D. Acoustic and perceptual measure- phonation. In: Stevens K, Hirano M, eds. Vocal fold phys-
ments of roughness influencing judgments of pitch. J Speech iology. Tokyo: University of Tokyo Press, 1981:181-91.
Hearing Disord 1988;53:15-22. 33. Kitzing P, Lffqvist A. Subglottal and oral air pressures dur-
14. Fant G, Liljencrants J, Qi-guang L. A four-parameter model ing phonation. Med Biol Eng 1975;13:644-8.
of glotta! flow. STL-QPST4/85. Stockholm:. Speech Trans- 34. Peterson G, Barney H. Control methods used in a study of
mission Laboratory, Royal Institute of Technology, 1985:1- the vowels. J Acoust Soc A m 1952;24:175-84.
14. 35. Sundberg J, Gauffin J. Waveform and spectrum of the glottal
15. Titze IR. Parameterization of the glottal area, glottal flow, voice source. In: Frontiers of Speech Communication Re-
and vocal fold contact area. J Acoust Soc A m 1984;75:570- search: Festschrift for Gunnar Fant. London: Academic
80. Press, 1979:301-20.

Journal of Voice, Vol. 2, No. 3, 1988

