Substyles of Belting: Phonatory and Resonatory Characteristics

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Substyles of Belting: Phonatory and Resonatory

Characteristics
n, and ‡Lisa Popeil, *yStockholm, Sweden and zLos Angeles, California
*,†Johan Sundberg, †Margareta Thale

Summary: Belting has been described as speechlike, yell-like, or shouting voice production commonly used in
contemporary commercial music genres and substantially differing from the esthetic of the Western classical voice
tradition. This investigation attempts to describe phonation and resonance characteristics of different substyles of belt-
ing (heavy, brassy, ringy, nasal, and speechlike) and the classical style. A professional singer and voice teacher, skilled
in these genres, served as the single subject. The recorded material was found representative according to a classification
test performed by an expert panel. Subglottal pressure was measured as the oral pressure during the occlusion for the
consonant /p/. The voice source and formant frequencies were analyzed by inverse filtering the audio signal. The
subglottal pressure and measured flow glottogram parameters differed clearly between the styles heavy and classical
assuming opposite extremes in most parameters. The formant frequencies, by contrast, showed fewer less systematic
differences between the substyles but were clearly separated from the classical style with regard to the first formant.
Thus, the differences between the belting substyles mainly concerned the voice source.
Key Words: Formant frequencies–Flow glottogram–NAQ–Subglottal pressure.

INTRODUCTION Moreover, she found that the contact quotient,a as measured


The term ‘‘belting’’ voice refers to the speechlike, yell-like, or from an electroglottogram (EGG), was highest in belt and low-
shouting voice production heard in such commercial vocal sub- est in opera. She also noted differences in the electromyogram
styles as pop, rock, R&B, jazz, country, and world music as well (EMG) signals recorded from a number of extrinsic laryngeal
as in musical theater. It differs substantially from the esthetic of muscles.
the Western classical voice tradition. Schutte and Miller4 compared ‘‘nonclassical’’ and operatic
Traditionally, the belting voice has been described as loud, styles of singing with respect to spectrographic, electroglotto-
nasal, and more akin to shouting than to the operatic esthetic graphic, and sub- and supraglottal pressure measurements and
of singing. Over time, this sound, as it is used in the professional arrived at a definition of ‘‘belting’’ on the basis of their findings:
musical theater world, has evolved into a number of discernible ‘‘Belting is a manner of loud singing that is characterized by
substyles, which might be termed ringy, brassy, nasal, speech- consistent use of ‘chest’ register (>50% closed phase of glottis)
like, and heavy belts.1 In musical theater singing, each substyle in a range in which larynx elevation is necessary to match the
represents and enhances the portrayal of characters on the stage. first formant with the second harmonic on open (high F1)
The substyles are commonly used also in other types of so- vowels, that is ∼G4-D5 in female voices.’’
called contemporary commercial music. Bestebreurtje and Schutte5 studied resonance strategies and
The need for a scientific description of belting was pointed EGG contact quotients that their single subject used in belting
out by Hollien and Miles.2 Physiological findings differentiat- and an ‘‘unoptimized speechlike’’ style of singing in some
ing classical and belting voice production have shown many vowels sung on the pitches G4 and B4-flat. They found that
important differences including variances in vocal fold vibra- for /e/, the second formant F2 was close to the fifth harmonic
tional patterns, speed quotients, and closed quotients of the partial, whereas for /i/, F1 was midway between the first and
vocal folds, air pressure, cartilage tilting differences of the second partials. They concluded that the loud bright sound of
thyroid and cricoid cartilages, the position of the hyoid bone, the belting style is achieved by the implementation of resonance
involvement of the supralaryngeal muscles, resonance charac- strategies that enhance higher harmonics. The contact quotient
teristics, and comparative laryngeal height. for belting averaged above 52%.
In a single subject investigation, Estill3 analyzed acoustical In a third single subject study, Evans and Howard6 compared
and physiological characteristics of belting as compared with the contact quotient in the belting and opera styles in some
‘‘speech’’ and ‘‘opera’’ modes. She found that belting was vowels sung on the pitches C4, E4, C5, and E5. For the opera
considerably louder than these styles and also that it was pro- style, the contact quotient varied in the range 18–34%, whereas
duced with a high degree of EMG activity in the vocalis muscle. the belt varied in the range 43–57%. Some variation with F0
was observed.
The purpose of the present study was to identify characteris-
Accepted for publication October 6, 2010.
This investigation was first presented at the Annual Symposium, Care of the Professional tic properties in some substyles of belting. Examples of the var-
Voice, Philadelphia, June 2009. ious belting substyles produced by a single subject were
From the *Department of Speech Music Hearing, School of Computer Science and
Communication, KTH, Stockholm, Sweden; yUniversity College of Music Education in recorded. To ensure that these examples were representative
Stockholm (SMI), Stockholm, Sweden; and the zLos Angeles, California. of the respective substyles, a listening test was carried out by
Address correspondence and reprint requests to Johan Sundberg, Department of Speech
Music Hearing, School of Computer Sciences and Communication, KTH, Stockholm
SE-10044, Sweden. E-mail: pjohan@speech.kth.se
Journal of Voice, Vol. 26, No. 1, pp. 44-50 a
It should be noted that Estill and also many other authors, including Schutte
0892-1997/$36.00
Ó 2012 The Voice Foundation and Miller4 and Evans and Howard,6 use the term ‘‘closed phase’’ instead of
doi:10.1016/j.jvoice.2010.10.007 contact phase.
Johan Sundberg, et al Belting Substyles: Phonatory and Resonatory Characteristics 45

FIGURE 1. Music of the song excerpt used in the recordings.

an expert panel. The examples were analyzed with regard to Japan) sound level meter held next to the recording micro-
subglottal pressure, voice source, and formant frequency char- phone. The SPL value thus observed was announced in the
acteristics. recording. The pressure transducer was calibrated by first
immersing it into water at a depth that was measured and
METHOD announced in the recording and then holding it in free air.
The single subject of the study was coauthor L.P., who is a voice As the experiment was run on one single subject only, it was
coach and professional singer. She performed examples of five important to evaluate the examples of the various vocal sub-
substyles of belt commonly heard in musical theater: heavy, styles recorded. This was done in terms of a listening test
brassy, ringy, nasal, and speechlike, and also classical voice with eight experts. The experts were all active as professional
production for comparison.1 teachers of singing in these styles and thus thoroughly
The song material was an excerpt taken from ‘‘Everything’s acquainted with the styles concerned. In the test, the stimuli
Coming Up Roses’’ from Gypsy (1959, music by Jule Styne and were divided into three groups, (1) the song excerpt with the
lyrics by Stephen Sondheim), Figure 1. She sang this example original lyrics, (2) the song excerpt sung on the syllable /pae/
first twice with the original lyrics and then twice with the sylla- and, (3) the four first (loudest) tones from the diminuendos on
ble /pae/ replacing the syllables of the lyrics. Finally, she sang the syllable /pae/, all produced in each of the six vocal styles.
diminuendos repeating the syllable /pae/ on the pitch of F#4 The stimuli were copied onto three test files, one per stimulus
(370 Hz). The subject sang each of these examples in each of group. In each group, each stimulus occurred twice. The stimuli
the five substyles of belting and also, for comparison purposes, were separated by 5-second long pauses and appeared in
in classical style of singing. random order. The total duration of the three test files was
The recordings were made in the anechoic chamber of the 460 seconds.
Linguistics Department of Stockholm University. Figure 2 The test files were copied onto a CD and sent to each panel
shows the experimental setup in terms of a block diagram. member. Their task was to classify the stimuli on paper sheets.
The audio signal was picked up by a Bru€el & Kjaer (Naerum, Before the test, proper anchor stimuli were presented, typical of
Denmark) condenser microphone (B&K 4190 2435611) at a dis- the various vocal substyles (Table 1).
tance of 30 cm. A Gaeltec (Dunvegan, Isle of Skye, Scotland) The voice source was analyzed by inverse filtering the audio
S7b pressure transducer, which the singer held in the corner of signal by means of the custom-made Decap program (Svante
her mouth, was used to capture the oral pressure. The oral pres- Granqvist, KTH, Sweden), complemented by the derivative of
sure during the occlusion for the consonant /p/ was used as an the EGG waveform (dEGG). Two criteria were applied for
estimate of subglottal pressure. The electroglottograph signal tuning the inverse filters: (1) a ripple-free closed phase and (2)
was recorded from a Glottal Enterprises (Syracuse, NY) EG2 synchronicity of the main positive peak of the dEGG and flow
two-channel electroglottograph. These three signals were digi- waveform discontinuity representing the closing of the glottis.7
tized and recorded onto different channels of a wav-file using The frequencies and bandwidths of the inverse filter settings and
the resulting flow glottograms were saved to files together with
SoundSwell software (Hitech Development, Solna, Sweden).
their derivatives and the dEGG signals. From the flow
The audio signal was calibrated by sustaining a vowel, the
sound pressure level (SPL) of which was measured by means
of an Ono Sokki LA-210 (Ono Sokki Co., Ltd., Yokohama, TABLE 1.
List of the Anchor Stimuli Used in the Listening Test
Heavy: Lisa Kirk Big Time from Mack & Mabel, music and
lyrics by Jerry Herman
Brassy: Ethel Merman There’s No Business Like Show
Business, from Annie Get Your Gun, music and lyrics
by Irving Berlin
Ringy: Debbie Gravitte Secret Love from Calamity Jane,
by Sammy Fain and Paul Francis Webster
Nasal: Patti Lupone As Long As He Needs Me from
Oliver!, music and lyrics by Lionel Bart
Speechlike: Idina Menzel No Good Deed from Wicked,
music and lyrics by Stephen Schwartz
Classical: Beverly Sills Una Voce Poco Fa, from Il Barbiere
FIGURE 2. Experimental setup during the recordings in the an- di Siviglia by Giacomo Rossini
echoic chamber.
46 Journal of Voice, Vol. 26, No. 1, 2012

repeated stimuli. The classifications given by listener 8 were


TABLE 2.
Consistency of Listeners’ Classifications
discarded because they were far less consistent than those of
the remaining experts. The remaining listeners responded con-
Rater sistently between 61% and 100% of the stimuli. This suggests
Stimulus Group 1 2 3 4 5 6 7 8 that their classifications represented rather reliable information.
The above results do not reveal how representative the vari-
Single tones /pae/ 67 100 100 83 100 83 100 50
ous stimuli were for the different styles. This, however, can be
Song /pae/ 33 100 100 50 100 100 83 33
seen in Table 3, showing the percentages of classifications that
Song, lyrics 83 100 100 50 100 67 33 50
were in agreement with the singer’s intentions. In most cases,
The numbers show, for each listener and for the indicated stimulus group, the singer’s intended styles resulted in high percentages. This
the percentage of identical classifications of the six repeated stimuli. implies that most of the examples were representative.
As might be expected, the stimuli consisting of the four first
glottogram were determined (1) period T0, (2) closed phase, (3) tones from the diminuendos on the syllable /pae/ seemed to
pulse amplitude, (4) maximum flow declination rate (MFDR), be somewhat more difficult to classify than the song stimuli.
and (5) the level of the voice source fundamental relative to Classical, speechlike, and heavy received the highest number
that of the second source spectrum partial, H1-H2. The closed of ‘‘intended’’ classifications, whereas ringy, nasal, and brassy
quotient Qclosed was calculated as the ratio between the closed received the lowest numbers. The four tone stimulus in the
phase duration and T0, and the normalized amplitude quotient Ringy style actually were classified in ways deviating from
(NAQ) was calculated as the ratio between the pulse amplitude the singer’s intentions in a majority of cases.
and the product of MFDR and T0. Unfortunately, the speechlike Summarizing, the listening test showed that most of the
/pae/-diminuendo examples could not be inverse filtered stimuli were rather easy to classify and that in most cases, the
because of nasalization, since nasalization makes the transfer classifications made by the expert listeners were in agreement
function of the voice organ too complex. with those of the singer. This supported the assumption that it
was meaningful to analyze in more detail the phonation and
RESULT resonance characteristics of these stimuli.
To check the consistency of the listening panel members, each
stimulus occurred twice, in a random order, in the test. The Voice source properties
numbers in Table 2 shows the consistency of listeners’ classifi- Figure 3 shows subglottal pressure Psub and equivalent sound
cations in terms of the percentage of identical classifications of level Leq, averaged across tones for the different vocal styles

TABLE 3.
Confusion Matrix Showing the Percentages of the Seven Experts’ Classifications That Were in Agreement With the Singer’s
Intentions in Each of the Stimulus Groups. Bold Digits Represent Classifications in Agreement With the Singer’s Intentions.
Classified as

Intended Brassy Classical Heavy Nasal Ringy Speechlike


Single pitches /pae/
Brassy 64 7 22 7
Classical 100
Heavy 7 93
Nasal 14 57 29
Ringy 64 36
Speechlike 14 86
Song /pae/
Brassy 93 7
Classical 93 7
Heavy 7 86 7
Nasal 64 7 29
Ringy 29 64 7
Speechlike 100
Song lyrics
Brassy 64 7 7 15 7
Classical 100
Heavy 7 93
Nasal 86 7 7
Ringy 14 7 65 14
Speechlike 7 93
Johan Sundberg, et al Belting Substyles: Phonatory and Resonatory Characteristics 47

FIGURE 3. Psub and Leq averaged across tones for the different vocal styles in the /pae/ version of the song excerpt and, in the left graph, also for
the diminuendo sequences.

in the /pae/ version of the song excerpt and in the diminuendo. extreme, in the sense that Psub, MFDR, and Qclosed were low
Because of technical problems, no Psub data were obtained for and H1-H2 and NAQ were high. Brassy showed high Qclosed
the song example in classical. Heavy was produced with the and low H1-H2.
highest pressures and also showed the greatest variation of pres- For heavy and brassy, the song excerpt values are reasonably
sure, as indicated by the high standard deviation in the song. close to the values observed from the diminuendo material. For
Figure 4A–E shows the MFDR, closed quotient, NAQ, H1- ringy, on the other hand, the song excerpt yielded values
H2, and pulse amplitude, respectively, for data observed in quite separate from the diminuendo values. Speechlike assumed
the inverse filtering of the diminuendo sequences of the /pae/. a position with medium Psub, a low MFDR, H1-H2, and NAQ,
The large symbols included in the same graphs refer to values and high in Qclosed. In all these parameters except NAQ, speech-
observed in the /pae/ version of the song excerpt produced at like was close to brassy. The pulse amplitude showed no clear
an F0 within ±1 semitone from the F0 used in the diminuendo differences between the vocal styles, neither for the /pae/-
sequences of /pae/. diminuendo nor for the /pae/ version of the song.
Heavy produced extreme values both in Psub and the various Figure 5A–E summarizes the data shown in Figure 4 in terms
glottogram parameters; MFDR and Qclosed were high, whereas of the averages and standard deviations of MFDR, Qclosed,
H1-H2 and NAQ were low. Classical assumed the opposite H1-H2, NAQ, and pulse amplitude, respectively, as functions

FIGURE 4. A. MFDR, B. closed quotient, C. NAQ, D. H1-H2, and E. pulse amplitude for data observed flow glottogram of the diminuendo se-
quences of the /pae/. The large open symbols refer to values observed in the /pae/ version of the song excerpt and produced at an F0 within ±1 semi-
tone from the F0 used in the diminuendo sequences of /pae/.
48 Journal of Voice, Vol. 26, No. 1, 2012

FIGURE 5. Data shown in Figure 4 but here represented by their averages and standard deviations plotted as functions of the associated
averages of Psub.

of the average Psub for the diminuendo sequences of /pae/. As and 4000 Hz. Also nasal shows a peak near 1600 Hz. All other
expected, heavy assumes an extreme position not only with re- substyles show a marked minimum at 2500 Hz. The highest
gard to Psub but also to MFDR, H1-H2, and NAQ, whereas clas- peak appears near 350 Hz for classical. This is close to the
sical assumes the opposite extreme positions not only with average F0 of 345 Hz of the song example; so this LTAS
regard to a low Psub but also to low Qclosed and high H1-H2. peak simply reflects a strong fundamental. The other substyles
Brassy and ringy are close to each other with regard to Psub, show a high peak near 750 Hz, that is, at twice the mean F0; so
MFDR, NAQ, and pulse amplitude, whereas brassy as com- in these styles, the second partial tended to be strong. Heavy,
pared with ringy is higher in Qclosed and lower in H1-H2. The ringy, and brassy have high levels between about 1000 and
standard deviations are mostly small for classical and heavy. 1700 Hz, producing the highest LTAS levels. All styles except
The long-term average spectrum (LTAS) curves of the differ- classical show a marked peak near 3000 Hz. It should be kept in
ent versions of the song are shown in Figure 6. Classical stands mind, however, that the example was no more than about 10
out because of low levels between 500 and 3000 Hz, a marked seconds long and, disregarding the single, short, penultimate
peak at 1600 Hz and a comparatively high level between 3000 note, had an ambitus of no more than a minor third.
The formant frequencies derived from inverse filter analysis
of the various syllables of the lyrics are shown in Figure 7. The
variation between styles appears to be unsystematic except for
F1 in classical, which is markedly lower than in the belting
substyles. The higher F1 values in the belting substyles would
reflect articulatory characteristics, such as a wider jaw opening,
a narrower pharynx, and/or spread lips.
The dotted curves in the Figure 7 show the frequencies of the
three lowest spectrum partials. The formant frequencies show
no consistent tendency to coincide with the two lowest partials.
The first formant is close to the second partial in no more than
two cases, in the vowel (e) in the word great, F0 z 343 Hz, and
on the vowel [ ] in the word the, F0 z 332 Hz, where the sec-
ond partial was about 15 and 10 dB stronger than the fundamen-
tal, respectively. The level difference between partials 1 and 2 in
FIGURE 6. LTAS curves of the different versions of the song excerpt. the spectrum seemed to vary depending on F0 and vowel.
Johan Sundberg, et al Belting Substyles: Phonatory and Resonatory Characteristics 49

FIGURE 7. Formant frequencies, derived from the inverse filter analysis of the various syllables of the lyrics of the song excerpt.

DISCUSSION that the voice source was the main factor distinguishing the
Our investigation has focused on one single subject, which calls styles.
into question the general validity of the findings. To counteract Belting has been found to be produced with high Psub10 and in
this, we presented a listening test to an expert panel well ac- modal register.4 Our analyses of the voice source properties pro-
quainted with the styles of singing concerned. The result of duced nicely stratified distributions of data parameters. On av-
this test revealed that all listeners except one mostly classified erage across tones and pitches, Psub was highest in heavy and
the same stimulus the same way both the times it occurred in lowest in classical, ringy, and brassy being similar to each other
the listening test. Thus, the classifications seemed to cause in this parameter. Ringy, in particular, was quite close to classi-
rather limited difficulties to the panel. Moreover, the panel’s cal. As belt is produced in modal register, the vocal folds can be
classifications were mostly in good agreement with the inten- assumed to be thick, thus requiring high driving pressures.
tions of the singer subject; so it seems fair to conclude that A further support for this assumption is provided by Qclosed,
the examples analyzed were representative of the styles consid- which was close to 0.5 in heavy and brassy and no more than
ered and hence merit detailed analysis. Furthermore, analyzing 0.25 in classical. Ringy showed an intermediate Qclosed average
one single subject implies that the same voice organ was used in of about 0.3. The low quotient for classical may reflect a register
all styles thus entailing the method’s advantage that the subject effect; in the F0 region concerned, classically trained female
serves as her own control. singers rarely use a pure modal register.
Schutte and Miller4 reported that the belt and legit styles of The voice source fundamental was quite dominant in classi-
singing differed in a number of ways. First, belt is produced cal and much weaker in the other styles, heavy showing the
with an elevated position of the larynx ‘‘necessary to match weakest, as might be expected from the Qclosed data. The pulse
the first formant with the second harmonic on open (high F1) amplitude average was rather similar for the different belting
vowels, that is, G4–D5 in female voices.’’ Recently, Bourne substyles. This is not entirely surprising, given their Psub differ-
and Garnier8 noted that F1, which they refer to as R1, was ences. An increase of Psub, a decrease of glottal adduction, and
higher in ‘‘chesty belt’’ than in legit and reported that a decrease of the closed phase tend to contribute to an increase
‘‘‘Chesty-belt’ quality’ was characterized by a tuning of R1 to of the pulse amplitude. NAQ, inversely related to perceived de-
2f0, whereas ‘Legit’ did not show any particular resonance ad- gree of pressedness,11 was low for heavy and higher and similar
justment.’’ Our analyses concerned pitches in the range F4–Ab4 for the other styles, including classical.
mainly, but there were several examples of high F1 vowels. As MFDR merits a special analysis because it reflects the vocal
shown in Figure 7, there was no tendency for F1 to be close to tract excitation strength, that is, the contribution of the voice
the second spectrum partial. This observation is in accordance source to the radiated sound level. Heavy showed the highest
with findings recently reported by Lebowitz and Baken.9 In- MFDR values indicating that here the voice source was contrib-
deed, in the word gonn(a), the first formant was quite close to uting substantially to vocal loudness. The means for brassy,
the third partial. In the source spectrum, on the other hand, classical, and ringy were all similar, and the rank order was
the first partial was much stronger in classical than in the belt the same as for the Leq.
styles, particularly the heavy. We showed the results of the averages of the voice source pa-
We found no striking formant frequency differences between rameters as function of Psub. This may be a promising method of
the styles except that classical had a consistently lower F1 in all describing the voice use in a particular style of singing. The three
vowels analyzed. As the examples sounded clearly different, as main control parameters of the voice source are Psub, F0, and glot-
quantitatively evidenced by the listening test, this result implies tal adduction. The first two are easy to measure; but in the present
50 Journal of Voice, Vol. 26, No. 1, 2012

investigation, F0 varied within a very narrow range, no more than 3. Estill J. Belting and classic voice quality: some physiological differences.
a minor third, if a single low tone is disregarded. Therefore, F0 Med Probl Perform Art. 1988;3:37–43.
4. Schutte HK, Miller DG. Belting and pop, nonclassical approaches to the fe-
did not seem as an important parameter. Glottal adduction, by
male middle voice: some preliminary considerations. J Voice. 1988;7:
contrast, was certainly a relevant source parameter, and Qclosed, 142–150.
H1-H2, and NAQ separated the different styles more or less 5. Bestebreurtje M, Schutte H. Resonance strategies for the belting
clearly. At present, it seems difficult to decide which parameter style: results of a single female subject study. J Voice. 2000;14:
or combination of these parameters reflects glottal adduction 194–204.
6. Evans M, Howard D. Larynx closed quotient in female belt and opera qual-
most faithfully. In future investigations of vocal styles, it may
ities: a case study. J Voice. 1993;2:7–14.
be worthwhile to plot voice source data in a three-dimensional 7. Sundberg J, Thalen M, Alku P, Vilkman E. Estimating perceived pho-
graph with Psub, F0, and glottal adduction as dimensions. natory pressedness in singing from flow glottograms. J Voice. 2004;18:
56–62.
Acknowledgments 8. Bourne T, Garnier M. Physiological and acoustic characteristics of the fe-
male music theatre voice in ‘belt’ and ‘legit’ qualities. Paper presented at:
Hassan Djamshidpey at Department of Linguistics of Stock- Proceedings of the International Symposium on Music Acoustics. August
holm University and Professor Sten Ternstr€ om kindly assisted 25–31, 2010; Australia: 1–5.
in the recordings. Financial support for this research and devel- 9. Lebowitz A, Baken R. Correlates of the belt voice: a broader examination.
opment project was provided by University College of Music J Voice. 2011;25:159–165.
Education in Stockholm (SMI), Sweden. 10. Hein M. Die Gesangtechnik des Beltings, eine Studie u€ber Atemdruck,
Lungenvolumen und Atembewegungen. Diss., Universit€at Hamburg;
2010.
REFERENCES 11. Henrich N, d’Alessandro C, Doval B, Castellengo M. On the use of the
1. Popeil L. Multiplicity of belting. J Sing. 2007;64:77–80. derivative of electroglottographic signals for characterization of nonpatho-
2. Hollien H, Miles B. Whither belting? J Voice. 1990;4:64–70. logical phonation. J Acoust Soc Am. 2004;115:1321–1332.

You might also like