David A. Berry ‘The University of lowe, owes City, IA Katherine Verdotini Harvard Medical Schoo! Boston, MA, Douglas W. Montequin The University of fowa lowe City, (A ‘Markus M. Hess University of Homburg Hemburg, Germany Roger W. Chan R. Titze The Unienty llowe lowe City, IA A Quantitative Output-Cost Ratio in Voice Production A quontitaive ouipu-cost ratio {OCR is proposed for objective use in voice production ond 's clined cs he rato of fe conus op! inna oe coin inns of vocl fos. Neoroen’ of he OCR donansobed 2 rimeont using 5 exci ‘nd ¢ transducer desi then human bec, ote wore gothered cl cet donor Rego (150 Hz}, Subglticl pressure wos varied from 1.00 1 6 KPa, and glotiol wi ‘tthe vocal processes was varied fron « pressed condition to 2-mm gap The ‘OCR was plotted at © funeion of gloral width With ne voce trl, the excised laryre experiments yielded a broad maxim in the OCR curves, across al subglotl pressure conditions, ot bout 0.6 am. Computer simulations indicate thot sharper maxima may oocur when the inf the vocal tract is taken into ‘account, The poten! cinical uly ofthe OCR is discussed for treatment ofa swide range of voice dizcrders, induding those involving beth hyper- anc hypoadduction. KEY WORDS: vocal folds, impoct sires intraglotte!siross transducers, vocal ‘ffciency, vocal economy ‘he general concept of an output-cost ratio (OGR) has beer. con- sidered important in voice physiology for decades. As an example, glottal efficiency (Schutte, 1981) has been defined as the ratio of radiated oral acoustic output power to aerodynamic input power, the latter being mean subglottal pressure times mean flow during phona- tion (Boukuys, Mead, Proctor, & Stevens, 1968; Schutte, 1981; van den Berg, 1956). As another example, the ratio of AC to DC flow through the glottis during phonation hes also been labeled a type of vocal efficiency Cisshiki, 1981; see also wn, Holmberg, Perkell, Walsh, & Vaughn, 1989). However, such measures of aerodynamic efficiency sometimes favor an effortful, pressed production, while ignoring the expense to vocal fold tissues. Clinfcaly, it is reasonable to consider a different type of output- cost relation in phonation, where cost is considered ta be the potential insult tothe tissue. Within this framework, the OCR would be the amount, of acoustic output obtained during phonation divided by the amount of ‘mechanical stress inflicted upon the vocal folds. There is evidence to suggest that impact stress is an important causal factor for vocal nod- ules Jiang & Titze, 1994; Titze, 1994a), lesions that are common among those who speak loudly or at high pitches (see for example, FitzHugh, Smith, & Chiong, 1958; Nagata et al., 1983). Pathological profiles of nodular tissue include obliteration of microvilli and surface desq mation of epithelial cells (Gray, Titze, & Lusk, 1987), a reduetion in basal membrane adherence to the lamina propria, abnormal fibronectin Journal of Speech, language, ond Hearing Rexeorh * Vel. £4 + 29-37 + February 2001 * GAencon Spoedhienguoge Heaeng Auccoken 29 1092-4288/01/401-0029 Comright 2001. Allrights reserved. = - accumulation, and appearance of collagen type IV within the lamina propria (Gray, 1991; Gray, Hirano, & Sato, 1993; Gray, Pignatari, & Harding, 1994). Thus, impact stress should be a good indicator of potential trauma. ‘These considerations provided the basis for a previ- ‘ous report that proposed the concept of “laryngeal effi- ciency” as an approach to output-cost relations in voice production (Verdolini & Titze, 1995). The prediction in this earlier study was that maximum economy would be obtained with barely abducted folds. This prediction ‘was based on (a) acoustic power calculations of an in- verse-filtered glottal waveform (Titze, 1994b, Chapter 9), which shows a broad maximum around an open quo- tient slightly greater than 0.5, and (b) impact stress data obtained from a canine hemilerynx investigation, with results averaged over nine larynges (Siang & Titze, 1994), Although the general optimization of an OCR was demonstrated by gleaning date from several previous studies, further quantitative experimental follow-up was needed to position it for use in the cline. Specifically, ‘the measure (a) needed to be performed with a trans- ducer that could be used on human subjects and (b) needed to be explored within the context ofa single, co- hesive vestigation. In this report, we begin to address these issues by pursuing the concept of an OCR from both a physical and mathematical perspective. Specifically, OCR is mea- sured on excised canine larynges, utilizing a transducer that could be used with human subjects. The specific ‘questions to be addressed were: Do data from the ex- cised larynx experiments confirm the presence of maxima in the OCR function, and if so, for which spe- cific glottal widths? The results should be useful for clini- cal estimates of glottal configuretions that maximize voice output, while at the same time providing relative protection to the tissue from phonotraums. Definition of the Output-Cost Ratio In this paper, the OCR is defined as the ratio of acoustic output intensity to the impact intensity of the folds (.e., the rate at which energy per unit erea is ab- sorbed by laryngeal tissues due to collision). in a previ- ous study (Verdolini & Titze, 1995), the ratio of acoustic output power to impact stress was investigated. Although the output-cost idea is similar in either case, most tra- ditional efficiency and economy ratios have been defined as dimensionless quantities. Intuitively, it is appealing to compare similar physical quantities (intensity and intensity, rather than power and stress), Furthermore, when deviations are expected to vary across orders of ‘magnitudes, as in this study, it is important to have a dimensionless quantity so that ratios can be reported in a common logarithmic scale, such as dB. When so ex- pressed, the OCR can be computed as the subtraction of the two intensity curves. A physical argument will now be given to demon- strate that impact stress squared, rather than stress by itself, is most directly related to impact intensity. First of all, observe that the general definition for the stored energy E in a small volume V of tissue may be expressed by the following integral: 1 z= 1 foceav o v where cis the tissue stress and e is tissue strain (defor- mation). If we assume a linearly viscoelastic materia! {for vocal fold tissues, then a constitutive equation will be of the form (Chan & Titze, 1999; Fung, 1993) onpe+ne @ ‘where is an elastic shear modulus, 1 is. viscosity, and é is the strain rave. For sinusoidal time variation, the strain can be written as: exeet @ ‘which results in the following equation: 1 + fone @ Note that © can now be written as off + jon) and that the impact energy is proportional to o?. Acoustie intensity is measured with a sound level ‘meter, which senges the acoustic sound pressure level P, and references it to some nominal value of pressure P, (aormaily 20 Pa), This measure is valid because sound intensity is known to be proportional to the square ofthe sound pressure level, just as impact intensity was shown to be proportional to impact stress. Thus, the output-cost ratio becomes: ocr = 20 106,,(*) 20106. | R= 20 108 yy 5! ~ 20 log,g) 6 Pos °\% where g, is the impact stress and 0, is some nominal value of stress. With this definition, OCR can be com- puted as the SPL (sound pressure level) minus ISL (im- pact stress level). Method ‘This experiment was performed with a stress trans- ducer, which has been used successfully to measure tal impact stress on human subjects, es will be docu- ‘mented in 2 future report. In addition, this experiment complements a previous investigation by Jiang and Titee (1994) that uilized a transducer appropriate for a hemilarynx methodology, but not for the elie. In- dod, one of the purposes of the prosent investigation 30 uot of Speech, ianguage, ond Heonng Resorch + Voi Aé * 29-97 » Februcry 2001 Copyright © 2001. Allrights reserved. ‘was to help bridge the gap between laboratory hemi- Jarynx experiments and experiments performed on ‘fnuman subjects. Five excised canine iarynges were provided by a ear- diae research iaboratory at the University of Iowa. Ce- nine weights ranged between 20 and 28 kg. Animals were sacrificed for use in cardiac research, and the larynges ‘wore made availabie to us about an hour post-mortem. ‘The experimental set-up was developed previously (Durham, Scherer, Druker, & Titze, 1987). In addition, specifics pertinent to this study were described in detail in a recent report (Verdotini, Chan, Titze, Hess, & Bierhals, 1998), In brief, the larynx was mounted on a laboratory bench. A three-pronged device, coupled to a micrometer, was used to manipulate the arytencids and thereby the intervoeal process width. Another microme- tor attached at the thyroid notch was used to establish vocal fold length. During ohonetion trials, predeter- mined intervocal process widths were held constant with wooden shims, the thickness of which had been con- firmed with a digital caliper (Mitutoyo Digimatic). Phonation was induced by delivering humidified, ‘warmed air to the larynx, using an air compressor and a ConchaThermIT heater-hamidifier (Respiratory Care, Inc). Sabglottal pressures were measured with an open- ended manometer (Dwyer No. 1211). The voral folds were monstened with saline (0.9%) intermittently throughout the experiment. ‘Trials were conducted with several intervocal pro- cess widths, including —1.0,~0.5, 0.0, 0.27, 0.5, 1.0, 1.5, 1.75, and 2.0 mm, Negative widths did not imply tissue overlap, but rather a squeezing together of the tissue relative to the 0 mm width. Such negative widths were employed to simulate a “pressed” vocal fold configura tion. Acoustic intensity and impact stress were measured for all glottal widths, Within each width condition, sub- glottal pressures of 1.0, 1.2, 1.4, and 1.6 kPa were de! ered to induce phonation. The target f, was 150 Hz, al- though the actual f, for most trials was 150 +5 Hz. Output intensity measures were obtained in dB us- ing a Britel and Kjaer 2230 sound level meter at a con- stant distance of 16 em and 45 degrees azimuth. The C scale weighting was used to eliminate low-frequency room noise and artifacts below about 50 Hz. Fundamen- tal frequency was measured using a Shure (SM48) dy- namic microphone with a dual-beam oscilloscope. During each phonation trial, impact stress measures were made at the midpoint of the membranous vocal folds. This location was identified with a digital caliper rior to data collection, For each trial, impact stress was measured with modified piezoelectric catheter transducer, which used a lateral window for atinzulation (see Figure 1a). By itself, the transducer had a catheter diameter of 2.0mm, which ‘was deemed too lange for intraglottal insertion. To adapt the transdzcer for human subjects and to effectively “thin” out the intraglottal tip, an elliptical silicone bulb ‘was attached to the end of the eatheter. The silicone bulb ‘was liquid filled to induce Sraveling waves from the point of contact ithe bulb) to the transducer window. To help stabilize the bulb, a palladium wire (0.2 mm diameter) ‘was arched making a single-wire frame. Measuroments and a visual representation of the transducer modifica tion are presented in Figures tb and le, As shown, the dimensions for intraglottal insertion of the modified transducer tip were 1.8 mm in the medial-lateral diree- tion and 3.5 mm in the anterior-posterior direction. The transcucer's frequency response ranged from DC to kEiz, and the dynamic range was 0-250 psi. Calibration was conducted wsing a sphygmomanomever (Nissei D-256038) and a Fluke digital multimeter. We evaluated responses between 0 and 40 mm Hg (0-6.92 kPa}, and confirmed linearity within this range. Prior to each trial, the sensor was placed and stabi- lized at the midpoint of the membranous vocal folds, as noted, by a three-dimensional menipulation microme- ter so that the bottom 2-3 mi of the transducer tip was inserted between the folds. This positioning was chosen because further embedding would have precluded visual verification of constant sensor depth, and less embed- ding would heve risked sensor extrusion during vocal {fold oscillation. Acoustic and impact stress signals were monitored online with a Tektronix 2212 60-MHz digital storage oscilloscope. Output intensities were directly noted from the cound level meter during cach triai and seved for Sater analyses, Impact stress signais were digitized at 10 KHz following anti-aliasing filtering and captured with a Sony PC-108M eight-channei, instrumentation cassette, digital audio tape (DAT) recorder. Data Processing and Analysis Signals were digitized at 10 kliz with a 12-bit CODAS/WINDAQ A/D conversion board and CODAS software, For each phonation trial, stable segments of 20 or more cycles were visually identified, which corre- sponded to stable phonation as auditorily perceived. Peak-to-peak values in the AC stress signal were com- puted, and an average value was obtained for the seg- ment. Sample impact stress waveforms are shown in Figure 2 for subgiottal pressures of 1.0, 1.2, 1.4, and 1.6 kPa. For subglottal pressures greater than cr equal to 2.0 kPa, the impact stress waveforms were similar to the “typical” impact stress waveform shown in diang and ‘Titze (1994), For each trial, the OCR was ealevlated using the formula presented in Equation 6. Berry etal: An Oup-Gos Rao m Vows Peicnon 31 Copyright © 2001. All rights reserved. Figure. A schematic of he modified piezoelectric ransclcer uted te meosure impact ses. LATERAL VIEW 3* unmodified catheter LO LATERALVEW & lips ine Sed nt ==} x © CORONAL VIEW oe ‘modified catheter transducer rerereraae i. Results Glottal Output Intensity ‘The average acoustic date across the five laryn- ges are shown in Figure 3 as a function of glottal width for four distinct subglottal pressure conditions, For all conditions, maximum acoustic intensity was pro- duced with the vocal folds separated by 0.0-0.27 mm at the vocal processes. Assuming the data to be linear functions of subglottal pressure, and either linear or quadratic functions of glottal width, best-fit solutions to the data are shown in Figures 3a and 3b, respec- tively. The R? values for solutions in Figure 3a and 3b were 0.36 and 0.90. Cubic functions of giottal width were also considered. Although they captured the acoustic intensity peak between 0.0 and 0.27 mm quite well, they also predicted that acoustic intensity would start to increase again for glottal widths above 1.75 mm, Consequently, the solutions were deemed to be non-physiologicel. Vocal Fold impact Stress Average impact stress data across the five larynges are shown in Figure 4 for four distinct subglottal pres- sure conditions. Assuming the data to be lineer func- tions of subglottal pressure and cubic functions of giot- tal width, best-fit curves are plotted. Cubie functions of glottal width are suggested by the data, as well as logi- cally. The data show an increase in impact intensity es the glottal width approzches zero. They also show a decrease in impact stress for glottal widths above 1.5 ‘mm. In the mid-range (glottal width of 0.5-1.5 mm), the impact stress remains relatively constant. A cubie func tion of glottal width was needed to describe such curves. ‘The R¢ value for the solunons shown in Figure 4 was, 0.89, Solid lines are shown for solutions within the range of data collected, and dashed lines for extrapolated re- gions. Data values collected at non-positive glottal Widths were not plotted and were not used for curve fitting, for reasons to be described in the Discussion sec- tion. For comparison, the Jiang and Nitze (1994) impact stress data gathered from previous hemilarynx experi- ments were plotted and fit to a straight line, as shown in Figure 4. Output-Cost Ratio Figure 5 shows the OCR function plotted as a fune- tion of glottal width. To derive this function, impact in- tensity curves (referenced to an impact pressure of 1 kPa) ‘wore subtracted from acoustic intensity curves. Across all subgiottal pressure conditions, the curves revealed maxima in the OCR fenetion fer intervocal process widths of approximately 0.6 mm. Discussion Acoustic Intensity ‘In Figure 3, the acoustic intensity data were essen- tially flat as a function of glottal width. However, at high 32 _sournol of Speech, Language, ond Heanng Rersorch * Vol AA * 29-37 9 February 2001 a Conyriaht © 2001-

