Journal of The Audio Engineering Society Audio / Acoustics / Applications

AES
JOURNAL OF THE AUDIO ENGINEERING SOCIETY

AUDIO / ACOUSTICS / APPLICATIONS
Volume 52 Number 11 2004 November
AUDIO ENGINEERING SOCIETY, INC. STANDARDS COMMITTEE
INTERNATIONAL HEADQUARTERS
60 East 42nd Street, Room 2520, New York, NY 10165-2520, USA
Tel: +1 212 661 8528 .
Fax: +1 212 682 0477 Richard Chalmers Mark Yonge
E-mail: HQ@aes.org .
Internet: http://www.aes.org
Chair
John Woodgate
Secretary, Standards Manager
Yoshizo Sohma
Vice Chair Vice Chair, International
ADMINISTRATION
Bruce Olson
Roger K. Furness Executive Director Vice Chair, Western Hemisphere
Sandra J. Requa Executive Assistant to the Executive Director
OFFICERS 2004/2005 TECHNICAL COUNCIL
SC-02 SUBCOMMITTEE ON DIGITAL AUDIO
Theresa Leonard President Wieslaw V. Woszczyk Chair
Neil Gilchrist President-Elect Jürgen Herre and
Ronald Streicher Past President Robert Schulein Vice Chairs Robin Caine Chair Robert A. Finger Vice Chair
Jim Anderson Vice President Working Groups
TECHNICAL COMMITTEES
Eastern Region, USA/Canada
Frank Wells Vice President ACOUSTICS & SOUND SC-02-01 Digital Audio Measurement Techniques
Central Region, USA/Canada REINFORCEMENT Steve Harris, Ian Dennis, Michael Keyhl
Bob Moses Vice President Mendel Kleiner Chair
Kurt Graffy Vice Chair SC-02-02 Digital Input-Output Interfacing:
Western Region, USA/Canada John Grant, Robert A. Finger
Søren Bech Vice President ARCHIVING, RESTORATION AND
Northern Region, Europe DIGITAL LIBRARIES SC-02-08 Audio-File Transfer and Exchange
Bozena Kostek David Ackerman and Mark Yonge, Brooks Harris
Vice President, Central Region, Europe Chris Lacinak Cochairs
SC-02-12 Audio Applications Using High Performance Serial Bus
Ivan Stamac AUDIO FOR GAMES
(IEEE 1394): John Strawn, Bob Moses
Vice President, Southern Region, Europe Martin Wilde Chair
Mercedes Onorato Vice President Ted Laverty Vice Chair SC-02-14 Internet Audio Delivery Systems
Latin American Region AUDIO FOR Karlheinz Brandenburg
Neville Thiele TELECOMMUNICATIONS
Vice President, International Region Bob Zurek Chair
Han Tendeloo Secretary Andrew Bright Vice Chair SC-03 SUBCOMMITTEE ON THE PRESERVATION AND RESTORATION
CODING OF AUDIO SIGNALS OF AUDIO RECORDING
Marshall Buck Treasurer
Louis Fielder Treasurer-Elect James Johnston and
Jürgen Herre Cochairs Ted Sheldon Chair Dietrich Schüller,
AUTOMOTIVE AUDIO Chris Chambers Vice Chairs
GOVERNORS
Richard S. Stroud Chair
Ronald Aarts Tim Nind Vice Chair Working Groups
Jerry Bruck HIGH-RESOLUTION AUDIO
Kees Immink SC-03-01 Analog Recording: J. G. McKnight
Malcolm Hawksford Chair
Garry Margolis Vicki R. Melchior and SC-03-02 Transfer Technologies: Lars Gaustad, Greg Faris
Ulrike Schwarz Josh Reiss Vice Chairs
Richard Small LOUDSPEAKERS & HEADPHONES
SC-03-04 Storage and Handling of Media: Ted Sheldon, Gerd Cyrener
Peter Swarte David Clark Chair SC-03-06 Digital Library and Archives Systems: David Ackerman,
John Vanderkooy Juha Backman Vice Chair Ted Sheldon
MICROPHONES & APPLICATIONS
Geoff Martin Chair SC-03-07 Audio Metadata: Chris Chambers
COMMITTEES
David Josephson Vice Chair SC-03-12 Forensic Audio: Tom Owen, M. McDermott
AWARDS
MULTICHANNEL & BINAURAL Eddy Bogh Brixen
Kees A. Immink Chair
AUDIO TECHNOLOGIES
CONFERENCE POLICY Francis Rumsey Chair
Søren Bech Chair Gunther Theile Vice Chair
CONVENTION POLICY & FINANCE SC-04 SUBCOMMITTEE ON ACOUSTICS
NETWORK AUDIO SYSTEMS
Marshall Buck Chair Jeremy Cooperstock Chair
EDUCATION Robert Rowe and Mendel Kleiner Chair David Josephson Vice Chair
Jason Corey Chair Thomas Sporer Vice Chairs
AUDIO RECORDING & STORAGE Working Groups
FUTURE DIRECTIONS
Theresa Leonard Chair SYSTEMS SC-04-01 Acoustics and Sound Source Modeling
Derk Reefman Chair Richard H. Campbell, Wolfgang Ahnert
HISTORICAL Kunimaro Tanaka Vice Chair
J. G. (Jay) McKnight Chair SC-04-03 Loudspeaker Modeling and Measurement
PERCEPTION & SUBJECTIVE
Irving Joel Vice Chair David Prince, Neil Harris, Steve Hutt
EVALUATION OF AUDIO SIGNALS
Donald J. Plunkett Chair Emeritus Durand Begault Chair
LAWS & RESOLUTIONS SC-04-04 Microphone Measurement and Characterization
Søren Bech and Eiichi Miyasaka
Neil Gilchrist Chair David Josephson, Jackie Green
Vice Chairs
MEMBERSHIP/ADMISSIONS SEMANTIC AUDIO ANALYSIS SC-04-07 Listening Tests: David Clark, T. Nousaine
Francis Rumsey Chair Mark Sandler Chair
NOMINATIONS Dan Ellis Vice Chair
Ronald Streicher Chair SIGNAL PROCESSING SC-05 SUBCOMMITTEE ON INTERCONNECTIONS
PUBLICATIONS POLICY Ronald Aarts Chair
Richard H. Small Chair James Johnston and Christoph M.
Musialik Vice Chairs Ray Rayburn Chair John Woodgate Vice Chair
REGIONS AND SECTIONS
STUDIO PRACTICES & PRODUCTION
Subir Pramanik and Working Groups
George Massenburg Chair
Roy Pritts Cochairs David Smith and
STANDARDS Mick Sawaguchi Vice Chairs SC-05-02 Audio Connectors
Richard Chalmers Chair Ray Rayburn, Werner Bachmann
TRANSMISSION & BROADCASTING
TELLERS Stephen Lyman and SC-05-05 Grounding and EMC Practices
Christopher V. Freitag Chair Kimio Hamasaki Cochairs Bruce Olson, Jim Brown
Correspondence to AES officers and committee chairs should be addressed to them at the society’s international headquarters.
Europe Conventions
AES REGIONAL OFFICES
Zevenbunderslaan 142/9, BE-1190 Brussels, Belgium, Tel: +32 2 345

AES Journal of the Audio Engineering Society
(ISSN 1549-4950), Volume 52, Number 11, 2004 November
7971, Fax: +32 2 345 3419, E-mail for convention information: Published monthly, except January/February and July/August when published bi-
euroconventions@aes.org. monthly, by the Audio Engineering Society, 60 East 42nd Street, New York, New
Europe Services York 10165-2520, USA, Telephone: +1 212 661 8528. Fax: +1 212 682 0477.
B.P. 50, FR-94364 Bry Sur Marne Cedex, France, Tel: +33 1 4881 4632, E-mail: HQ@aes.org. Periodical postage paid at New York, New York, and at an
Fax: +33 1 4706 0648, E-mail for membership and publication sales: additional mailing office. Postmaster: Send address corrections to AES Journal of the
euroservices@aes.org. Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520.
United Kingdom
British Section, Audio Engineering Society Ltd., P. O. Box 645, Slough, The Audio Engineering Society is not responsible for statements made by its
SL1 8BJ UK, Tel: +44 1628 663725, Fax: +44 1628 667002, contributors.
E-mail: UK@aes.org.
Japan EDITORIAL STAFF
AES Japan Section, 1-38-2 Yoyogi, Room 703, Shibuyaku-ku, Daniel R. von Recklinghausen Editor
Tokyo 151-0053, Japan, Tel: +81 3 5358 7320, Fax: +81 3 5358 7328,
E-mail: aes_japan@aes.org. William T. McQuaide Managing Editor Ingeborg M. Stochmal
Gerri M. Calamusa Senior Editor Copy Editor
AES REGIONS AND SECTIONS
Abbie J. Cohen Senior Editor Barry A. Blesser
Eastern Region, USA/Canada Mary Ellen Ilich Associate Editor Consulting Technical Editor
Sections: Atlanta, Boston, District of Columbia, New York, Philadelphia, Toronto Patricia L. Sarch Art Director
Student Sections: American University, Appalachian State University, Berklee Stephanie Paynes
College of Music, Carnegie Mellon University, Duquesne University, Fredonia, Flávia Elzinga Advertising Writer
Full Sail Real World Education, Hampton University, Institute of Audio Research, REVIEW BOARD
McGill University, New York University, Peabody Institute of Johns Hopkins
University, Pennsylvania State University, University of Hartford, University of Ronald M. Aarts Jürgen Herre Martin Polon
Massachusetts-Lowell, University of Miami, University of North Carolina at James A. S. Angus Tomlinson Holman D. Preis
Asheville, William Patterson University, Worcester Polytechnic Institute George L. Augspurger Andrew Horner Derk Reefman
Central Region, USA/Canada Jerry Bauck Jyri Huopaniemi Francis Rumsey
Sections: Central Indiana, Chicago, Cincinnati, Detroit, Kansas City, James W. Beauchamp James D. Johnston Kees A. Schouhamer
Nashville, Heartland, New Orleans, St. Louis, Upper Midwest, West Michigan Søren Bech Arie J. M. Kaizer Immink
Student Sections: Ball State University, Belmont University, Columbia Col- Durand Begault James M. Kates Manfred R. Schroeder
lege, Michigan Technological University, Middle Tennessee State University, Barry A. Blesser Robert B. Schulein
Music Tech College, SAE Nashville, Ohio University, Ridgewater College, D. B. Keele, Jr.
John S. Bradley Mendel Kleiner Richard H. Small
Hutchinson Campus, Texas State University–San Marcos, University of
Arkansas-Pine Bluff, University of Cincinnati, University of Illinois-Urbana- Robert Bristow-Johnson David L. Klepper Julius O. Smith III
Champaign, University of Michigan, Webster University John J. Bubbers Wolfgang Klippel Gilbert Soulodre
Western Region, USA/Canada Marshall Buck Bozena Kostek Herman J. M. Steeneken
Sections: Alberta, Colorado, Los Angeles, Pacific Northwest, Portland, Mahlon D. Burkhard John S. Stewart
Richard C. Cabot W. Marshall Leach, Jr. John Strawn
San Diego, San Francisco, Utah, Vancouver
Student Sections: American River College, Brigham Young University, Robert R. Cordell Stanley P. Lipshitz G. R. (Bob) Thurmond
California State University–Chico, Citrus College, Cogswell Polytechnical Andrew Duncan Robert C. Maher Jiri Tichy
College, Conservatory of Recording Arts and Sciences, Expression Center John M. Eargle Dan Mapes-Riordan Floyd E. Toole
for New Media, Long Beach City College, San Diego State University, San Louis D. Fielder Geoff Martin Emil L. Torick
Francisco State University, Cal Poly San Luis Obispo, Stanford University, The J. G. (Jay) McKnight
Art Institute of Seattle, University of Colorado at Denver, University of Southern Edward J. Foster John Vanderkooy
California, Vancouver Mark R. Gander Guy W. McNally Alexander Voishvillo
Earl R. Geddes D. J. Meares Daniel R. von
Northern Region, Europe
Sections: Belgian, British, Danish, Finnish, Moscow, Netherlands, Bradford N. Gover Robert A. Moog Recklinghausen
Norwegian, St. Petersburg, Swedish David Griesinger Brian C. J. Moore Rhonda Wilson
Student Sections: All-Russian State Institute of Cinematography, Danish, Dorte Hammershøi James A. Moorer John M. Woodgate
Netherlands, Russian Academy of Music, St. Petersburg, University of Malcolm O. J. Hawksford Dick Pierce Wieslaw V. Woszczyk
Lulea-Pitea
Central Region, Europe COPYRIGHT ONLINE JOURNAL
Sections: Austrian, Belarus, Czech, Central German, North German, Copyright © 2004 by the Audio Engi- AES members can view the Journal
South German, Hungarian, Lithuanian, Polish, Slovakian Republic, Swiss, neering Society, Inc. It is permitted to online at www.aes.org/journal/online.
Ukrainian quote from this Journal with custom-
Student Sections: Aachen, Berlin, Czech Republic, Darmstadt, Detmold, SUBSCRIPTIONS
ary credit to the source.
Düsseldorf, Graz, Ilmenau, Technical University of Gdansk (Poland), Vienna, The Journal is available by subscrip-
Wroclaw University of Technology tion. Annual rates are $190 surface
COPIES
Southern Region, Europe mail, $240 air mail. For information,
Individual readers are permitted to
Sections: Bosnia-Herzegovina, Bulgarian, Croatian, French, Greek, Israel, Ital- contact AES Headquarters.
ian, Portugal, Romanian, Slovenian, Spanish, Serbia and Montenegro, Turkish photocopy isolated ar ticles for
Student Sections: Croatian, Conservatoire de Paris, Italian, Louis-Lumière research or other noncommercial use. BACK ISSUES
Latin American Region Permission to photocopy for internal or Selected back issues are available:
Sections: Argentina, Brazil, Chile, Colombia, Ecuador, Mexico, Peru, personal use of specific clients is From Vol. 1 (1953) through Vol. 12
Uruguay, Venezuela granted by the Audio Engineering (1964), $10 per issue (members),
Student Sections: Del Bosque University, I.A.V.Q., Javeriana University, Los Society to libraries and other users $15 (nonmembers); Vol. 13 (1965) to
Andes University, Orson Welles Institute, San Buenaventura University, Taller registered with the Copyright Clear-
de Arte Sonoro (Caracas)
present, $6 per issue (members), $11
ance Center (CCC), provided that the (nonmembers). For information, con-
International Region base fee of $1 per copy plus $.50 per
Sections: Adelaide, Brisbane, Hong Kong, India, Japan, Korea, Malaysia, tact AES Headquarters office.
page is paid directly to CCC, 222
Melbourne, Philippines, Singapore, Sydney MICROFILM
Rosewood Dr., Danvers, MA 01923,
PURPOSE: The Audio Engineering Society is organized for the purpose USA. 0004-7554/95. Photocopies of Copies of Vol. 19, No. 1 (1971 Jan-
of: uniting persons performing professional services in the audio engi- individual articles may be ordered uary) to the present edition are avail-
neering field and its allied arts; collecting, collating, and disseminating from the AES Headquarters office at able on microfilm from University
scientific knowledge in the field of audio engineering and its allied arts; Microfilms International, 300 North
advancing such science in both theoretical and practical applications; $5 per article.
and preparing, publishing, and distributing literature and periodicals rela- Zeeb Rd., Ann Arbor, MI 48106, USA.
tive to the foregoing purposes and policies. REPRINTS AND REPUBLICATION
ADVERTISING
MEMBERSHIP: Individuals who are interested in audio engineering may Multiple reproduction or republica-
Call the AES Editorial office or send e-
become members of the AES. Information on joining the AES can be found tion of any material in this Journal
at www.aes.org. Grades and annual dues are: Full members and associate mail to: JournalAdvertising@aes.org.
requires the permission of the Audio
members, $95 for both the printed and online Journal; $60 for online Jour- Engineering Society. Permission MANUSCRIPTS
nal only. Student members: $55 for printed and online Journal; $20 for
online Journal only. A subscription to the Journal is included with all member- may also be required from the For information on the presentation
ships. Sustaining memberships are available to persons, corporations, or author(s). Send inquiries to AES Edi- and processing of manuscripts, see
organizations who wish to support the Society. torial office. Information for Authors.
AES
JOURNAL OF THE
AUDIO ENGINEERING SOCIETY
AUDIO/ACOUSTICS/APPLICATIONS
VOLUME 52 NUMBER 11 2004 NOVEMBER
CONTENT
President’s Message ....................................................................................................Theresa Leonard 1123
PAPERS
Dithered Noise Shapers and Recursive Digital Filters
...............................................................Stanley P. Lipshitz, Robert A. Wannamaker, and John Vanderkooy 1124
Quantizers combining the use of colored, nonsubtractive dither with or without noise-shaping error feedback
are closely examined for the first time. In particular, it is shown that the appropriate use of spectrally shaped
dither signals entails subtle practical and theoretical considerations, especially if such dithers are combined
with noise-shaping schemes. A rigorous analysis of systems employing such dither signals with and without
feedback is undertaken, yielding practical guidelines that ensure satisfactory results in applications. In
particular, it is shown that the class of dither signals suitable for combination with noise shaping is greatly
restricted.
Motion-Tracked Binaural Sound ...............V. Ralph Algazi, Richard O. Duda, and Dennis M. Thompson 1142
By using a head tracking system that selects from an array of microphones placed on a surface that
approximates the listener’s head, the authors have created an alternative to the conventional binaural
recording technique. Based on the orientation of the listener, the appropriate microphone is selected or the
signals from a pair of microphone signals are interpolated. Head tracking provides strong dynamic cues that
create a strong sense of realism with a reduced need for pinna matching. This approach supports multiple
listeners from a single microphone array.
Importance and Representation of Phase in the Sinusoidal Model
...............................................................................................Tue Haste Andersen and Kristoffer Jensen 1157
While the importance of differential phase among the components of an audio signal has been debated over
the years, experiments show that both synthesized and encoded audio need to consider phase integrity.
Subjective listening tests with sounds that were synthesized with varying amounts of phase information
clearly demonstrate the need to use a reliable phase model, especially with common harmonic musical
sounds. A novel phase representation, called partial-period phase, characterizes phase evolution as an
almost stationary parameter.
ENGINEERING REPORTS
Direct Approximate Third-Order Response Synthesis of Vented-Box Loudspeaker Systems
....................................................................................................................................Bernat Llamazares 1170
Extending the earlier work on vented-box loudspeaker systems, a new approach is proposed that provides
approximate third-order frequency-response alignments characterized by featuring a good transient
response at reduced efficiency for the lower frequencies. These new alignments are very suitable for
low-resonance, low-Q drivers, producing a response characteristic midway between a second-order
sealed-box and a fourth-order vented-box loudspeaker system. This approach provides an additional degree
of freedom in the design.
CORRECTIONS
Correction to “Analysis of Loudspeaker Line Arrays” .................................................Mark S. Ureda 1176
STANDARDS AND INFORMATION DOCUMENTS

AES Standards Committee News........................................................................................................... 1177
Synchronization; optical disks; media storage; file interchange; jitter
FEATURES
Metadata Revisited: Six New Things to Know About Audio Metadata ............................................... 1178
Digital Archive Strategies and Solutions for Radio Broadcasting...................................................... 1180
New Officers 2004/2005........................................................................................................................... 1185
26th Conference, Denver, Call for Papers ............................................................................................. 1200
DEPARTMENTS
News of the Sections.......................................1188 Membership Information.................................1195
Upcoming Meetings ........................................1192 Advertiser Internet Directory..........................1197
Sound Track .....................................................1192 Sections Contacts Directory ..........................1201
New Products and Developments ..................1193 AES Conventions and Conferences ..............1208
Available Literature..........................................1194
PRESIDENT’S MESSAGE
working with the Education Committee and students and
I
t is a great pleasure for me to address the membership
of this truly remarkable society. In the past I have know that there is much more we can do. Growing empha-
served the AES as a Governor and as Education sis on recording, postproduction, and design competitions
Chair. I have also served on two convention committees, with increased sponsorship at conventions is only a small
chaired an international conference, and have been chair part of what must happen in education. The Education
of a regional section for the past nine years. Through Committee has set up a forum for discussion of the role of
these experiences, I have developed tremendous respect the AES in audio education and our role in promoting men-
for the society and am very grateful to the membership torship. Through a growing number of tutorials and work-
for your vote of support. I would also like to take this op- shops at conventions, the AES promotes education at all
portunity to thank the staff at HQ and all the committee levels, not just for students. It is my strong feeling that
members I have worked with over the years. I look at AES must also work with other organizations to bring
this presidency as a new opportunity to make a unique hearing protection awareness and understanding to students
contribution to the society and its members. as well as educators.
It is clear that there are many wonderful aspects to the We rely on our sustaining members and exhibitors and
AES, but what I would like to focus on are the challenges must be continually aware of their needs and concerns in
that we will face and try to overcome over the next year. this changing marketplace. AES conventions and confer-
One challenge is maintaining and broadening our mem- ences are still the best meeting place for professionals in-
bership. Over the last two years we have seen a dramatic volved in all aspects of audio, but more can still be done to
increase in our overall membership numbers—especially promote this partnership at conventions, regional confer-
with respect to students. This is partly due to streamlining ences, and through online marketing and development.
the online application and registration procedure. How do Finally, we, the Board of Governors, are accountable to
we continue to keep our membership interested and grow- the membership of the society. We must be open to
ing? Our membership committee has cited ways we can change, be very good listeners, and set future goals and di-
retain and grow, through online access as well as by of- rections for marketing strategies, membership growth, and
fering more resources through our website on a subscrip- international standards. It is my intent to work very closely
tion basis with preferential rates for members. We should with the BOG, and especially with the vice-presidents of
also consider enhanced online delivery of educational the various regions, to make this happen. Having said this,
tools for students; for example tutorials and convention I welcome feedback on the operations of the AES Execu-
workshop materials placed on the website for our mem- tive Committee from all our members on an ongoing basis.
bers to access. Expanding our reach through online ser- On my recent visit to Latin America—which has
vices and registrations will also facilitate a more interna- shown extreme growth in AES membership over the past
tional/global set of initiatives. With market reforms in the few years—there were three words that I used frequently
Far East, particularly China, the time is upon us to diver- to express my gratitude and appreciation for the friend-
sify and expand our membership in these growing techno- ship and professionalism I experienced from the section
logical markets. We must give special attention to recruit- members. I would like to conclude by repeating these
ment and setting up new sections in these areas, in words: “Es un honor.” It really is.
addition to promoting regional conferences and activities,
which has been shown to spur membership growth.
Because of the growing rate of student membership, we
have a responsibility in AES to help set standards for audio
education programs worldwide. Students continually look
to us for educational advice; they are the present, as well as Theresa Leonard
the future, of this society. I have spent a great deal of time President
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1123
PAPERS
Dithered Noise Shapers and Recursive

Digital Filters*
STANLEY P. LIPSHITZ, AES Fellow, ROBERT A. WANNAMAKER, AND JOHN VANDERKOOY, AES Fellow
Audio Research Group, University of Waterloo, Waterloo, ON N2L 3G1, Canada
The question of which spectrally shaped dither signals are appropriate for use in quantizing
systems, with and without noise shaping error feedback, or in recursive digital filters using
dithered quantization at the output, is addressed. It is shown that dithers that are acceptable
without feedback present may be unacceptable if feedback is introduced. In each case, certain
classes of dither generators are shown to be appropriate for audio applications.
0 INTRODUCTION applications the use of spectrally shaped dither is super-

seded by the use of noise shaping, and that only a very
Today dithering and spectral shaping of quantization restricted class of dither signals (of which i.i.d. TPDF
noise by means of error feedback are two techniques that dither is the best known representative) is suitable for use
have found widespread application in contemporary audio in noise-shaping systems.1
engineering. They are frequently used together with ex-
cellent results. In conjunction they can eliminate distortion 0.1 Background: Spectrally White Dithers
and reduce the real and/or perceived noise level in the (without Feedback)
audio band. Many people may be surprised to learn that no Given a nonsubtractively dithered quantizing system
research has been published that proves that these two without noise-shaping error feedback (see Fig. 1), it is well
technologies can be combined without deleterious effects, known [4], [5], [3] that i.i.d. triangular probability-density-
yet this is indeed the case. In fact, this paper will show that function (TPDF) dither of 2 least significant bits (LSB)
dither signals whose samples are statistically independent peak-to-peak amplitude will render the mean and the vari-
of one another and that are suitable for use in systems ance of the total error signal ␧ independent of the system
without noise shaping are also safe for use in noise- input x, assuming that the dither ␯ and input samples are
shaping systems and recursive digital filters (that is, they statistically independent of one another. In particular, this
eliminate distortion and noise modulation, and they render will yield an error with mean and variance
the error spectrum input independent), but that consider- E关␧兴 = 0 (1)
able care is required if noise shaping is to be used with
spectrally shaped dithers. Such dithers are often discussed ⌬ 2
E关␧2兴 = (2)
not only because of the modest perceptual benefits that 4
may be associated with them, but also because some of where ⌬ represents 1 LSB. Furthermore, statistical indepen-
them can be generated more quickly than comparable dence of distinct dither samples ensures that the total error
white dithers. In fact, general practical recommendations will be spectrally white, independent of the input signal; that
for the proper use of such dithers in systems without error is, when such a TPDF dither is used the one-sided power
feedback are also absent from the literature, with the ex- spectral density PSD␧ of ␧ will be given by
ception of certain simple special cases [1], [2], although
much of the necessary underlying theory has now been ⌬2T
PSD␧共f兲 = (3)
published [2], [3]. The aim of this paper, then, is to ana- 6
lyze the action of spectrally shaped dithers in systems both where f represents the frequency variable and T the sam-
with and without feedback and to establish some guide- pling period of the system so that the integrated PSD be-
lines to which such dithers should conform if they are to
be used in audio. We will see that in many (but not all) 1
i.i.d. refers to a random process whose samples are (statisti-
cally) independent (of one another and) identically distributed.
*Manuscript received 2003 December 17; revised 2004 July 22. Such a process has a flat (white) power spectrum.
1124 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
䉭
tween zero and the Nyquist frequency 1/2T gives the total Here ⳱ denotes equality by definition.2
error power ⌬2/4. Note that this sort of high-pass dither requires the gen-
eration of only one new random number ␩n per sampling
0.2 Spectrally Colored Dithers period, as opposed to two such numbers per sampling
The recent literature has seen some discussion of dithers period if white TPDF dither is used. This slight saving of
that are not spectrally white, and whose samples are there- time is sometimes important to software designers work-
fore not all statistically independent of one another. Nec- ing near the temporal limits of their signal processing
essary and sufficient conditions upon the statistics of such hardware.
dithers have been discovered that guarantee that the cor-
responding total error spectra will be the sum of the dither 0.3 Dithered Systems with Noise-Shaping
spectrum and a white quantization noise component of Error Feedback
⌬2/12 total power [2, theorem 4]; that is, The question of a dither’s appropriateness is compli-
cated by the adoption of a noise-shaping quantization
⌬2T scheme of the sort illustrated in Fig. 3. Such systems pro-
PSD␧共f兲 = PSD␯共f兲 + . (4)
6 duce a shaped total error e that is spectrally shaped with
At least one spectrally shaped dither is well known [1] that respect to ␧ according to the formula [6]
has also been shown to satisfy these conditions [2]—it is
PSDe(f) ⳱ |1 − H(e−j2␲fT)|2 PSD␧(f) (8)
simple first-order high-pass dither such as would be gen-
erated by the scheme shown in Fig. 2. Here the TPDF
where H(e−j2␲fT) represents the frequency response of the
dither samples ␯ represent differences between consecu-
noise-shaping filter H(z) shown in Fig. 3. This filter al-
tive samples of a random process ␩,
ways includes one implicit delay element that prevents the
␯n ⳱ ␩n − ␩n−1 (5) current error from being subtracted from the current input.
where the subscripts time-index the samples and ␩n are From Eq. (8) we note that for a given noise-shaper
i.i.d. with rectangular probability density function (RPDF). design the power spectrum of e is entirely determined by
That is to say, the ␩n all have probability density functions the spectrum of ␧. It has been observed [6], [7] that use of
(pdfs) of the form the usual TPDF dither with samples that are statistically
P␩n(␩) ⳱ ⌸⌬(␩) (6)

2
The combination, either additive or subtractive, of n statisti-
where the rectangular window function of width ⌫, ⌸⌫, is cally independent RPDF processes yields a process whose pdf is
defined as
再
the convolution of n rectangular window functions. Throughout
the sequel, we will refer to the combination of n independent
1 ⌫ ⌫
䉭 , if − ⬍␩ⱕ RPDF processes as an n RPDF process, so that an RPDF process
兿⌫共␩兲 = ⌫ 2 2 (7) may also be referred to as 1RPDF and a TPDF process as
0, otherwise. 2RPDF.
Fig. 1. Schematic of nonsubtractively dithered quantizing system without noise-shaping error feedback.
Fig. 2. Simple high-pass TPDF dither generator.
LIPSHITZ ET AL. PAPERS
independent of one another results in an ␧ spectrum that is jective, we will not explicitly consider such elaborate
flat and independent of the system input x⬘. The reason for systems since the results derived for a simple noise-
this is that the noise-shaping feedback path always in- shaping quantizer will be directly applicable thereto.
cludes the aforementioned delay element so that the cur-
rent dither sample ␯n is always statistically independent of 0.4 Outline of Paper
the current input to the dither summer xn despite the pres- The salient question raised by the preceding discussion,
ence of the feedback path. We will see that this is suffi- then, is the following: what spectrally shaped dithers are
cient to ensure that the spectrum of ␧ is flat and indepen- appropriate for use in quantizing systems? Furthermore, it
dent of the system input, as has been observed in practice. is clear that for systems without noise shaping, the answer
Unfortunately the one sample delay is not as helpful in will be very different from that for noise-shaping systems
a system using spectrally shaped dither. One would hope due to the absence of a feedback path. Indeed we will
that such a system would produce a shaped total error proceed by treating the two cases quite separately.
spectrum given by substitution of Eq. (4) into Eq. (8), In both instances, however, we will assume the same
namely, reasonably general scheme for the generation of dither.
冋册
The first step in our treatment, then, will be to define and
⌬2T characterize the statistics of the family of dithers under
PSDe共f兲 = |1 − H共e−j2␲fT兲|2 PSD␯共f兲 + . (9)
6 consideration. This is done in Section 1.
The analysis of quantizing systems will begin in Section
Unfortunately this is not always the case. To understand 2 with the simpler case where no feedback is present, the
why, consider simple high-pass dither. Here xn contains case of noise shaping being taken up in Section 3. In both
vestiges of ␩n−1 arriving via the feedback path, and this instances our objective will be to find conditions upon the
signal is also present in ␯n. Hence the current dither dither signal that will ensure that the spectrum PSD␧ of the
sample is not, in general, independent of the current x. In total error is independent of the system input.3 To accom-
fact, we will show that with high-pass dither and an arbi- plish this we will examine E[␧1␧2], the correlation be-
trary noise-shaping system the shaped total error spectrum tween two samples of the total error, ␧1 and ␧2, separated
is not given by Eq. (9) and is not independent of the in time by a time lag of ᐉ sampling periods. If the value of
system input. this quantity depends only on ᐉ (not on the system inputs)
Analysis of this sort of system is of potentially very then the error is said to be wide-sense stationary and we
wide interest, not only because noise-shaping converters can construct its autocorrelation function,
are now commonplace, but also because garden-variety
digital filters often employ feedback of this sort when the
filtering operation must produce an output of specified
precision. A direct-form recursive filter of this sort is
䉭
r␧共ᐉ兲 = 再 E关␧2兴,
E关␧1␧2兴共ᐉ兲,
if ᐉ = 0
otherwise.
(10)
shown in Fig. 4, where Q represents the dominant requan-

The one-sided power spectrum of the error is the discrete-
tization. Optional noise shaping by a filter H(z) and the
time Fourier transform (DTFT) of this [8],
production of spectrally shaped dither by means of a
“dither filter” G(z) are also incorporated. Note that both ⬁
H(z) and the recursive part of the primary filter (the bi
coefficients) return the quantizer output y to its input w.
PSD␧共f兲 = 2T 兺 r 共ᐉ兲e
ᐉ=−⬁
␧
−j2␲fᐉT
. (11)
For the purposes of our analysis these two feedback paths
are equivalent—they impose similar restrictions on the
statistics of the dither. Hence although the following treat- 3
This will guarantee, of course, that the error mean, variance,
ment will be of interest in a broad class of filtering appli- and spectral shape are all input independent, since these quanti-
cations where signal quantization is not the primary ob- ties are all derivable from the PSD.
Fig. 3. Schematic of nonsubtractively dithered quantizing system with noise-shaping error feedback.
Here T is the sampling period of the system, f is the fre- of a signal’s statistics with which to work than the pdf
quency variable in hertz, and the normalization of the itself.
transform is such that if it is integrated between zero and We will consider a dither signal whose nth sample can
the Nyquist frequency 1/2T, the result is equal to the vari- be written as
ance of ␧ (which is the same as the normalization of the ⬁
PSDs just cited). Clearly, since r␧(ᐉ) is input-independent
by definition, if it can be constructed as indicated, then the
␯n = 兺 c␩
i=−⬁
i n−i. (12)
power spectrum PSD␧(f) will be input-independent also. Throughout the sequel we will assume that all ␩i are i.i.d.
The discussion is necessarily mathematical, but the Furthermore, although the assumption will not be made
treatment has been organized so that readers uninterested explicit until it is required in Section 3, in practice it will
in the technical details can extract the results of broadest be taken for granted that ci ⳱ 0 for i < 0, so that ␯
interest by reading Section 1.1 and the discussions follow- corresponds to the strict-sense stationary4 output of a caus-
ing Corollary 1 (in Section 2.3) and Corollary 2 (in Section al nonrecursive dither filter G of the form
3.3). These address the most common sorts of dithers that ⬁
might be used in systems with and without noise-shaping
error feedback, respectively.
G共z兲 = 兺 cz
i=0
i
−i
(13)
with an i.i.d. input signal ␩ (see Figs. 4 and 5). ␩ is

1. FILTERED DITHERS assumed to be statistically independent of the system input
1.1 Definition so that ␯ is as well. We will refer to a dither of this sort as
a filtered dither.
Detailed knowledge of the dither signal’s statistical
properties is required in order to analyze the behavior of 1.2 Characteristic Function of a Filtered Dither
the systems under consideration. Thus we begin by speci-
We define the vectors
fying the family of dither signals to be considered and
䉭
computing the form of the characteristic function (cf) of ␯ = 共. . . , ␯−1, ␯0, ␯1, . . .兲
dithers in this family. The characteristic function of a
random variable is the Fourier transform of its pdf [9]. For 4
A random process is said to be strict-sense stationary if all of
our present purposes it is a more convenient specification its samples are identically distributed.
Fig. 4. Schematic of general direct-form recursive digital filter using error feedback around output requantizer.
and Since ␯ is strict-sense stationary, we will drop the un-

needed time index n and reindex the c, to obtain
䉭
␩ = 共. . . , ␩−1, ␩0, ␩1, . . .兲.
⬁
Now we write the joint pdf P␯共u兲 = 兿P 共c u兲.
i=−⬁
␩ i (15)
䉭
p␯,␩共␯, ␩兲 = p␯ⱍ␩共␯, ␩兲p␩共␩兲
冉冊
⬁
Also, by setting to zero all of the wi and all of the uj
⬁
兿␦ 兺 c␩
except for un and un+ᐉ (which we relabel u1 and u2), Eq.
= ␯j − i j−i p␩共␩j兲.
j=−⬁ i=−⬁
(14) yields
⬁
Here we have used the facts that ␯j is completely deter-
mined by choosing the ␩i and that the ␩i are i.i.d. so that P␯n,␯n+ᐉ共u1, u2兲 = 兿 P 共c
i=−⬁
␩ n−iu1 + cn+ᐉ−iu2兲
their joint pdf splits into a product of identical functions
that we will simply denote by p␩; that is, we take ⬁
p␩i ≡ p␩ ᭙i.
= 兿 P 共c u + c
i=−⬁
␩ i 1 i+ᐉu2 兲. (16)
To obtain the associated cf, we now Fourier transform all The moments of a random variable can be computed
variables. The transform variable corresponding to ␯j will from the derivatives of its characteristic function at the
be uj and that corresponding to ␩i will be wi, where, as origin [9]. In particular, for arbitrary random variables x
before, we will form real vectors u and w from these and y
components for notational convenience,
冉冊兰
䉭 ⬁
⬁ ⬁ E关xm兴 = xmpx共x兲 dx
兿兰 exp ∑c␩
⬁ −⬁
−j2␲␩jwj
P␯,␩共u, w兲 = −j2␲uj p␩(␩j)e
冉冊
i j−1 d␩j
−⬁ j m
j=−⬁ i=−⬁
= P共xm兲共0兲 (17)
⬁ ⬁ 2␲
兿兰兿 e
⬁
−j2␲ujci␩j−i
= p␩共␩j兲e−j2␲␩jwj d␩j
−⬁ and
j=−⬁ i=−⬁
⬁ ⬁
兿兰兿e 兰兰
⬁ 䉭 ⬁ ⬁
= −j2␲cj−iuj␩i
p␩共␩i兲e−j2␲␩iwi d␩i E关xy兴 = xypx,y共x, y兲 dx dy
−⬁ −⬁ −⬁
冉冊
i=−⬁ j=−⬁
冉冊
j 2
⬁ ⬁ 1,1兲
= P共x,y 共0, 0兲. (18)
= 兿
i=−⬁
P␩ wi + 兺
j=−⬁
cj−iuj . (14) 2␲
(We will use superscripts (m, n) to indicate a function

One very appealing property of characteristic functions differentiated m times with respect to its first argument
is that marginal cfs can be obtained from joint cfs by and n times with respect to its second. Differentiating Eqs.
simply setting the unwanted variables to zero [9]. Thus (15) and (16) and making the simplifying assumption that
setting wi ⳱ 0 for all i and uj ⳱ 0 for all j ⫽ n, we directly E[␩] ⳱ 0 yields
obtain the cf we require,
⬁
兺 cc
⬁
r␯共ᐉ兲 = E关␩2兴
P␯n共un兲 = 兿 P 共c
i=−⬁
␩ n−iun 兲. j=−⬁
j j+ᐉ
the DTFT of which is the dither spectrum,
PSD␯共f兲 = 2T E关␩2兴冋兺 ⬁
j=−⬁
c2j + 2
⬁ ⬁
兺兺 c c
ᐉ=1 j=−⬁
j j+ᐉ 册
cos共2␲fᐉT兲 .
(19)
2. SPECTRALLY SHAPED DITHERS IN

SYSTEMS WITHOUT FEEDBACK
We proceed to determine conditions on dithers of the

sort described in the preceding such that they will render
the total error of the quantizing system wide-sense station-
ary. With reference to the autocorrelation function of the
error as defined by Eq. (10), we will need to treat the
first-order (ᐉ ⳱ 0) and second-order (ᐉ ⫽ 0) statistics of
Fig. 5. Schematic of general filtered dither generator. the error signal separately.
2.1 First-Order Error Statistics Applying the product rule for differentiation to Eq. (15)
(without Feedback) we have
冉冊冉冊兿冉冊
In the first-order case, the theorem of fundamental im- ⬁ ⬁
兺
portance is given in [2, corollary 2 to theorem 1] (which is k k k
P共␯1兲 = cjP共␩1兲 cj P␩ ci . (25)
also in [3, theorem 7] in a slightly different but entirely ⌬ j=−⬁ ⌬ i=−⬁ ⌬
equivalent form). i⫽j
This expression will go to zero at the required locations if

Lemma 1 In an NSD quantizing system E[␧ ] is in- m and only if for each k ⫽ 0 either 1) there exists an i such
dependent of the distribution of the system input x for that
冉冊
m ⳱ 1, 2, . . . , M if and only if
k
P共␩1兲 ci =0
P共␯i兲冉冊
k
⌬
=0 ᭙k ⫽ 0, i = 0, 1, 2, . . . , M − 1.
and
⌬
If the conditions of Lemma 1 are satisfied, then it has

been shown elsewhere [2], [3] that for 0 ⱕ m ⱕ N the 冉冊
P␩ ci
k
⌬
=0
moments of the error are related to those of the dither by
the formula or 2) there exist two distinct values of i such that
E关␧m兴 =
m Ⲑ 2
兺冉冊冉冊
m ⌬
2
2ᐉ E关␯m−2ᐉ兴
2ᐉ + 1
(20)
冉冊
P␩ ci
⌬
k
=0
ᐉ=0 2ᐉ
so that, although terms occur in Eq. (25) in which either
where the floor operator returns the greatest integer less one of these two functions alone will be differentiated, in
than or equal to its argument. These relations are often any given term one will be undifferentiated and will cause
referred to as Sheppard’s corrections. The first two of the respective term to vanish in the required places.
them are of special interest, since they concern the mean
and the variance of the error, 2.2 Second-Order Error Statistics
(without Feedback)
E关␧兴 = E关␯兴 (21) In audio applications it is not sufficient to ensure only
that the error’s mean and variance are input independent.
⌬2 More generally, the error’s power spectrum should be con-
E关␧2兴 = E关␯2兴 + . (22) stant and predictable. We now proceed to find conditions
12
under which a spectrally shaped dither will render the
Note that Eq. (21) implies that there is no distortion of the complete autocorrelation function (and hence the power
input signal because the mean error is zero, whereas Eq. spectrum) of the total error input independent. Necessary
(22) implies that the error variance is also input indepen- and sufficient conditions for E[␧1␧2] to be input indepen-
dent so that no so-called noise modulation is present. dent, presented in [2, theorem 4], are transcribed here as
We need to ascertain conditions under which the filtered the following lemma.
dither cf of Eq. (15) satisfies the requirements imposed by
Lemma 1. We begin with the case of the error mean (ᐉ ⳱ Lemma 2 In an NSD system where all dither values
1), which entails the requirement that are statistically independent of all system input values,
冉冊
E关␧1␧2兴 = E关␯1␯2兴
k
P␯ =0 ᭙k ⫽ 0 (23)
⌬ for arbitrary input distributions if and only if the fol-
lowing three conditions are satisfied:
冉冊
in order that it be independent of the input and given by
Eq. (21). Clearly, this condition will be satisfied by the k1 k2
P␯1,␯2 , =0 ᭙共k1, k2兲 ⫽ 共0, 0兲 (26)
dither cf of Eq. (15) if and only if for each k ⫽ 0 there ⌬ ⌬
冉冊
exists an i such that
兲
k1
P共␯0,1 ,0 =0 ᭙k1 ⫽ 0 (27)
冉冊 ⌬
1,␯2
k
= 0.
冉冊
P␩ ci
⌬ k2
兲
P共␯1,0 0, =0 ᭙k2 ⫽ 0. (28)
1,␯2 ⌬
Requiring that the error variance be input-independent
introduces an additional constraint, Subject to the conditions of Lemma 2, then, we have
P共␯1兲冉冊
k
⌬
=0 ᭙k ⫽ 0. (24) E关␧n␧n+ᐉ兴 = 再 E关␧2n兴,
E关␯n␯n+ᐉ兴,
if ᐉ = 0
otherwise.
(29)
The case where the lag parameter ᐉ equals zero has al- and
冉冊
ready been handled in Section 2.1. Provided that the con-
ditions for constancy of the error variance are met so that k
P共␩1兲 ci =0 (33)
it is given by Eq. (22), then Eq. (29) is just the autocor- ⌬
relation function of the dither apart from an added ⌬2/12 at
ᐉ ⳱ 0. In this case Fourier transforming both sides of Eq. or there exist two distinct values of i such that
冉冊
(29) yields Eq. (4). Hence the error spectrum will be equal
to the dither spectrum, apart from an additive white-noise k
P␩ ci = 0. (34)
component arising from the zero lag term. ⌬
We proceed by applying each of the three conditions
required by the lemma to the two-dimensional dither cf of The conditions in the theorem are sufficient but not
Eq. (16). Condition 1 [Eq. (26)] is satisfied for all lags ᐉ necessary, with more complicated and general conditions
⫽ 0 if and only if for all (k1, k2) ⫽ (0, 0) and ᐉ ⫽ 0 there probably existing. In spite of this, the conditions of the
exists an i such that theorem are so general as to be difficult to use, but they are
of the form required for certain ␩ pdfs (see [10]). We will
冉
k1
P␩ ci + ci+ᐉ
⌬
k2
⌬
= 0. 冊 (30)
interpret them next in a special case of interest.
2.3 Illustrated Special Case: ␩ Is nRPDF

Note that if this equation holds, then Eq. (23) necessarily (without Feedback)
holds as well. We will now interpret the conditions of Theorem 1 in
Proceeding to condition 2 [Eq. (27)] and again applying the common case where ␩ represents an i.i.d. nRPDF ran-
the product rule, we have dom process.
The Fourier transform of the rectangular window func-
冉冊冉冊兿冉冊 tion ⌸⌬ is a sinc function,5

⬁ ⬁
兺
兲
k1 k1 k1
P共␯0,1 ,0 = cj+ᐉP共␩1兲 cj P␩ ci .
n,␯n+ᐉ ⌬ j=−⬁ ⌬ i=−⬁ ⌬ sin共␲⌬u兲
䉭
i⫽j sinc共u兲 =
␲⌬u
Demanding that all terms in this sum go to zero at the
required locations for all ᐉ ⫽ 0, we arrive at the same and the cf of a sum of statistically independent random
condition found for constancy of the error variance above; processes is the product of their individual cfs [5], so the
that is, we require for all k ⫽ 0 that either P␩(cik/⌬) ⳱ 0 cf of an nRPDF dither is P␯(u) ⳱ sincn(u). It follows that
␩ (cik/⌬) ⳱ 0 for some value of i, or P␩(cik/⌬) ⳱
and P(1) if the ␩ are i.i.d. and nRPDF, then condition 1 of Theorem
0 for any two values of i. 1 will be satisfied for all ᐉ ⫽ 0 if for each ᐉ ⫽ 0 there
Condition 3 [Eq. (28)] is symmetric with condition 2 exists an i, call it i0, such that of ci0 and ci0+ᐉ one is zero
and yields the same conditions on the cf of ␩. and the other is a nonzero integer. To see why this is so,
Clearly, the conditions for the joint error moments to be note that for an ␩ of this sort Eq. (30) becomes
冉冊冉冊
input independent are stronger than (and include) the cor-
responding conditions for the mean and the variance of the k1 k2 k1 k2
P␩ ci + ci+ᐉ = sincn ci0 + ci0+ᐉ .
error. Collecting them gives us the following sufficient ⌬ ⌬ ⌬ ⌬
conditions for the error spectrum to be constant and input
independent This equation must hold if both k1 ⫽ 0 and k2 ⫽ 0 since
the argument of the sine function will then be a nonzero
integer multiple of ␲ under this condition. What happens
Theorem 1 In a nonsubtractively dithered quantizing in the case where ci0 ⳱ 0 and k2 ⳱ 0 (k1 ⫽ 0)? Then there
system without noise-shaping error feedback and using exists i1 ⳱ i0 + ᐉ such that Eq. (30) holds and becomes
dither of the form described by Eq. (12), the total error
will be wide-sense stationary and independent of the
system input with its PSD given by Eq. (4) if both of the
following conditions are satisfied:
冉冊
sincn ci1
k1
⌬
= 0.
1) For each pair (k1, k2) ⫽ (0, 0) and for each ᐉ ⫽ 0 A similar factor exists if ci0+ᐉ ⳱ 0 and k1 ⳱ 0 (k2 ⫽ 0).
there exists an i such that Hence for each pair (k1, k2) ⫽ (0, 0) there exists, under the
冉冊
stated condition, an i such that Eq. (31) holds.
k1 k2 What does condition 2 of Theorem 1 entail when ␩ is
P␩ ci + ci+ᐉ =0 (31)
⌬ ⌬ nRPDF with n ⱖ 1? In such a case we see that the exis-
tence of two distinct ci with values that are nonzero inte-
and gers is sufficient to satisfy the requirements of both Eq.
2) for each k ⫽ 0, either there exists a value of i such
that
冉冊
5
Note that this definition of the sinc function differs slightly
k from the most standard definition appearing in the literature,
P␩ ci =0 (32)
⌬ which omits the factors of ⌬.
(32) and Eq. (33). If, on the other hand, ␩ is nRPDF with Any coefficient sequence satisfies the conditions if it con-
n ⱖ 2, then it is sufficient that one nonzero integral ci tains at least two integers at least one of which is either
exists to satisfy the requirements of Eq. (34). For instance, leading or trailing. For instance, the following permuted
the cf of a TPDF process sequences
P␩i(u) ⳱ sinc2(u)
goes to zero at u ⳱ cik/⌬ for all k ⫽ 0 if ci is an integer,

(35)
再 1 1
. . . , 0, 0, 0, 1, −1, , − , 0, 0, 0, . . .
2 2 冎
再冎
and so does its first derivative.
1 1
We collect these conclusions into the following useful . . . , 0, 0, 0, , − , 1, −1, 0, 0, 0, . . .
corollary to Theorem 1. 2 2
Corollary 1 In a nonsubtractively dithered quantizing

system without noise-shaping error feedback and using
再 1 1
. . . , 0, 0, 0, 1, − , 1, − , 0, 0, 0, . . .
2 2 冎
再冎
dither of the form described by Eq. (12) with ␩ repre-
1 1
senting an i.i.d. nRPDF random process, the total error . . . , 0, 0, 0, , −1, , −1, 0, 0, 0, . . .
will be wide-sense stationary and independent of the 2 2
再冎
system input with a PSD given by Eq. (4) if both of the
following conditions are satisfied: 1 1
. . . , 0, 0, 0, 1, − , , −1, 0, 0, 0, . . .
1) For each ᐉ ⫽ 0 there exists an i such that of ci and 2 2
再冎
ci+ᐉ one is zero and the other is a nonzero integer, and
2) either ␩ is nRPDF with n ⱖ 1 and there exist at 1 1
. . . , 0, 0, 0, 1, − , 0, , −1, 0, 0, 0, . . .
least two distinct values of i such that ci is a nonzero 2 2
integer, or ␩ is nRPDF with n ⱖ 2 and there exists at
least one value of i such that ci is a nonzero integer. all satisfy the requirements. Fig. 6 shows the total error
spectrum from a system using one of these dithers will a
Clearly, the conditions of the theorem are met by i.i.d. null system input, as well as that error spectrum normal-
TPDF dither, which corresponds to the case where ␩ is ized by the error PSD as predicted by Eq. (4) for a properly
TPDF and the associated dither filter has a single nonzero dithered system.6 The result of the normalization is flat,
coefficient, c0 ⳱ 1. Now consider a system with a sta- indicating that the spectrum is of the expected shape.
tionary RPDF ␩ signal. What sets of dither filter coeffi- Other suitable sequences without leading or trailing inte-
cients satisfy these conditions? Obviously, the require- gers can also be constructed. an example is
再冎
ments are met by the simple high-pass dither coefficients
1 1
. . . , 0, 0, 0, , −1, 0, 1, − , 0, 0, 0, . . . .
{. . . , 0, 0, 0, 1, − 1, 0, 0, 0, . . .}. 2 2
This coefficient set is associated with a dither whose spec- 6

All power spectra shown in this paper represent the average
trum has a simple high-pass form, as given by Eq. (19)
of 12 000 256-point FFTs of 50% overlapping Hann-windowed
data generated by computer-simulated quantization. 0 dB repre-
⌬2T sents the PSD of a random process whose values are RPDF and
PSD␯共f兲 = 关1 − cos共2␲fT兲兴.
3 i.i.d.; that is, 0 dB represents ⌬2T/6.
Fig. 6. PSD␧(f) for quantizing system without error feedback and using a dither filter with RPDF input and coefficients {0.5, −1.0, 0.5,
−1.0}. System was presented with static null input (0.0 LSB). (a) Observed PSD. (b) Observed PSD normalized by predicted PSD.
The pair of unit-magnitude coefficients in each of these shows error spectra from a system using this sort of dither
sequences ensures satisfaction of the second condition of with different static system inputs. Also shown are nor-
Corollary 1. With an RPDF ␩ these coefficients ensure malizations of these spectra by the expected curve speci-
that the total dither contains a TPDF component. The cor- fied by Eq. (4). The results of the normalization are clearly
ollary guarantees that the presence of other components in not flat indicating that the error spectrum is not of the sort
the total dither does not interfere with the well-known predicted. Furthermore, variation of the spectrum with the
benefits of such a TPDF component (that is, the elimina- system input value is apparent.7 Resorting to a TPDF ␩ not
tion of distortion and error-variance modulation; see Sec- only increases the error variance, but it does not ameliorate
tion 0.1). If ␩ were itself TPDF rather than RPDF, then the problem of input dependence because it does nothing
only a single integer coefficient would be required in order to satisfy the first condition of the corollary. This is illus-
to achieve this result, as indicated in the corollary. On the trated in Fig. 8, where a small but consistently reproduc-
other hand, the first condition of the corollary introduces ible deviation from the expected spectrum is observed at
an additional restriction on the coefficient sequence that low frequencies. On the other hand, simply doubling the
ensures not only that distortion and error-variance modu- coefficient sequence (or, equivalently, doubling ␩) yields
lation are eliminated, but that modulation of the shape of a suitable dither, although it also increases the error vari-
the error’s power spectrum is also prevented. ance and somewhat alters the spectral shape as well [since
Consider, for instance, the coefficient sequence the additive white-noise component in Eq. (4) is not
doubled]. The result is illustrated in Fig. 9.
再 1 1
. . . , 0, 0, 0, , −1, 1, − , 0, 0, 0, . . .
2 2 冎 7
Interestingly, this coefficient sequence does satisfy the sec-
ond condition of the corollary, and thus the total error variance
Like the sequences shown in the preceding, this one meets assumes the predicted value given by Eq. (22) regardless of the
the second condition of Corollary 1. However, it fails to input. It is only the error’s spectral shape and not its variance that
meet the first condition of Corollary 1 for ᐉ ⳱ ±1. Fig. 7 is input dependent.
Fig. 7. PSD␧(f) for quantizing system without error feedback and using a dither filter with RPDF input and coefficients {0.5, −1.0, 1.0,
−0.5}. System was presented with static inputs. (a) Observed PSD with 0.0 LSB input. (b) Observed PSD normalized by predicted PSD
for 0.0 LSB input. (c) Observed PSD with 0.25 LSB input. (d) Observed PSD normalized by predicted PSD for 0.25 LSB input.
We observe that in NSD systems we cannot generate 3 SPECTRALLY SHAPED DITHERS IN

arbitrary total error spectra by varying the dither spectrum. SYSTEMS WITH FEEDBACK
Not only does Corollary 1 impose significant restrictions
on the allowable dither signals, but Eq. (4) indicates that We now explore the use of filtered dithers in quantizing
an additive white-noise component will always be present. systems employing noise-shaping error feedback. While
There are many applications where more complete control the use of such feedback allows general control over the
of the error spectrum is desirable [6], [7], and this may be spectral shape of the total error, we will see that the class
achieved using noise-shaping error feedback (see Section of dither signals suitable for use with noise shaping is
3). Spectrally shaped dithers remain of interest in certain greatly restricted.
applications, however (see [11]). Furthermore, they are
useful in high-speed applications where it is prohibitively 3.1 First-Order Error Statistics (with Feedback)
time-consuming to generate nRPDF dither using n newly Again, we begin with first-order statistics. Unfortu-
calculated random numbers per data sample. In such cases nately we cannot draw upon any results preexisting in the
a single new ␩ may be generated per sample and placed in literature here because none have been published for sys-
a delay line to generate spectrally shaped dither of the sort tems in which feedback introduces vestiges of past dither
described by Eq (12). A commonly used example is the values into the current input to the quantizing network (see
simple high-pass dither mentioned, which may be gener- Fig. 3). We will have to derive the appropriate theorems.
ated using dither filter coefficients Fortunately the derivations proceed in a fashion analogous
{. . . , 0, 0, 0, 1, −1, 0, 0, 0, . . .}. to those for the nonfeedback theorems in [2], [3]. The
details are contained in the Appendix. Here we will simply
Such dither is TPDF, but only one new random number is state the resulting theorems as required, beginning with the
calculated per sampling period. following generalization of Lemma 1.
Fig. 8. PSD␧(f) for quantizing system without error feedback and using a dither filter with TPDF input and coefficients {0.5, −1.0, 1.0,
−0.5}. System was presented with static null input (0.0 LSB). (a) Observed PSD. (b) Observed PSD normalized by expected PSD.
Fig. 9. PSD␧(f) for quantizing system without error feedback and using a dither filter with TPDF input and coefficients {1.0, −2.0, 2.0,
−1.0}. System was presented with static null input (0.0 LSB). (a) Observed PSD. (b) Observed PSD normalized by predicted PSD.
Lemma 3 In an NSD quantizing system in which the pendent of xn. This is i = n, so that Eq. (38) can be written
dither ␯ and the system input signal x are not necessarily as the product
statistically independent, E[␧ᐉ] is independent of the
P␯n,xn共c0un, vn兲 = P␩n共c0un兲P␩,xn共␮⬘, vn兲
distribution of the input x for ᐉ ⳱ 1, 2, . . . , N if and
only if the joint characteristic function of the dither and where
再
the input P␯,x(u, v) obeys the condition that
␮i, if i ⬍ n
P共␯,x 冉冊
i,0兲
k k
,
⌬⌬
=0 ᭙k ⫽ 0, i = 0, 1, 2, . . . , N − 1.
␮⬘i =
0, if i ⱖ n.
We conclude that Eq. (39) holds if
(36)
Subject to the conditions of Lemma 3, E[␧m] for 0 ⱕ m
ⱕ N is given by the Sheppard’s corrections, as before.
冉冊
P␩ c0
⌬
k
=0 ᭙k ⫽ 0 (41)
The derivation of P␯,x in terms of the ␩i proceeds pre- and, similarly, that Eq. (40) holds if
冉冊
cisely as for the case where x is not involved, and we
simply state the result: k
P共␩1兲 c0 =0 ᭙k ⫽ 0. (42)
⌬
P␯,␩,x共u, w, v兲 = P␩,xr共␥, v兲 (37)
where 3.2 Second-Order Error Statistics
(with Feedback)
x = 共. . . , x−1, x0, x1, . . .兲 The analysis of the two-dimensional statistics proceeds
and in the usual fashion. The following obvious generalization
of Lemma 2 is derived in the Appendix.
v = 共. . . , v−1, v0, v1, . . .兲
Lemma 4 Consider two values ␧n and ␧n+ᐉ of the total
v being the corresponding vector of Fourier-transformed
error produced by an NSD quantizing system in which
variables. ␥ is a similar vector with components
the dither and the input to the quantizing system are not
⬁ necessarily statistically independent. Let these error
␥i = wi + 兺c
j=−⬁
j−iuj. samples be separated in time by ␶ ⳱ ᐉT, where T is the
sampling period of the system and ᐉ ⫽ 0. Denote by
By setting all the unwanted variables in Eq. (37) to zero P(␯n,␯n+ᐉ),(xn,xn+ᐉ) the joint cf of the dither and input values
we obtain ␯n, ␯n+ᐉ, xn, and xn+ᐉ, corresponding to ␧n and ␧n+ᐉ,
respectively. If and only if
P␯n,xn共un, vn兲 = P␩,xn共␮, vn兲 (38)
where the components of ␮ are P共␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲冉 k1 k2 k1 k2

, , ,
⌬ ⌬ ⌬ ⌬
=0冊 ᭙(k1, k2) ⫽ 共0, 0兲
(43)
␮i = cn−iun
and where we will retain the time indices since the relative
times of ␩i and xn must be taken into account. [Note that
P共(0,1,0,0)
␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲冉 k1 k1
⌬ ⌬ 冊
, 0, , 0 = 0 ᭙k1 ⫽ 0 (44)
if the ␩ are all mutually independent and we let vn ⳱ 0,

then Eq. (38) reduces to Eq. (15).]
In order for the mean and the variance of the error to be
P共(1,0,0,0)
␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲 0,冉 k2 k2
⌬
, 0,
⌬ 冊
=0 ᭙k2 ⫽ 0 (45)
input independent, Lemma 3 requires that then
P␯n,xn 冉冊
k k
,
⌬ ⌬
=0 ᭙k ⫽ 0 (39)
E关␧n␧n+ᐉ兴 = E关␯n␯n+ᐉ兴.
From Eq. (37) we have
and P共␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲共u1, u2, v1, v2兲 = P␩,共xn,xn+ᐉ兲共␮, v1, v2兲 (46)
P共␯1,0
n,xn
兲
冉冊
k k
,
⌬ ⌬
=0 ᭙k ⫽ 0. (40)
where
␮i = cn−iu1 + cn+ᐉ−iu2.
At first glance, interpretation of these conditions in terms We first consider the case where ᐉ > 0. Using the same
of Eq. (38) appears to be frustrated by the fact that we brand of reasoning that we used in the one-dimensional
know nothing about the quantity P␩,xn. However, we can case, we note that there exists exactly one value of i for
assume that 1) the dither filter is causal so that ci ⳱ 0 for which (cn−i,cn+ᐉ−i) ⫽ (0, 0) and for which ␩i is statistically
all i < 0, and that 2) ␩i is statistically independent of the independent of (xn,xn+ᐉ). This is i ⳱ n + ᐉ, so that Eq. (46)
random vector (. . . , xn−2, xn−1, xn) for i ⱖ n, where we can be written
recall that the dither filter H(z) must contain an implicit
single-sample delay. Thus there exists exactly one value of P共␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲共u1, u2, v1, v2兲 = P␩n+ᐉ共c0u2兲P␩,共xn,xn+ᐉ兲共␮⬘, v1, v2兲
i such that ci ⫽ 0 and for which ␩i is statistically inde- (47)
where only c0 remains since the other coefficient is zero, Corollary 2 In an NSD quantizing system with arbi-
and where trary noise-shaping error feedback and using filtered
再
dither with ␩ being an i.i.d. mRPDF random process,
␮i, if i ⬍ n + ᐉ
␮⬘i = the shaped total error will be wide-sense stationary and
0, if i ⱖ n + ᐉ. independent of the system input with a PSD given by
According to Eq. (47), condition 1 [Eq. (43)] of Lemma Eq. (9) if c0 is a nonzero integer and m ⱖ 2.
4 will be satisfied for ᐉ > 0 and k2 ⫽ 0 if
冉冊
The result exploits the inherent single-sample delay in
k the feedback loop (see Section 0.3), which guarantees that
P␩ c0 =0 ᭙k ⫽0.
⌬ at least the most recent ␩ value is independent of x because
it has not been recirculated. Thus whatever the remaining
On the other hand, if k2 ⳱ 0, then Eqs. (43) and (46) yield
components in the total dither signal may be, this ␩ can
P共␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲冉 k1 k1
⌬ ⌬ 冊冉冊冏
, 0, , 0 = P␩,xn ␮⬙,
k1
⌬ u1=k1 Ⲑ ⌬
single-handedly provide a suitable dither signal if it is at
least TPDF (that is, m ⱖ 2) and if it has an appropriate
width (that is, if c0 is a nonzero integer).
(48)
To appreciate just how restrictive this condition really
where is, it should be noted that simple TPDF high-pass dither
generated by filtering an RPDF random process does not
␮⬙i = cn−iu1.
satisfy it. This is confirmed by Fig. 10, which shows the
Then there exists exactly one i such that cn−i ⫽ 0 and for spectrum of ␧ from a noise shaper using this kind of dither
which ␩i is independent of xn. This is i ⳱ n. Thus the and a one-tap feedback filter with coefficient −0.5. [As
right-hand side of Eq. (48) splits into a product that goes was pointed out in the Introduction the shaped total error
to zero if spectrum PSDe (f) will have the expected form given by
冉冊
Eq. (9) if and only if the total error spectrum PSD␧(f) has
k the form given by Eq. (4); that is, we only require that the
P␩ c0 =0 ᭙k ⫽ 0.
⌬ dither fix PSD␧(f) since PSDe (f) is then determined via
Thus condition 1 is satisfied for all (k1, k2) ⫽ (0, 0) subject Eq. (8).] Also shown is the spectrum normalized by the
to this requirement. By symmetry, the ᐉ < 0 case produces predicted spectrum of Eq. (4). Two static inputs (x ⳱ 0.0
identical conditions. and 0.5 LSB, respectively) were used. The normalized
Conditions 2 and 3 [Eqs. (44) and (45)] are handled by spectra are not flat, indicating that the error spectra are not
application of the product rule, as before. We omit the of the expected shape. Furthermore, the two spectra are
details, but it can be shown that these conditions are sat- different, clearly indicating that the error spectrum is input
isfied if Eqs. (41) and (42) hold. All three conditions being dependent.
satisfied, Eq. (4) gives the total error spectrum in terms of The observed spectral modulations can be eliminated by
the dither spectrum. using a TPDF rather than an RPDF ␩, in which case the
We will now collect the conclusions from the foregoing conditions of Corollary 2 are satisfied since m ⳱ 2 but the
analysis. error variance is increased. This is illustrated in Fig. 11.
However, it is not clear that this use of spectrally shaped
Theorem 2 In an NSD quantizing system with arbi- dither offers an advantage over using simple i.i.d. TPDF
trary noise-shaping error feedback and using filtered dither, since any desired error spectrum can be obtained
dither of the form described by Eq. (12) the shaped total using noise-shaping error feedback. Fig. 12 shows power
error will be wide-sense stationary and independent of spectra of ␧ in the case where the noise-shaping system is
the system input with a PSD given by the same as that of Fig. 10 but where i.i.d. TPDF dither is
used rather than filtered RPDF (high-pass) dither. The
冋
PSDe共f兲 = |1 − H共ej2␲fT兲|2 PSD␯共f兲 +
⌬2T
6
册 normalized spectra are flat, indicating that the error spectra
are of the expected shape and are input independent.
When dithers that do not satisfy the conditions of Cor-
if both of the following conditions are satisfied: ollary 2 are used in conjunction with noise shaping, modu-
冉冊
lation of the error spectrum typically decreases in magni-
k
P␩ c0 =0 ᭙k ⫽0 tude with increasing complexity of the noise-shaping
⌬ filter. For instance, the plots in Fig. 13 correspond to those
and in Fig. 10, with the sole difference being the use of a
冉冊
three-coefficient noise-shaping filter with psychoacousti-
k cally optimized coefficients (see [6]). Although some
P共␩1兲 c0 =0 ᭙k ⫽ 0.
⌬ variation of the spectrum with input is probably still pres-
ent, it is apparently negligible. To further characterize this
3.3 Illustrated Special Case: ␩ Is nRPDF variation would require a general statistical model of signals
(with Feedback) in the noise shaper, and the development of such a model
If ␩ is mRPDF we reach the following simple but quite remains an open problem. In any event, we do not recom-
restrictive conclusion. mend the use of dithers that violate the conditions of Corol-
lary 2 in conjunction with noise shaping. For most applica- fective dither filter, 1 − H(z), must be minimum phase for
tions, simple i.i.d. TPDF dither is the best choice, with Fig. 14 to be realizable; that is it must be invertible.] This
spectral shaping of the error effected by means of noise- means that for such noise shapers, the broad class of
shaping feedback rather than by spectrally shaping the dither. shaped dithers defined by the conditions of Section 2 must
produce the expected input-independent error spectra. This
3.4 Results for Special Classes of Shapers is confirmed by Fig. 15, which shows total error spectra
We have so far been unable to find necessary and suf- PSD␧, unnormalized and normalized, for such a system
ficient conditions that will guarantee input independence using the simple high-pass dither that failed when a feed-
of the error spectrum for an arbitrary noise shaper (al- back filter with noninteger coefficients was used.
though a set of weaker but more complicated conditions
for mRPDF ␩ is given in [11, theorem 6]). However, some 4 CONCLUSIONS
interesting results are known for certain special classes of
shapers. The most obvious is that if the feedback filter Systems without noise-shaping feedback respond quite
H(z) is FIR and its first ᐉ coefficients are all zero, the differently to the use of particular spectrally shaped dither
shaped total error spectrum is wide-sense stationary and signals from those with error-feedback paths. The total
given by Eq. (9) if ci ⳱ 0 for i > ᐉ and the conditions of error of the system may be wide-sense stationary in one
Theorem 1 are satisfied. This ensures that xi contains no case and not in the other. For instance, simple high-pass
vestiges of any ␩j that will also be present in the current dither renders the total error wide-sense stationary if no
dither sample ␯i, so that xi and ␯i will be independent. feedback is present, but fails to do so for systems with
An interesting result has been obtained for a special arbitrary noise shapers.
class of noise-shaper designs by Gerzon et al. [12]. These Dithered systems using spectrally colored dither signals
shapers employ feedback filters H (z) whose filter coeffi- should be designed according to the criteria of Corollary 1
cients are all integers. Gerzon et al. have shown that any when no noise-shaping error feedback is to be used. How-
such system produces precisely the same output as the ever, in most applications the greater flexibility of noise-
system of Fig. 14, which employs no feedback. [The ef- shaping error feedback will supersede the use of spectrally
Fig. 10. PSD␧(f) for quantizing system with error feedback and using a dither filter with RPDF input and coefficients {1.0, −1.0}. A
single-tap noise-shaping filter with coefficient −0.5 was used. (a) Observed PSD for 0.0 LSB input. (b) Observed PSD normalized by
expected PSD for 0.0 LSB input. (c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected PSD for 0.5 LSB
input.
shaped dither. When such noise shaping is employed, Audio Engineering Society, Tokyo (1989 June); in Col-
the dither signal should meet the restrictive conditions lected Preprints (AES Japan Section, Tokyo, 1989), pp.
of Corollary 2. Many spectrally shaped dithers will 72–75.
introduce unexpected error modulations due to recircula- [2] R. A. Wannamaker, S. P. Lipshitz, J. Vanderkooy,
tion of the dither by the feedback loop so that it is no and J. N. Wright, “A Theory of Non-Subtractive Dither,”
longer independent of the input signal. Thus we rec- IEEE Trans. Signal Process., vol. 48, pp. 499–516 (2000
ommend simple i.i.d. TPDF dither for most noise- Feb.).
shaping applications, since any desired shape of error [3] S. P. Lipshitz, R. A. Wannamaker, and J. Vander-
spectrum can be achieved by specifying a suitable kooy, “Quantization and Dither: A Theoretical Sur-
feedback filter. Such precautions will guarantee that the vey,” J. Audio Eng. Soc., vol. 40, pp. 355–375 (1992
total systemic error is wide-sense stationary, as is ap- May).
propriate in audio systems, and is possessed of the ex-
[4] S. P. Lipshitz and J. Vanderkooy, “Digital Dither,”
pected spectral characteristics.
presented at the 81st Convention of the Audio Engineering
Society, J. Audio Eng. Soc. (Abstracts), vol. 34, p. 1030
5 ACKNOWLEDGMENT (1986 Dec.), preprint 2412.
Stanley P. Lipshitz and John Vanderkooy have been [5] J. Vanderkooy and S. P. Lipshitz, “Digital Dither: Sig-
supported by operating grants from the Natural Sciences nal Processing with Resolution Far below the Least Signifi-
and Engineering Research Council of Canada. cant Bit,” in Proc. AES 7th Int. Conf. on Audio in Digital
Times (Toronto, ON, Canada, 1989 May), pp. 87–96.
[6] R. A. Wannamaker, “Psychoacoustically Optimal
6 REFERENCES Noise Shaping,” J. Audio Eng. Soc., vol. 40, pp. 611–620
[1] S. P. Lipshitz and J. Vanderkooy, “High-Pass (1992 July/Aug.).
Dither,” presented at the 4th Regional Convention of the [7] S. P. Lipshitz, J. Vanderkooy, and R. A. Wanna-
Fig. 11. PSD␧(f) for quantizing system with error feedback and using a dither filter with TPDF input and coefficients {1.0, −1.0}. A
single-tap noise-shaping filter with coefficient −0.5 was used. (a) Observed PSD for 0.0 LSB input. (b) Observed PSD normalized by
expected PSD for 0.0 LSB input. (c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected PSD for 0.5 LSB
input.
maker, “Minimally Audible Noise Shaping,” J. Audio APPENDIX

Eng. Soc., vol. 39, pp. 836–852 (1991 Nov.).
[8] J. S. Lim and A. V. Oppenheim, Advanced Topics in Dither Theory with Dependent Input and Dither
Signal Processing (Prentice-Hall, Englewood Cliffs, NJ,
(1988). It is the aim of this appendix to derive conditions on the
[9] A. Papoulis, Probability, Random Variables, and Sto- dither signal ␯ in a nonsubtractively dithered quantizing sys-
chastic Processes, 2nd ed. (McGraw-Hill, New York, 1984). tem that will guarantee that the moments of the total error
[10] S. P. Lipshitz and C. Travis, “The Generation of signal ␧ are independent of the input x to the quantizing
Non-White Dithers of Specified Probability Density Func- network. It will not be assumed that any dither and input
tion,” presented at the 94th Convention of the Audio En- values are statistically independent of one another, although
gineering Society, J. Audio Eng. Soc. (Abstracts), vol. 41, they may be. We will address both simple one-dimensional
p. 404 (1993 May), no preprint. moments and multidimensional joint moments between total
[11] R. A. Wannamaker, “Subtractive and Nonsubtrac- error values separated in time. The approach used will be of
tive Dithering: A Comparative Analysis.” To be pub- the most general and efficient sort. Readers who are unfa-
lished, JAES, vol. 52 (2004 Dec.). miliar with this approach may wish to refer to [2], [3] for an
[12] M. A. Gerzon, P. G. Craven, J. R. Stuart, and R. J. introduction thereto.
Wilson, “Psychoacoustic Noise-Shaped Improvements in For notational convenience we will define the following
CD and Other Linear Digital Media,” presented at the 94th vectors:
Convention of the Audio Engineering Society, J. Audio 䉭
Eng. Soc. (Abstracts), vol. 41, p. 394 (1993 May), preprint x = 共x1, x2, x3, . . . , xN兲 ∈⺢N
3501. 䉭
␯ = 共␯1, ␯2, ␯3, . . . , ␯N兲 ∈⺢N
[13] R. A. Wannamaker, “The Mathematical Theory of
䉭
Dithered Quantization,” Ph.D. Thesis, Dept. of Applied ␧ = 共␧1, ␧2, ␧3, . . . , ␧N兲 ∈⺢N
Mathematics, University of Waterloo, Waterloo, ON,
䉭
Canada (1997 June). k = 共k1, k2, k3, . . . , kN兲 ∈⺪N
Fig. 12. PSD␧(f) for quantizing system with error feedback and using simple i.i.d. TPDF dither. A single-tap noise-shaping filter with
coefficient −0.5 was used. (a) Observed PSD for 0.0 LSB input. (b) Observed PSD normalized by expected PSD for 0.0 LSB input.
(c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected PSD for 0.5 LSB input.
where ⺢N is the space of all N-vectors with real com- Using the definition of conditional probability [9], we may
ponents and ⺪ N is the space of all N-vectors with express the joint pdf of the signals under consideration as
integer components. x1 and w1, for example, represent sig-
nals present in the system at the same time instant, p␧,y,␯,x共␧,y,␯,x兲 = p␧|y,␯,x共␧,y,␯,x兲 py|␯,x共␧,y,␯,x兲 p␯,x共␧,y,␯,x兲
whereas x1 and x2 represent distinct but not necessarily (49)
successive samples. We note that if N ⳱ 1 then the results
that follow reduce directly to the “one-dimensional” re- where it should be kept in mind that the arguments and
sults of [2]. However, N may have any value between 1 subscripts in general represent vectors. We will compute
and infinity. the factors on the right-hand side of this equation. p␯,x will
Fig. 13. PSD␧(f) for quantizing system with error feedback and using a dither filter with RPDF input and coefficients {1.0, −1.0}. A
three-tap FIR noise-shaping filter with coefficients {1.33, − 0.73, 0.65} was used. (a) Observed PSD for 0.0 LSB input. (b) Observed
PSD normalized expected PSD for 0.0 LSB input. (c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected
PSD for 0.5 LSB input.
Fig. 14. System equivalent to that of Fig. 3 for case where all coefficients of error-feedback filter H(z) are integers.
be regarded as given. Since ␧ ≡ y − x, and
p␧|y,␯,x共␧,y,␯,x兲 = ␦共␧ − y + x兲 N
where the delta function with a vector argument is defined

W⌬共y兲 = 兿 W 共y 兲 = 兺 ␦共y − k⌬兲
i=1
⌬ i
k∈⺪N
as a product of delta functions,
where now k ∈ ⺪N as defined at the outset. With these
N
definitions, Eq. (52) applies if N ⱖ 1.
䉭
␦共x兲 = 兿 ␦共x 兲.
i=1
i (50)
Taking the Fourier transforms with respect to all vari-
ables of the three factors just computed, we obtain
Thus the only nontrivial factor in Eq. (49) is py|␯,x.
For simplicity, let us initially consider the case N ⳱ 1. P␧|y,␯,x共u␧,uy,u␯,ux兲 = ␦共uy + u␧兲␦共ux − u␧兲␦共u␯兲
Assuming for the sake of definiteness an infinite uniform
Py|␯,x共u␧,uy,u␯,ux兲 = ␦共u␧兲sinc共ux兲
兺␦冉u + u − ⌬冊␦冉u + u − ⌬冊
midtread quantizer with transfer characteristic
k k
⳯

w 1 x y ␯ y
Q共w兲 = ⌬ + (51) k∈⺪N
⌬ 2
P␯,x共u␧,uy,u␯,ux兲 = ␦共u␧,uy兲P␯,x共u␯,ux兲
we observe that if
共2n − 1兲⌬ Ⲑ 2 ⱕ x + ␯ ⬍ 共2n + 1兲⌬ Ⲑ 2 where ux ⳱ (ux1, ux2, ux3, . . . , uxN) ∈ ⺪N is a vector of
transform domain variables associated with x, where u␧,
then the quantizer output is n⌬. Thus py|␯,x can be ex- uy, and u␯ are similarly defined, where x ∈ ⺪N, and where
䉭
pressed as the following product of a window function sinc(u) ⳱ ⌸Ni⳱1 sinc(ui). After transforming, the multipli-
with an impulse train, cations of Eq. (49) become convolutions, so in order to
compute the joint cf P␧,y,␯,x we convolve the three preced-
py|␯,x共y,␯,x兲 = ⌬⌸⌬关y − 共x + ␯兲兴W⌬共y兲 (52) ing expressions with one another (separate convolutions
over each transform variable being required). After sim-
where plification the result is
兺 sinc冉u + u + u + u − ⌬冊
⬁
䉭
W⌬共y兲 = 兺 ␦共y − k⌬兲.
k=−⬁
P␧,y,␯,x共u␧,uy,u␯,ux兲 = ␧ y ␯ x
k
k∈⺪N
Since the quantizer output at any particular time is com-

pletely determined if the signals x and ␯ at that time are
k k
冉
⳯ P␯,x u␧ − , − .
⌬ ⌬ 冊
specified, the treatment is trivially extended to handle N ⱖ
1 by defining the following scalar functions of vector The desired marginal cf P␧ is obtained by simply setting
arguments: unwanted variables to zero [9],
兺 sinc冉 u − ⌬ 冊P 冉冊
N k k k
䉭
⌸⌬共y兲 = 兿 ⌸ 共y 兲
i=1
⌬ i
P␧共u兲 =
k∈⺪N
␯,x u− ,−
⌬ ⌬
. (53)
Fig. 15. PSD␧(f) for quantizing system with error feedback and using a dither filter with RPDF input and coefficients {1.0, −1.0}.
System was presented with null static input (0.0 LSB), and a single-tap noise-shaping filter with coefficient 1.0 was used. (a) Observed
PSD. (b) Observed PSD normalized by expected PSD.
Moments of ␧ are determined by the derivatives of its cf The correlation between two total error samples sepa-
at the origin [9]. Consider, for instance, N ⳱ 1. Then Eq. rated in time by a nonzero lag is
冉冊
(53) becomes 2
䉭 j 兲
E 关␧1␧2兴 = P共␧1,1
1,␧2
共0, 0兲
⬁
2␲
冉冊兺再冉冊冉冊
k k k
P␧共u兲 = ␯,x u− ,− . j 2 k1 k2
k∈−⬁ ⌬ ⌬ = sinc共1兲 − sinc共1兲 −
2␲ ⌬ ⌬
k∈⺪2
and
冉
× P␯1,␯2,x1,x2 −
k1 k2
⌬
,− ,− ,−
⌬ ⌬ ⌬冊
k1 k 2
䉭
E关␧m兴 =
2␲冉冊
j m
P共␧m兲共0兲
冉冊
+ sinc −
k1
⌬
sinc共1兲 −
k2
⌬ 冉冊
冉冊兺兺冉冊冉冊
⬁ m
冉冊
j m k
m 共r兲
= r sinc − ⌬ k1 k2 k1 k2
兲
2␲ k=−⬁ r=0 ⳯ P 共␯1,0,0,0 − ,− ,− ,−
1,␯2,x1,x2 ⌬ ⌬ ⌬ ⌬
冉冊
m−r,0兲
⳯ P共␯,x
k k
− ,− .
⌬ ⌬
(54)
冉冊
+ sinc共1兲 −
k1
⌬
sinc − 冉冊
k2
⌬
We discern that E[␧m] is independent of the distribution of

x if 冉
⳯ P 共␯0,1,0,0 兲
1,␯2,x1,x2
−
k1
⌬
k2 k1
,− ,− ,−
⌬ ⌬ 冊k2
⌬
i,0兲
p共␯,x 冉冊
k k
᭙k ⫽ 0 i = 0, 1, 2, . . . , m − 1. (55)
冉冊
+ sinc −
k1
⌬
sinc −
k2
⌬ 冉冊
冉冊冎
,
⌬⌬ k k k1 k2
兲
⳯ P共␯1,1,0,0
1 2
− ,− ,− ,− . (59)
1,␯2,x1,x2 ⌬ ⌬ ⌬ ⌬
This is the forward direction of the assertion in Lemma 3.
(The converse is proven in [13] using induction.) In this Careful inspection of Eq. (59), keeping in mind that the
case Eq. (54) reduces to derivatives of the sinc function vanish at the origin, shows
that it will be independent of the system input distribution
if and only if the following three conditions are satisfied:
冉冊兺冉冊冉冊
m m
j m 共r兲共m−r兲
E关␧m兴 = r sinc 共0兲P␯,x 共0兲
k1 k2 k1 k2
2␲ P␯1,␯2,x1,x2 , , , =0 ᭙共k1, k2兲 ⫽ 共0, 0兲 (60)
r=0 ⌬ ⌬ ⌬ ⌬
兺冉冊冉冊
m Ⲑ 2
冉冊
m ⌬ 2r E关␯m−2r兴
= (56) 兲
k1 k1
2r 2 2r + 1 P共␯0,1,0,0 , 0, , 0 = 0 ᭙k1 ⫽ 0 (61)
r=0 1,␯2,x1,x2 ⌬ ⌬
which are the Sheppard’s corrections.

Consider, now the case where N ⳱ 2 and let us explore
P共␯1,0,0,0 兲
1,␯2,x1,x2
0,冉 k2
⌬
, 0,
k2
⌬ 冊
=0 ᭙k2 ⫽ 0. (62)
the joint statistics between total error samples separated in In that case Eq. (59) reduces to
time in NSD systems. From Eq. (53) we have
兲䉭
E 关␧1␧2兴 = P共␯1,1
1,␯2
共0,0兲 = E 关␯1␯2兴. (63)
k k k We have now derived all of the theorems used in the
P␧共u␧兲 = ␧ ␯,x u␧ − , −
⌬ ⌬ body of the paper, including, although this may not be
k∈⺪2
obvious, those in Section 2. The latter follow from assum-
ing that the random vectors ␯ and x are statistically inde-
but now we will let ␧ ⳱ (␧1, ␧2), u␧ ⳱ (u␧1, u␧2), and k ⳱ pendent (since no feedback is present). That is, we let
(k1, k2). Then
P␯,x共␯, x兲 = P␯共␯兲 Px共x兲.
P␧共u␧兲 = P␧1,␧2共u␧1, u␧2兲 By then insisting that Px be arbitrary, the results in this
兺 sinc冉 u 冊冉冊
appendix immediately reduce to those of [2], [3].
k1 k2
= ␧1 − sinc u␧2 −
⌬ ⌬ The biographies of Stanley P. Lipshitz and John Vanderkooy
k∈⺪2
冉冊
were published in the 2004 March issue of the Journal. The
k1 k2 k1 k2 biography of Robert A. Wannamaker was published in the 2004
× P␯1,␯2,x1,x2 u␧1 − ,u − ,− ,− . (58)
⌬ ␧2 ⌬ ⌬ ⌬ June issue of the Journal.
PAPERS
Motion-Tracked Binaural Sound*
V. RALPH ALGAZI, AES Member, RICHARD O. DUDA, AES Member, AND

DENNIS M. THOMPSON, AES Student Member
CIPIC Interface Laboratory, University of California, Davis, CA 95616, USA
A new method is presented for capturing, recording, and reproducing spatial sound that
provides a vivid sense of realism. The method generalizes binaural recording, preserving the
information needed for dynamic head-motion cues. These dynamic cues greatly reduce the
need for customization to the listener. During either capture or recording, the sound field in
the vicinity of the head is sampled with a microphone array. During reproduction, a head
tracker is used to determine the microphones that are closest to the positions of the listener’s
ears. Interpolation procedures are used to produce the headphone signals. The properties of
different methods for interpolating the microphone signals are presented and analyzed.
0 INTRODUCTION that are missing in conventional binaural recordings. We

call this procedure motion-tracked binaural (MTB).
This paper presents and analyzes a new, motion- MTB sound capture has several advantages over exist-
sensitive method for capturing, recording, and reproducing ing techniques. It generates a stable spatial sound image in
spatial sound1 that provides listeners with important dynamic which the perceived locations of the sound sources do not
localization cues. The basic idea is to distribute a number change when the listener turns his or her head. It greatly
of microphones over a surface that approximates the listen- reduces the front/back confusion and the “frontal collapse”
er’s head, and to use a head tracker to determine the location that occurs with artificial-head binaural recordings. It au-
of each of the listener’s ears relative to the microphones. tomatically captures all of the room reflections and rever-
If one of the listener’s ears happens to coincide with a beration that characterize the acoustic environment. These
particular microphone, the signal from that microphone is advantages lead to a strong sense of realism and presence.
sent directly to the listener’s corresponding headphone. If Experimental results show that the number N of micro-
the ear is between two microphones, the signals are inter- phones does not have to be large, so that MTB sound
polated and sent to the headphones (see Fig. 1). In effect, capture is relatively efficient in the use of bandwidth.
the procedure places virtual copies of the listener’s ears in There is no limit to the number of listeners who can move
the incident sound field, moving them as the listener their heads independently and still listen simultaneously to
moves his or her head, thereby capturing the dynamic cues the signals from the same N microphones. The signals for
other kinds of spatial sound recordings can readily be
transformed into this format, thereby preserving legacy
*Manuscript received 2003 December 23; revised 2004 Sep- stereo and surround-sound recordings. Finally, the tech-
tember 7. nique can also be used with computer-generated sounds by
1
Patent pending. simulating the microphone array, thereby securing the ad-
vantages of MTB for sounds produced in virtual auditory
space.
However, the MTB procedure is not without limitations.
Both a head tracker and headphones are required.2 Al-
though the sound image does not rotate when the listener
2
It is possible to reproduce MTB signals directly over loud-
speakers without head tracking and achieve many of the spatial
effects of headphone listening, much as Johnston and Lam did
using another approach [1], but “crosstalk” introduces audible
artifacts. When the listeners are in separated sound environ-
Fig. 1. Basic components of motion-tracked binaural system. A
head tracker is used to find the microphones closest to the lis- ments, it is possible to use crosstalk cancellation techniques and
tener’s ear and to interpolate between their outputs, thereby dy- replace the headphones by loudspeakers [2], [3], but the cancel-
namically capturing the sound at the point where the ear would lation algorithm must be responsive to head motion. In either
be located. case, the listener is confined to a relatively small “sweet spot.”
PAPERS MOTION-TRACKED BINAURAL SOUND
rotates his or her head, the sound image translates when for resolving front/back ambiguities and improving local-
the listener translates. The perceived source locations are ization accuracy. (See [18] for recent results and additional
completely stabilized only when the radius of the micro- references.)
phone array is the same as the radius of the listener’s head.
Additional bandwidth is needed for the microphone sig- 1.2 Spatial Sound Technology
nals. If too few microphones are used, objectionable in- There are fundamentally only two different engineering
terpolation artifacts may be audible. Because the micro- approaches to reproducing spatial sound: wavefield syn-
phones are not equipped with artificial outer ears or thesis and binaural reproduction [19], [20]. When used for
pinnae, the signals lack the listener-dependent spectral faithful reproduction, stereo and the various forms of sur-
cues for elevation. round sound (such as quadraphonics, 5.1-channel sur-
In this paper we present a detailed analysis of the MTB round, Ambisonics) can all be viewed as attempts to re-
system and show how many of these limitations can be construct the sound field that was sampled by the
overcome. We begin with a brief review of the physical recording microphones. Although this technology is com-
and psychophysical principles of spatial hearing, placing mercially dominant, the theoretical requirements for exact
particular emphasis on the theoretical properties of the wavefield synthesis throughout the audible frequency
spherical-head model. We then describe the errors that are range are severe [21]. For example, the required area sam-
introduced when the sound field is sampled spatially. We pling density for a hexagonal array of microphones spaced
present several alternative approaches of increasing effec- a half-wavelength apart is 8/3␭2; for 20-kHz bandwidth,
tiveness for reducing these errors, and the classes of ap- this calls for about 15 700 microphones per square meter.
plications for which each is most appropriate. Ambisonic recording provides a theoretically well
founded local approximation to exact reconstruction that is
1 BACKGROUND vastly more efficient, but the listener is confined to a rela-
tively small “sweet spot,” and multiple loudspeakers are
1.1 Spatial Hearing still needed for spatially faithful reproduction [22], [23].
Research on the physical and psychophysical basis for Binaural sound reproduction has the great advantage of
sound localization has a long history [4]–[7]. The many being able to produce fully three-dimensional sound with
auditory cues used by people include: only two signals—the pressure waveforms at the listener’s
1) The interaural time difference (ITD) eardrums. Reproduced over properly compensated head-
2) The interaural level difference (ILD) phones, binaural reproduction can sound impressively re-
3) Monaural spectral cues introduced by the pinnae alistic. However, there are several reasons why the binau-
4) Torso reflection and diffraction cues ral approach has not been accepted widely.
5) The ratio of direct to reverberant energy 1) The listener must either wear headphones or be con-
6) Cue changes induced by voluntary head motion fined to a small “sweet spot.”
7) Familiarity with the sound of the source 2) Differences between the size and shape of the pinnae
All of the acoustic cues vary with azimuth, elevation, of the dummy head and the pinnae of the listener can cause
range, and frequency. The two interaural difference cues the apparent source elevation to be either poorly defined or
are particularly important, because they are largely inde- seriously in error.
pendent of the source spectrum. Lord Rayleigh’s pioneer- 3) The perceived auditory field turns if the listener turns
ing and well-known duplex theory asserts that the ITD is his or her head. This is unacceptable if the sound must be
exploited at low frequencies and the ILD is exploited at spatially registered with imagery.
high frequencies, the crossover frequency being around 4) Sound sources that are in front are often perceived as
1.5 kHz [8]. Indeed, the ITD and the ILD are the primary being in back or in the head.
cues for estimating the so-called lateral angle ␪, the angle The first of these problems is intrinsic to binaural re-
between a ray from the center of the head to the sound production. The second problem can be ameliorated,
source and the vertical median plane. Above 3 kHz the though it remains a challenge. However, the other two
monaural spectral changes introduced by the pinnae pro- problems can be solved completely if head motion is taken
vide the primary cues for estimating elevation [9], whereas into account.
below 3 kHz the torso provides weak but useful elevation In 1941 de Boer and van Yrk showed that front/back
cues [10]. For estimating range, the primary cues appear to confusion could be eliminated with a spherical dummy
be familiarity with the source [11], the ILD for close head by rotating the head back and forth, provided that the
sources [12], and the direct-to-reverberant energy ratio for listener turned his or her head back and forth in synchrony
distant sources [13]. with it [24]. More recent work using a head tracker on the
The fact that people also use head motion to help lo- listener and a servomechanism to turn the dummy head in
calize sounds has long been recognized [14]. In a series of accordance with the listener’s head motion produced a
classic experiments, Wallach demonstrated that motion stable acoustic field and eliminated front/back confusion
cues can override pinna cues in resolving front/back con- [25]. Clearly, this solution cannot be used to record sound,
fusion [15], [16]. Although pinna cues are also important and even for remote listening it requires a separate dummy
[17], and although head motion is not effective for local- head for each listener. The MTB method can be viewed as
izing very brief sounds [5], subsequent research has a generalization of the servomechanism approach that 1)
largely confirmed the importance of these dynamic cues eliminates the need for physically turning the dummy
ALGAZI ET AL. PAPERS
head, 2) allows recording as well as remote listening, and This solution for an idealized model provides a useful
3) allows multiple listeners to listen simultaneously. and widely used first-order approximation to a human
Binaural reproduction also plays a central role in the HRTF. For best results, the radius a should be adapted to
creation of virtual auditory space. Here the left-ear and each listener [34]. However, the major features of the
right-ear signals for any sound source are computed by HRTF behavior can be illustrated using the traditional
convolving the source signal with the head-related impulse value of 87.5 mm for the head radius. We used this value
responses for the respective ears [26]–[28]. The Fourier and the algorithm given in [33] to compute H numerically.
transforms of these impulse responses are the head-related The HRTFs for other radii can be found by scaling fre-
transfer functions (HRTFs), which capture all of the quency inversely with head size.
acoustic sound localization cues. Because the HRTFs de- Fig. 3 shows the resulting magnitude responses for 19
pend on the location of the source relative to the head, the different observation angles. Because the sphere does not
HRTFs change if the source moves or if the listener turns appreciably disturb the incident field at low frequencies
his or her head. The use of a head tracker to modify (frequencies where the wavelength is greater than the cir-
HRTFs in real time was reported as early as 1988 [29], and cumference of the sphere), all of the curves approach 0 dB
is now common practice [26]. In particular, it is the basis at low frequencies. In general, the high frequencies are
for the stabilization of stereo and surround-sound record- boosted on the ipsilateral side of the sphere (␣ < 90°) and
ings for headphone listening [30]. The techniques used to are cut on the contralateral side (␣ > 90°). This contralat-
generate virtual auditory space can be readily modified to eral high-frequency attenuation is commonly referred to as
simulate MTB sound capture, thereby allowing multiple “head shadow.” However, the strongest head shadow does
listeners to experience the same computer-generated spa- not occur at ␣ ⳱ 180°. As the observation point ap-
tial sounds. proaches 180°, it enters the so-called “bright spot,” where
A variety of different dummy heads have been devel- waves traveling over the surface of the sphere come to-
oped for binaural recording [31]. Differences in their pinna gether in phase and the response becomes essentially flat.
shapes produce corresponding differences in their HRTFs, A consequence of the extreme symmetry of the sphere, the
particularly at high frequencies. In the absence of head bright spot is not as pronounced in human HRTF data,
motion, these differences impact the perception of eleva- although it can be seen there as well [35].
tion, front/back discrimination, and externalization [32]. In
our experience, however, the dynamic cues that arise from
head motion are sufficiently strong that they often domi-
nate pinna cues. The subtle differences between these
HRTFs become much less significant when head motion is
accounted for. In fact, with proper compensation, remark-
ably good results can be obtained from a spherically
shaped head. This leads us to examine the theoretical be-
havior of an ideal spherical-head model.
2 SPHERICAL-HEAD MODEL
Consider an ideal rigid sphere of radius a that is scat-
tering incident plane waves of angular frequency ␻. In Fig. 2. Infinitely distant sound source producing plane waves that
particular, suppose that the free-field sound pressure at the are scattered by a rigid sphere. Pressure at observation point
origin—the pressure due to the source when the sphere is varies with frequency ␻, observation angle ␣, and radius a of
removed—is given by the real part of Pff exp(j␻t), where sphere.
Pff is the phasor free-field pressure. Let ␣ be the observa-
tion angle, the angle between a ray to the sound source and
a ray to any observation point on the surface of the sphere
(see Fig. 2), and let the total pressure at the observation
point be the real part of Pop exp(j␻t), where Pop is the
phasor pressure at the observation point. Then it can be
shown that the HRTF H is given by
冉冊兺
⬁
Pop 1 2 jm−1共2m + 1兲Pm共cos ␣兲
H共␻, ␣兲 = = (1)
Pff ␮ m=0 h⬘m 共␮兲
where ␮ ⳱ ␻a/c is the normalized frequency, c is the

speed of sound, Pm(⭈) is the Legendre polynomial of de- Fig. 3. HRTF magnitude response for 87.5-mm-radius sphere.
⬘ (⭈) is the derivative of the spherical Hankel
gree m, and hm Effects of head shadow are revealed by reduced high-frequency
function of order m [33]. response on contralateral side (␣ > 90°).
These results can also be used to compute the ILD as a The equally important ITD can be obtained from the
function of the lateral angle ␪. Although people’s ears are phase response.3 The variations of the ITD with the lateral
usually somewhat below and behind the center of the head, angle are shown in Fig. 6. For human hearing the most
for simplicity we assume that the ears are on opposite important frequency range is below 1.5 kHz [37]. Below
sides of a diameter of the sphere. With that assumption, ␪ about 600 Hz it can be shown that the first two terms in
⳱ ␲/2 − ␣ (see Fig. 4). It follows that the ILD (in dB) is Eq. (1) provide the following low-frequency approximation:
given by
a
ITDlow frequency ≈ 3 sin ␪. (3)
冉冊
c
␲
冨冨
H ␻, − ␪ A simple ray-tracing argument provides an approximation
2
冉冊
ILD共␻, ␪兲 = 20 log10 . (2) for the high-frequency ITD, which is known as Wood-
␲
H ␻, + ␪ worth’s formula [38],
2
a
ITDhigh frequency ≈ 共␪ + sin ␪兲. (4)
The variations of the ILD with the lateral angle are c
shown in Fig. 5. Note that substantial interaural level dif-
As Fig. 6 illustrates, these approximations agree closely
ferences can be developed for frequencies above 1.5 kHz,
with the computed results. Although the perceptual sig-
where the ILD contributes strongly to sound localization.
nificance of the difference between low- and high-
The reduction in the magnitude of the ILD as |␪| ap-
frequency behavior has been questioned [7], we note in
proaches 90° is due to the bright spot. ILDs measured for
passing that the difference between low-frequency and
human heads have this same general behavior, but because
high-frequency ITDs is greatest when ␪ ⳱ 60°, and the
the bright spot is not as strong, they tend to vary more
percentage difference is greatest when ␪ ⳱ 0°. To be more
monotonically with the angle.
specific, the percentage difference is 35.8% when ␪ ⳱ 60°
and 50% when ␪ ⳱ 0° [39].
It is natural to ask how well the HRTF for the spherical-
head model matches human HRTFs. The pinna has a
strong effect on the HRTF at high frequencies, which com-
plicates a direct comparison. However, a simple compari-
son can be made with the HRTF for the KEMAR manne-
quin, for which the pinnae can be removed. Figs. 7 and 8
show the angular dependence of the ILD and the ITD for
a pinnaless KEMAR for a source in the anterior horizontal
plane.
3
Although the group delay is sometimes used to define the
ITD, its frequency dependence is usually determined from the
phase delay [36], which is consistent with neurophysiological
auditory models. For this reason we also use the phase delay to
Fig. 4. Top view of listener’s head for source in the horizontal define the ITD. Below 1.5 kHz, where hearing is phase sensitive,
plane. Because of symmetry of sphere, same diagram applies to
the differences between the two measures of the ITD for the
any plane through the interaural axis. ILD and ITD are constant
on a surface of constant lateral angle, called cone of confusion. sphere are relatively small [4, p.74].
Fig. 6. Angular dependence of phase-derived ITD for 87.5-mm-

Fig. 5. Angular dependence of ILD for 8.75-mm-radius sphere. radius sphere. The three solid lines show exact results; low-
Above 1.5 kHz magnitude of ILD provides a significant cue for frequency approximation ⵜ obtained from Eq. (3); Woodworth’s
the lateral angle. Bright spot reduces ILD when sound source is formula [Eq. (4)] provides a simple high-frequency approxima-
directed into one ear. tion ⌬.
There are noticeable differences between the low- the top half of the image, and the contralateral responses
frequency ILDs for the KEMAR and the sphere (compare appear in the bottom half. The strong response on the
Figs. 5 and 7). Most of these differences can be attributed ipsilateral side and both the head shadow and the bright
to torso reflections [35]. In addition, the bright spot, which spot on the contralateral side are clearly seen in this
reduces the ILD when |␪| is close to 90°, is significantly representation.
stronger for the sphere than for the KEMAR. Despite these Fig. 5, 6, and 9 show how the critical cues provided by
differences, one sees the same general behavior, namely, a a spherical-head model vary with rotation, revealing a sig-
strong increase in ILD with frequency and with lateral nificant and continuous variation of the ILD, the ITD, and
angle. In addition, the equally important ITD response is the monaural spectrum with the lateral angle. Although the
quite close to the results for the sphere, both at low and at corresponding behavior of a human head is more complex,
high frequencies (compare Figs. 6 and 8). Clearly, the the dynamic effects that are produced by this simple ap-
general behavior of the HRTF for the sphere retains the proximation can produce a compelling perceptual experi-
basic features of the HRTF for a pinnaless KEMAR. ence. The MTB recording method approximates this con-
In the remainder of this paper we find it more revealing tinuous behavior through sampling and interpolation, thus
to display frequency-response data using polar coordinates introducing errors. We now use the spherical-head model
and an image representation. For purposes of illustration, to investigate the spectral errors that different interpolation
the magnitude response data shown in Fig. 3 are presented procedures introduce.
as an image in Fig. 9. Here the brightness at any point in
the image represents the dB magnitude, the radius speci- 3 MTB SAMPLING AND INTERPOLATION
fies the frequency on a logarithmic scale, and the polar
angle directly specifies the observation angle. This makes 3.1 The Interpolation Problem
it easy to visualize how the sound spectrum changes and, The core problem for MTB sound capture is to recover
by direct implication, how the ILD changes as the head is the sound pressure at the location of a listener’s ear from
rotated. In Fig. 9 the incident sound is propagating down the signals picked up by a small number of microphones.
the positive y axis. Thus the ipsilateral responses appear in In this section we analyze and evaluate the behavior of
three different interpolation procedures. In each case we
assume that the N microphones are equally spaced around
the equator of a rigid sphere. Fig. 10 illustrates this case
for N ⳱ 8. We assume that the signals from a head tracker
can be used to determine the locations of the listener’s left
and right ears relative to the center of the sphere. The
problem is to find a good approximation to the signals at
the corresponding ear locations on the sphere.
3.2 Nearest Microphone Selection

A simple way to obtain a signal for a particular ear is
merely to select the microphone nearest to that ear. This is
analogous to temporal sampling with a sample-and-hold
circuit. The obvious limitation of this method is that it
Fig. 7. Angular dependence of ILD for KEMAR mannequin with
pinnae removed and source in front in the horizontal plane. Com-
parison with ideal response (Fig. 5) reveals many differences in
detail but generally similar behavior.
Fig. 9. Image representation of magnitude response of Fig. 3.

Response in dB for any particular observation angle ␣ can be
seen from brightness variations along corresponding radial line.
Fig. 8. Angular dependence of ILD for KEMAR mannequin with Center of circle corresponds to low frequencies (0.1 kHz); out-
pinnae removed. Comparison with ideal response (Fig. 6) reveals ermost circle corresponds to high frequencies (10 kHz). For clar-
similar behavior at both low and high frequencies. ity; coordinate system shown in (a) is removed in (b).
divides the circle into N sectors within which the system where the interpolation coefficient w is given by
does not respond to changes in head motion. Thus for N ⳱
8, listeners will hear the location of the source turn with w ⳱ ␤/␤N. (6)
their heads as they turn through a 45° angle, and then
However, when microphone signals are linearly com-
suddenly jump back to the initial position each time a
bined, phase interference can produce comb filtering and
switching boundary is crossed. In addition to the positional
significant linear distortion. The degree to which x̂(t) ap-
jump, a discontinuity or “click” will be heard each time a
proximates x(t) can be determined by comparing H(␻, ␣),
boundary is crossed.4
Discontinuities occur in the ITD, the ILD, and the mon-
aural spectrum, which are piecewise constant functions of
the lateral angle. These discontinuities are clearly visible
in the spectra shown in Fig. 11. Here the source is assumed
to be aimed directly at the first microphone and infinitely
distant, that is, propagating plane waves. Fig. 11(a) shows
the magnitude spectrum for an eight-microphone MTB
array. If there were no error, this image would be identical
to the image in Fig. 9(b). As Fig. 11(c) illustrates, a much
better approximation can be obtained by increasing the
number of microphones to 32. In practice, many micro-
phones are required to make the artifacts introduced by
this method inaudible.
3.3 Full-Bandwidth Linear Interpolation

The discontinuities of nearest microphone selection can
be eliminated by interpolating rather than switching be-
tween the outputs of the microphones. Let ␤N ⳱ 2␲/N be
the angle in radians between two adjacent microphones,
and let ␤ be the angle between the ear and the nearest
microphone (see Fig. 12). Then if xn(t) is the output of the
nearest microphone and xnn(t) is the output of the next
nearest microphone, the signal x(t) at the ear location can
be estimated by
x̂(t) ⳱ (1 − w)xn(t) + w xnn(t) (5)
4
Fig. 11. (a) Magnitude response for interpolation by nearest mi-
These switching artifacts can be reduced by cross fading crophone selection for eight-microphone MTB array. If there
rather than switching between microphones, and by including were no error, response would be the same as in Fig. 9(b). (b)
hysteresis to prevent “chattering” when the listener’s head is on Interpolation error. Largest errors occur when ear is on contra-
a switching boundary. However, unless the number of micro- lateral side. However, discontinuities at visible sector boundaries
phones is quite large, listeners are still aware of the sudden are also audible. (c), (d) corresponding results for 32-microphone
changes that occur when a switching boundary is crossed. array.
Fig. 10. Using eight microphones to sample sound field around

equator of rigid sphere. The problem is to produce the signals Fig. 12. Angles to nearest and next nearest microphones, which
that would be picked up at the locations of the left and right ears. are used by MTB system for interpolation of microphone signals.
the desired HRTF, to ĤN(␻, ␣), the HRTF produced by Equivalently, the distance between microphones should
Eq. (5), be no more than one quarter of the shortest wavelength.5
For N ⳱ 8 and a ⳱ 87.5 mm, fmax ⳱ 1.25 kHz. In
ĤN 共␻, ␣兲 = 共1 − w兲H共␻, ␣ − ␤兲 + wH共␻, ␣ + ␤N − ␤兲 principle, 128 microphones are needed to meet the sam-
= 1−冉 ␣ − ␣n
␤N 冊H共␻, ␣n兲 +
␣ − ␣n
␤N
H共␻, ␣n + ␤N兲
pling conditions out to 20 kHz. Fortunately, as we shall
see in Section 3.4, simple modifications of the full-
bandwidth interpolation procedure reduce this requirement
(7)
significantly.
where ␣ is the angle between the sound source and the ear,
and ␣n ⳱ ␣ − ␤ is the angle between the sound source and 3.3.2 Exact Response
the nearest microphone. We will use this expression to evalu- We now compare the spectra for the exact HRTF and
ate the interpolation error. However, we begin with a simple the interpolated HRTF. Similar errors also appear in the
approximate analysis that provides some physical insight. ITD and the ILD. Ideally, we would like to see no differ-
ence between the exact HRTF H(␻, ␣) given by Eq. (1)
3.3.1 Approximate Analysis and the interpolated HRTF Ĥ(␻, ␣) given by Eq. (7). We
When adjacent microphones are sufficiently close, the used Eq. (7) and the algorithm given in [33] to compute
primary difference between xn(t) and xnn(t) is a time delay ĤN(␻, ␣) for the case a ⳱ 87.5 mm and for several dif-
T. Thus xnn(t) ≈ xn(t − T), and the signal x(t) at the ear ferent values of N. In every case we use the microphone
occurs at an intermediate delay, x(t) ≈ xn(t − wT). By configuration shown in Fig. 10, with the first microphone
substituting xn(t − T) for xnn(t) in Eq. (5) and approximat- at the top.
ing xn(t − T) by the first two terms in a Taylor’s series Fig. 13(a) shows the dB magnitude of ĤN(␻, ␣) for the
expansion, we obtain case where N ⳱ 8 and where the source is directed at the
x̂共t兲 ≈ 共1 − w兲 xn共t兲 + w xn共t − T兲 first microphone. The error, shown in Fig. 3(b), is small at
low frequencies, and there is no error at all when the ear
≈ 共1 − w兲 xn共t兲 + w关xn共t兲 − ẋn共t兲T兴 is positioned at one of the microphones. This is reflected
≈ xn共t兲 − ẋn共t兲共wT兲 visually in the clear division of the image into eight sec-
tors, defined by the eight radial streaks in the Fig. 13
≈ xn共t − wT兲 images. However, as Eq. (8) predicts, significant errors
as desired. That is, the weighted combination of the signal occur above about 1.25 kHz. The negative errors, shown
and the time-delayed signal is approximately the signal as dark spots in Fig. 13(b), are a consequence of phase
that arrives at the ear, and thus linear interpolation pro- interference. The positive errors, shown as bright streaks
duces the equivalent intermediate delay for any signal at the bottom of Fig. 13(b), stem from the failure of the
direction. interpolation procedure to properly reproduce the strong
However, the Taylor’s series approximation breaks head shadow that occurs between microphone locations.
down when the delay T is so large that quadratic and
higher order terms are required. For a sinusoidal signal, 5
The microphones can be thought of as sampling the sound
the approximation becomes poor when the delay is greater field spatially. From sampling theory one might expect that it
than about a quarter of a period. If the source contains no would be sufficient to have two samples per wavelength. How-
significant energy above some maximum frequency fmax, ever, that would require interpolation involving more micro-
we can expect that the linear interpolation will be accept- phones and a more sophisticated interpolation procedure.
able if T < 1/4fmax.
In addition, note that when xnn(t) ≈ xn(t − T) and when
w ⳱ 0.5, it follows from Eq. (5) that X̂(␻) ⳱ Ĥ(␻) Xn (␻),
where Ĥ(␻) ⳱ 0.5 [1 + exp(− j␻T)]. This is the transfer
function for a comb filter. This filter has its first notch at
f ⳱ 1/2T, and its response is 3 dB down at f ⳱ 1/4T. It
follows that spectral coloration will be strong unless T <
1/4fmax. Thus both the time-delay and the spectral-
coloration considerations lead to the requirement that the
delay be less than a quarter of the shortest period.
Large delays cause serious problems. The delay T is
maximum when the sound wave is traveling around the
sphere from one microphone to the next. Because the dis-
tance between microphones is 2␲a/N, the maximum value
for T is 2␲a/Nc. Thus for good performance in the worst- Fig. 13. (a) Magnitude response for nearest microphone full-
bandwidth interpolation for eight-microphone MTB array. As in
case situation there should be no significant spectral en- Fig. 9, frequency range is from 100 Hz to 10 kHz. In the worst
ergy above case, a deep interference notch occurs near 2.5 kHz, which is
twice fmax. (b) Interpolation error. Dotted circle identifies 1-kHz
Nc frequency contour; solid contours are for −3 dB. Error is small
fmax = . (8)
8␲a for frequencies ⱕ1.5 kHz.
Slightly different results are obtained if the source is

moved so that it is not aimed directly at a microphone. The 3.4 Two-Band Interpolation
results when the microphone is rotated 11.25° and 22.5° The direction-dependent audible artifacts due to phase
clockwise are shown in Fig. 14. Note that when the source interference that are revealed in Figs. 13 to 15 can be
is rotated 22.5°, the incident sound wave arrives at the first eliminated from the interpolated signal x̂(t) by employing
two microphones at the same time, and no interference a low-pass antialiasing filter and then restoring the high-
notches develop. In general, the errors are reduced when frequency components. This approach fits very naturally
the source is aimed at a point between the microphones, with Rayleigh’s duplex theory of sound localization. It
but the overall behavior is generally similar to that shown exploits the fact that low-frequency ITD is the dominant
in Fig. 13. cue for sound localization [37], and that simple linear
For applications in which the source sounds have interpolation will produce the correct ITD cues at low
limited bandwidth and the requirements for sound qual- frequencies. It also exploits the fact that the human audi-
ity are not high, an eight-microphone array with full- tory system is insensitive to the interaural time differences
bandwidth interpolation may suffice. In particular, the for frequencies above 1.3–1.4 kHz [40], [41]. Thus the
sound coloration introduced by the interference notches ITD cues are preserved in the low-pass filtered signal if
is less perceptible than might be expected, and even fmax ⱖ1.5 kHz. Although the ILD cues are not as critical
though an eight-channel system would not be acceptable as the ITD cues, they are also important. The degree to
for high-quality music reproduction, it is very suitable for which the ILD cues are preserved depends on the method
speech applications. As Eq. (8) indicates, better perfor- used to restore the high frequencies. We consider three ap-
mance can be obtained by increasing the number of proaches of increasing capability and complexity in turn.
microphones.
The results for 16 and 32 microphones are shown in Fig. 3.4.1 Fixed-Microphone Restoration
15. As expected, the response gets closer to the ideal re- A simple way to restore the missing high frequencies is
sponse shown in Fig. 9(b) as the number of microphones to add in a high-pass-filtered version of the signal xc(t)
is increased. In theory the error can be made arbitrarily from a fixed omnidirectional microphone called the
small, but at the cost of requiring an arbitrarily large num- complementary microphone. Fig. 16 shows a block dia-
ber of microphones. Fortunately more efficient solutions gram for this procedure. Here HLP is the low-pass antiali-
can be obtained by exploiting the characteristics of human asing filter, and HHP is a complementary high-pass filter.
spatial hearing.
Fig. 15. Magnitude response for nearest microphone full-

bandwidth interpolation for (a) 16 and (c) 32 microphones.
Fig. 14. As in Fig. 13, but with source rotated. Note that there is Worst-case notch frequencies are approximately 6 kHz for 16
less phase interference when source is directed between micro- microphones and 12 kHz for 32 microphones. In general, re-
phones and signals arrive more nearly in phase. (a) Source at sponse gets closer to ideal response of Fig. 9(b) as number of
11.25°. (c) Source at 22.5°. (b), (d) Error. microphones increases. (b), (d) Error.
The results of this procedure are shown in Fig. 17 for the It is important to note that the performance of this pro-
case of eight microphones and ideal low-pass and high-pass cedure is significantly better than the elementary nearest
filters with a cutoff frequency of 1.5 kHz. Below the cutoff microphone procedure illustrated in Fig. 11. For wide-
frequency both the ITD and the ILD are essentially correct. band sources one can still hear the spectral discontinuities
However, above the cutoff frequency both the ITD and the that are clearly visible in Fig. 18(a). These can be per-
ILD are zero. The single complementary microphone is able ceived as sudden changes in the brightness of the tone
to restore the spectral energy above the cutoff frequency, but color, perhaps accompanied by small jumps in location
the high frequency directional cues are wrong for sound due to jumps in the high-frequency ILD. However, below
sources that are not in the median plane. 1.5 kHz both the temporal and the spectral cues vary con-
Because the auditory system is not sensitive to phase at tinuously with the head motion, which largely eliminates
high frequencies, the erroneous high-frequency ITD cues the positional jumps heard with the simpler procedure.
may not be serious. However, the erroneous ILD cues
produce perceptual errors. Sound sources that have little 3.4.3 Spectral-Interpolation Restoration
energy above the cutoff frequency tend to be heard cor- It is also possible to use a small number of microphones
rectly. Sound sources that have most of their energy above and spectral interpolation to obtain high-frequency content
the cutoff frequency appear to be in the median plane, that varies continuously with the head motion. Let Mn(␻) be
usually at or near the center of the head. In our informal the magnitude of the short-time Fourier transform of xn(t),
tests, broad-band sound sources often produce a split and let Mnn(␻) be the magnitude of the short-time Fourier
sound image, with a low-frequency image heard correctly transform of xnn(t). Then we can estimate the magnitude of
and a high-frequency image heard near the center of the the short-time Fourier transform of x(t) by
head. Thus this simple procedure is potentially very useful
for speech and similar limited-bandwidth applications, but Mc(␻) ⳱ (1 − w)Mn(␻) + wMnn(␻) (9)
is less than ideal.
and we can use any of several standard methods to recover
3.4.2 Nearest Microphone Restoration the time signal xc(t) from Mc(␻) [42], [43].
The high-frequency ILD can be roughly restored by The magnitude responses resulting from this procedure
using the nearest microphone to provide the high- are shown in Fig. 19. The responses now vary continuously
frequency information, that is, by letting xc(t) ⳱ xn(t). with the head motion, and there are no artifacts from switch-
This sample-and-hold approach for the high frequencies ing discontinuities. As with all of the two-band procedures,
leads to the magnitude responses shown in Fig. 18. the ITD cues are properly reproduced. With 32 microphones
Fig. 16. Signal processing for two-band interpolation. Signals

from nearest and next nearest microphones are interpolated and
low-pass filtered. High frequencies are restored by adding a
high-pass-filtered version of signal xc from a complementary
microphone.
Fig. 17. Two-band interpolation with high frequencies restored

from fixed omnidirectional microphone. Low-frequency interau- Fig. 18. Two-band interpolation with high frequencies restored
ral cues are correct, but high-frequency interaural cues are in- from nearest microphone. (a) 8 microphones. (c) 32 micro-
correct. (a) Single-microphone restoration. (b) Error. phones. (b), (d) Error.
the ILD is also properly reproduced, and it is reasonably simplicity/performance compromise. However, each
well approximated even with only eight microphones. method might be the preferred choice for a particular ap-
plication, and in this section we compare their different
3.5 Comparison of Interpolation Methods advantages and disadvantages. A concise summary is
We have presented three different methods for interpolation, given in Table 1.
the third method having three different ways to restore high If the use of 16 or more microphones is acceptable, the
frequencies. For brevity we identify these methods as follows: conceptual simplicity of NM is attractive. However, with
NM Nearest microphone selection a small number of microphones the spatial instability and
FB Full-bandwidth interpolation discontinuities make it unacceptable for music, and its use
TB1 Two-band interpolation, fixed-microphone is limited to low-quality applications.
restoration The instability and discontinuity problems of NM are
TB2 Two-band interpolation, nearest microphone largely removed by the other methods. Although the spec-
restoration tral notches introduced at high frequencies make FB un-
TB3 Two-band interpolation, spectral-interpolation acceptable for music (see Figs. 13–15), in our informal
restoration. listening tests FB is remarkably good for speech. It is
worth observing that reflections from walls, tables, and
Of these five methods, NM is the simplest, TB3 pro- other environmental surfaces also introduce spectral
vides the best performance, and TB2 offers an attractive notches that change with changes in head position, and
familiarity with these effects may account for part of the
surprisingly small degree to which this spectral coloration
is distracting.
The two-band methods (TB1, TB2, and TB3) exploit
the psychoacoustics of spatial hearing, with the ITD cues
being confined to the low-frequency band. All of them
essentially eliminate the spectral notches (see Figs. 17–
19). By using only one full-bandwidth channel, TB1 is
particularly efficient in the use of bandwidth. The price for
this bandwidth efficiency is the absence of high-frequency
ILD cues and the appearance of split images for wide-band
sources. However, TB1 is an attractive option for speech,
and it may even be acceptable for moderate-quality music
if the dominant sources are more or less in front of the
listener. By sacrificing bandwidth efficiency and including
the high-frequency ILD cues, TB2 provides good perfor-
mance on music as well as speech with a small number of
microphones. However, the spectral differences in the dif-
ferent sectors revealed in Fig. 18(a) are audible. TB3 re-
moves this flaw at the cost of higher computational
requirements.
4 EXTENSIONS AND APPLICATIONS

4.1 Spatial Sampling Strategies
There are several different ways to sample the space
Fig. 19. Two-band interpolation with high frequencies restored
around the head with microphones. The method of spatial
by spectral interpolation. (a) 8 microphones. (c) 32 microphones. sampling that is most appropriate depends on the nature of
(b), (d) Error. the application. We find it useful to distinguish three gen-
Table 1. Comparison of five methods for MTB spatial sound capture.
Advantages NM FB TB1 TB2 TB3

Captures ITD cues * *** *** *** ***
Captures ILD cues * * ** ***
Stabilizes sound images with few microphones * * ** ***
Responds continuously with few microphones ** * ** ***
Free from split images *** ** *** ***
Uses bandwidth efficiently ***
Has modest computational requirements *** ** ** *
*The number of asterisks in a cell gives a rough indication of the degree to which a method possesses
a particular advantage.
eral classes of applications, which we term panoramic, tions performance can be improved significantly by cus-
frontal, and omnidirectional. In every case the basic prin- tomizing the procedure to the individual listener. In this
ciple is to make the sampling density proportional to the section we present some possible customization techniques.
probability density for the ear locations.
With panoramic applications listeners are equally likely 4.2.1 Head Size
to turn to face any position around a full horizontal circle, It was observed in the Introduction that if the radius a of
but will usually not tilt or roll their heads. For these ap- the sphere differs significantly from the radius b of the
plications it is appropriate to have the N microphones listener’s head, the apparent locations of the sound sources
equally spaced around the equator of a sphere [see Fig. are not stable, but shift systematically with head motion.
20(a)]. This is the case that was analyzed in Section 3. Specifically, if a < b, the perceived motion is in the di-
Frontal applications are a restricted form of panoramic rection of the listener’s motion, whereas if a > b, the
applications in which there is a preferred direction of at- motion is retrograde.
tention, and listeners usually restrict head motions to turn- In the United States, 98% of the adult population has a
ing no more than 45° from side to side. This is typically head radius within approximately ±15% of the mean [46].
the situation for musical and theater performances. For Thus for most listeners who turn their heads through an
frontal applications the microphones can be spaced more angle ␪, the magnitude of the apparent angular motion of
closely along the sides of the sphere [see Fig. 20(b)]. If the the source is at worst 0.15 ␪. This is usually a small effect,
listener turns his or her head beyond the limit of an out- but it may be important for demanding applications.
ermost microphone, one could either continue to interpo- For frontal applications in which the sound sources of
late between the more widely separated microphones, or interest are in front of the listener, the disturbance can be
maintain the signal from the outermost microphone. With reduced significantly by simply replacing the measured
the latter procedure the sound field is no longer stabilized head rotation angle ␺ by the scaled value (b/a) ␺, limiting
beyond the maximum angle of head rotation, but there are the magnitude of the result to 90°. Equivalently, the listener
no sudden spectral artifacts from phase cancellation. Once can be allowed to adjust the scale factor interactively until
again, the preferred choice is application dependent. the perceived stability of the sound image is maximized.
For true omnidirectional applications, the microphones
should be spaced uniformly over the sphere. We use the 4.2.2 Pinna Compensation
formula for a hexagonal grid to estimate the number of An isolated sphere is only a first approximation to the
microphones required to cover a sphere of radius a with human head, and sounds captured by an MTB array lack
quarter-wavelength sampling, obtaining the directional cues provided by the torso and pinnae. In
冉冊 2
static listening tests increased front/back confusion and
128␲ a fmax
N≈ . (10) excessive elevation are commonly experienced conse-
3 c quences of the lack of pinna cues. Although head motion
For fmax ⳱ 1.5 kHz, a ⳱ 87.5 mm, and c ⳱ 343 m/s this cues resolve front/back confusion and help to establish
formula calls for 20 microphones. Because eight-track re- elevation, people listening to basic MTB reproduction fre-
cordings are technically convenient, and because sampling quently comment that sound sources appear to be elevated.
near the top and bottom may not be necessary, a 16- In addition some listeners comment that a source seems to
microphone configuration is attractive for practical omni- rise in elevation when they turn to face it.
directional applications. The effects of the pinna on the HRTF have been stud-
iedly extensively, but are still not completely understood,
4.2 Customization to Individual Listeners the role of the so-called pinna-notch being particularly
Basic MTB sound reproduction can be thought of as controversial [4], [17], [47]–[51]. It is possible, of course,
rendering spatial sound by substituting the HRTF of a to affix nonindividualized, “average” pinnae to the surface
sphere for the HRTF of the listener. Even though people used for an MTB array, just as is done for dummy-head
can adapt to perceptual distortions, it is well known that recordings. This is particularly attractive for frontal appli-
people are better at localizing sounds with their own cations, where left ears can be used on the left side of the
HRTFs than with other people’s HRTFs [44], [45]. For head and right ears on the right side. However, in addition
many applications absolute localization may not be impor- to being visually intrusive, acoustic interference between
tant, and the dynamic cues may more than compensate for adjacent pinnae will introduce spectral disturbances if the
the loss of spectral cues [17]. However, for some applica- spacing between microphones is small.
In informal listening tests we have found that both the
excessive apparent elevation of sound sources and its de-
pendence on head rotation can be reduced by inserting a
filter that introduces a simulated pinna notch. A typical
filter has a center frequency from 5 to 8 kHz, a Q of 3, and
a 20-dB depth. Of course the elevation cues provided by
the pinna depend on both the listener and the source lo-
Fig. 20. Appropriate sampling patterns. (a) Panoramic applica- cation, and a fixed pinna-notch filter cannot provide the
tion. (b) Frontal application. (c) Omnidirectional application. proper correction for all listeners and all source locations.
Arrow in (b) points in direction of listener’s preferred orientation. However, one can allow the listener to adjust the filter
parameters for best results. Furthermore the characteristics ations such as ease of manufacturing, ruggedness, direc-
of the pinna notch change slowly for sources in the ante- tional sound properties, or aesthetic appeal can also affect
rior horizontal plane, which makes this form of pinna cor- the final selection.
rection particularly effective for frontal applications. Prog-
ress on this form of customization is reported in [52]. 4.5 Applications
There are many potential applications for MTB sound
4.3 MTB in Virtual Auditory Space capture and reproduction. They can be broadly classified
The MTB method for spatial sound reproduction is also into three categories: 1) remote listening, 2) recording, and
applicable to computer-generated spatial sound and pro- 3) immersive interactive multimedia. We consider each of
vides the same benefits: stabilized sound images, elimina- these briefly in turn.
tion of front/back confusion, and support for an arbitrary Teleconferencing and collaborative work systems are
number of simultaneous listeners. obvious remote listening applications for MTB. These are
A typical system for generating virtual auditory space typically frontal applications, and thus can be customized
includes subsystems for modeling the source, modeling to individual listeners for optimum performance [52].
the acoustics of the room, and modeling the listener [26], MTB can also expand the functionality of omnidirectional
[28]. Usually either individualized or nonindividualized surveillance and security systems, and is potentially valu-
HRTFs are used to model the listener. If head tracking is able for the remote operation of equipment (teleopera-
used, the HRTF can be changed dynamically, but if sev- tions). In particular, it is well known that divers have
eral listeners are using the system simultaneously, the difficulty localizing sound sources because the higher
headphone output must be rendered for each listener speed of sound under water leads to small ITDs. If the
separately. radius of the array can be scaled appropriately, an MTB
With the use of MTB, these individual two-channel, array could prove useful in underwater activities.
variable-location HRTFs are replaced by the N-channel, Home theater sound and musical entertainment recordings
fixed-location HRTFs for the MTB array. This HRTF can are frontal applications, and thus both can be customized to
be approximated by a simple and very efficient fixed-pole individual listeners. Although it would be best to make new
variable-zero-plus-delay model [53]. With this approach recordings using MTB microphone arrays, it is also possible
the computational requirements for supporting N listeners to convert legacy recordings into the MTB format. Because
are not substantially greater than the computational re- the locations of the sound sources for surround-sound record-
quirements for supporting one listener. ings are known exactly, either generalized or individualized
pinna cues can be added to control elevation and to enhance
4.4 Alternative mounting surfaces static front/back discrimination.
To simplify analysis, up to this point we have assumed In Section 4.3 we described how a virtual MTB system
that the microphones are mounted on the surface of a can be used to generate sound for virtual auditory space. The
sphere. However, other alternatives may be preferable. For main advantage of this approach stems from its efficiency at
example, the microphones could be effectively suspended supporting multiple simultaneous listeners. MTB also offers
in space by supporting them by stiff rods, they could be a simple and effective way to enhance video and computer
mounted on any surface of revolution about a vertical axis, games. Finally, creating augmented reality systems by com-
or they could be mounted on the flat surfaces of a vertical bining remote listening and virtual auditory space provides a
prism. Fig. 21 shows two experimental MTB arrays, one particularly attractive application of MTB technology [54].
with the microphones mounted on a truncated cylinder and The live sounds can be acquired directly by an MTB array,
the other with the microphones mounted on a sphere. and the virtual sounds can be efficiently rendered in MTB
Any of the nonspherical surfaces have the advantage of format. In this way any of the remote listening applications
not developing a strong bright spot, and thus behaving described here can be enhanced with computer-generated
more like the HRTF for a human head. Other consider- audio information.
5 CONCLUSION
MTB is a new method for capturing, recording, and

reproducing spatial sound. It generalizes binaural record-
ing, preserving the information needed for dynamic head-
motion cues. By producing stable sound images, reducing
front/back confusion, and supporting an arbitrary number
of simultaneous listeners, it promises to enable a wide
range of applications in remote listening, recording, and
immersive interactive multimedia.
6 ACKNOWLEDGMENT
The authors would like to thank Eric Angel, Robert

Fig. 21. Two experimental MTB microphone arrays. Dalton, Roger Hom, and Josh Melick for their contribu-
tions to the implementation of the MTB system. Support Vestibular and Visual Cues in Sound Localization,” J.
was provided by the National Science Foundation under Exper. Psychol., vol. 27, pp. 339–368 (1940 Apr.).
grants IIS-00-97256 and ITR-00-86075. Any opinions, [17] H. G. Fisher and S. J. Freedman, “The Role of the
findings, and conclusions or recommendations expressed Pinna in Auditory Localization,” J. Audit. Res., vol. 8, pp.
in this material are those of the authors and do not neces- 15–26 (1968).
sarily reflect the view of the National Science Foundation. [18] F. L. Wightman and D. L. Kistler, “Resolution of
Front–Back Ambiguity in Spatial Hearing by Listener and
7 REFERENCES Source Movement,” J. Acoust. Soc. Am., vol. 105, pp.
2841–2853 (1999 May).
[1] J. D. Johnston and Y. H. Lam, “Perceptual Sound- [19] M. F. Davis, “History of Spatial Coding,” J. Audio.
field Reconstruction,” presented at the 109th Convention Eng. Soc. (Features), vol. 51, pp. 554–569 (2003 June).
of the Audio Engineering Society, J. Audio Eng. Soc. (Ab- [20] F. Rumsey, Spatial Audio (Focal Press, Oxford,
stracts), vol. 48, p. 1102 (2000 Nov.), preprint 5202. UK, 2001).
[2] W. G. Gardner, 3-D Audio Using Loudspeakers [21] M. M. Boone, “Acoustic Rendering with Wave
(Kluwer Academic, Boston, MA, 1998). Field Synthesis,” in Proc. ACM SIGGRAPH and Euro-
[3] J. Bauck, “A Simple Loudspeaker Array and Asso- graphics Campfire: Acoustic Rendering for Virtual Envi-
ciated Crosstalk Canceler for Improved 3D Audio,” J. Au- ronments (Snowbird, UT, 2001 May).
dio Eng. Soc., vol. 49, pp. 3–13 (2001 Jan./Feb.). [22] M. A. Gerzon, “Ambisonics in Multichannel
[4] J. Blauert, Spatial Hearing: The Psychophysics of Broadcasting and Video,” J. Audio Eng. Soc., vol. 33, pp.
Human Sound Localization, rev. ed. (MIT Press, Cam- 859–871 (1985 Nov.).
bridge, MA, 1997). [23] J. S. Bamford and J. Vanderkooy, “Ambisonic
[5] J. C. Middlebrooks and D. M. Green, “Sound Lo- Sound for Us,” presented at the 99th Convention of the
calization by Human Listeners,” Ann. Rev. Psychol., vol. Audio Engineering Society, J. Audio Eng. Soc. (Ab-
42, no. 5, pp. 135–159 (1991). stracts), vol. 43, p. 1095 (1995 Dec.), preprint 4138.
[6] S. Carlile, “The Physical and Psychophysical Basis [24] K. de Boer and A. T. van Yrk, “Some Particulars of
of Sound Localization,” in Virtual Auditory Space: Gen- Directional Hearing,” Philips Tech. Rev., vol. 6, pp.
eration and Applications, S. Carlile, Ed. (R. G. Landes, 359–364 (1941).
Austin, TX, 1996), pp. 27–78. [25] U. Horbach, A. Karamustafaoglu, R. Pellegrini, P.
[7] F. L. Wightman and D. J. Kistler, “Factors Affecting Mackensen, and G. Theile, “Design and Applications of a
the Relative Salience of Sound Localization Cues,” in Bin- Data-Based Auralization System for Surround Sound,”
aural and Spatial Hearing in Real and Virtual Environ- presented at the 106th Convention of the Audio Engineer-
ments, R. H. Gilkey and T. R. Anderson, Eds. (Lawrence ing Society, J. Audio Eng. Soc. (Abstracts), vol. 47, p. 528
Erlbaum Assoc., Mahwah, NJ, 1997), pp. 1–23. (1999 June), preprint 4976.
[8] E. A. Macpherson and J. C. Middlebrooks, “Listener [26] D. R. Begault, 3-D Sound for Virtual Reality and
Weighting of Cues for Lateral Angle: The Duplex Theory Multimedia (AP Professional, Boston, MA, 1994).
of Sound Localization Revisited,” J. Acoust. Soc. Am., vol. [27] B. Shinn-Cunningham and A. Kulkarni, “Recent
111, Prt. 1, pp. 2219–2236 (2002 May). Developments in Virtual Auditory Space,” in Virtual Au-
[9] S. K. Roffler and R. A. Butler, “Factors that Influ- ditory Space: Generation and Applications, S. Carlile, Ed.
ence the Localization of Sound in the Vertical Plane,” J. (R. G. Landes, Austin, TX, 1996), pp. 185–243.
Acoust. Soc. Am., vol. 43, pp. 1255–1259 (1967 Dec.). [28] L. Savioja, J. Huopaniemi, T. Lokki, and R. Vään-
[10] V. R. Algazi, C. Avendano, and R. O. Duda, “El- änen, “Creating Interactive Virtual Acoustic Environ-
evation Localization and Head-Related Transfer Function ments,” J. Audio Eng. Soc., vol. 47, pp. 675–705 (1999
Analysis at Low Frequencies,” J. Acoust. Soc. Am., vol. Sept.).
109, pp. 1110–1122 (2001 Mar.). [29] E. M. Wenzel, F. L. Wightman, D. J. Kistler, and
[11] M. B. Gardner, “Distance Estimation of 0° or Ap- S. H. Foster, “The Convolvotron: Realtime Synthesis of
parent 0°-Oriented Speech Signals in Anechoic Space,” J. Out-of-Head Localization,” presented at the 2nd Joint
Acoust. Soc. Am., vol. 45, pp. 47–53 (1969 Jan.). Meeting of the Acoustical Societies of America and Japan
[12] D. S. Brungart, “Auditory Localization of Nearby (Honolulu, HI, 1988 Nov.).
Sources. III. Stimulus Effects,” J. Acoust. Soc. Am., vol. [30] K. Inanaga, Y. Yamada, and H. Koizumi, “Head-
106, pp. 3589–3602 (1999 Dec.). phone System with Out-of-Head Localization Applying
[13] A. W. Bronkhorst and T. Houtgast, “Auditory Dis- Dynamic HRTF (Head-Related Transfer Function),” pre-
tance Perception in Rooms,” Nature, vol. 397, pp. sented at the 98th Convention of the Audio Engineering
517–520 (1999 Feb.). Society, J. Audio Eng. Soc. (Abstracts), vol. 43, pp. 401,
[14] P. T. Young, “The Role of Head Movements in 402 (1995 May), preprint 4011.
Auditory Localization,” J. Exper. Psychol., vol. 14, pp. [31] H. Møller, D. Hammershøi, C. B. Jensen, and M. F.
96–124 (1931). Sørensen, “Evaluation of Artificial Heads in Listening
[15] H. Wallach, “On Sound Localization,” J. Acoust. Tests,” J. Audio Eng. Soc., vol. 47, pp. 83–100 (1999 Mar.).
Soc. Am., vol. 10, pp. 270–274 (1939 Apr.). [32] P. Minnaar, S. K. Olesen, F. Christensen, and H.
[16] H. Wallach, “The Role of Head Movements and Møller, “Localization with Binaural Recordings from Ar-
tificial and Human Heads,” J. Audio Eng. Soc., vol. 49, pp. [45] H. Møller, M. F. Sørensen, C. B. Jensen, and D.
323–336 (2001 May). Hammershøi, “Binaural Technique: Do We Need Indi-
[33] R. O. Duda and W. L. Martens, “Range Depen- vidual Recordings,” J. Audio Eng. Soc., vol. 44, pp.
dence of the Response of a Spherical Head Model,” J. 451–469 (1996 June).
Acoust. Soc. Am., vol. 104, pp. 3048–3058 (1998 Nov.). [46] H. Dreyfuss Assoc., The Measure of Man and
[34] V. R. Algazi, C. Avendano, and R. O. Duda, “Es- Woman (Whitney Library of Design, New York, 1993).
timation of a Spherical-Head Model from Anthropom- [47] E. A. G. Shaw, “Acoustical Features of the Human
etry,” J. Audio Eng. Soc., vol. 49, pp. 472–479 (2001 June). External Ear,” in Binaural and Spatial Hearing in Real
[35] V. R. Algazi, R. O. Duda, R. Duraiswami, N. and Virtual Environments, R. H. Gilkey and T. R. Ander-
Gumerov, and Z. Tang, “Approximating the Head-Related son, Eds. (Lawrence Erlbaum Assoc., Mahwah, NJ, 1997),
Transfer Function Using Simple Geometric Models of the pp. 25–47.
Head and Torso,” J. Acoust. Soc. Am., vol. 112, pp. [48] R. A. Butler and K. Belendiuk, “Spectral Cues Uti-
2053–2064 (2002 Nov.). lized in the Localization of Sound in the Median Sagittal
[36] P. Minnaar, J. Plogsties, S. K. Olesen, F. Chris- Plane,” J. Acoust. Soc. Am., vol. 61, pp. 1264–1269 (1977
tensen, and H. Møller, “The Interaural Time Difference in May).
Binaural Synthesis,” presented at the 108th Convention of [49] H. L. Han, “Measuring a Dummy Head in Search
the Audio Engineering Society, J. Audio Eng. Soc. (Ab- of Pinna Cues,” J. Audio Eng. Soc., vol. 42, pp. 15–37
stracts), vol. 48, p. 359 (2000 Apr.), preprint 5133. (1994 Jan./Feb.).
[37] F. L. Wightman and D. J. Kistler, “The Dominant [50] E. A. López-Poveda and R. Meddis, “A Physical
Role of Low-Frequency Interaural Time Differences in Model of Sound Diffraction and Reflections in the Human
Sound Localization,” J. Acoust. Soc. Am., vol. 91, pp. Concha,” J. Acoust. Soc. Am., vol. 100, pp. 3248–3259
1648–1661 (1992 Mar.). (1996).
[38] R. S. Woodworth and G. Schlosberg, Experimental [51] Y. Kahana and P. A. Nelson, “Spatial Acoustic
Psychology (Holt, Rinehard and Winston, New York, Mode Shapes of the Human Pinna,” presented at the 109th
1962), pp. 349–361. Convention of the Audio Engineering Society, J. Audio
[39] G. F. Kuhn, “Physical Acoustics and Measure- Eng. Soc. (Abstracts), vol. 48, pp. 1102, 1103 (2000 Nov.),
ments Pertaining to Directional Hearing,” in Directional preprint 5218.
Hearing, W. A. Yost and G. Gourevitch, Eds. (Springer, [52] J. Melick, V. R. Algazi, R. Duda, and D. Thomp-
New York, 1987), pp. 3–25. son, “Customization for Personalized Rendering of Mo-
[40] J. Zwislocki and R. S. Feldman, “Just Noticeable tion-Tracked Binaural Sound,” presented at the 117th
Differences in Dichotic Phase,” J. Acoust. Soc. Am., vol. Convention of the Audio Engineering Society, J. Audio
28, pp. 860–864 (1956 Sept.). Eng. Soc. (Abstracts), vol. 52 (2004 Dec.), convention
[41] A. W. Mills, “On the Minimum Audible Angle,” J. paper 6225.
Acoust. Soc. Am., vol. 30, pp. 237–246 (1958 Apr.). [53] R. Algazi, R. O. Duda, and D. M. Thompson, “The
[42] M. R. Portnoff, “Implementation of the Digital Use of Head-and-Torso Models for Improved Spatial
Phase Vocoder Using the Fast Fourier Transform,” IEEE Sound Synthesis,” presented at the 113th Convention of
Trans. Acoust., Speech, Signal Process., vol. ASSP-24, pp. the Audio Engineering Society, J. Audio Eng. Soc. (Ab-
243–248 (1976 June). stracts), vol. 50, pp. 976, 977 (2002 Nov.), convention
[43] J. F. Alm and J. S. Walker, “Time–Frequency paper 5712.
Analysis of Musical Instruments,” SIAM Rev., vol. 44, pp. [54] A. Härmä, J. Jakka, M. Tikander, M. Karjalainen,
457–476 (2002). T. Lokki, H. Nironen, and S. Vesa, “Techniques and Ap-
[44] E. M. Wenzel, M. Arruda, D. J. Kistler, and F. L. plications of Wearable Augmented Reality Audio,” pre-
Wightman, “Localization Using Nonindividualized Head- sented at the 114th Convention of the Audio Engineering
Related Transfer Functions,” J. Acoust. Soc. Am., vol. 94, Society, J. Audio Eng. Soc. (Abstracts), vol. 51, p. 419
pp. 111–123 (1993 July). (2003 May), convention paper 5768.
THE AUTHORS
V. R. Algazi R. O. Duda D. M. Thompson
V. Ralph Algazi received a degree of Ingénieur Radio Semiconductor, after which he joined Syntelligence. In
from l’Ecole Supérieure d’Electricité (ESE), Paris, France, 1988 he became emeritus professor of Electrical Engineer-
and M.S and Ph.D. degrees from the Massachusetts Insti- ing at San Jose State University, and currently is a visiting
tute of Technology, Cambridge, in 1952, 1955, and 1963, professor in the Department of Electrical and Computer
respectively. Engineering at the University of California at Davis. His
He was at MIT from 1959 to 1965 as a research and research interests include pattern recognition, image
teaching assistant and then as a postdoctoral fellow and analyses, expert systems, auditory scene analysis, and the
assistant professor. On the faculty of the University of localization and synthesis of spatial sound.
California, Davis, since 1965, he was chairman of the Dr. Duda is the coauthor with Peter Hart and David
Department of Electrical and Computer Engineering from Stork of Pattern Classification, 2nd Ed. (Wiley-Inter-
1975 to 1986. He founded CIPIC, the Center for Image science, 2001). He is a member of the Audio Engineering
Processing and Integrated Computing, in 1989 and served Society and the Acoustical Society of America, and is a
as its director until 1994. He is now a research professor fellow of the IEEE and the American Association for Ar-
at CIPIC, pursuing research interests in signal processing, tificial Intelligence.
engineering applications of human perception for both
●
speech and images, and image and video processing and
coding. Dennis M. Thompson was born in Bradenton, FL, in
Dr. Algazi is a life senior member of the IEEE and is a 1958. He studied electronic technology at the College of
member of the AES, SPIE, and AAAS. the Redwoods, Eureka, CA. He is currently working to-
ward a degree in electrical engineering at the University of
●
California at Davis.
Richard O. Duda was born in Evanston, IL, in 1936. He In 1958 he started Yknot Sound, a regional sound com-
received B.S. and M.S. degrees in engineering from pany that specializes in live music PA systems. Currently
UCLA, Los Angeles, CA, in 1958 and 1959, respectively, he is working at the CIPIC Interface Lab, where he de-
and a Ph.D. degree in electrical engineering from MIT, signs hardware and software. His main research interest is
Cambridge, MA, in 1962. He was in the Artificial Intel- high-quality 3-D sound reproduction. He still enjoys
ligence Center at SRI International from 1962 to 1980, working in concert hall reinforcement, with an emphasis
serving as a visiting professor at the University of Texas at on quality over quantity.
Austin during the 1973/74 academic year. From 1980 to He is a student member of the Audio Engineering
1983 he was at the Laboratory for AI Research at Fairchild Society.
PAPERS
Importance and Representation of Phase in the

Sinusoidal Model*
TUE HASTE ANDERSEN AND KRISTOFFER JENSEN, AES Member
Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
Work is presented on the representation and perceptual importance of phase. Based on a

standard sinusoidal analysis/synthesis system, the phase alignment of the sound components
is analyzed. A novel phase representation, partial-period phase, is introduced, which char-
acterizes phase evolution over time with an almost stationary parameter for many musical
sounds. The proposed partial-period phase representation is used to control the phase when
synthesizing sounds. Sounds synthesized with varying amounts of phase information are
compared in a listening experiment with 11 subjects. It is shown that phase is of great
importance to the perception of the sound quality of common harmonic musical sounds, but
indications are found that phase is not of importance to the slightly inharmonic piano sounds.
In particular, the sound degradation is large for low-pitched sounds, approaching “slightly
annoying” when no phase information is used. In addition, a model based on the partial-
period phase representation has a significantly better perceived sound quality than sounds
with random phase shifts.
0 INTRODUCTION We propose a novel phase representation, partial-period

phase, to describe phase evolution over time. The partial-
Synthetic sounds are more and more common in con- period phase representation is a convenient way to repre-
temporary music. Sampling and algorithmic synthesis sent phase in the additive model. The representation solves
techniques are used not only for artistic reasons, but also several problems with the comparison of consecutive
to improve sound quality and reduce production costs. phase values in time, resulting in near constant partial
Sampling provides good sound quality, but is limited to phase trajectories.
the timbre of the recorded sound, with few parameters to Using the additive analysis/synthesis framework, we
control the timbre. One solution to this problem is to use conducted a listening experiment where resynthesized
parametric analysis/synthesis methods. These methods sounds were compared with original recorded sounds. The
combine models with good sound quality and large pa- sounds were resynthesized with phase information, with-
rameter space, allowing near real-time sound transforma- out phase information, and with two different models of
tions, for changing the perceptually important character- phase. The perceived degradation was judged by 11 sub-
istics [1]. The additive (sinusoidal) sound model [2] is the jects and compared using analysis of variance.
most widely used parametric analysis/synthesis method in In Section 1 previous work related to the perception of
sound modeling research. The additive parameters form a phase is reviewed. Section 2 gives an overview of the
convenient representation for the deterministic part of the sinusoidal analysis technique, and Section 3 describes the
sound, allowing for easy transformations of perceptually improved phase representation. Section 4 presents the listen-
important characteristics such as duration, pitch and, ing experiments, and conclusions are given in Section 5.
timbre.
Literature studies using harmonic tones show that the 1 PREVIOUS WORK
perceived sound quality is influenced by synthesizing
tones with the same harmonic content, but with different In this section the literature showing the relevance of
phase shifts. However, it is not clear to what extent these phase in musical sound modeling is briefly reviewed.
effects are important to the perception of naturally occur- From the experiments reported here it is clear that phase is
ring sounds. of importance to the perception of complex tones, and
especially to the perception of timbre. The question re-
*Manuscript received 2002 December 9; revised 2003 Decem- mains as to what extent phase is of importance to the
ber 8; 2004 July 27; and 2004 August 24. quality of naturally occurring sounds, and if so, how
ANDERSEN AND JENSEN PAPERS
phase can be modeled in sound transformations and pure but was a less effective forward masker than a tone with
synthesis. components added in random phase.
Whereas phase changes are detectable in controlled
1.1 Does Phase Affect Timbre? situations, as even polarity change was found to be audible
One of the first experiments concerning the perception in two-component signals [16], the discrimination of phase
of timbre of complex tones in relation to phase was con- changes in individual components often requires specific
ducted by von Helmholtz [3]. Using a special technique he phase alignment, such as cosine phase [17].
was able to generate complex tones consisting of eight
sinusoids (partials) with variable phase and fundamental 1.2 Importance of Phase in Transients
frequencies of 120 and 240 Hz. Helmholtz concluded that Patterson and Green [18] used Huffman sequences, in
“the changes in timbre are not distinct enough to be ob- which the phases can be varied independently of the en-
served after a few seconds required to alter the phases; ergy spectrum, to assess the discrimination of phase
anyhow these changes are too small to transform one changes in transients. They found that phase changes
vowel in another,” and “harmonics beyond the sixth to could be discriminated reliably, for some stimulus wave-
eighth give dissonances and beats, so it is not excluded forms, for durations above 5 ms.
that, for these higher harmonics, a phase effect does exist.” Wakefield et al. [19] conducted a study of the percep-
These conclusions have often been interpreted as indicat- tion of transients using filtered noise, where a two-interval
ing that phase has no influence on timbre [4], even though forced-choice adaptive psychophysical procedure was
later experiments showed otherwise [5], [6]. used to find the JND between a given sound and a copy of
Plomp and Steeneken [4] conducted a number of ex- the sound where the magnitude spectrum was smoothed
periments involving complex tones with ten harmonic par- and the phase spectrum held constant. The surprising re-
tials and equal spectral envelope, but different phase sult was that the JND depended strongly on the phase
shifts. The most important finding was that the maximum pattern used. It was concluded that “the effect for short
effect of phase on timbre perception occurs when a tone duration signals is greater than what the (sparse) literature
containing harmonic partials that all start at sine phase (0°) on the auditory perception of transients would suggest.”
is compared to one where the partials alternate between The perception of clicks and chirps was further investi-
sine phase and cosine phase (90°). The effect of reducing gated by Uppenkamp et al. [20]. The up-chirps used by
the level of each successive partial by 2 dB was greater Uppenkamp et al. are signals constructed to contain the
than the maximum phase effect described earlier. Also the same frequencies as clicks, but where the phase is ma-
effect of phase on timbre appeared to be independent of nipulated to compensate for the spatial dispersion along
the sound level. the cochlea. Up-chirps should therefore reach maximum
Patterson [7] presented psychoacoustic experiments in- amplitude at the same moment in time at all places of the
volving alternating phase (APH) waves, that is, harmonic basilar membrane. They compared the perceived “com-
partials in which even partials start in cosine phase while pactness” of clicks to that of chirps and found that clicks
odd ones start in cosine phase + D°. It was found that the were perceptually more compact than up-chirps, but that
value of D leading to a just noticeable difference (JND) down-chirps, that is, up-chirps reversed in time, sounded
between a sound with partials in cosine phase and an APH more compact than up-chirps. Even though up-chirps are
sound was lower for sounds with high bandwidth, low aligned in time at the basilar membrane output, they have
repetition rate, and high signal level. The signal duration a longer within-channel impulse reponse than down-chirps
was found to have no, or very little, effect on the JND. and clicks. This suggests that “the perceived ‘compact-
Progressively improved models, using summary auto- ness’ of a sound is apparently more determined by the fine
correlation [8], [9], auditory imaging [10], and models structure of excitation within each peripheral channel than
including the behavior of early cortical stages, using the by between-channel phase differences.”
summary measure of spectrograms [11], have provided
explanations for the observed effect of phase. 1.3 Phase Models
Alcántara et al. [12] studied the influence of phase on Schroeder [21] reported a number of effects related to
the identification of vowel-like sounds. The “vowel” were sounds with up to 31 harmonic partials. Most interesting is
created by increasing the level of three pairs of successive the reported strong dependence of timbre on the peak fac-
harmonic partials. They found better identification when tor. The peak factor can be minimized via an analy-
the components had cosine starting phase than when they tical approximation equation [22]. The synchronization
had random phase, and poorer performance for weaker index model (SIM) of Leman [23] employs a functional
stimuli. Pressnitzer and McAdams [13] studied the influ- model of the auditory periphery and a method of predict-
ence of phase on roughness perception and found that ing the roughness of a sound. This model was used by
roughness is linked to shapes of the waveforms at the Tind and Jensen [24] to devise a propagation formula of
output of the simulated auditory filter. Roberts et al. [14] the phase shifts that control the roughness output of
showed that phase shifts could influence stream segrega- the SIM. By basing the propagation formula on the rough-
tion for rapid sound sequences. Gockel et al. [15] studied ness prediction for three partials, they obtained a corre-
the influence of phase on loudness and forward masking spondence between the roughness control parameter and
produced by harmonic complex tones. They found that a the predicted roughness for complex harmonic sounds.
tone with components added in cosine phase was louder They concluded that there exists a (nonunique) phase shift
PAPERS PHASE IN THE SINUSOIDAL MODEL
for a given perceptual roughness of complex harmonic sinusoidal tracks over time from analyzed recordings of
sounds. instrument sounds. The synthesis quality of a comparable
analysis method, when phase information is not used, has
2 SINUSOIDAL ANALYSIS/SYNTHESIS previously been measured to be equal to, or better than
“perceptible but not annoying,” when compared to the
This study is based on the analysis by synthesis meth- original recorded monophonic sounds [1].
odology [25], using additive (sinusoidal) analysis/
synthesis techniques. In the additive framework, sounds 2.1 Analysis
are modeled as a sum of sinusoids with time-varying am- For each sound under analysis, the fundamental fre-
plitudes, frequencies, and sometimes also phase shifts. quency ␻0 is estimated using autocorrelation [42]. This
The short-time Fourier transform (STFT) [26] is a re- method, which is applicable only to monophonic quasi-
lated technique that can be used for analysis/synthesis and harmonic sounds, is used to determine the fixed block size
transformations of sounds [27]. In the STFT overlapping used in the analysis of the given sound. For each block of
blocks of the windowed sound are Fourier transformed, sound k under analysis, a new local measure ␻k0 of the
modified, and inverse Fourier transformed. However, for fundamental frequency is calculated, again by use of au-
harmonic sounds or sounds with strong partials, the fre- tocorrelation. From this measure an FFT is performed and
quency components between the strong partials are a search for peaks is done near the regions of the quasi-
masked. Because a large number of the frequency com- harmonic frequencies. The amplitude Aki and frequency ␻ki
ponents in the STFT are masked, the number of param- are stored for each partial i and time frame k.
eters used to model the sound can be greatly reduced. This In order to retain more of the additive noise compo-
is the assumption in the additive model that is used in this nents, a method inspired by the NBBF [35] has been em-
work. The additive model is chosen for two reasons. First, ployed here, in which sinusoids are estimated in between
it is well suited for further high-level modeling of musical the harmonic partials if the fundamental frequency is
sounds. Second, the additive model is being used in many above 400 Hz. This method essentially retains noises such
research and development prototypes today [28]–[31], and as hammer noise or the additive noise in wind instruments.
thus it provides a stable frame work for exploration in the Peaks from adjacent blocks are connected to form sinu-
perception of natural sounds. The additive model has, soidal tracks. The system has been extended to output not
however, several shortcomings. The transients are often only the amplitude and frequency of the tracks, but also
smeared in block-based analysis/synthesis and noise is not the phase ␪ki for each block. To model the phase over time,
well represented. high precision of the estimated phase values is required.
Several methods exist for determining the time-varying To achieve this it was found necessary to extend the length
amplitudes and frequencies of the harmonic partials. Al- of each analysis block from 2.8 periods of the fundamental
ready in the last century, musical instrument tones were period length to 4 periods. By doing this, the time resolu-
divided into their Fourier series [3]. Early techniques for tion is affected, and thus the sound quality of transient
the time-varying analysis of the additive parameters are sounds is degraded.
presented by Matthews et al. [32] and Freedman [33].
Today the most common technique for the additive analy- 2.2 Synthesis
sis of musical signals is based on STFT analysis [2]. In The sound is synthesized using the analysis parameters
order to retain the noise components, several noise models in the following way:
of musical sounds have been presented, including the re- N
sidual noise model in the fast Fourier transform (FFT)
[28], [2], the bandwidth-enhanced additive synthesis [34],
s共n兲 = 兺 A 共n兲 cos关␪ (n)兴
i=0
i i (1)
[1], and the narrow-band basis functions (NBBF) in
speech models [35]. In order to improve the frequency, for N partials, where ␪i(n) denotes the time-varying phase
and in particular the time resolution, that is, to better retain for partial i and sample index n. In practice the values of
the transient behavior of percussive musical instruments, Ai(n) used in the synthesis are obtained by linear interpo-
time–frequency based methods [1], [36] could be used, lation of the measured amplitude values between the block
and the time and frequency reassignment method [37] has boundaries. Two methods for finding the phase ␪i are
recently gained popularity [38], [34]. Ding and Qian [39] used:
have presented an interesting method for improving the
time resolution, fitting a waveform by minimizing the en- Sa—Synthesis without measured phase information
ergy of the residual. This was improved and dubbed adap- Sb—Synthesis with measured phase information.
tive analysis by Röbel [40].
We used a software package developed by the authors When synthesizing sound without the measured phase in-
[1], [41], previously used in explorations of the timbre of formation ␪ki , the phases of the sinusoidal tracks are found
musical instruments. It is based on the classic peak- by the cumulative sum of the interpolated frequency val-
picking method, where overlapping blocks are windowed, ues over time,
and the amplitudes, frequencies, and phases are found n
from interpolated peaks of the magnitude of the FFT. This
method has been shown to work well in forming stable
␪i 共 n 兲 = 兺 ␻ (n).
0
i (2)
When synthesizing the sound using phase information, the for two types of musical sounds. Fig. 1(a) shows a sta-
phase trajectory is interpolated in such a way that bound- tionary part of a soprano voice, where the phase progresses
ary conditions are satisfied. This can be done by cubic in a coherent way through time and frequency. The attack
interpolation [2] of the phase using the measured fre- of a guitar note is shown in Fig. 1(b). Here the phase
quency and phase values. In this way the measured phase evolution over time is less coherent. The goal of the phase
and frequency values are preserved at the block bound- representation presented in this section is to describe
aries, but oscillating frequency tracks can occur between stable sounds, such as the sustained part of most sounds
the block boundaries [39]. A solution to this problem is to from musical instruments, using a few parameters.
use quadratic interpolation where the phase and frequency
values cannot be preserved at the boundaries. Instead a 3.2 Phase Delay
weighting factor is used to determine the importance of the The phase values ␪(␻) obtained from the discrete Fou-
estimated phase relative to the frequency [39]. However, rier transform, and thus also the values used in additive
no degradation caused by the oscillations in frequency has analysis, are specified as the phase shift in radians for each
been found in this work; therefore the cubic interpolation sinusoidal component. Another way to represent phase is
method is used. as phase delay [43],
␪共␻兲
3 PHASE REPRESENTATION P=− (3)
␻
The goal of additive phase modeling is to improve the where ␪(␻) is the phase at frequency ␻, and P(␻) ex-
sound quality in pure synthesis models such as the Timbre presses the time delay in seconds relative to the center of
Engine, based on the timbre model [31], to improve the the frame. The magnitude, phase, and phase delay as a
sound quality in time–frequency scaling of signals, and function of frequency of a stable part of a saxophone
finally to gain a better understanding of the perception of sound are shown in Fig. 2. Phase delay is not a common
musical signals. We chose to investigate the phase as a way to represent phase in the sinusoidal model. However,
function of time, and thus a convenient representation of it is shown here as it is used as the basis of the relative
the phase trajectories over time is needed. phase delay described in the following section.
3.1 Visualization 3.3 Relative Phase Delay

The phase is rarely considered or even visualized in Waveform preservation when performing time or pitch
sound modeling literature. A way to visualize the phase is scaling of harmonic sounds can be achieved without the
to use a spectrogram, but plotting phase instead of energy use of a specific phase representation, by using analysis
of the discrete Fourier transform. Fig. 1 demonstrates this step sizes exactly equal to the fundamental period length.
Fig. 1. Phase as a function of time and frequency. Brightness represents phase (radians) between −␲ and ␲. (a) Sustained part of a
soprano voice. (b) Attack of a guitar. Fundamental frequency of both sounds is approximately 500 Hz.
However, in many cases it is not convenient to have forced the overall waveform characteristics are preserved, and
nonconstant step sizes during analysis, and thus another thus the phase delay of the fundamental can be chosen at
phase representation is needed. To overcome this problem, random. Having modified the phase delay of the funda-
while still being able to preserve the shape of the wave- mental, phase delays of the other partials are converted
form when doing time or pitch scaling, Di Federico pro- back into phase values,
posed a representation, relative phase delay (RPD) [44],
based directly on additive parameters, by representing the
phase trajectories as phase delays relative to the phase
␪i,k = mod 冋冉 ␪1,k
␻1,k 冊册
+ ⌬␶i,k ␻i,k, 2␲ , k = 2, . . . , N
(6)
delay of the first partial. When performing time scaling,
the amplitudes and frequencies of the partials are left un- where ␪1,k is the phase of the fundamental.
touched, but the phase values of the fundamental are up- Relative phase delay is a representation that works well
dated using a propagation formula. After the new phase for harmonic sounds in that the waveform characteristics
values of the fundamental are found, the phase values of are preserved. However, to actually use this representation
the other partials are changed, based on their position rela- it is necessary to take phase wrapping into account. An-
tive to a fixed point in the fundamental period. other problem with the relative phase delay is that the
RPD is based on the definition of phase delay from Eq. phase delay calculated from the wrapped phase values
(3), and it is defined as approaches zero as the frequency increases. This makes it
difficult to compare relative phase delays and plot them
␪i,k visually. Finally if the sound is slightly inharmonic, a drift
␶i,k = (4)
␻i,k in the relative phase delays will occur for the partials. To
show this, imagine a nearly harmonic signal with two
where i is the index of the partial and k is the analysis sinusoids of start phase 0, one with a frequency of 110 Hz
frame index. ␶ expresses the distance in time between the and one with a frequency of 225 Hz. The sound is ana-
analysis frame center and a specific point in the partial lyzed using a step size of 100 ms. In the first analysis
period. frame the relative phase delay between the first and second
The relative phase delay is defined for the partials as the partials is 0. At the next frame, t ⳱ 0.1 s, the phase of each
difference between the phase delay of the fundamental and partial is
the partial i,
␪0,1 = mod共0.1 s ⭈ 2␲ ⭈ 110 Hz, 2␲兲 = 0
⌬␶i,k ⳱ ␶i,k − ␶1,k. (5) (7)
␪1,1 = mod共0.1 s ⭈ 2␲ ⭈ 225 Hz, 2␲兲 = ␲.
Since the relative phase delay ⌬␶i,k for i ⳱ 2, . . . , N is The phase delay of the first partial is ␶0,1 ⳱ 0/110 Hz ⳱
defined relative to a fixed point in the fundamental period, 0 s. The phase delay of the second is evaluated at an
Fig. 2. One analysis frame of sustained part of a saxophone sound. (a) Magnitude. (b) Phase. (c) Phase delay. +—spectral peaks in
magnitude plot.
integer multiple of the “fundamental frequency,” 2 ⭈ 110 tion. Furthermore, the new way of representing the phase
Hz ⳱ 220 Hz, ␶1,1 ⳱ ␲/220 Hz ≈ 0.014 s, and thus a drift solves the phase unwrapping problem of the RPD.
in the relative phase delay of the second partial has oc-
curred. If the second partial had been at frequency 220 Hz, 3.5 Partial-Period Phase Representation
no drift would have occurred, and the relative phase delay To correct for drifting phase values in inharmonic
representation would have given a usable result. This drift sounds, an improvement to the fundamental-period phase
can be demonstrated on synthetic and recorded signals [41]. representation is proposed, the partial-period phase (PPP),
in which the phase is expressed relative to the same point
3.4 Fundamental Period Phase Representation between frames of the partial period instead of relative to
To overcome some of the problems of the relative phase a point in the fundamental period. The method presented
delay, an improved phase representation is proposed. One here bears some similarity to the phase propagation em-
of the goals that was achieved with the RPD was that ployed in STFT-based phase vocoders [45] when time or
phase values between frames could be compared. This is pitch scaling a signal.
usually possible only in a frame-based analysis when the Eqs. (10) and (8) then give
frame size is exactly an integer multiple of the fundamen-
␾k,i ⳱ ␪k,i + ⌬tk,i␻k,i (12)
tal period. In this case the phase value is measured at the
same point in the waveform period for successive frames, where k is the frame number, i is the index of the partial,
and is thus comparable between frames. In the RPD this and ⌬tk,i is the time difference between the point at which
problem was overcome by using phase delays. Another the phase value is measured and the corrected value (see
way to make the phase values comparable between frames Fig. 3),
冋冉冊册
is used in the fundamental-period phase representation. In Ra − ⌬tk−1,i
this representation the measured phase values of the fun- ⌬tk,i = 1 − mod ,1 Lk,i. (13)
damental, ␪k,0, for a given block k, are corrected by a linear Lk−1,i
change in phase, corresponding to a time difference ⌬tk,0 Fig. 4 shows an example of the difference between the
at the measured frequency ␻k,0. ⌬tk,0 is defined as the time fundamental-period phase representation and the PPP rep-
difference between a fixed point in the fundamental pe- resentation. A segment of a piano sound is analyzed, and
riod, that is, a point in the period which is the same be- the corrected phase values for the first five partials are
tween frames, and the point where ␪k,0 is measured. The shown. The piano sound is known to have stretched har-
point where ␪k,0 is measured is dependent on the step size monic frequencies [46], and thus effectively demonstrates
Ra. More formally ⌬tk,0 can be found by the following the problem with RPD and fundamental-period phase rep-
formula, where Lk,0 represents the length of the fundamen- resentation. The partial-period phase representation [Fig.
tal period in the kth frame. 4(b)] is clearly superior to the fundamental-period phase
冋冉冊册
representation [Fig. 4(a)], removing the phase drift caused
Ra − ⌬tk−1,0
⌬tk,0 = 1 − mod ,1 Lk,0. (8) by nonharmonic partials.
Lk−1,0 All phase representations presented here preserve the
The modulus function is used to ensure that the phase phase information, and thus no degradation in sound qual-
correction stays in the interval between −␲ and ␲. Note ity results from the use of these representations. By sub-
that Lk,0 can be found by knowing the frequency of the stituting Ra in Eq. (13) with the synthesis frame length Rs
fundamental ␻k,0, we obtain the phase values used in the cubic interpolation
when synthesizing,
2␲
Lk,0 = . (9) ␪k,i ⳱ ␾k,i − ⌬tk,i␻k,i (14)
␻k,0
Using Eq. (10) all partials in a given analysis frame are
The corrected phase for the fundamental ␾k,0 can now be analyzed relative to the same time in the fundamental pe-
found, riod. In the partial-period representation of Eq. (12) each
␾k,0 ⳱ ␪k,0 + ⌬tk,0␻k,0. (10) partial phase value is evaluated relative to the same point
in the last analysis frame of the particular partial. In a
The phase values of the other partials are corrected using transient or low-energy part of the sound, the estimation of
the same time difference ⌬tk,0 that was used in correcting the partial frequencies is likely to fail, resulting in new
the fundamental, absolute phase values, and thus occurrence of transients
␾k,i ⳱ ␪k,i + ⌬tk,0␻k,i. (11) has to be taken into consideration when using the partial-
period phase representation. In practice the phase repre-
This representation is called fundamental-period phase sentation and modeling should be coupled with a transient
representation and is equivalent to the RPD, apart from the detector to signify in which portions of the sound the
fact that the RPD uses phase delays measured in time to phase values are comparable.
represent the phase differences, whereas radians are used
in the fundamental-period phase representation. This 4 EXPERIMENT: THE IMPORTANCE OF PHASE
means that now we have a representation preserving the
waveform characteristics and allowing for a comparison of The purpose of this experiment is to determine how
phase values between frames, as in the RPD representa- important phase is with regard to sound quality when syn-
thesizing monophonic singing voice or other musical in- 4) Synthesized sound, constant partial-period phase, ap-
struments. Recorded instrument sounds are analyzed using proximating absolute phase values in the stationary part
the method described in Section 2, followed by modifica- (AP)
tion of the phase trajectories using the partial-period phase 5) Synthesized sound, no phase information (NP).
representation. Finally the sounds are resynthesized from For the synthesized sounds method Sb described in Sec-
the modified analysis parameters and compared in a lis- tion 2.2 was used, except for the sound with no phase
tening experiment. information, where method Sa was used.
In ARP all phase information is preserved, and thus no
4.1 Sound Reproduction Methods modification to the phase information is made between
Five conditions were used in a repeated-measures full analysis and synthesis.
factorial experiment. In each condition the original sound To change the partial phase trajectories in RP and AP,
was compared to one of the following sounds: we use the partial-period phase representation. In synthe-
1) Original sound (ORG) sizing RP the relative phase shift between each partial is
2) Synthesized sound, with full phase information, preserved, but the absolute phase value of each partial is
maintaining absolute and relative phase (ARP) discarded. This is accomplished by randomizing the start
3) Synthesized sound, with phase information, main- phase of each phase trajectory in the partial-period phase
taining relative phase (RP) representation.
Fig. 3. Schematic drawing of partial-period phase value for one partial. 哹 block boundaries; ␪—measured phase value for each block;
␾—PPP value found by knowing frequency of partial and step size Ra. As seen, ␾k and ␾k-1 are at the same position in the partial
period, even though phase values, ␪k and ␪k−1 are measured at different positions in the partial period.
Fig. 4. First five partials of a piano sound. (a) Fundamental-period phase. (b) Partial-period phase. Noise at the start is due to transient
sound of piano attack, for which no stable frequency information can be estimated. Partial-period phase is clearly superior to
fundamental-period phase in that phase trajectories are nearly constant over time for sustained part of sound.
In AP the partial-period phase trajectory is approxi-

mated with a constant ␬i. The start phase is chosen so that 4.3 Procedure
the phase is close to the phase of the original sound in the We used the double-blind triple-stimulus with hidden
stationary part of the sound. This method poses a problem reference method [47] for assessing perceptual differences
as to how the attack should be synthesized. The attack can between original and synthesized sounds. This method
be synthesized with random start phases as in NP, but then was used in previous studies to measure the sound quality
the phases have to be interpolated at the start frame of the of timbre models based on analyzed sounds without phase
stationary sound ks to the phase value ␬i. Another alterna- information [1]. In these experiments it was found that the
tive is to synthesize the attack with full phase information sound quality was dependent on the fundamental fre-
as in ARP, to minimize the phase difference between the quency of the synthesized sounds. In general, resynthe-
attack and the stationary part at frame ks. This last method sized sounds with high fundamental frequency were rated
is chosen to avoid artifacts introduced by interpolation to have less degradation than resynthesized sounds with
between frames ks − 1 and ks. The selection of ks was made low fundamental frequency.
by manual segmentation. ␬i was approximated by the fol- For each sound the subject first heard a reference sound,
lowing function: in this case the original recorded sound, followed by two
sounds in random order, where one was the reference and
ke
␾i,k Ai,k
兺兺
the other was the experimental sound, a sound from one of
␬i = ke
(15) the five conditions described in the preceding. The subject
k−ks k=ks Ai,k
was then asked to rate the degradation of the two sounds
where ␾i,k is the partial-period phase for partial i at frame relative to the reference. The degradation of the sounds
k, ks is the first frame of the stationary part of the sound, was rated on a scale from 1 to 5, where 1 corresponds to
and ke is the ending frame. The function weights the phase “very annoying” and 5 to “imperceptible.” One decimal
values in the high-energy portion of the sound more highly could be used in the rating, and one of the sounds had to
than the phase values in low-energy portions of the sound. be given the score 5. The full scale is shown in Table 2.
As the AP synthesis method retains the waveform of the The subjects were allowed to listen to the sounds as many
high-energy harmonic partials, the resulting AP sounds times as found necessary.
have a waveform that resembles the original waveform. Statistical analysis of the results was done using the
Sound is synthesized with no phase information (NP) repeated measures analysis of variance, with synthesis
using synthesis method Sa described in Section 2.2. The method, instrument type, and fundamental frequency as
start phase of each trajectory is randomized. independent variables. The individual levels were com-
pared using a post-hoc least significant difference (LSD)
4.2 Sounds test at a 0.05 significance level. The results were evaluated
The sounds used in the experiment were bass clarinet, according to a measure of degradation, which is the dif-
bass trombone, cello, piano, and singing voice. The instru- ference between the rating of the reference Sr and the
ments were chosen to represent a broad spectrum of acous- rating of the processed sound Sp,
tic instruments and because of their extended pitch range. d ⳱ Sp − Sr. (16)
The singing voice is probably the best known musical
instrument; the piano is chosen for its percussive, slightly The degradation measure takes into account the rating of
inharmonic sound; the cello has a particular sound because the reference sound to compensate for any erroneous rat-
of the jitter introduced by the bow-string interaction. The ing by the subjects. No statistically significant variation
bass clarinet and bass trombone are examples of reed and between subjects in the rating of the reference sounds was
lip-driven wind instruments. These instruments are be- found (F10,2739 ⳱ 1.58, p ⳱ 0.106). In the following,
lieved to be representative of today’s music instruments, sound quality is defined by the degradation d; a low deg-
and in particular, the different aspects of timbre in com- radation is assumed to be synonymous with high sound
mon instruments that could influence the perception of quality.
phase. For all instruments, recordings of five fundamental All sounds were played back over Beyer Dynamic
frequencies were used, as shown in Table 1. The length is DT990 headphones, connected to an M-Audio firewire
approximately 2 seconds for most sounds, and they were 410 sound interface. 11 subjects, most of them male in their
played forte for the most part. mid-twenties, listened to each condition twice, and thus
250 sets of sounds were presented to each subject. The
Table 1. Reference sounds used in listening experiment.
Table 2. Scale used in
Frequency (Hz) listening experiment.
Instrument f1 f2 f3 f4 f5 Score Impairment

Bass clarinet 59.2 109.7 176.0 295.0 469.2 5.0 Imperceptible
Bass trombone 58.8 117.0 237.6 263.2 469.7 4.0 Perceptible, but not annoying
Cello 65.6 98.3 131.2 196.1 525.4 3.0 Slightly annoying
Piano 49.4 97.9 132.3 263.2 526.6 2.0 Annoying
Singing voice 82.5 109.5 216.7 273.7 391.0 1.0 Very annoying
experiment took about one hour, and thus time was the proved as a function of frequency, except for f5, where it
limiting factor for the number of repetitions, reproduction was significantly lower than for f3 and f4, at p ⱕ 0.024.
conditions, and instrument types used in the experiment. The interaction of reproduction and fundamental fre-
quency was significant (F16,336 ⳱ 30.1, p < 0.001), with
4.4 Results and Discussion an unexplained large mean degradation in the AP and NP
After the experiment the subjects were asked to com- reproductions for f5 compared to f3 and f4. For ARP and
ment on the experiment. Many stated that the perceived RP we see a clear relationship between fundamental fre-
difference between reference and processed sound was quency and perceived sound quality, where low funda-
due to changes in the sustained part of the sound. The mental frequency results in larger degradation. It seems
sounds used in the experiment all had soft attacks and no that ARP and RP retain the phase relations that are im-
transients, except for the piano, which has a fast attack portant when modeling the noise between the harmonic
when the hammer hits the string. For the piano one subject partials in the high-pitched sounds. The noise is modeled
commented on a perceived difference in the attack. using additional nonharmonic sinusoids, which make “the
The degradation varied significantly across conditions noise take on a tonal quality that is unnatural and annoy-
(F4,84 ⳱ 60.7, p ⳱ 0.001). Fig. 5 shows the mean degra- ing,” if the phase information is not used [2]. This only
dation and the standard error of the mean for each repro- applies to the high-pitched tones, however.
duction type. Pairwise comparison showed that the indi- Mean degradation as a function of instrument type and
vidual levels of reproduction were significantly different reproduction is shown in Fig. 7. The degradation varies
from each other (p ⱕ 0.001). The order of the mean deg- significantly across the type of instrument (F4,84 ⳱ 54.4,
radations ranged, as expected, from imperceptible toward p < 0.001). In general the degradation was lower for cello
larger degradation of the sound quality, as phase informa- and piano than for the rest of the instruments. The inter-
tion was removed. One exception is that the mean degra- action of instrument type and reproduction was significant
dation of relative phase (RP) was lower than that of ab- (F16,336 ⳱ 28.4, p < 0.001). For bass trombone and bass
solute phase (AP). In synthesizing RP far more phase clarinet, AP gave a lower degradation than the other re-
information is used than in AP. Even though AP is rated production methods, with the exception of ORG and ARP.
lower than ARP, the results show that AP, the model using Piano gave the largest degradation for ARP reproduc-
the partial-period phase representation, can indeed retain tion, which is most likely due to errors in the reproduction
some of the perceptually important phase information. An- of the attack. The transient caused by the hammer–string
other explanation for the finding may be the fact that the interaction in the piano attack is the only fast transient that
attack in AP is identical to the attack in ARP and thus occurs in the instrument selection included in this experi-
close to that of the original sound. ment. Because of the window used in the block-based
The variation in degradation across the fundamental fre- analysis, smearing does occur, which is harmful to the
quency group was also significant (F4,84 ⳱ 82.5, p < modeling of fast transients. This may explain the higher
0.001). Fig. 6 shows the mean degradations for the fun- degradation in ARP for piano. A within-subject analysis of
damental frequencies f1 to f5 as defined in Table 1 for the the degradation for the piano sounds reveals a significant
different levels of reproduction. The sound quality im- difference between reproduction types (F4,84 ⳱ 4.5, p ⳱
Fig. 5. Mean degradation for different reproduction types.
0.002). However, pairwise comparison of the four differ- 5 CONCLUSIONS

ent synthesized reproduction types reveals no significant
difference between any of them. The piano tones are We have presented an improved representation, partial-
slightly inharmonic, and thus the relative phase shift is not period phase, for use in the sinusoidal analysis/synthesis
constant over time. This change in relative phase shift may framework. The proposed representation makes phase val-
explain why no significant difference in degradation is ues from consecutive frames comparable while solving
observed between the synthesis methods, since the non- problems with phase unwrapping. The representation re-
harmonic relationship prevents stable auditory waveforms sults in stable phase trajectories for the stationary part of
that would influence the perceived sound quality [22], harmonic and inharmonic sounds. Partial-period phase is
[13]. In addition, the relatively short transient of approxi- of direct use in additive analysis/synthesis, where phase is
mately 5 ms [48] of the piano attack makes phase changes modeled together with frequency and amplitude, when
less perceptible [18]. transforming or synthesizing sounds.
Fig. 6. Mean degradation for different reproduction types as a function of fundamental frequency. A high fundamental frequency group
corresponds to a high fundamental frequency.
Fig. 7. Mean degradation for different reproduction types as a function of instrument type.
An experiment was conducted where synthesized Phase Sensitivity of a Computer Model of the Auditory
sounds were compared to original recorded sounds. The Periphery. I: Pitch Identification,” J. Acoust. Soc. Am., vol.
results of the experiment show that the inclusion of phase 89, pp. 2866–2882 (1991).
alignment enhances the sound quality of the analysis/ [9] R. Meddis and M. J. Hewitt, “Virtual Pitch and
synthesis system. A significant change in mean degrada- Phase Sensitivity of a Computer Model of the Auditory
tion was found between synthesis without and with phase, Periphery. II: Phase Sensitivity,” J. Acoust. Soc. Am., vol.
going from “perceptible, but not annoying” to “impercep- 89, pp. 2883–2894 (1991).
tible.” This result is in agreement with the literature on [10] R. D. Patterson, M. H. Allerhand, and C. Giguere,
auditory perception of complex tones. A significant effect “Time-Domain Modelling of Peripheral Auditory Process-
of fundamental frequency was found, resulting in degra- ing: A Modular Architecture and a Software Platform,” J.
dation approaching “slightly annoying” for sounds with Acoust. Soc. Am., vol. 98, pp. 1890–1894 (1995).
fundamental frequencies below approximately 100 Hz [11] R. P. Carlyon and S. Shamma, “An Account of
synthesized without phase (NP). Monaural Phase Sensitivity,” J. Acoust. Soc. Am., vol.
By use of the partial-period phase representation, a 114, pp. 333–348 (2003).
phase model (AP) is proposed where the sustained part of [12] J. I. Alcántara, I. Holube, and B. C. J. Moore, “Ef-
the sound is modeled by a constant partial-period phase fects of Phase and Level on Vowel Identification: Data
trajectory. The experiment shows that this model is sig- and Predictions Based on a Nonlinear Basilar-Membrane
nificantly better than when discarding phase alignment Model,” J. Acoust. Soc. Am., vol. 100, pp. 2382–2392
information (NP), or when maintaining the relative phase (1996).
shift (RP) but discarding the absolute alignment of the [13] D. Pressnitzer and S. McAdams, “Two Phase Ef-
partials. fects on Roughness Perception,” J. Acoust. Soc. Am., vol.
For the piano, synthesis with full phase information 105, pp. 2773–2782 (1999).
(ARP) was worse than for the other instruments, which is [14] B. Roberts, B. R. Glasberg, and B. C. J. Moore,
most likely due to the smearing of the fast transient in the “Primitive Stream Segregation of Tone Sequences without
attack. No significant difference was found between the Differences in F0 or Passband,” J. Acoust. Soc. Am., vol.
different synthesis methods of the piano sound. 112, pp. 2074–2085 (2002).
[15] H. Gockel, B. C. J. Moore, R. D. Patterson, and R.
6 ACKNOWLEDGMENT Meddis, “Louder Sounds Can Produce less Forward Mask-
ing: Effects of Component Phase in Complex Tones,” J.
The authors would like to thank the reviewers, Brian
Acoust. Soc. Am., vol. 114, pp. 978–990 (2003).
C. J. Moore and one anonymous reviewer, for helpful
[16] R. A. Greiner and D. E. Melton, “Observations on
comments and suggestions. We would also like to thank
the Audibility of Acoustic Polarity,” J. Audio Eng. Soc.,
the subjects participating in the experiment.
vol. 42, pp. 245–253 (1994 Apr.).
[17] B. C. J. Moore and B. R. Glasberg, “Differ-
7 REFERENCES ence Limens for Phase in Normal and Hearing-Impaired
[1] K. Jensen, “Timbre Models of Musical Sounds,” Subjects,” J. Acoust. Soc. Am., vol. 86, pp. 1351–1365
Ph.D. dissertation, Tech. Rep. 99/7, Dept. of Computer (1989).
Science, University of Copenhagen, Copenhagen, Den- [18] J. H. Patterson and D. M. Green, “Discrimination
mark (1999). of Transient Signals Having Identical Energy Spectra,” J.
[2] R. J. MacAulay and T. F. Quatieri, “Speech Analy- Acoust. Soc. Am., vol. 48, pp. 894–905 (1970).
sis/Synthesis Based on a Sinusoidal Representation,” [19] G. H. Wakefield, L. M. Heller, L. H. Carney, and
IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP- M. Mellody, “On the Perception of Transients: Applying
34, pp. 744–754 (1986 Aug.). Psychophysical Constraints to Improve Audio Analysis
[3] H. Helmholtz, On the Sensation of Tone, 2nd En- and Synthesis,” in Proc. Int. Computer Music Conf.
glish ed., based on 4th German ed. of 1877 (Dover, New (2000), pp. 225–228.
York, 1954). [20] S. Uppenkamp, S. Fobel, and R. D. Patterson, “The
[4] R. Plomp and H. J. M. Steeneken, “Effect of Phase Effects of Temporal Asymmetry on the Detection and the
on the Timbre of Complex Tones,” J. Acoust. Soc. Am., Perception of Short Chirps,” Hear. Res., vol. 158, pp.
vol. 46, pp. 409–421 (1969). 71–83 (2001).
[5] R. C. Mathes and R. L. Miller, “Phase Effects in [21] M. R. Schroeder, “New Results Concerning Mon-
Monaural Perception,” J. Acoust. Soc. Am., vol. 19, pp. aural Phase Sensitivity,” J. Acoust. Soc. Am., vol. 31, p.
780–797 (1947). 1579 (1959).
[6] J. L. Goldstein, “Auditory Spectral Filtering and [22] M. R. Schroeder, “Synthesis of Low-Peak-Factor
Monaural Phase Perception,” J. Acoust. Soc. Am., vol. 41, Signals and Binary Sequences with Low Autocorrelation,”
pp. 458–479 (1967). IEEE Trans. Inform. Theory, vol. 16, pp. 85–89 (1970).
[7] R. D. Patterson, “A Pulse Ribbon Model of Monau- [23] M. Leman, “Visualization and Calculation of the
ral Phase Perception,” J. Acoust. Soc. Am., vol. 82, pp. Roughness of Acoustical Musical Signals Using the Syn-
1560–1586 (1987). chronization Index Model (sim),” in Proc. Conf. on Digi-
[8] R. Meddis and M. J. Hewitt, “Virtual Pitch and tal Audio Effects (DAFX-00) (2000), pp. 125–130.
[24] E. Tind and K. Jensen, “Phase Models to Control [37] F. Auger and P. Flandrin, “Improving the Read-
Roughness in Additive Synthesis,” in Proc. Int. Computer ability of Time Frequency and Time Scale Representations
Music Conf. (Miami, FL, 2004), To be published (2004 by the Reassignment Method,” IEEE Trans. Signal Pro-
Nov.). cess., vol. 43, pp. 1068–1089 (1995).
[25] J. C. Risset and D. L. Wessel, “Exploration of Tim- [38] S. Borum and K. Jensen, “Additive Analysis/
bre by Analysis and Synthesis,” in Psychology of Music, Synthesis Using Analytically Derived Windows,” in Proc.
D. Deutsch, Ed. (Academic Press, New York, 1982). Digital Audio Effects Workshop (Trondheim, Norway,
[26] M. R. Portnoff, “Implementation of the Digital 1999), pp. 125–128.
Phase Vocoder Using the Fast Fourier Transform,” IEEE [39] Y. Ding and X. Qian, “Processing of Musical
Trans. Acoust., Speech, Signal Process., vol. ASSP-24, pp. Tones Using a Combined Quadratic Polynomial-Phase Si-
243–248 (1976). nusoid and Residual (QUASAR) Signal Model,” J. Audio
[27] J. B. Allen, “Short Term Spectral Analysis, Syn- Eng. Soc., vol. 45, pp. 571–584 (1997 July/Aug.).
thesis and Modification by Discrete Fourier Transform,” [40] A. Röbel, “Adaptive Additive Synthesis of Sound,”
IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP- in Proc. Int. Computer Music Conf. (Berlin, Germany,
25, pp. 235–238 (1977). 1999), pp. 256–259.
[28] X. Serra and J. Smith, “Spectral Modeling Synthe- [41] T. H. Andersen, “Phase Models in Real-Time
sis: A Sound Analysis/Synthesis System Based on a De- Analysis/Synthesis of Voiced Sounds,” Master’s thesis,
terministic Plus Stochastic Decomposition,” Computer Dept. Computer Science, University of Copenhagen, Co-
Music J., vol. 14, pp. 12–24 (winter 1990). penhagen, Denmark (2002 Jan.).
[29] K. Fitz and L. Haken, “Sinusoidal Modeling and [42] L. R. Rabiner, “On the Use of Autocorrelation
Manipulation Using Lemur,” Computer Music J., vol. 20, Analysis for Pitch Detection,” IEEE Trans. Acoust.,
no. 4, pp. 44–59 (1996). Speech, Signal Process, vol. ASSP-25, pp. 24–33 (1977).
[30] X. Rodet, “The Additive Analysis–Synthesis Pack- [43] A. Papoulis, Signal Analysis (McGraw-Hill, New
age,” Tech. Rep., IRCAM, Paris, France (2004 July). www. York, 1977).
ircam.fr/equipes/analyse-synthese/DOCUMENTATIONS/ [44] R. Di Federico, “Waveform Preserving Time
additive/index-e.html. Stretching and Pitch Shifting for Sinusoidal Models of
[31] K. Jensen, “The Timbre Model,” in Proc. Work- Sound,” in Proc. COST-G6 Digital Audio Effects Work-
shop on Current Research Directions in Computer Music shop (1998), pp. 44–48.
(Barcelona, Spain, 2001), pp. 174–186. [45] J. Laroche and M. Dolson, “New Phase-Vocoder
[32] M. V. Matthews, J. E. Miller, and E. E. David, Techniques for Real-Time Pitch Shifting, Chorusing, Har-
“Pitch Synchronous Analysis of Voiced Speech,” J. monizing, and Other Exotic Audio Modifications,” J. Au-
Acoust. Soc. Am., vol. 33, pp. 179–186 (1961 Feb.). dio Eng. Soc., vol. 47, pp. 928–936 (1999 Nov.).
[33] M. D. Freedman, “Analysis of Musical Instrument [46] H. Fletcher, E. D. Blackham, and R. Stratton,
Tones,” J. Acoust. Soc. Am., vol. 41, pp. 793–806 (1967). “Quality of Piano Tones,” J. Acoust. Soc. Am., vol. 34, pp.
[34] K. Fitz and L. Haken, “On the Use of Time–Fre- 749–761 (1962).
quency Reassignment in Additive Sound Modeling,” J. [47] ITU-R 8510, “Methods for the Subjective Assess-
Audio Eng. Soc., vol. 50, pp. 879–893 (2002 Nov.). ment of Small Impairments in Audio Systems, Including
[35] J. S. Marques and L. B. Almeida, “New Basis Multichannel Sound Systems,” Tech. Rep., International
Functions for Sinusoidal Decompositions,” in Proc. 8th Telecommunications Union, Geneva, Switzerland (1994
Eur. Conf. in Electrotechnics (EUROCON’88) (1988 Mar.).
June), pp. 48–51. [48] J. Bensa, K. Jensen, and R. Kronland-Martinet, “A
[36] P. Guillemain, “Analyse et modélisation de signaux Hybrid Resynthesis Model for Hammer–String Interaction
sonores par des représentations temps–frequence linéaires,” of Piano Tones,” EURASIP J. Appl. Signal Process., vol.
PhD thesis, Université d’Aix–Marseille II, France (1994). 7, pp. 1021–1035 (2004).
THE AUTHORS
T. H. Andersen K. Jensen
Tue Haste Andersen received a master’s degree in com- from the Department of Datalogy, University of Copen-
puter science from the Department of Computer Science, hagen, Denmark, doing work in analysis/synthesis, signal
University of Copenhagen, Copenhagen, Denmark, in processing, classification, and modeling of musical
2002. At present he is pursuing graduate studies in the sounds.
same department, working with human–computer interac- Dr. Jensen is an assistant professor in the Department of
tion aspects of sound and music. Datalogy. He has a broad background in signal processing
and has been involved in synthesizers for children, state-
●
of-the-art next-generation effect processors, and general
Kristoffer Jensen received a master’s degree in com- topics in music informatics. His current research topic is
puter science from the Technical University of Lund, Swe- signal processing with musical applications, which in-
den, and a D.E.A in signal processing from ENSEEIHT, cludes knowledge of perception, psychoacoustics, physi-
Toulouse, France. In 1999 he received a Ph.D. degree cal models, and expression of music.
ENGINEERING REPORTS
Direct Approximate Third-Order Response

Synthesis of Vented-Box Loudspeaker Systems*
BERNAT LLAMAZARES, AES Associate Member
Polytechnic University of Catalonia (UPC), 08034 Barcelona, Spain
A new approach to designing vented-box loudspeaker systems exhibiting approximate

third-order responses is presented, and as a result a new family of intermediate alignments is
obtained. These alignments are characterized by providing a better transient response at the
expense of a less than optimum efficiency in the low-frequency range (provided both fourth-
and approximate third-order filter functions are adjusted in a similar way). Different cases are
illustrated relating to a set of well-known solutions.
0 INTRODUCTION be considered is that pointed out in Maxwell [2], namely,

the margin for error in the nominal Thiele–Small param-
Thiele in a series of seminal papers [1] deals with the eters in a loudspeaker design is larger for lower quality
possible frequency-response alignments of vented-box factors when it is a question of minimizing possible
loudspeaker systems, the quasi-third-order Butterworth changes in the frequency response. On the down side there
(QB3) and sub-Chebyshev (SC4) responses being among would be two main disadvantages, 1) a frequency response
these. The first are derived by setting equal to zero the that may be considered intermediate between conventional
coefficients for the sixth and fourth powers of frequency in second-order sealed-box systems and fourth-order vented-
the expression for the squared modulus of the fourth-order box systems and therefore a transient response not as good
filter; the second are generated by multiplying the real part as that of a sealed-box system and 2) an increasing cone
of the Butterworth poles by a factor k greater than unity. In excursion for very low frequencies.
any case it has to be noted that both of these alignment The basis of the approach to be described in the present
types smoothly depart from the reference fourth-order engineering report involves cascading a prototype third-
Butterworth and feature smoothly rounded shapes in their order high-pass function having optimal behavior with a
respective cutoff regions. They can also be described as first-order high-pass function at a lower frequency. This
having a cutoff frequency and box tuning frequency above will force the fourth-order system function denominator to
the driver resonance frequency, and as being suitable for consist of two real poles and one complex conjugate pair,
use with low-Q, low-resonance drivers and relatively thus generating approximate third-order responses by only
small enclosures. shifting the single real pole of the first-order function once
The new alignments presented in this engineering report the s plane is normalized to the angular frequency variable
have performance characteristics that are qualitatively of first- and third-order transfer functions. It should also be
similar to those of the Thiele alignments. They result in a clear that because the overall transfer function is the cas-
series of advantages (when compared to the same charac- cade of an optimal prototype high-pass function with an-
teristics of a sealed-box system) such as a reduced dia- other high-pass function, the result is no longer optimum
phragm excursion at frequencies near the vented enclosure in the same way.
resonance frequency fB (and consequent higher displace-
ment-limited power capacity and lower values of nonlin- 0.1 List of Symbols
ear and modulation distortion), a higher efficiency, and a d Uncoupling coefficient
lower cutoff frequency. Furthermore these alignments re- fB Resonance frequency of vented enclosure
quire low-QT drivers. In this regard another advantage to fS Resonance frequency of driver
f3 Cutoff (−3-dB) frequency of loudspeaker system
response
*Manuscript received 2004 April 22; revised 2004 August 18, F(s) First-order transfer function
September 1, and September 13. G(s) Response transfer function
ENGINEERING REPORTS THIRD-ORDER RESPONSES AND VENTED-BOX LOUDSPEAKERS
h System tuning ratio, ⳱ fB/fS Equating Eqs. (1) and (8), this yields the following coef-
QL Enclosure Q at fB resulting from leakage losses ficient relationships:
QT Total driver Q at fS resulting from all system resistances
s Complex frequency variable b1 + d
a1 =
T(s) Third-order transfer function c
U Uncoupling factor
VAS Volume of air having same acoustic compliance as b2 + b1d
a2 = (9)
driver suspension c2
VB Net internal volume of enclosure
b3 + b2d
␻0 Angular frequency variable of fourth-order transfer a3 = .
function c3
␻1 Angular frequency variable of first- and third-order
transfer functions
2 DETERMINATION OF UNCOUPLING FACTOR
␻T3 Cutoff angular frequency of third-order transfer
function We can determine an uncoupling factor between the two
␣ System compliance ratio, ⳱ VAS/VB transfer functions by evaluating the log magnitude-
squared form of F(s) at the cutoff angular frequency of
1 RESPONSE SYNTHESIS T(s), that is, we define the uncoupling factor U as follows:
The response transfer function of a vented-box system is a 1

fourth-order high-pass filter and can be written in the form U = 10 logⱍF共␻T3兲ⱍ2 = 10 log . (10)
1 + 共d␻1 Ⲑ ␻T3兲2
s4
G共s兲 = . (1) Thus the uncoupling factor U indicates how much ad-
s4 + a1␻0s3 + a2␻20s2 + a3␻30s + ␻40 ditional attenuation F(s) contributes at the cutoff angular
frequency of T(s).
For the purpose of this engineering report, this function To illustrate with some values, let us start with an ap-
is factored into the product of first- and third-order high- proximate third-order Butterworth response (AB3),
pass transfer functions as follows:
s3
G共s兲 = F共s兲 × T共s兲 (2) T共s兲 = B3共s兲 = .
s3 + 2␻1s2 + 2␻21s + ␻31
where
Then
s
F共s兲 = (3)
s + d␻1 ␻T3 = ␻1
s3 1
T共s兲 = (4) U = 10 logⱍF共␻T3兲ⱍ2 = 10 log
s3 + b1␻1s2 + b2␻21s + b3␻31 1 + d2
d being an uncoupling coefficient equal to the ratio of the and

cutoff angular frequency of F(s) to the angular frequency
d U (dB)
variable ␻1.
Rewriting Eq. (2), G(s) is given by 0.05 −0.009
0.1 −0.044
G共s兲 0.2 −0.168
s4 0.3 −0.376
= . 0.4 −0.645
s4 + 共b1 + d兲␻1s3 + 共b2 + b1d兲␻21s2 + 共b3 + b2d兲␻31s + b3d␻41
(5)
Now let us consider an approximate third-order Bessel
By defining response (ABL3),
c = 共b3d兲1 Ⲑ 4 (6) s3
T共s兲 = BL3共s兲 = .
␻0 = c␻1 (7) s3 + 2.466␻1s2 + 2.433␻21s + ␻31
G(s) becomes Then

4
s ␻T3 = 1.405␻1
G共s兲 = .
b1 + d b2 + b1d 2 2 b3 + b2d 3
s4 + ␻0s3 + ␻0s + ␻0s + ␻4
0 1
c c2 c3 U = 10 logⱍF共␻T3兲ⱍ2 = 10 log
(8) 1 + 共d Ⲑ 1.405兲2
LLAMAZARES ENGINEERING REPORTS
and Hence
d U (dB)
h1 Ⲑ 2
0.05 −0.004 ␻1 = ␻. (15)
0.1 −0.022 c S
0.2 −0.088
0.3 −0.195 Some examples are shown next.
0.4 −0.339
0.5 −0.521 3.1 Approximate Third-Order Butterworth
Responses (AB3)
Finally let us suppose an approximate third-order Cheby- The coefficient values are given by
shev response with a 0.5-dB peak dip (AC3(0.5)),
b1 = b2 = 2, b3 = 1.
s3
T共s兲 = C3共0.5兲共s兲 = . The system parameters for QL = 7 are given in Table 1.
s3 + 2.145␻1s2 + 1.751␻21s + 1.397␻31
3.2 Approximate Third-Order Bessel
Then
Responses (ABL3)
␻T3 = 0.855␻1 The coefficient values are given by
1 b1 = 2.466, b2 = 2.433, b3 = 1.
U = 10 logⱍF共␻T3兲ⱍ2 = 10 log
1 + 共d Ⲑ 0.855兲2
The system parameters for QL = 7 are given in Table 2.
and
d U (dB) 3.3 Approximate Third-Order Chebyshev
Responses (AC3)
0.05 −0.013
0.1 −0.061 Note that both Butterworth and Bessel specify unique
0.2 −0.232 alignments, whereas Chebyshev is a family that must have
0.3 −0.506 a parameter specified, such as the ripple magnitude.
The coefficient values for a 0.5-dB peak dip are given by
It then follows that by using an uncoupling coefficient d
b1 = 2.145, b2 = 1.751, b3 1.397.
that is not too large, the response of the fourth-order
vented-box system is made to closely resemble that of a The system parameters for QL = 7 are given in Table 3.
prototype third-order high-pass function. From Tables 1–3 can be seen that smaller box volumes
require lower quality factors and higher tuning ratios, thus
3 COMPUTATION OF PARAMETERS
In this section we show the calculated Thiele–Small Table 1. System parameters for a set of AB3 responses.
parameters for three typical classes of responses. In all d a1 a2 a3 h ␣ QT f3/fS
cases the vented-box system is assumed to have a leak-
age loss of QL ⳱ 7 and a desired uncoupling factor ⱍUⱍ ⱕ 0.05 4.334 9.386 10.395 2.509 13.922 0.154 3.352
0.1 3.737 6.965 6.760 1.880 7.256 0.206 2.448
0.4 dB.
0.2 3.288 5.362 4.676 1.464 3.919 0.265 1.832
To do this we follow the steps described in Small [3]. 0.3 3.108 4.748 3.948 1.297 2.854 0.298 1.583
1) Calculate
c1 = a1QL, c2 = a3QL. (11)
Table 2. System parameters for a set of ABL3 responses.
2) Find the largest positive real root r of
d a1 a2 a3 h ␣ QT f3/fS
r4 − c1r3 + c2r − 1 = 0. (12)
0.05 5.319 11.426 10.599 2.053 16.094 0.136 4.259
3) Then the alignment parameters are 0.1 4.566 8.484 7.004 1.570 8.622 0.182 3.142
0.2 3.985 6.538 4.965 1.266 4.896 0.233 2.393
h = r2 0.3 3.738 5.794 4.269 1.153 3.716 0.260 2.098
0.4 3.605 5.410 3.927 1.096 3.161 0.276 1.943
1
␣ = a2h − h2 − 1 − 共a3h1 Ⲑ 2QL − 1兲
Q2L
Table 3. System parameters for a set of AC3(0.5) responses.
hQL
QT = . (13)
a3h1 Ⲑ 2QL − 1 d a1 a2 a3 h ␣ QT f3/fS
0.05 4.270 7.034 10.932 2.686 8.139 0.151 2.734
Furthermore
0.1 3.674 5.265 6.892 1.954 4.114 0.206 1.965
0.2 3.226 4.125 4.547 1.450 2.117 0.272 1.433
␻0 = h1 Ⲑ 2␻S (14)
yielding higher cutoff frequencies. As to the response Tables 4–6 list the system parameter values of different
shape, it is clear that as d decreases, the approximate alignments with QT being held constant in each case. To
rolloff of 18 dB per octave will be extended down to lower be able to compare these readily, we plotted their respec-
frequencies below the cutoff frequency. tive normalized response curves in Figs. 1–3.
Comments to Figs. 1–3 can be summarized as follows.
4 COMPARISON OF APPROXIMATE 1) For low QT values, AB3 and QB3 alignments exhibit
THIRD-ORDER AND QB3–SC4 ALIGNMENTS almost identical response shapes and provide similar sys-
tem parameter values. Note that in both cases one gets the
The QB3 and SC4 responses of Thiele [1] can be largest values of ␣ (a larger value of ␣ means a smaller
calculated by a coefficient parameter B and a pole-shift- box size).
ing factor k, respectively, as described in [3]. It would 2) The lowest cutoff frequencies and steepest cutoff
therefore be interesting to compare both of these slopes always occur for AC3 alignments.
alignment types to the new approximate third-order align- 3) The ABL3 and SC4 alignments feature the most
ments. rounded response—in other words the best transient re-
Table 4. System parameters for different responses. QT ⳱ 0.136, QL ⳱ 7.
Alignment Parameter a1 a2 a3 h ␣ f3/fS

AB3 d ⳱ 0.038 4.616 10.650 12.503 2.839 18.136 3.818
ABL3 d ⳱ 0.050 5.319 11.426 10.599 2.053 16.094 4.259
AC3(0.5) d ⳱ 0.040 4.496 7.777 12.780 2.989 10.175 3.050
QB3 B ⳱ 11.622 4.615 10.650 12.505 2.839 18.185 3.818

AB3 d ⳱ 0.100 3.737 6.965 6.760 1.880 7.256 2.448
ABL3 d ⳱ 0.140 4.260 7.425 5.857 1.402 6.474 2.736
AC3(0.5) d ⳱ 0.100 3.674 5.265 6.892 1.954 4.114 1.965
QB3 B ⳱ 5.645 3.732 6.965 6.767 1.885 7.269 2.449

AB3 d ⳱ 0.217 3.248 5.225 4.511 1.428 3.672 1.779
ABL3 d ⳱ 0.370 3.636 5.499 4.004 1.109 3.286 1.979
AC3(0.5) d ⳱ 0.200 3.226 4.125 4.547 1.450 2.117 1.433
QB3 B ⳱ 3.175 3.233 5.225 4.531 1.440 3.694 1.777
SC4 k ⳱ 3.380 3.702 5.916 3.950 1.073 3.633 2.082
Fig. 1. Normalized response curves corresponding to alighments of Table 4.
LLAMAZARES ENGINEERING REPORTS
sponse—but the price is that they provide the highest cut- teristics midway between second-order sealed-box and
off frequencies. fourth-order vented-box systems.
4) It has to be emphasized that for quality factors QT
having values less than about 0.23, SC4 alignments are no 6 ACKNOWLEDGMENT
longer possible.
The author would like to express his gratitude to the two
5 CONCLUSIONS reviewers for their valuable comments and suggestions.
We have described a simple method to design vented- 7 REFERENCES

box loudspeaker systems exhibiting approximate third-
order responses. For this to be intuitive and easy to man- [1] A. N. Thiele, “Loudspeakers in Vented Boxes, Parts
age, the introduction of an uncoupling factor has proved to I and II,” J. Audio Eng. Soc., vol. 19, pp. 382–392 (1971
be very useful. Examples shown include the so-called ap- May); pp. 471–483 (1971 June).
proximate third-order Butterworth (AB3), Bessel (ABL3), [2] R. G. Maxwell, “Low-Frequency Options: Design
and Chebyshev (AC3) responses. It is noteworthy that Curves for Vented-Box Loudspeakers,” J. Audio Eng. Soc.
AB3 alignments and Thiele’s QB3 show almost identical (Engineering Reports), vol. 41, pp. 44–48 (1993 Jan./Feb.).
characteristics for low QT values. [3] R. H. Small, “Vented-Box Loudspeaker Systems,
In summary, these new alignments are very suitable for Parts I–IV,” J. Audio Eng. Soc., vol. 21, pp. 363–372
low-resonance, low-QT drivers and offer the designer an (1973 June); pp. 438–444 (1973 July/Aug.); pp. 549–554
alternative system design option having response charac- (1973 Sept.); pp. 635–639 (1973 Oct.).
Fig. 2. Normalized response curves corresponding to alignments of Table 5.
Fig. 3. Normalized response curves corresponding to alignments of Table 6.
THE AUTHOR
Bernat Llamazares was born in Barcelona, Spain, in communication and system projects and is presently man-
1965. He received a degree in telecommunication engi- aging his own company. Besides a passion for music, his
neering in 1998 and a master’s degree in private and pub- main interests include loudspeaker systems, room acous-
lic telecommunication services and networks in 2003, both tics, and audio signal processing.
from the Polytechnic University of Catalonia (UPC). He Mr. Llamazares is an associate member of the
has been working with different consultancies on both AES.
LETTERS
CORRECTIONS
CORRECTION TO “ANALYSIS OF LOUDSPEAKER LINE ARRAYS”
In the above letter to the editor1 Fig. 39(b) and 39(c) should have appeared as follows. The author wishes to thank Greg
Oshiro for bringing it to his attention.
Fig. 39. Comparison of directivity functions of a stack of three curved sources and a straight-line source. Curved sources have element
length L ⳱ 150 mm, total included angle ␪ ⳱ 20°. Straight-line source has total length 3L.
MARK S. UREDA, AES Member

*Manuscript received 2004 September 23. JBL Professional
1
M. S. Ureda, J. Audio Eng. Soc., vol. 52, pp. 467–495 (2004 May). Northridge, CA 91329, USA
AES STANDARDS
COMMITTEE NEWS
Detailed information regarding AES Standards Committee
(AESSC) proceedings including structure, procedures, reports,
meetings, and membership is published on the AES Standards Web
site at http://www.aes.org/standards/. Membership of AESSC work-
ing groups is open to any individual materially and directly affect-
ed by the work of the group. For current project schedules, see the
project-status document also on the Web site.
SC-02-05 has achieved its objective Working Group SC-03-04.

Work of SC-02-05 Complete by defining the principles of synchro- Scope: This project seeks to provide
SC-02-05, the working group on syn- nization. SC-02 and the AESSC accept suggested guidelines for four
chronization under subcommittee SC- that it has no immediate projects. It is extended-term storage environments
02 on Digital Audio, has been therefore closed, the maintenance of its for archives that contain a variety of
responsible for two standards, AES5— standards reallocated to SC-02-02, the recording media, based on the corre-
Preferred Sampling Frequencies—and working group on Digital Input-Output sponding AES and ISO storage
AES11—Synchronization in Studio Interfaces. standards for those media. This
Operations. These two standards have Robin Caine, Chair SC-02, technical report will not replace those
laid down the fundamental rules for October 2004 storage standards.
handling timing and delay issues in all
areas using sampled audio, including Project AES-X148 Sample-Accurate
the specification of the synchronization New Projects Initiated Timing in AES47
distribution interface and the rela- If you have an interest in helping to Working Group SC-02-02.
tionship between audio and video develop these standards projects, go to Scope: To specify how the timing
signals. the AES Standards Web site at markers specified in 4.1.4.1.1 and 4.5
While the interface specified in http://www.aes.org/standards/ and of AES47 may be used to associate an
AES11 is very similar to the digital click on “Participation” to learn more. absolute timestamp with individual
audio interface AES3, it should not be audio samples.
assumed that it relates solely to AES3. Project AES-X145 Care and
Rather, the standard defines synchro- Handling of Optical Discs Project AES-X149 Format and
nization issues for all audio interfaces. Working Group SC-03-04. Recommended Usage of the Direct
However, the details of how newer Scope: This standard concerns the Stream Digital Interchange File
interfaces such as AES47 (ATM), care and handling of digital optical Format (DSDIFF)
IEEE1394 (Firewire), and others discs during use. It addresses the issues Working Group SC-06-01.
achieve the process of synchronization of physical integrity of the medium Scope: To standardize a file format
require expert input from the groups necessary to preserve access to the to allow general interchange of sound
managing those interfaces. According- recorded data. Included are recommen- files based on DSD-coded data based
ly, projects addressing these have been dations for handling procedures to on an existing specification. The
allocated to the relevant interface maximize the life of optical discs. DSDIFF file format is intended for
working groups. This standard addresses the fol- storage and transfer of DSD and DST
lowing subjects: use and handling (encoded DSD) material.
environments, including pollutants,
For its published documents and temperature and humidity, light Project AES-X151 Jitter
reports the AESSC is guided by Inter-
exposure and magnetic fields; contam- Performance Specification
national Electrotechnical Commission
(IEC) style as described in the ISO/IEC ination concerns; inspection; cleaning Working Group SC-02-01.
Directives, Part 3. IEC style differs in and maintenance, including cleaning Scope: To establish unambiguous,
some respects from the style of the AES methods and frequency; transportation; concise, and useful ways of expressing
as used elsewhere in this Journal. disasters, including water, fire, con- the jitter performance of audio com-
AESSC document stages referenced struction and post-disaster procedures; ponents and equipment. To find,
are: Project initiation request (PIR); staff training. develop, and recommend jitter charac-
Proposed task-group draft (PTD);
terization practices and jitter specifi-
Proposed working-group draft (PWD);
Proposed call for comment (PCFC); Project AES-X146, Extended Term cation terminologies that will help
Call for comment (CFC). Storage Environment for Multiple audio equipment designers and audio
Media Archives system integrators.
Metadata RevPAT 10/19/04 1:04 PM Page 1
METADATA REVISITED
Six New Things to Know About Audio Metadata
L
ast year we published an arti- that metadata is stored, as is planned for sal structure for data, can be either a
cle designed to demystify and a system being developed by the record- lowest common denominator or a
explain a number of key ing industry. complete description that fits the most
concepts relating to audio Morrell mentioned the ISBN general case. While the former seems
metadata (see JAES July/August 2003). (International Standard Book Number) limited, the latter is virtually impossible
Since that time the field has moved on system as a good example of a well- to achieve in Wright’s view. Although a
and the AES 25th International known registry system that has enabled lowest common denominator approach
Conference held in London during June the development of a number of useful is limited, it can do a finite and rather
2004 provided an opportunity to find out tools for libraries and publishers: a one- small job in a finite time. Such a core
more about recent developments, as well stop source for bibliographic informa- standard should be essential, general,
as topics not covered in the original arti- tion on English-language books in print; simple, and popular. The recognized
cle. The following is a short summary of a range of online data interchange and standard is Dublin Core (see the previ-
some of these, concentrating primarily order-routing solutions for books; and a ous article on metadata in the
on audio metadata standards and appli- database of publisher information. July/August 2003 JAES for more
cations rather than on feature extraction Global Data Synchronization (see details). Given the fact that digital data
(which was another key concept www.e-centre.org.uk), which uses the is impermanent and that filing systems
discussed at the AES 25th Conference). EAN.UCC method for numbering and are superseded thanks to the evolution of
barcoding, was developed as a system technology, the primary need is for core
WHAT’S A REGISTRY? for relating product, company, and loca- standards that are quickly implemented
Philippa Morrell, industry standards tion metatdata to facilitate collaborative and simple.
manager of the BOSS Federation, business processes. There is also a
described the purpose and nature of global registry—Global Product WHAT’S THE SAM/EBU DUBLIN
metadata registries. These are secure, Classification (GPC)— that keeps track CORE STANDARD?
central repositories of data that are of the original data relating to products Following on from this last point, Lars
increasingly used for business and and companies by providing a common Jonsson and Gunnar Dahl explained that
commercial operations, particularly on link between the classification systems the Scandinavian Academy of
the Internet where e-commerce leads to of different companies. Management worked with 25 archive
the need for centralized licensing and specialists and engineers to specify a
searching. WHAT’S CORE METADATA? core metadata standard for use within
Registries provide cross-references Richard Wright of BBC Information and the audio industry. This was proposed to
between digital items (such as a song Archives argued persuasively at the the European Broadcasting Union
stored on a server somewhere on the conference that the typical definition of (EBU) and later approved as EBU Tech.
Internet) and the information describing metadata as being “data about data” is Dec. 3293 by the EBU panel P-FRA
them. This is done either directly or via unfortunately too simple. In fact it is (Future Radio Archives). It is based on
a proxy (an indirect address or server more correctly labeled “beyond data.” It Dublin Core (DC) and includes some
that handles the means of accessing the is the organization, naming, and rela- additional internal fields for the transfer
real information). Morrell’s main point tionships of the descriptive elements; of near-online production audio files
was that registries provide a means by the structure of data rather than any within organizations. Use of the XML
which everyone can be “singing off the actual data. Wright pointed out that there syntax for transferring the extended
same hymn sheet.” In other words, are really three layers in any system: Dublin Core metadata information
everyone is using the same descriptive object (such as a digital audio file), enables the addition of other information
and licensing information from a descriptive data (data describing the file) that the organization might find impor-
common and reliable source, rather than and metadata (the convention or struc- tant. It is possible to incorporate this
there being numerous different versions ture for that descriptive data). new form of metadata within Broadcast
in numerous locations. A registry may Descriptive data is often mistakenly WAVE files using a header defined in
not itself contain metadata relating to termed metadata. Supplement 5 to the BWF standard,
digital items, but may point to where Core metadata, being a basic, univer- using the XML structure. One important
Metadata RevPAT 10/19/04 1:04 PM Page 2
METADATA REVISITED
feature is that none of the 15 fields in
basic Dublin Core metadata is compul-
sory, only titles and names could be USEFUL WEBLINKS
used if desired.
The implementation developed by the EBU Tech. Doc. 3293-2001: http://www.ebu.ch/tech_32/tech_t3293.html
SAM/EBU group is known as AXML, SMPTE standards: http://www.smpte.org
which adds four TYPE values—PGR
(program group), program, item Annodex: http://www.annodex.net
(constituent editorial part of a program), XPath: http://www.w3.org/TR/xpath
and MOB (media object)—to the 15
basic elements of Dublin Core. These Global Data Synchronization: http://www.e-centre.org.uk
come from the EBU P-META standard
(see the previous article on metadata in
the July/August 2003 JAES) . undertaken on the XML documents. elementary streams).
These allow indices to be updated when Metadata extracted from the audio
HOW TO SOLVE PROBLEMS changes are made to the elements of the stream or file can be taken into account
SEARCHING MPEG-7 XML tree (which are stored separately and synchronized with the storage of
DATABASES in the database). The indices are stored audio. Specific metadata contained in
Max Jacob from IRCAM in Paris as case tables, which are fast and easy the original audio format, such as the
showed that there is no common way to to search, so there is no need to browse format chunk in BWF, can be mapped
manage sound databases. Searching the whole XML tree. Elements can be into relevant MXF descriptors and meta-
such databases is not straightforward, inserted and updated. (SAX is a Simple data components. Channel status data
but the MPEG-7 framework may API for XML; in other words, the SAX from an AES3 stream can be incorpo-
provide a way of enabling more efficient parser is an application programming rated into MXF; the data mode is
processes in this regard. Included in interface that enables the parsing of mapped to the AES3 audio essence
MPEG-7 are a number of parts to the XML documents so as to separate them descriptor set, and the channel status
XML schema, including audio descrip- into events separated by relevant tags.) data itself is stored as data essence (on a
tors and multimedia description schemes separate track) or in the file header.
(MDS). This latter defines complex MAPPING AES3/BWF AUDIO
object-oriented data structures and is AND METADATA INTO MXF WHAT’S CMML?
used for managing relations between Bruce Devlin, David Brooks, and David CMML (the Continuous Media
descriptors, content segmentation, and Schweinsberg from Snell and Wilcox Markup Language), as described by
semantic descriptions, among other discussed ways in which audio and Claudia Schremmer, Steve Cassidy,
things. Some issues need to be metadata from BWF or AES3 (the stan- and Silvia Pfeiffer, is a means of mark-
addressed, namely validation, manage- dard digital audio interface) structures ing up time-continuous media such as
ment of very large MPEG-7 documents, can be mapped into the MXF (Material audio and video for integration into the
and efficient searches. Exchange Format), a new SMPTE stan- searching, linking, and browsing func-
With regard to efficient searches, the dard for interchange in the broadcasting tionality of the worldwide web. This is
Worldwide Web Consortium (W3C) has world. As explained in the previous arti- the basis of the so-called Continuous
developed a language called XPath cle on metadata, the MXF format is Media Web (CMWeb). As Schremmer
(XML Path Language), which is a principally a streaming format for media pointed out, it is relatively easy to
simple way of addressing nodes in an data but can also transmit edited search and find static data that has
XML tree. Jacobs believes that XPath projects. descriptive metadata of one sort or
2.0 is theoretically capable of undertak- An MXF file essentially has a number another, but streamed media that
ing more or less all the searches one of separate tracks each containing a evolves over time is a form of “dark
might want to perform on an MPEG-7 different stream, which can be audio, matter” on the Internet, because it
database, but states that the practical video, metadata, data, or timecode. The cannot easily be searched or browsed.
implementation of this is more difficult. tracks are grouped to form packages. A A new format called Annodex is used
Fast searches in practice require what document has been created to deal with to stream such material, which allows
are called indices focusing on a specific AES/BWF mapping, known as SMPTE annotation and indexing so that it can
task, but there have been no tools for 382M, currently in committee draft be integrated into the URL-based
indexing MPEG-7 data in a way that can form. This essentially allows multichan- hyperlinking approach found on the
be understood by XPath. This problem nel audio to be stored and individual or Internet. Essentially what happens is
was addressed in the CUIDADO project stereo audio tracks extracted from it. An that a markup file written in CMML is
within which a number of such limita- important factor to consider was the interleaved with the media stream to
tions were addressed. need to ensure that one or more audio create an Annodex representation that
They decided to adopt an open-source tracks could be synchronized to any of can be searched and browsed.
database system called PostgreSQL and the other media formats (primarily Editor’s note: Look for a third metadata
set up a method of associating event video) contained within an MXF file article in another year or so as new tools
handlers with database operations (such as uncompressed, DV, or MPEG are developed and systems are refined.
Radio ArchivingOct22Bill 10/22/04 10:39 AM Page 1
Digital Archive Strategies

and Solutions for
Radio Broadcasting
I n this article we review cov-

erage of digital archiving
issues for radio broadcasting
from the 116th AES Convention, held
in May in Berlin, Germany.
A WORKSHOP ON RADIO
terms in this field are audio archives,
content management, and media-asset
management; but the distinction
between these terms is not always
completely clear. The term essence is
often applied to the content that is
stored, whereas metadata is used to
to ensure that everything works
together.
The wide use of digital playout sys-
tems has been the driver for incorpo-
rating IT tools into the archiving pro-
cess. Heidrich asked whether we are
seeking a solution that improves qual-
ARCHIVING TOOLS: structure and describe the content. ity and saves money, or perhaps some-
STRATEGIES AND SOLUTIONS Information about the right to use the thing to preserve existing analog
Klaus Heidrich, chair, explained that material completes the picture. assets, or maybe a business require-
this workshop, designed to deal specif- The task of migrating to a digital ment to serve the next generation of
ically with the needs of radio broad- archive involves dealing with legacy digital platforms for on-demand
casters and originally entitled “Com- storage media such as tapes and the program delivery. Or is it a market-
parison of Existing Archiving Tools,” handling of legacy database structures driven change led by the advance of
had been given the subtitle of Solu- that require the addition of new meta- technology?
tions and Strategies. He was familiar data. Last but not least there is the Digital archive projects tend to be
with a number of recent digital archiv- question of “how to get along with fairly expensive, but there are at least
ing projects. The successful ones rights management,” which is usually three elements involved. There is the
always had an overall strategy the responsibility of administrative equipment, of course, but there is also
designed to provide a solution to the departments and which, on the basis of the time to transfer the legacy archive.
archiving problem. This ultimately current experience, may not always be Then there is the cost of ownership
gave rise to specific tools. The field as in the form of concise software and over the lifetime of the system. The
a whole, however, did not conform to a systems specifications that can be inte- question remains, can we justify such
fixed scenario but was something of a grated with a digital archive. an investment? Can business pro-
moving target. There is a wide variety of different cesses and workflow be improved,
Heidrich began the workshop by legacy solutions already in use for and can new business be generated as
introducing the panel: Ernst Dohlus, database management, including more a result?
Bavarian Radio, head of production recent relational databases. The ques- The topic of convergence has been
and playout; Wingolf Grieger, Nord tion arises as to whether it is possible around for years (that is, convergence
Deutscher Rundfunk (NDR), system simply to add mass storage for digital between the different broadcasting and
coordination for digital archives; Niko content to existing database structures. content delivery media made possible by
Waesche, IBM Business Consulting In general, it turns out that this is not information technology). Is the introduc-
Services; Rainer Kellerhals, Tecmath adequate. In fact, it is necessary to tion of digital archives closely related to
AG, executive VP of product and solu- incorporate features that are specific to convergence? From an operational point
tions; and Karl Pieper, general man- digital archives, such as browsing and of view, it is interesting to consider
ager of VCS Media Broadcasting coping with the range of different whether the operational process is driv-
Solutions. audio codecs involved. Interfaces to ing the solution or whether the solution
the digital archive solution are crucial is driving the process. Related to this is
INTRODUCING THE QUESTIONS to the success of a system. Business the question of whether an archiving
Heidrich evaluated reasons for intro- management systems for rights man- solution can really be an off-the-shelf
ducing digital archives and asked some agement also need to be integrated. A product. In particular, does an archive
important questions that he hoped the digital archive solution therefore system have to be broadcast-specific or
panel would address. Commonly used requires a qualified integration concept can it be a generic solution?
Standards, common practice, and content. For example, in future, the be a functional archive, and a radio
common middleware solutions are all music industry will deliver audio files house without an archive “is like a
issues to be considered, as are stan- instead of CDs, and this content has to kitchen without a pantry.”
dard interfaces and metadata struc- be kept in an archive unless it is only The question of workflow was
tures. Convergence is a thorny subject to be used temporarily. addressed by Grieger, in the form of a
and liable to fill many workshops in NDR has calculated that its mass- question as to whether the workflow of
its own right. However, it was consid- storage project—involving 90,000 a traditional carrier-based archive is
ered very important. “Everything hours of old tape material and 2,000 compatible with the workflow of a dig-
should be made as simple as possible hours of new production per ital system. Traditional material has to
but not simpler!” year—requires an investment of around be stored, documented, and ultimately
7M Euro as well as 10M Euro for the brought out again for broadcasting. In
WHY SHOULD A PUBLIC RADIO outsourcing of digitization. Each hour the digital world there is no huge vault
STATION ADOPT A DIGITAL of archived material, therefore, costs containing tapes and disks, but a mass-
ARCHIVE? about 180 Euro. Why should we make storage archive of some terabytes
Ernst Dohlus from Bavarian Radio this kind of investment? Are we occupying about 6 square meters. In
asked why a public radio station in attempting to perform a cultural role, fact almost none of the documentation
Germany should adopt a digital mass digitizing the cultural radio heritage of at NDR had to be changed with the
storage system and transfer its program the nation? Probably not, considering switch to a digital archive. They are
archive to this format. Such a project is the competitive business of radio today, still using the same database as before.
inevitably complicated and expensive. but certainly we should preserve impor- The new system actually made the pro-
All of the companies offering content- tant examples from radio history. Are cess of documentation easier.
management systems inevitably have to we digitizing in order to facilitate new An important test of a system is in
develop software before they can business solutions, such as educational its approach to the handling of errors;
deliver products. They often run short programs on demand? This may be the this involves primarily the errors of
of time and postpone delivery dates. case for commercial broadcasters, but operators as opposed to software
There is no model solution for radio public radio in Germany is limited in errors. The archive number of an item,
archives, so each project involves new the extent to which it can undertake for example, acts as the unique key to
software development. Radio broad- such commercial operations. The copy- the content and its location in the tradi-
casters are often offered solutions that right laws also make such enterprises tional archive. It acts in a similar way
were originally developed for TV oper- complicated. For example, repeat fees in relation to audio files in the new
ations, and these are rarely suitable to artists are common, especially for system. However, if an operator errs in
without considerable modification. older material when contracts did not the specification of this number, then a
Digital archiving solutions can there- explicitly state that material could be tape might be lost in the archive, and
fore almost never be used right out of reused many times. No, the primary human ingenuity is needed to find it,
the box. Complex interfacing problems reason for moving to a digital archive is either by accident or design. Such
arise, making the introduction of these the way in which production techniques errors will always occur, and in the
systems complicated and expensive. So and delivery methods are changing, digital domain a tool is necessary to
why are we so attracted to such solu- driven by new technology, making con- enable humans to intervene in similar
tions, and why would we want to par- ventional recording equipment and ways so as to be able to correct errors
ticipate in such a risky venture? It is techniques largely obsolete. and search for things that may have
because the wave of technical develop- been erroneously labeled. Grieger
ments drive us forward. Vintage equip- A SOLUTION OR THE START OF therefore stressed the need not to lose
ment, for example tape machines, is NEW PROBLEMS? the possibility for human control.
becoming increasingly rare and diffi- Wingolf Grieger asked the provocative
cult to maintain and operate. question: “Is a solution actually a solu- THE MANUFACTURERS SPEAK
Although some old tapes are gradu- tion or is it the start of new problems?” Niko Waesche from IBM wanted to
ally deteriorating, in fact most of the The project at NDR, active since 2002, reinforce the point made by Dohlus
content in existing archives does not is called Digital Long Term Archive that digital archives are of vital strate-
suffer from this problem, so media (or DELA, to abbreviate the German gic importance to the future of broad-
degradation is not the key driving fac- title). Historically, and still to some casters. Process change in broadcasting
tor. Actually, it is changes in broadcast extent today, people have been is the key issue. Archives have to be
production pulling us to a new solu- employed simply to find material in shifted from being a “luxury item” to a
tion. Journalists, editors, and producers the vaults and bring it out. Finding it core technology. However, if operating
are used to browsing for material on usually involves a database of some costs after the implementation of a dig-
computers, and it is becoming common sort. As more and more material with ital archive become greater than they
practice to find all this material online more channels is generated, the con- were before, many of the advantages
in one way or another. It is no longer ventional archive finds it increasingly will be negated. Owing to the way in
considered possible to do effective difficult to cope and will eventually be which technology has changed the pro-
production work without the introduc- paralyzed by complexity. Unless the cess of archiving, it is no longer sepa-
tion of a mass-storage archive for radio archive is digitized there will no longer rate from the remainder of the ➥
Digital Archive Strategies and Solutions for Radio Broadcasting

broadcast production process. Because a “change management” process the THE DELEGATES DISCUSS
of the integration between storage and better, because changes in workflow During a discussion session delegates
production, and because of media con- are implied. Involving the users in the considered the topic of what to keep
vergence, the concept of a separate project and educating them during the and what to throw away when transfer-
archive is no longer particularly mean- process is vitally important to its ulti- ring an old archive. Who will be the
ingful. While the idea of creating a cul- mate success. Data entry has to be judge of quality? Who dares to make
tural-heritage archive may indeed be a carefully considered. Repeatedly hav- that decision? In fact, it might be
luxury, it can be a by-product of imple- ing to enter similar data is very time cheaper to digitize everything rather
menting a digital archive for process consuming and inefficient, so options than to employ people to decide what
change reasons. Radio broadcasters to use master data are needed. to keep. After having done this at
may be able to “hit two birds with one In the short term the benefits to the NDR, some totally unexpected content
stone.” They may also find that their organization will tend to be relatively has subsequently been used by produc-
overall operating costs can be reduced. soft in terms of return on investment ers. Some of that material would
Waesche was not in favor of the so- (for example, greater ease of use), but almost certainly not have been trans-
called turnkey solution, partly because in the long term some more concrete ferred if a decision had been made in
of the negative aspects of being locked benefits are likely to accrue. In any advance. Once archive content
into a single supplier with a niche solu- case, it seems there is no way of avoid- becomes available on an online
tion. Technical support and skills could ing the introduction of digital archives database it is often surprising what
become problematic later on. —they are here to stay. people decide to use.
Rainer Kellerhals from Tecmath, a Karl Pieper from VCS Media Broad- As time goes on the functionality
specialist supplier of radio asset-man- casting Solutions began by speaking that can be delivered by out-of-the-box
agement systems, spoke about some of about the financial aspects of digital products is getting greater, thereby
the issues that he had encountered in archives. Much of the work of transfer- reducing the amount of custom work
projects to date. Typically, he said, a ring material to an archive is time-con- required. Digitization-on-the-fly or
key issue relates to the cataloging sys- suming and labor-intensive, and more on-demand could be quite a popular
tem already employed by the organiza- automatic tools are required to assist in approach with broadcasters, whereby
tion, some of which have been in place this process. In fact, the investment tape-based archive material is digitized
for years. If the existing system is to be element of a digital archive project only when it is requested by a producer
kept, then the question is one of how to (that is investment in new technology) or editor. However, in many cases this
interface to that system and how the is generally a much smaller component process is farmed out to agencies that
user interface is to be integrated. If the of the total cost than that of conversion might need a few weeks notice to carry
organization chooses to replace the (of old tape-based material into a digi- out the digitization. At NDR this pro-
existing cataloging system, then the tal form). cess accounts for only 3 to 5% of their
challenge is to find a suitable supplier The archive should really be a ser- digitization activity. Ernst Dohlus
and develop a way of migrating the vant to broadcasting production rather pointed out that while they knew that
metadata from one catalog to the other. than the other way around. The tech- perhaps half of their archive might be
One of the biggest problems, as Ernst nology should not drive you into com- used at some point in the future, they
Dohlus pointed out, is to find a suitable plicated changes of workflow, because did not know which half. Hence, digiti-
product that can handle the task. The this makes the acceptance of a system zation on the fly might provide some
market for software solutions to these much easier. If you choose the right clue as to the most desirable material.
issues is relatively small, probably system architecture, then a ratio of Nonetheless, a producer would not
only consisting of hundreds of compa- about 80% off-the-shelf products to necessarily know what was worth
nies in the world. 20% customization or adaptation using if he could not listen to it in
Audio editing systems need to be should be possible. However, the envi- advance; and he would not find it if it
interfaced to digital archives, but edit- ronment in which an archive operates had not been digitized, which makes a
ing systems have typically been stand- will change as the years go by (for strong argument for digitizing as much
alone black boxes that have not had the example, new metadata structures may as possible up front. The ability to
necessary APIs (application program- be introduced), so the system must be browse the archive has often provided
ming interfaces) to interface to other adaptable to such changes, and inter- inspiration to journalists to make a
systems. Gradually this is changing as faces should be flexible enough to program on something they might
manufacturers are being encouraged by accommodate changes in configura- never have thought of before. A mixed
users to make this more possible. tion. We know that mass-storage tech- approach to digitization is needed.
It is well worth spending the addi- nology will change in the future. A question was asked about how the
tional time on a project to ensure a Therefore a radio archive solution radio houses will continue to be able to
high level of integration and automa- really has to be independent of any play back tape-based material, now
tion, as it will give benefits in the long particular mass-storage type or device. that tape machines are becoming obso-
run in terms of usability and speed of Standards develop slowly, and it is lete. It was reckoned that in fact there
operation. The more the introduction probably best not to wait for them but are very large numbers of analog tape
of a digital archive goes beyond a tech- to enable the system to be adapted to machines still in existence and that suf-
nology deployment issue and becomes new standards as they arise. ficient numbers of them can be ➥
Pure audio performance
Please visit us at the AES

117 Convention in San Francisco
The R&S®UPV is the most advanced Booth 1235
audio analyzer in the world

Rohde & Schwarz presents the new R&S®UPV Audio ◆ For all interfaces – analog, digital and combined
Analyzer. Its performance makes it the new reference ◆ Real two-channel signal processing for maximum
standard – there is simply nothing else like it. measurement performance
The R&S®UPV pushes the limits of technology for ◆ Digital AES/EBU interface with sampling rate
broadcasting and R&D, and is an extremely fast up to 192 kHz
instrument for the production line. It handles high ◆ 250 kHz analysis bandwidth
resolution digital media just as easily as analog ◆ Recording and replaying of audio signals
measurements up to 250 kHz. ◆ Overlapping FFT analysis
With all of its performance the Windows XP-based ◆ Expandable, with easy addition of further audio
R&S®UPV is simple to operate. It’s also expandable, interfaces like I2S
and remarkably affordable. Take a look at its capabili- ◆ Built-in PC with Windows XP operating system
ties, and then contact us to find out more. ◆ Intuitive, ergonomically sophisticated user
interface for efficient operation
www.upv.rohde-schwarz.com
Digital Archive Strategies and Solutions for Radio Broadcasting

kept going for at least the next ten Johan de Koster of Radio Netherlands, riers” are the way of the future. In
years to make this possible. also presented at the AES 116th Con- other words, the search for the ideal
Another questioner asked about vention in Berlin, suggested some new long-term physical carrier for archiv-
quality levels when material is digi- approaches to a pragmatic archiving ing is a relatively fruitless exercise
tized and what is being done with the strategy based on case studies of real because new media continue to come
tapes after digitization? In response, projects. They began with some strik- along. An archive becomes a logical
one of the German broadcasters said ing statistics suggesting that by 2020 it space that is independent of the pro-
that they are storing material in linear will be possible to store 1.4 million duction environment, whose physical
PCM format at 16 bits and 48 kHz and hours of audio content online for a cost attributes may need to change as time
that currently they kept the tapes after- of less than 100 Euros. One petabyte passes.
ward. In fact, deleting a tape is more (one thousand terabytes) of storage One of the main problems of digiti-
expensive than one might think, and will hold 165 years worth of continu- zation is when and where to start. New
one cannot be sure that the digitization ous broadcast material encoded at 16 material continues to be added to the
will have been done correctly, it was bits, 48 kHz. To store the equivalent in archive while we are digitizing the
suggested. Another broadcaster said analog-tape form would require 2.5 older material. Considering what
that they have decided to keep the tape million items spread over 80 km of should be digitized and when, another
for two years after digitization, after shelves. question raised in the workshop, Hans
which they will destroy it. One dele- Echoing the comments of one of the and De Koster referred to the experi-
gate questioned the quality of 16- bit, workshop panelists noted earlier, these ence at CBC in Canada, where it had
48-kHz digital audio for the treasures authors suggest that archivists need to been found that 50% of requests were
that might be stored in some of these move out of their traditional environ- for material that had been broadcast in
archives. The broadcasters expressed ment—the basement archive—and into the preceding 12 months, so the orga-
satisfaction with the current standard the production space. Integrating the nization stopped adding tapes to its
however, and said that it had been archive with the production workflow archives by first digitizing its ongoing
decided at a point in time when the ensures that metadata are properly and broadcasts. The point they make is that
economics of storage were a key issue. quickly collected as part of the produc- such an approach gives rise to a fast
The quality of analog-taped material in tion process and that search and return on investment. Digitizing cur-
the archive was not so high that it retrieval are made easier. In fact, it rent programs requires much less
pushed the limit of the digital standard may even be possible to generate some expenditure than transferring past
employed. Of course archives have to metadata automatically in the future, recordings.
be reliable, so that material is not lost thanks to the development of systems The digitization of archives can also
or damaged, and they need to use a that analyze the audio material for lead to new business opportunities and
quality of mass-storage media that is speech and other content, generating increased use of the archived material,
comparable to that used in banks and metadata to enable the cataloging and as suggested by workshop panelists.
other similar secure environments. searching of radio content. An example For example, at RAI in Italy the video
Methods of backup and automatic of this is described in a separate paper archive experienced an 85% increase
copying are also employed. by Löffler et al., “Automatic Extrac- in use after digitization. Furthermore,
Summing up the workshop, Heidrich tion of MPEG-7 Audio Metadata a number of new digital television
asked the panel for a final statement Using the Media Asset Management channels dedicated to historical
after having heard what all the others System iFinder,” from the AES 25th recordings emerged.
had to say. They agreed that the speci- International Conference. Based on this workshop we can see
fication phase of the project is abso- Although many institutions regard that the digitization of radio archives
lutely crucial and that broadcasters archiving onto CD as the most cost- is here to stay. It is not a question of
have to be clear about the business effective approach, the handling of if but when. Many radio houses have
goals for the project to ensure that you these media is an expensive operation. already embarked upon this opera-
get more honest answers from poten- For example, the CD has to be burned, tion, and more systems are available.
tial suppliers. Journalists at NDR really labeled, barcoded, and shelved. Con- The increased production flexibility
love the digital archive that has been sequently the archiving of one year’s and business opportunities that arise
introduced, so it has been a great suc- worth of audio costs nearly 40,000 help to justify the high costs of such
cess in the eyes of the production staff. Euros, of which 80% is labor. The cor- projects.
The same type of questions discussed responding storage in an integrated Editor’s note: The two papers
here arose when digital production and online system using hard drives mentioned in this article—and all
playout systems were introduced some amounts to about half that amount, other AES Journal, convention, and
ten years ago. These now seem to be where the handling operations are lim- conference papers—are now avail-
very much the norm. ited to triggering file copies. The cost able for online purchase. Go to
of the equivalent amount of analog http://www.aes.org/journal/
TAKING CARE OF TOMORROW tape storage is close to ten times the search.cfm for Journal articles or
BEFORE IT IS TOO LATE cost of CD storage, even if the space http://www.aes.org/publications/
A paper (AES number 6009) by Nico- could be found. The key to the preprints/search.cfm for convention
las Hans of Dalet Digital Media and authors’ argument is that “virtual car- and conference papers.
1185to1187_officersNov_oct14 10/13/04 4:33 PM Page 1
AES Officers 2004-2005
ciation, a past member of the AES 1990/91. In 1995 the AES awarded
PRESIDENT Board of Governors, and founder and him a fellowship for his contribution to
chair of the AES’ Alberta Section. She digital audio technology and AES
THERESA LEONARD is the direc- served on the executive committee of standards activities. He also served as
tor of audio for the AES convention in Los Angeles in governor from 1999 to 2001. With
music and sound 2002 and as chair of the AES confer- Christer Grewin of the Swedish Broad-
at The Banff ence on multichannel audio at The casting Corporation he assembled and
Centre. She is Banff Centre in June 2003. edited the AES special publication
responsible for Collected Papers on Digital Audio Bit-
overseeing the rate Reduction. Neil left the BBC in
audio work/stu- PRESIDENT-ELECT 2002 to work as a consultant in
dy program and audio and broadcasting. An offshoot of
directing activi- NEIL GILCHRIST joined the BBC his consultancy work is a recording
ties at the cen- after graduating service for musicians and societies in
tre’s extensive audio facilities. Her from Manches- his area. He has just completed two
work spans many aspects of audio pro- ter University in CD masters for a mechanical musical
duction, administration, and engineer- 1965 with a instrument museum.
ing, including both studio and live B.Sc. honours
recording and postproduction in a vari- degree in physics
ety of musical genres, as well as audio and electronic SECRETARY
for video postproduction. As director engineering. As
of The Banff Centre’s audio education a BBC engineer, HAN TENDELOO was born in
program, she works closely with top he worked on Amsterdam, the
industry personnel, who serve as facul- broadcast audio, PCM for national Netherlands, in
ty members and guest lecturers. radio distribution, and NICAM for 1936. He receiv-
Leonard holds bachelor degrees in television sound. He participated in the ed his master’s
music and education, and a master’s EUREKA 147 (Digital Audio Broad- degree in electri-
degree in music from McGill Univer- casting) project, and toward the end of cal engineering
sity, where she was enrolled in the his BBC career led the European from the Techni-
sound recording program. Her ACTS ATLANTIC project to a suc- cal University
thesis, “Time Delay Compensation of cessful conclusion in its final year. of Delft in the
Distributed Multiple Microphones in From 1981 to 1996 he represented the Netherlands,
Recording: An Experimental Evalua- UK in the former CCIR, including with a specialization in semiconductors.
tion” was later transcribed into an AES chairmanship of CCIR Interim Work- He has been employed by Philips-relat-
paper and presented in New York City ing Party 10/6 (international exchange ed companies such as PolyGram and
at the 95th Convention in 1993. of sound programs). He represented PDO in the fields of recording, duplica-
Trained as a classical pianist, she the BBC on Sub-group V3 (Sound) of tion, replication, and product develop-
previously taught music in French and the EBU, and served on both the AES ment and marketing: LP, MC, VLP,
English schools in eastern Canada, and the EBU groups, which prepared CD, CD-i, CD-Video, DCC, packaging.
worked as audio postproduction engi- the specification for the AES/EBU He is coinventor of the CD jewel box.
neer for a Canadian TV series, and as digital audio interface. His AES He was a long-time chair of NEC TC60
an audio engineer and instructor at the activities have included frequent con- (IEC Audio and Video-Recording Stan-
University of Iowa School of Music. tributions to papers and workshop dardization) and a member of the Soci-
She is the regional representative for sessions at AES conventions, and ety of Motion Picture and Television
the Alberta Recording Industry Asso- chairmanship of the British Section in Engineers (SMPTE). ➥
J. Audio Eng. Soc., Vol. 52 No. 11, 2004 November 1185

NEW OFFICERS
After his retirement he freelanced working in the area of acoustics for He was the papers chair for the
for Philips and the International Feder- small to medium-sized rooms. An 104th, 110th, and 114th conventions
ation of the Phonographic Industry AES fellow, he served on its Board in Amsterdam '98, ’01, and ’03; and
(IFPI) in London. An AES member of Governors from 1990 to 1992, and AES governor from 1999-2000. Aarts
since the mid 60s, he has held the fol- was AES president from 1994 to was made a fellow in 1998 of the
lowing AES offices: vice president, 1995. As president and president- AES for major contributions to sound
Northern Region, Europe; governor; elect, he worked very closely reproduction and assessment.
vice chair of the Standards Committee, on financial concerns with the trea-
Europe Region; chair and member of surer. He has also been involved with ULRIKE KRISTINA SCHWARZ
the Publications Policy Committee; AES digital audio measurement stan- started pursuing
convention chair; convention vice dards from 1984 to 1998. Since 1994 a career in the
chair; and convention program coordi- he has been on the Publications music industry
nator. He was awarded a fellowship in Policy Committee to promote the with classical
1977 and has received three Board of use of electronic media for AES piano training
Governors Awards. His focus in recent publications. at the Richard
times is on improvement of informa- Strauß Conser-
tion to the membership about upcom- vatory, Munich,
ing AES conventions by introducing GOVERNORS Germany. In
bar-graph convention calendars, com- addition to the
prehensive semi-interactive conven- RONALD AARTS was born in 1956, Tonmeister program at the University
tion Web sites, and detailed on-site in Amsterdam. of the Arts Berlin (UdK) and the
convention planners. He received a Technical University Berlin, which
B.Sc. degree in she entered in 1994, she expanded her
electrical engi- knowledge by taking part in the Sum-
TREASURER-ELECT neering in 1977, mer Performance Program at the
and a Ph.D. from Berklee School of Music, Boston,
LOUIS FIELDER received a B.S. Delft University MA, USA. A scholarship for a six
degree in elec- of Technology months’ workstudy with acclaimed
trical engineer- in 1994. In 1977 jazz recording engineers brought her
ing from the he joined the to New York City. There she estab-
California Insti- optics group of Philips Research Labo- lished contacts with the major record-
tute of Technol- ratories, Eindhoven. Until 1984, his pri- ing facilities of New York and was
ogy in 1974 and mary contributions were in the fields of involved in productions of all major
an M.S. degree servos, signal processing for video long- jazz labels including artists like Joe
in acoustics play players, and Compact Disc players. Henderson and Horace Silver, and on
from the Uni- In 1984 he joined the acoustics group of the classical side, Lorin Maazel and
versity of Cali- the Philips Research Laboratories and Itzak Perlman. Several of these pro-
fornia in Los Angeles in 1976. was engaged in the development of ductions have received Grammy nomi-
Between 1976 and 1978 he worked CAD tools and signal processing for nations or awards. In 2000 Schwarz
on electronic component design for loudspeaker systems. graduated from the UdK Berlin with
custom sound-reinforcement systems In 1994 he became a member of the the Tonmeister-Diplom, the equivalent
at Paul Veneklasen and Associates. DSP group, where he studied the to a double M.A. and M.Sc. in classical
From 1978 to 1984 he was involved improvements of sound reproduction music production and recording sci-
in digital-audio and magnetic record- by exploiting DSP and psychoacousti- ence in the U.S. In 2001 she joined the
ing research at Ampex Corporation. cal phenomena. He currently holds the TV department of Bayerische Rund-
At that time he became interested in position of research fellow. He has funk, Munich, Germany, as video and
applying psychoacoustics to the published more than 120 technical sound engineer. During this engage-
design and analysis of digital-audio papers and reports, and holds over a ment she recorded Yale University’s
conversion systems. Since 1984 he dozen U.S. patents, while another 40 Chamber Music Festival 2002, featur-
has worked at Dolby Laboratories on are pending. He has been a member of ing the Tokyo String Quartet. Since
the application of psychoacoustics to the organizing committee and chair of 2003 she has been a sound engineer for
the development of audio systems and various conventions. Currently he is studio and remote productions in BR’s
on the development of a number of the chair of AES’ Technical Commit- radio department.
bit-rate reduction audio coders for tee on Signal Processing and a review- She was chair of the Student Delegate
music distribution, transmission, and er for the AES Journal. He is a senior Assembly, Europe/International Regions
storage applications. He has also member of the IEEE, NAG (Dutch in 1999, facilities assistant at the conven-
investigated perceptually derived lim- Acoustical Society), and ASA (Acous- tions in New York in 2001, Munich
its for the performance of digital- tical Society of America). He has been 2002, New York 2003, and Berlin 2004,
audio conversion and low-frequency a member of the Dutch AES committee with increasing involvement in educa-
loudspeaker systems. Currently, he is for various positions, recently as chair. tion events and section activities.
JOHN VANDERKOOY was born FACULTY VACANCY ANNOUNCEMENT: Music Business Production Emphasis.
in 1943 in
Maasland, the RANK AND SALARY: Assistant Professor/Associate Professor (dependent upon qualifi-
Netherlands. He cations and experience).
emigrated to
Canada with his RESPONSIBILITIES: Teach courses in music recording production and audio engineer-
family at an ear- ing technology. Specifically, beginning, intermediate, and advanced courses in studio
ly age. All of recording theory, history, and practice. Typical load is 24 hours per year (four class sections
his education per semester) plus student advising. Classes may include topics in studio, mastering, post-
was completed production audio for video, or concert remote recording. This position involves a full-time
commitment to teaching. Base contract salary is ten-month cycle with additional summer
in Canada, with
teaching option available.
a B.Eng. degree in engineering physics
in 1963 and a Ph.D. in physics in 1967, QUALIFICATIONS: Teaching experience and a Master's degree in a related discipline
both from McMaster University in with current or future pursuit of Doctorate preferred. Experience and progress toward a ter-
Hamilton, Ontario. After a two-year minal degree may be considered. Experience with studio record production and session pro-
postdoctoral appointment at the Uni- cedures, professional experience with commercially released credits, and demonstrated abil-
versity of Cambridge in the UK, he ity to communicate and work as part of an accomplished team are required. Must possess
went to the University of Waterloo. comprehensive knowledge of microphone design, studio and concert recording techniques,
For some years, he followed his doc- must have historical as well as functional and theoretical knowledge of both analogue
toral interests in high magnetic-field, (Neve and SSL console operations; Studer and Otari 2-inch machine alignment and opera-
low-temperature physics of metals. tions--including synchronization procedures) and digital recording technology (specifically
ProTools, Nuendo, Sony DASH, and Otari RADARHD systems).
His research interests since the late
1970s, however, have been mainly in BELMONT UNIVERSITY: A coeducational university located in Nashville, TN,
audio and electroacoustics. He is cur- Belmont is a student-centered, teaching university focusing on academic excellence. The
rently a full professor of physics at the university is dedicated to providing students from diverse backgrounds an academically
University of Waterloo. Over the challenging education in a Christian community, and is affiliated with the Tennessee
years he has spent sabbatical research Baptist Convention.
leaves at the University of Maryland,
Chalmers University in Gothenburg, THE MIKE CURB COLLEGE OF ENTERTAINMENT AND MUSIC BUSINESS:
the Danish Technical University Located near Nashville's dynamic Music Row, the Mike Curb College of Entertainment and
in Lygnby, the University of Essex in Music Business enrolls 900+ majors and combines classroom experience with real-world
the UK, the Bang & Olufsen Research applications. The curriculum comprises a BBA with emphasis areas in Music Business and
Music Production. Facilities feature eight state-of-the-art recording studios, including the
Centre in Struer, Denmark, and
award-winning Ocean Way Nashville studios, historic RCA Studio B, and the state-of-the-
Philips National Labs in Eindhoven, art Robert E. Mulloy Student Studios in the Center for Music Business.
the Netherlands. Vanderkooy is a
fellow of the AES, a recipient of its APPLICATION PROCESS: Candidates are asked to respond to Belmont’s mission,
Silver Medal and several Publication vision, and values statement in a written statement articulating how the applicant’s knowl-
Awards. Over the years he has con- edge, experience and beliefs have prepared them to function in support of that statement.
tributed a wide variety of technical Send a letter of application including a statement of personal educational philosophy, a
papers in such areas as loudspeaker complete resumé/curriculum vitae, and contact information for at least three references to:
crossover design, electroacoustic mea-
surement techniques, dithered quantiz- Dr. Wesley A. Bulla
ers, and acoustics. Together with his Associate Dean
Mike College of Entertainment and Music Business
colleague Stanley Lipshitz and a num-
1900 Belmont Blvd.
ber of graduate students they form the Nashville, TN 37212
Audio Research Group at the Univer-
sity of Waterloo. His important contri- APPLICATION DEADLINE: Review of applications will begin immediately
butions were papers on dither in digi-
tal audio and MLS measurement BELMONT IS AN EOE/AA employer under all applicable civil rights laws. Women and
systems. He brings an academic point minorities are encouraged to apply.
of view to the AES.
The preceding biographies were

provided by the respective officers.
While the Audio Engineering Soci-
ety believes them to be accurate, it
cannot assume responsibility for
accuracy or completeness.
p1188to1191_NewssectionsNov 10/8/04 11:34 AM Page 1
NEWS
OF THE
SECTIONS
We appreciate the assistance of the
section secretaries in providing the
information for the following reports.
systems. The group thanked Fred

Vogler, Los Angeles Philharmonic
sound designer, and Michael Cooper,
head of Hollywood Bowl’s
Audio/Video Department, who out-
lined the Bowl’s design criteria and
the roles that these individuals played
in defining the various operational and
technical parameters. Marc Lopez, a
Yamaha Corporation product manag-
er, described the use of front-of-house
assignable PM1D/DM2000 digital
consoles and the stage-left PM1D
monitor console, together with the
locations of DSP engines for short
analog cable runs plus direct digital
interface to the Contour systems.
Paul Freudenberg, L-Acoustics
U. S. sales and marketing director, and
Bernie Broderick, director of technical
services detailed the design of the new
multi-element left/center/right line-
Los Angeles Section visits newly refurbished Hollywood Bowl in July. array system, including component
Photo by Mel Lambert/content-creators.com and cabinet layout. Bruce Jackson,
senior vice president with Lake Tech-
began last October so that the new nology, described the use of Contour
Hollywood Bowl system would be ready for use in late Digital Loudspeaker Processors to
On July 28, 90 members of the Los June 2004. The new shell, video and control all equalization, dynamics,
Angeles Section met at the newly lighting hardware, technical infra- delay, and overall signal conditioning
refurbished Hollywood Bowl, where structure, and line-array sound sys- of the flown systems, as well as a
participants received an insider’s look tem cost a reported $18 million. At direct AES/EBU-format digital inter-
at the new digital mixing and LCR the heart of the new sound system is face from the mixing consoles and sig-
line-array sound system. The Holly- an L-Acoustics V-DOSC/d V-DOSC nal distribution to the various power
wood Bowl is an impressive outdoor rig powered by L-Acoustics/Lab amp racks.
venue that poses a number of creative Gruppen power amplifiers and user- The group thanked four AES Exec-
challenges to the acoustic designer. programmable Contour systems from utive Committee members: Joe Carter,
With a seating capacity of 17,500 Lake Technology. Yamaha digital Bob Lee (with last minute substitute
within a large natural amphitheater consoles at the front-of-house and Bill Hogan), Phil Richards, and Bejan
that stretches 450 feet from an acousti- stage-monitor positions cover all Amini, who served as Bowl stewards,
cal shell, the venue’s primary role is sound mixing duties. shepherding tour groups between the
classical music. The Bowl serves as AES participants split into groups in various areas.
the summer home of the Los Angeles order to rotate through four specially As an extra treat, the tour concluded
Philharmonic. However, it also needs organized lecture stations, where rep- with a concert by the Count Basie
to be versatile enough to accommo- resentatives from hardware manufac- Orchestra, which this year is celebrat-
date jazz, opera, and popular music, in turers, as well as the members of the ing the 100th birthday of “the kid from
addition to leased events. Bowl’s sound department provided Red Bank.”
Demolition of the old structure insight into the workings of the new Mel Lambert
NEWS
OF THE
SECTIONS
ogy, whereby artifacts of the recording original analog recording and subse-
McGill Students Meet process are extracted and conditioned. quent playback.
One hundred and twenty people gath- These artifacts are then converted into Howarth provided a complete
ered on January 9, for the McGill a pseudoclock source. Using this clock description of the steps necessary to
University Student Section meeting information, the audio is reconfigured obtain results with Time Traveler. He
in Clara Lichtenstein Hall at the in DSP with a unique application of also had samples from familiar master
Strathcona Music Building in Montre- Irregularly Spaced Sampling Theory. tapes, charts, and other analytical data,
al. Bob Ludwig, renowned mastering In so doing, the recorded material pro- which he used to demonstrate the
engineer, was the featured speaker. vides the information necessary to process. The conclusion was that the
The Centre for Interdisciplinary counteract the mechanical defects unit helps reduce FM distortion by as
Research in Music Media and Tech- inherent within the analog recorder/ much as 30 dB.
nology sponsored the event. reproducer as manifested during the April Cech ➥
Ludwig spoke to the standing room
only crowd about his mastering facility,
Gateway Mastering in Portland, Maine.
He addressed the issues involved in DSCOPE SERIES III
building such a facility. He showed pic-
tures of the studios and discussed the
THE FASTEST WAY TO TEST
rooms, wiring, acoustics and equipment
among other important elements essen- We listened, we developed and then we provided the
tial to a mastering facility. Audience "complete solution" for audio test - dScope Series III
members asked Ludwig’s opinion on
issues such as DVD versus SACD, the
use of compression in pop music, and Ideal for:
the general state of the industry. • Research & Development
In addition to this appearance, Lud- • Automated Production
wig also gave a private lecture to the
McGill AES students. During this ses- Test
sion, Ludwig talked about a DVD art • Quality Assurance
and music project mixed in 5.1 that • Servicing
Gateway had worked on with the com-
poser, Steve Reich. Students were also • Installation
given the unique opportunity to have
Ludwig critique their material.
The section was grateful to Ludwig
for taking time out of his busy sched-
ule to come to Montreal. This event
also helped generate a lot of publicity
for the Audio Engineering Society
among Montreal’s professional audio
community.
Time Traveler
On February 9, members of the
McGill University Student Section dScope Series III is fast becoming the worlds leading "complete solution" for
met at the Strathcona Music Building audio test and measurement. See for yourself why industry leaders like SSL,
in Montreal to hear Jamie Howarth,
Allen & Heath, Neve and Klark Teknik are making the change.
president of Plangent Processes, talk
about a hardware/software solution
for audio restoration and analog Call or e-mail NOW to find out how dScope Series III can help you.
recording.
Plangent Processes has developed a Prism Media Products Limited Prism Media Products Inc.
William James House, 21 Pine Street,
processor called “Time Traveler” that Cowley Road, Cambridge. Rockaway, NJ.
CB4 0WX. UK. 07866. USA.
removes speed variations from analog
recording as they are digitized. The Tel: +44 (0)1223 424988 Tel: 1-973 983 9577
Fax: +44 (0)1223 425023 Fax: 1-973 983 9588
processor corrects speed and pitch
sales@prismsound.com www.prismsound.com
variations known as wow and flutter
using a wideband reproducer technol-
NEWS
OF THE
SECTIONS
of system designer and technical sup- Ethernet and CobraNet

DC Hears about Ethernet port. On this occasion he wore a third Rayburn’s presentation concentrated on
Ray Rayburn of Peak Audio of Boul- hat: that of educator. why audio should be transferred over
der, Colorado, spoke about “Audio The first question from members Ethernet systems, and the advantages
over Ethernet” to the DC Section on concerned the derivation of the name the CobraNet technology brings. He
August 16. Fourteen members came to CobraNet. Rayburn explained that first reviewed the history of Ethernet’s
the National Public Radio headquar- the name derives from the interest in evolution from the original sketch of
ters to hear his talk. Peak Audio, a Shelby Cobra race cars that an Xerox’s Metcalf to today’s configura-
subsidiary of Cirrus Logic, licenses its investor in Peak Audio had, which tions, which provide for the transport
CobraNet approach to delivering high resulted in the original #1 car being and distribution of multichannel audio
quality audio over Ethernet systems. exhibited in the Peak Audio booth at and control data over modern Ethernet
Rayburn typically wears two hats: that a Las Vegas trade show. networks. Over 40 manufacturers have
selected Peak Audio’s CobraNet, mak-
ing it the most popular real-time audio
over Ethernet technology.
Two major advantages of using
Ethernet for audio distribution are
Why advertise in the reliability and installation cost. Ray-

burn explained that high levels of
connectivity are complex when using
Journal of the traditional point-to-point wiring,
requiring trained professionals for
Audio Engineering proper installation using expensive
audio cabling along with patch bays
Society? and switching. On the other hand, Eth-

ernet wiring using CAT-5 unshielded
twisted pairs cabling (or fiberoptic for
It's the best way to reach the broadest spectrum of longer distances) is inexpensive and
can easily be installed by electricians.
decision makers in the Patch bays and complex switching are
audio industry. replaced by software commands and
user-friendly graphics. For example,
The more than 12,000 AES members and sub- any audio input can be routed to any
audio output in any combination, and
scribers worldwide who read the Journal include real time system adjustments and con-
record producers and engineers, audio equipment figurations can be easily performed.
designers and manufacturers, scientists and The digital transmission is based on
researchers, educators and students (tomorrow's isochronous data streaming, which
guarantees real time audio transfer of
decision makers). up to 64 channels of 24 bit audio at 48
kHz within 5 1⁄3 ms each direction on a
Deliver your message where you get single 100 Mbit Ethernet link.
the best return on your investment. Rayburn passed around a small
Ethernet box with the question “what
is missing on this box?” Besides Eth-
For information contact: ernet connectors, the box contained
Advertising Department, Flavia Elzinga microphone input and loudspeaker
Journal of the Audio Engineering Society output jacks. Missing was any power
connection. This box represented the
60 E 42nd Street, Room 2520
first standards-compliant production
New York, NY 10165-2520 hardware that exploits power over
Tel: +1 212 661 8528, ext. 34 Ethernet. It can drive a loudspeaker
journal_advertising@aes.org in an Ethernet audio distribution
system, eliminating separate power
www.aes.org/journal/ratecard.html
amplifier and AC voltage connec-
tions, yet retaining full system
addressability.
There was much discussion during
NEWS
OF THE
SECTIONS
and after Rayburn’s presentation. A

topic of concern among the members
was the time delay through a
CobraNet system (5 1⁄3 ms) and
whether the relative time delay
between channels is sufficiently small
to preserve proper channel-to-channel
coherence in a 5.1 multichannel
program (it is).
Fred Geil
Thiele in India
The India Section met on Monday,
July 19 at the Ramee Guestline hotel,
Juhu, Mumbai, to listen to a presenta-
tion of two technical papers by AES
international vice-president Neville
Thiele. Before the meeting, he had a Neville Thiele presents two technical papers at India Section meeting.
one-on-one conversation with section
committee members. The venue was were first published, the T-S parame- Linear Single Channel 50 W amp. At
packed to capacity with almost every ters continue to prove extremely useful. the end of his talk, he provided some
member present to listen to a legend in Realizing he had lost track of the time, useful resources for those interested in
our field of work. Thiele had to abruptly conclude his learning more about amps. On the
The international vice-president presentation in order to catch a flight Web visit http://www.diyaudio.com
was very pleased with the good work back to Australia. and http://www. passdiy.com,
being carried out by the India section Thiele promised section members to and for a place to order boards
and recommended that all sections follow up on his all too brief visit with http://www.4pcb.com.
follow their example. Besides another visit in the near future. The
encouraging scientific research in the meeting ended with a sumptuous Indi-
field of audio engineering, the section an feast in honor of Thiele’s visit. Effects of Digital TV
has been building awareness among Russell A. Corte-Real Jim Hilson of Dolby Labs visited the
the general public about the miscon- Nashville Section on June 29, to talk
ceptions associated with audio and about the effects of digital television
loud music, along with informing Solid State Power Amps transmission on the delivery of multi-
them on the finer technical aspects of To kick off the fall season, Penn State channel sound.
the modern audio systems and the Section met on September 9 to hear Hilson discussed various logistical
fascinating world of recording sound Eli Hughes speak on Modern Solid and physical problems involved in
in a studio. State Power Amps: Concerns and delivering multichannel sound with
The two papers presented were Implementation. After some opening the new Digital TV standard. He cited
“The Dynamics of Reproduced remarks by Dan Valente, Hughes many specific examples from the sum-
Sound” and “The Thiele-Small Para- began his talk. mer Olympics in Athens, Greece, in
meters for Measuring, Specifying and Hughes covered some general topics order to demonstrate the complexity of
Designing Loudspeakers.” such as, why build your own amp, keeping audio and video in sync with
Thiele explained how, in the more what kinds of things should be under- various processing taking place up and
than 100 years since sound was first stood before building one of your down the transmission path.
recorded in a form that could be own, and what is the basic idea of an Jim Ferguson, chief engineer, at
replayed, the quality of reproduction amp. Overall it was very informative WNPT, the local PBS affiliate, said
has improved dramatically. Neverthe- for those just getting started in the that the station is not yet using the
less sound reproduction, although area. Hughes also described his additional bandwidth of the digital
better in many respects, still too often method of design: Load to Line. He TV broadcast frequency for HD pro-
retains the problem of restriction of advocates building the amp for what gramming. However, they are experi-
dynamic range. The ways in which you want to use it on. menting with the multicast delivery
this restriction sometimes enhances of specialized programming to pri-
but very often impairs the quality of Schematics vate sectors of the community while
sound reproduction were explored and When the basics of amps were dis- delivering standard definition pro-
explained in lucid detail. cussed, he continued with a brief gramming that mirrors the analog
Now more than 40 years since they discussion of the schematics of his transmission over the digital channel.
st & cal nov page oct 12 10/12/04 11:46 AM Page 1
SOUND
Upcoming Meetings
TRACK
2005 April 18-21: International Con-
ference on Emerging Technolo-
gies of Noise and Vibration ABOUT COMPANIES… theater or automotive system heard
Analysis and Control, Saint lifelike surround sound, while those
Raphael, France. E-mail:
goran.pavic@insa-lyon.fr. AES sustaining member Solid State listening over regular two-loudspeaker
Logic of Begbroke, UK, has named stereo systems heard the broadcast in
•
AudioPro International, Inc. of Toron- enhanced stereo.
2005 May 16-20: 149th Meeting of
the Acoustical Society of America, to, Canada, the exclusive distributor Circle Surround is a patented multi-
Vancouver, British Columbia, of the new SSL AWS 900 Analogue channel audio encoding and decoding
Canada. ASA, Suite 1NO1, 2 Workstation System for the Canadian technology capable of supporting a
Huntington Quadrangle, Melville, market. wide range of surround sound cre-
NY 11747, USA. Fax: +1 516 576 The advance of Pro Tools and other ation and playback applications. CS
2377; Web: http:// www.
asa.aip.org. such recording and editing systems hardware and software encoders can
used in many facilities today makes encode up to 6.1 channels of discrete
•
the AWS 900 suitable for bridging the audio for distribution over existing
2005 May 28-31: 118th AES Con-
vention, Barcelona, Spain. See gap between a simple digital control two-channel carriers such as broad-
page 1208 for details. unit and the high-end sound of an SSL cast television, cable, and satellite
• SuperAnalogue console. The AWS transmission, streaming media over
2005 July 7-9: 26th AES Con- 900 provides the dual benefits of a ful- the Internet, CDs and VHS tapes.
ference, “Audio Forensics in the ly featured SuperAnalogue signal path
Digital Age,” Denver, CO, USA. coupled with a DAW controller at a AES sustaining member Klipsch
See page 1208. lower price point. According to the Audio Technologies of Indianapolis,
• principal partner of AudioPro Interna- Indiana, has reached an agreement
2005 October 7-10: 119th AES Con- tional, this kind of solution suits the with Oxmoor Corporation LLC of
vention, Jacob K. Javits Conven- Canadian market, which has many Birmingham, Alabama, making Klip-
tion Center, New York, NY, USA. smaller facilities that can not afford a sch the exclusive global distributor of
full-blown 9000 J or K Series. ZON Whole House Digital Audio
The representatives of AudioPro products. The announcement came
MAGNETIC RECORDING: International, Inc., specialize in sur- just before Klipsch debuted 50 of its
The Ups and Downs of a Pioneer round sound applications and bring a own new high-performance loud-
The Memoirs of combined 60 years of experience to speaker products geared toward the
Semi Joseph Begun the audio sales field. AudioPro will custom installation market, 28 of
be setting up a demo facility for the which were on display recently at the
SSL AWS 900 workstation in the Custom Electronic Design and Instal-
near future. lation Association (CEDIA) Expo in
Indianapolis.
SRS Labs, Inc., of Santa Ana, Cali- According to Paul Jacobs, Klipsch
fornia, a sustaining member of the president, Klipsch has experienced
AES, has announced that 38 member significant growth in retail, commer-
stations of JFN Network, Japan’s lead- cial, multimedia, and professional
ing broadcast network association of cinema segments of the audio business
radio stations, aired Japan’s first sur- over the past five years. The ZON
round sound soccer broadcast on alliance will now allow Klipsch to
August 1. make an even greater impact on the
The participating stations broadcast- residential contracting market.
ed the Kashima Antlers vs. FC Oxmoor’s award-winning ZON
Barcelona game in full 5.1 surround Digital Whole House Audio System
using SRS Labs’ Circle Surround (CS) features a stylish easy-to-use control
technology. Tokyo FM broadcasted with 60-Watt integrated amp, analog
the soccer game and encoded the sur- and digital audio inputs, IR routing,
Soft cover round mix live into the standard two- RS-232 control, source selection, EQ,
Prices: $15.00 members channel broadcast format using SRS balance, paging, and other advanced
$20.00 nonmembers
Circle Surround. The signal was then features. ZON received honors for this
AUDIO ENGINEERING SOCIETY
Tel: +1 212 661 8528 ext 39 broadcast to over 100 million soccer system last year from the Consumer
e-mail Andy Veloz at fans in Japan via the FJN network. Electronics Association (CEA) as the
AAV@aes.org
http://www.aes.org Listeners with a multichannel home “Mark of Excellence Award Winner.”
p1193NewProductsNov 10/12/04 11:36 AM Page 1
NEW PRODUCTS
AND
DEVELOPMENTS
Product information is provided as a
service to our readers. Contact manu-
facturers directly for additional infor-
mation and please refer to the Journal
of the Audio Engineering Society.
AES SUSTAINING MEMBER figuration, the ADA-8 can provide eight until full control is regained. The unit
DIGITAL RADIO TUNER is channels of AES/EBU digital and ana- weighs 2.5 pounds and measures
designed for use with Kenwood log input and output at sample rates up 2-inches x 3-inches x 5-inches for
Excelon™ and Kenwood in-dash DVD to 96 kHz. The converter is compatible easy placement. A Designs Audio,
and CD receivers. Model KTC-HR100 with a wide range of PC- and MAC- P. O. Box 4255, West Hills, CA
works with more than two-dozen 2003 based recording, editing, and sequencing 91304, USA; tel. +1 818 716 4153;
and 2004 Kenwood models. WHUR- systems including Digidesign Pro-Tools fax +1 818 716 4153; e-mail sales@
FM, at Howard University, is now Mix and Pro-Tools HD, Logic Audio, adesignsaudio.com; Web site
broadcasting HD Radio™ technology, Final Cut Pro, and many others. In http://www.adesignsaudio.com.
making it the first commercial station to addition to Firewire, ADA-8 interfaces
bring digital radio to the Washington, include AES, Pro Tools MIX/HD, and
D.C. metropolitan area. The station is DSD. Connection to Pro-Tools is
broadcasting signals with a Harris direct to the Mix or Core card, eliminat-
ZDD64HDC 28 000 W, solid-state FM ing the Pro-Tools I/O hardware. Prism
digital broadcast transmitter using the Media Products Inc., 21 Pine Street,
Harris DEXSTAR™ HD Radio exciter. Rockaway, NJ 07866, USA; tel. +1 973
The Harris equipment transmits the HD 983 9577 (US); fax +1 973 983 9588;
radio audio and data created by soft- in UK: tel. +44 1223 424 988; fax AES SUSTAINING MEMBER
ware developer, iBiquity Digital. +44 1223 424 023; e-mail sales @ RECHARGEABLE BATTERY
These combined technologies provide prismsound.com; Web site http://www. PACK powers the Portadisc for
a platform for integrated wireless data prismsound.com. approximately three and one half hours.
services that deliver a variety of addi- As a single, sealed unit, the MDPBP bat-
tional information via scrolling text. tery pack does not suffer the problems
Kenwood USA Corporation, P. O. that can occur with caddies containing
Box 22745 MDS, Long Beach, CA individual AA sized cells, in which bat-
90801, USA; tel. +1 310 639 4200; teries can be inserted incorrectly, or
fax +1 310 604 4487; Web site rechargeable and alkaline batteries may
www.kenwoodusa.com. be inadvertently combined. Designed
and engineered for the most demanding
conditions, the MDPBP is both short
circuit and temperature protected, with
all internal metal contacts securely
welded for long-term reliability.
PASSIVE IN-LINE AUDIO LEVEL Also new from the company is the
CONTROLLER is designed to con- ACS11O, a microprocessor-controlled
trol mono/stereo audio signals from charger specifically developed for use
powered monitors and amplifier units with the MDPBP battery pack. An
that have no output control, such as a ACS110 charger and two MDPBPs will
AES SUSTAINING MEMBER series of microphone preamplifiers, supply a Portadisc with continuous bat-
MULTICHANNEL AD/DA CON- power amplifiers or loudspeakers. The tery power. Additional ACS110 features
VERTER adds a Firewire (IEEE1394) new model ATTY controller has two include a discharge function for effec-
interface module that is compatible with Neutrik combo 1/4-inch XLR input tive management of the MDPBP pack,
the latest Apple OS/X operating system. jacks and two balanced output XLRs. and comprehensive LED metering of the
Support is also planned for Windows Features include a level control knob charging functions. Sennheiser Electron-
XP. The new module allows the ADA-8 and mute switch for those moments ic Corporation, 1 Enterprise Drive, Old
to operate with software such as Emagic when an immediate response is Lyme, CT 06371, USA; tel. +1 860 434
Logic Audio V6, Apple’s Final Cut Pro, required. The mute switch operates as 9190; fax +1 860 434 1759; Web site
and many other applications. In this con- a “panic button,” shorting the signal www.sennheiserusa.com.
p1194_lit_nov 10/7/04 4:31 PM Page 1
Proceedings of the AVAILABLE

AES 24th
International LITERATURE
The opinions expressed are those of
Conference: the individual reviewers and are not
Multichannel Audio, necessarily endorsed by the Editors of
The New Reality the Journal.
Banff, Alberta, Canada
2003 June 26-28
This conference was a follow-
up to the 19th Conference
on surround sound. These IN BRIEF AND OF INTEREST… manageable, and integrated system.
papers describe multichannel The eXaudi XIP882 is the first prod-
sound from production and Building Valve Amplifiers, by Mor- uct in the eXaudi range, and is only
engineering to research and gan Jones (Newnes/Elsevier) is a available to Digigram’s OEM devel-
development, manufacturing, practical guide to building, modify- opment partners. One of the first
and marketing. ing, troubleshooting, and repairing adopters of the technology, Finnish
valve amplifiers. Valuable to anyone software house Jutel, has already inte-
350 pages
working with tube audio equipment, grated eXaudi into its broadcast con-
Also available this 354-page guide provides a hands- tent management solution RadioMan.
on CD-ROM on approach to valve electronics, The eXaudi XIP882 exemplifies
both classic and modern, and Digigram’s network centric approach
describes the technology using a min- to integrating multiple applications,
THE PROCEEDINGS imum of theoretical language. Chap- which are already digitalized but show
OF THE AES 24 th ters include Planning, Metalwork for only little operational interoperability.
INTERNATIONAL Poets, Wiring, Testing, Faultfinding Some of these “digital islands” are
CONFERENCE to Fettling, and Performance Testing. radio automation, studio transmitter
Topics covered include chassis lay- links, logging, monitoring, content
2003 June 26–28
out, safety and drilling, earthing, lay- sharing and program localization.
out of components, test equipment, At the heart of the system are ready-
linear distortions and more. to-use, stand-alone devices like the eX-
Jones pays particular attention to audi XIP882; devices that combine
answering questions commonly asked well-proven Digigram audio process-
by newcomers to the world of the vac- ing inherited from Digigram’s PCX
Banff, Alberta, Canada uum tube. As such, the paperback range of sound cards and IP streaming
Conference Chair: book is useful both to enthusiasts tack- capabilities. Other key features include
Theresa Leonard ling their first project and more expe- encoding and decoding in various for-
rienced amplifier designers seeking to mats (MPEG 1 and 2 layer 2, MP3,
work further with valves. Clear illus- MPEG4/AAC+ and others), an embed-
trations, photographs, reference lists, ded matrix, which provides routing
and an index are included for clarifica- capabilities from any input (physical,
tion. Contact Newnes (an imprint of file, or IP stream) to any output (physi-
Elsevier Publishing), Linacre House, cal, file, or IP stream), and control and
Jordan Hill, Oxford OX2 8DP, UK; or management capabilities based on a
200 Wheeler Road, Burlington, MA SDK and a network API.
01803, USA. On the Internet: The eXaudi system is based on open
http://www.books.elsevier.com. standards, which makes it easy to inte-
grate into software applications and
Digitalization of Radio Broadcast- existing IP networking infrastructure:
You can purchase the ing–the Next Step, is a white paper Ipv4 and Ipv6, RTP-RTCP-RTSP
book and CD-ROM online published by Digigram that illustrates streaming protocols, Linux as operat-
at www.aes.org. the innovative approach of its eXaudi ing system, MP3/MPEG4/AAC+
For more information IP Audio Streaming, Processing, and encoding/decoding, or SNMP output.
email Andy Veloz at Routing System. The eXaudi allows The Digitalization of Radio Broad-
AAV@aes.org for the integration of radio automation casting–the Next Step white paper is
or telephone applications and audio transport over available for download at www.
+1 212 661 8528 ext. 39 IP networks with a single scalable, digigram.com/exaudi.
p1195to1198MembershipNov 10/7/04 4:25 PM Page 1
MEMBERSHIP
INFORMATION
Section symbols are: Aachen Student Section (AA), Adelaide (ADE), Alberta (AB), All-Russian State Institute of Cinematography
(ARSIC), American River College (ARC), American University (AMU), Appalachian State University (ASU), Argentina (RA),
Atlanta (AT), Austrian (AU), Ball State University (BSU), Belarus (BLS), Belgian (BEL), Belmont University (BU), Berklee
College of Music (BCM), Berlin Student (BNS), Bosnia-Herzegovina (BA), Boston (BOS), Brazil (BZ), Brigham Young University
(BYU), Brisbane (BRI), British (BR), Bulgarian (BG), Cal Poly San Luis Obispo State University (CPSLO), California State
University–Chico (CSU), Carnegie Mellon University (CMU), Central German (CG), Central Indiana (CI), Chicago (CH), Chile
(RCH), Cincinnati (CIN), Citrus College (CTC), Cogswell Polytechnical College (CPC), Colombia (COL), Colorado (CO),
Columbia College (CC), Conservatoire de Paris Student (CPS), Conservatory of Recording Arts and Sciences (CRAS), Croatian
(HR), Croatian Student (HRS), Czech (CR), Czech Republic Student (CRS), Danish (DA), Danish Student (DAS), Darmstadt
(DMS), Del Bosque University (DBU), Detmold Student (DS), Detroit (DET), District of Columbia (DC), Duquesne University
(DU), Düsseldorf (DF), Ecuador (ECU), Expression Center for New Media (ECNM), Finnish (FIN), Fredonia (FRE), French
(FR), Full Sail Real World Education (FS), Graz (GZ), Greek (GR), Hampton University (HPTU), Heartland (HRT), Hong Kong
(HK), Hungarian (HU), I.A.V.Q. (IAVQ), Ilmenau (IM), India (IND), Institute of Audio Research (IAR), Israel (IS), Italian (IT),
Italian Student (ITS), Japan (JA), Javeriana University (JU), Kansas City (KC), Korea (RK), Lithuanian (LT), Long
Beach/Student (LB/S), Los Andes University (LAU), Los Angeles (LA), Louis Lumière (LL), Malaysia (MY), McGill University
(MGU), Melbourne (MEL), Mexican (MEX), Michigan Technological University (MTU), Middle Tennessee State University
(MTSU), Moscow (MOS), Music Tech (MT), Nashville (NA), Netherlands (NE), Netherlands Student (NES), New Orleans (NO),
New York (NY), New York University (NYU), North German (NG), Norwegian (NOR), Ohio University (OU), Orson Welles
Institute (OWI), Pacific Northwest (PNW), Peabody Institute of Johns Hopkins University (PI), Pennsylvania State University
(PSU), Peru (PER), Philadelphia (PHIL), Philippines (RP), Polish (POL), Portland (POR), Portugal (PT), Ridgewater College,
Hutchinson Campus (RC), Romanian (ROM), Russian Academy of Music, Moscow (RAM/S), SAE Nashville (SAENA), St. Louis
(STL), St. Petersburg (STP), St. Petersburg Student (STPS), San Buenaventura University (SBU), San Diego (SD), San Diego
State University (SDSU), San Francisco (SF), San Francisco State University (SFU), Serbia and Montenegro (SAM), Singapore
(SGP), Slovakian Republic (SR), Slovenian (SL), South German (SG), Spanish (SPA), Stanford University (SU), Swedish (SWE),
Swiss (SWI), Sydney (SYD), Taller de Arte Sonoro, Caracas (TAS), Technical University of Gdansk (TUG), Texas State
University—San Marcos (TSU), The Art Institute of Seattle (TAIS), Toronto (TOR), Turkey (TR), Ukrainian (UKR), University of
Arkansas at Pine Bluff (UAPB), University of Cincinnati (UC), University of Colorado at Denver (UCDEN), University of
Hartford (UH), University of Illinois at Urbana-Champaign (UIUC), University of Luleå-Piteå (ULP), University of
Massachusetts–Lowell (UL), University of Miami (UOM), University of Michigan (UMICH), University of North Carolina at
Asheville (UNCA), University of Southern California (USC), Upper Midwest (UMW), Uruguay (ROU), Utah (UT), Vancouver
(BC), Vancouver Student (BCS), Venezuela (VEN), Vienna (VI), Webster University (WEB), West Michigan (WM), William
Paterson University (WPU), Worcester Polytechnic Institute (WPI), Wroclaw University of Technology (WUT).
These listings represent new membership according to grade.

MEMBERS Carl Riehl Victor Sapphire
W. 30th St. # 4E, New York, NY 10001 (NY) 7510 Sunset Blvd. #116, Los Angeles, CA
Uffe L. Madsen 90046 (LA)
Lorenzo Rizzi
Uffe Lomholt Madsen Consult & Trade, Corso Matteotti 3A, IT 23900, Lecco, Italy Justin Schadt
Sauntesvej 15 A, DK 2820, Gentofte, (IT) 208 Glenbrook Dr. SE, Cedar Rapids, IA
Denmark (DA) 52403 (HRT)
Dietmar Rottinghaus
Connex GmbH, Dinklagerstrasse 96, DE Christopher Scheuermann
ASSOCIATES 361 Eastview Dr., Urbana, IL 43078 (DET)
49393, Lohne, Germany (NG)
Gustavo B. Quintana Castillo Mikkel Schille
Alexander Rueger
Mcal. Lopez, 1 de Marzo, Eusebio Ayala, Prestegaardsveien 9, NO 1710, Sarpsborg,
Wittekindstrasse 7, DE 12103, Berlin,
Cordillera, Paraguay Norway (NOR)
Germany (NG)
Brian Redmond Haiko Schillert
Phillip Rutschman Kurfuerstenstrasse 17, DE 12249, Berlin,
810 Broadway St. # 5, Lowell, MA 01854 3958 S. Lucile St., Seattle, WA 98118 (PNW)
(BOS) Germany (NG)
Rosy Saboh Berthold Schlenker
Erin Rettig Rapsodivagen 157, SE 14241, Bussardweg 80, DE 98693, Ilmenau,
1555 Scott Rd. #222, Burbank, CA 91504 Skogas/Stockholm, Sweden (SWE) Germany (SG)
(LA)
Mojtaba Saeidi John Schmidt
Lionel Rey IRIB London Bureau, 31 Warple Way, 7399 Circle Dr., Rohnert Park, CA 94928
6 rue Lancret, FR 75016, Paris, France (FR) Acton, London, W3 0RX, UK (BR) (SF)
Michael Rice Paul S. San Bartolome Mark Schultz
616 Pacific St. 1st. Fl., Brooklyn, NY Calle B #120 Urb. Pando 6ta., etapa San 12122 J Natural Bridge Rd., Bridgeton, MO
11217 (NY) Miguel, Lima 32, Peru (PER) 63044 (STL) ➥
MEMBERSHIP
INFORMATION
Greg Scott Sean Tan Doug Wong
14004 Mercado Dr., Del Mar, CA 92014 1556 Ambergrove Dr., San Jose, CA 95131 215-36 Ave. NE Unit 7, Calgary, T2E 2L4,
(SD) (SF) Alberta, Canada (AB)
Peter Seckel Joachim Thiemann Lonce Wyse
123 Tuscan Rd., Mapelwood, NJ 07040 4657 du Parc, Montreal, H2V 4E4, Quebec, 1F Pine Grove, 13-31, 595001 Singapore
(NY) Canada (MGU) (SGP)
Samantha Selig Jan Thore Hol Vladimir S. Zverev
154 Brown St., Tewksbury, MA 01876 Rabbenveien 6C, NO 3039, Drammen, Chernichnaya str 22, Vsevolozhskiy raion,
(BOS) Norway (NOR) Toksovo, RU 188666, Leningradskaya
Vinay Shrivastava oblast, Russia (STP)
Albert Trezza
Broadcast & Elec. Comm. Arts Dept., San 1211 Court N. Dr., Melville, NY 11747 Kevin Zwack
Francisco State University, 1600 Holloway (NY) 9653 Lamar Pl., Westminster, CO 80021
Ave., San Francisco, CA 94132 (SF) (CO)
Cartsen Tringgaard
Samir Sinha Tjearebyvej 111, DK 4000, Roskilde, STUDENTS
495 14th Ave. #1, San Francisco, CA 94118 Denmark (DA)
(SF) Daniel Epstein
Tony Tseng
Ali H. Sleiman 8F-6 no. 351 Chung Shan Rd., Sec.2 Chung 4423 N. Paulina St. # 1, Chicago, IL 60640
ACC wll, Shuwaikh, 176 Safat, 13002 Ho City, Taipei 235, Taiwan (CC)
Kuwait Michael Epstein
Christian Ulbrich
Kevin Smith 400 E. 66th St. # 4F, New York, NY 10017
Strelitzer str 18, DE 10115, Berlin,
204-73 Coburg St., New Westminster, V3L (IAR)
Germany (NG)
2E7, British Columbia, Canada (BC) Chris Fletcher
James W. Urick 6400 Christie Ave. #4118, Emeryville, CA
Jerome Smith 1510 Hillside Oak Dr., Grayson, GA 30017
Klepto Records/Diffrent Fur, 3470 19th St., 94608 (ECNM)
(AT)
San Francisco, CA 94110 (SF) Megan Foley
Jaime Valenszuela P. O. Box 4567, Davis, CA 95617 (SFU)
Adam Sohmer 338 Pasqual Ave., San Gabriel, CA 91775
Sohmer Associates LLC, 507 17th St., Daniel Forsberg
(LA)
Brooklyn, NY 11215 (NY) Ankarskatav 85b, SE 94134, Pitea, Sweden
Peter Van Dam (ULP)
Eric Southam ATS bvba, Wingepark 17, BE 3110,
Easyplug Inc., 2300 S. Decker Lake Blvd., Benjamin M. Foxx
Rotselaar, Belgium (BEL) 167 W. Hudson St., Long Beach, NY 11561
Salt Lake City, UT 84119 (UT)
William Vaughan (IAR)
Bryan Steele
1029 Park Rd. NW, Washington, DC 20010 Daniel Fritz
42855 W. 19th St., Lancaster, CA 93534
(DC) Millergasse 50/14, AT 1060, Vienna,
(LA)
Marcus Venturi Austria (VI)
Stefan Stenzel
Visu-IT! GmbH, Donaustaufer Str. 93, DE Stefan Fuhmann
Stadler Electro GbR, Neustrasse 12, DE
93059, Regensburg, Germany (SG) Oehrenstoecker str 3, DE 98693, Ilmenau,
53498, Waldorf, Germany (CG)
Fabio Vignoli Germany (IM)
Jim Stephens
Philips Research Laboratories, Philips Nozumu Furuya
1755 John Richardson Ln., Vale, NC 28168
Research WY-21, Prof. Holstlaan 4, NL 3 Admiral Dr. #F269, Emeryville, CA
(AT)
5656AA, Eindhoven, Netherlands (NE) 94608 (ECNM)
Daniel Stevens Matthew Gagnon
155 Burton Ave., Hasbrouck Heights, NJ Yon Visell
Kozada 1 Stinjan, Franinovic, HR 52000, 6230 N. Kenmore Ave. # 908, Chicago, IL
07604 (NY) 60660 (CC)
Pula, Croatia (HR)
David Stinson Diego F. Galceran Fernandez
54 Coady Ave., Toronto, M4M 2Y8, Travis Walat
1115 Providence Ct., Frederick, MD 21703 Juan Paulier 1018, Montevideo 11300,
Ontario, Canada (TOR) Uruguay
(DC)
Chris Sturwold Louis Galliot
This is Oddyo Inc., 4139-98 St. NW, Joseph Warda
4 allée du Clos de la Croix, FR 78290,
Edmonton, T6E 5N5, Alberta, Canada (AB) 43-60 Douglaston Pkwy. # 420,
Croissy sur Seine, France (CPS)
Douglaston, NY 11363 (NY)
Olav G. Sunde Carol A. Galvis Jimenez
Car Konows Gate 14, NO 5161, Laksevag, Mark Wherry Cra 23 No. 39 A 40 Apto. 302, Bogota,
Norway (NOR) 1547 14th St., Santa Monica, CA 90404 Colombia (JU)
(LA)
MyungHoon Sunwo Laporschea Gamble
Ajou Univ. San 5 Wonchon-Dong, Paldal- Silvia Weise 3802 Sutton Place Blvd. #1324, Winter
Gu, Suwon, Kyunggi-Do 442-749, Korea Falkentalersteig 58, DE 13467, Berlin, Park, FL 32792 (FS)
(RK) Germany (NG)
Aaron Gandia
Dennis Tabuena Joey White Villa de Torrimar 428 Valle Rey Luis,
1075 Trinity Dr., Menlo Park, CA 94025 1220 Wright St., Reno, NV 89509 Guaynabo, PR 00969 (FS)
(LA) Monte Wise Cole Gaugler
Joel Tan 13216 Marion Dr., Burnsville, MN 55337 27520 N. Sierra Hwy. # H205, Canyon
2/F 6D Babington Path, Hong Kong (HK) (UMW) Country, CA 91351 (CRAS)
MEMBERSHIP
INFORMATION
Oren Gertlitz Yves L. Henry
Kopernikusstrasse 5, DE 10243, Berlin, 53 Hempstead Rd., Spring Valley, NY
Germany (BNS) 10977 (NYU)
Jason Goldkamp Ralf Herrmann
2409 Lancaster Dr. # 11, Richmond, CA Juelicher str 80, DE 40477, Duesseldorf,
94806 (ECNM) Germany (DF)
Brandon Gonzalez Arthur T. Hill
11867 SW 9th Mamor, Davie, FL 33325 2005 N. Ball Ave., Muncie, IN 47304
(UOM) (BSU)
Rodrigo Gonzalez-Hverta Simeon Hinton
Calle 13 No. 230, Cordoba 24500, Mexico 96 Autumn Breeze Way, Winter Park, FL
Christos Goussios 32792 (FS)
12 Askitou Str., GR 54624, Thessaloniki, William Ho
Greece Apt. 404 1745 Wilcox Ave., Los Angeles,
Zachary Gowen CA 90028 (USC)
210 Crystal Lake Dr., Clermont, FL 34711 Ricardo Hohmann
(FS) Rindermarkt 16, DE 80331, Munich,
Celine Grangey Germany
139 rue Manin, Apt. 64, FR 75019, Paris,
Jason Holderness
France (CPS)
1118 Nord Ave. #32, Chico, CA 95926
Ulf Grunbaum (CSU)
Ankarskatav 84e, SE 94134, Pitea, Sweden
(ULP) Sean Hopper
2 Millridge Estates, Elora, N0B 1S0,
Manuel D. Guevara Alvarez Ontario, Canada
Calle 8f No. 79-37, Bogota, Colombia
(SBU) Tomislav Horvat
A. Mihanovica 22, HR 44322, Kutina,
Benjamin Gugler II Croatia (HRS)
2113 Irise Ct. Apt. 306, Orlando, FL 32807
(FS) Lisa M. Host
2728 N. 83 St., Omaha, NE 68134
Steven Guilliams
1803 Golden Gate Ave., San Francisco, CA Mattew Houston
94115 (SU) 906 N. Dodge #10, Iowa City, IA 52245
Hannelore Guittet Daniel Howd
13 rue Jules Auffret, FR 93500, Pantin, 801 E. Benjamin Ave., P. O. Box 150,
France (CPS)
Ajay Gupta
Norfolk, NE 68702
Michael T. Hudson
Advertiser
P. O. Box 115, Notre Dame, IN 46556 11801 High Tech Ave. #324, Orlando, FL
(BSU)
Mike Gurnari
32817 (FS)
Mats Ingemansson
Internet
8 Fuente Ave., San Francisco, CA 94132 Ankarskatav 71b, SE 94134, Pitea, Sweden
(SFU)
Erik Gustafsson
(ULP)
Anamaria D. Irisarri
Directory
Ankarskatav 79c, SE 94134, Pitea Sweden Carrera 2A #72-67 Apt. 201, Bogota,
(ULP) Colombia (JU)
Stanley Haggard Matthew C. Irvin Belmont University.........................1187
1202 Hillside Ave., Richmond, VA 23229 121 Hazelwood Dr. # E23, Hendersonville, www.belmont.edu
(HPTU) TN 37075 (SAE NA)
Christopher Harrelson Nicholas Jacalone BSWA Technology Co., Ltd...........1197
1911 28th St., Sacramemto, CA 95816 5410 Loma Ave., Temple City, CA 91780 www.bswa-tech.com
(SFU) (CSU)
Benjamin Harris *Prism Media Products Limited ......1189
Jeremy Jacobs www.prismsound.com
7985 W. 51st Ave. #8, Arvada, CO 80002 31103 Pierce Ct., Crown Point, IN 46307
(UCDEN) (FS) Rohde & Schwarz GmbH & Co......1183
Julia Havenstein Eric Jacobsen www.rohde-schwarz.com
Kaiser-Friedrich-str 64, DE 10627, Berlin, 936 Bishop Park Ct. # 1321, Winter Park,
Germany (BNS) FL 32792 (FS)
Joshua Hearst Chris Jara
872 Queen Anne Pl., St. Louis, MO 63122 4733 N. Goldenrod Rd. Apt. D, Winter
(WEB) Park, FL 32792 (FS)
Wiebke Heldmann Shawn Jennings
*AES Sustaining Member.
Chorinerstr. 61, DE 10435, Berlin, 202 E. Peabody Dr., URH 361 Scott Hall,
Germany (BNS) Champaign, IL 61820 (UIUC) ➥
MEMBERSHIP
INFORMATION
John Jensen Bryan Laseter John Madden
517 V St., Sacramento, CA 95818 (SFU) 7934-B Shoals Dr., Orlando, FL 32817 (FS) 1812 Page St. #5, San Francisco, CA 94117
Sverre K. Johansen Victor Laugier (SFU)
Madlamarkveien 6 l.118, NO 4041, 212 avenue Jean Jaures, FR 75019, Paris, Zoran Maksimovic
Stavanger, Norway France (CPS) Faculty of Dramatic Arts, Bulevar
Daniel Johnson David Layne Umetnosti 20, HR 11000, Zagreb, Croatia
8196 SW 53rd Ct., Ocala, FL 34476 (UOM) 100 Anavista Ave., San Francisco, CA (HRS)
Paul Johnson 94115 (ECNM) Valdemar J. Maldonado
4420 N. Varsity Ave. #1074, San Seng Siong Lee 12508 200th Ave. E., Sumner, WA 98390
Bernadino, CA 92407 (SDSU) 29 Jalan 3/149G, Taman Sri Endah, Kuala (TAIS)
Dave Jones Lumpur 57000, Malaysia Carlos A. Manrique Alonso
2404B Crestmoor Rd., Nashville, TN 37215 Cra. 25 #142-60 Apto. 502, Bogota,
Mario A. Lemus Mendez
(SAE NA) Colombia (JU)
Calle 48 No. 15-92 Apt. 302, Bogota,
William Jones Colombia (SBU) Brian Markman
1759 N. Semoran Circle # 203, Winter 3733 Goldenrod Rd. # 1109, Winter Park,
Park, FL 32730 (FS) Tobias Lentz
Viktoriastrasse 87, DE 52066, Aachen, FL 32792 (FS)
Johannes Kammann Germany (AA) Michael A. Mavriokos
Oldenburger str 31, DE 10551, Berlin, 418 Autumn Breeze Way, Winter Park, FL
Germany (BNS) Josue C. Lescano
unidad Vecinal de matute 39-H, La 32792 (FS)
Gregory Kares Victoria, Lima 13, Peru (OWI) Oscar E. Mazuera Escobar
501 Clinton St. #3, Brooklyn, NY 11231 Trans. 9c No. 130 b-81, Bogota, Colombia
(NYU) Oscar Andre Lie Foss
Tiurkroken 26, NO 2050, Jessheim, Norway (SBU)
Jill Kares Sebastian Mazur
501 Clinton St. #3, Brooklyn, NY 11231 Lena Lienig
Ugleveien 6a, NO 4042, Hafrsfjord, Norway ul. Wyspianskiego 7, PL 80-434, Gdansk,
(NYU) Poland (TUG)
Robert Kawiak Maria Linares
Calle 74 #6-11 Apto.302, Bogota, Colombia Kevin McCormick
ul. Fr. Sokoka 150, PL 80-603, Gdynia,
(JU) 2039 New Stonecastle Terrace #111, Winter
Poland (TUG)
Park, FL 32792 (FS)
Travis Kessler Olov Lindberg
Brandellsvag 8, SE 93133, Skelleftea, Kelly McCoy
3309 Horst Ln., Chambersburg, PA 17201
Sweden (ULP) 225 Brown Rd., Lot 46, Franklin, KY
(PSU)
Johannes Lindemann 42134
Nick Kettman
Eidelstedter Weg 9, DE 20255, Hamburg, Mike McKenzie
1200 Barton Hills Dr. # 209, Austin, TX
Germany 1701 Lee Rd. M431, Winter Park, FL
78704 (TSU)
Carlos Llorens 32789 (FS)
George Kim
Universidad Politenica Valencia, Ctra Anthony McMahon
P. O. Box 745020, Los Angeles, CA 90004
Nazaret-Olivia SN, ES 46730, Grao Gandia, Auchlinsky House, Burnfoot, Glendevon,
(LB/S)
Spain Clackmannanshire, FK14 7JY, Scotland
Craig King Erin Lockhart
1323 Whitewood Dr., Deltona, FL 32725 Drew McNally
3886 Calibre Bend Ln. # 809, Winter Park, 338 Newtown Rd., Richboro, PA 18954
(FS) FL 32792 (FS) (PSU)
Rishi Kirby Gabe D. Long
411 Lincoln Ave. Unit 36, Glendale, CA Matthew B. Meares
3416 Murphy Rd. # C-11, Nashville, TN 211 Lenora St. # 203, Seattle, WA 98121
91205 (CTC) 37203 (SAENA) (TAIS)
Mariusz Klawikowski Emil Lorelius
Nowy Barkoczyn 112, PL 83-422, Nowy Volker Meitz
Ankarskatav 84e, SE 94134, Pitea, Sweden
Barkoczyn, Poland (TUG) Bornholmer Str. 95, DE 10439, Berlin,
(ULP)
Germany (BNS)
Matthew Kline Bob Lorentz
703 Jackpine, P. O. Box 864, Stanton, NE 2 Spinozalaan, NL 2273 XA, Voorburg, David Menke
68779 Netherlands (NES) Riglergasse 14/12a, AT 1180, Wien,
Austria (VI)
Michal Klos Christoph Lowis
ul. Szarych Szeregow 4/2, PL 88-100, Erdbergstr 101/22, AT 1030, Vienna, Jessica Mercel
Inowroclaw, Poland (TUG) Austria (VI) 2096 St. Clair Ave. West, Toronto, M6J
3W6, Ontario, Canada
Ivan Kovacevic Elizabeth Luchenbill
Grcica Milenka 3, YU 11000, Belgrade, 611 St. Johns Ct., Winter Park, FL 32792 Daniel Miller
Yugoslavia (FS) 630 Hickory Club Dr., Antioch, TN 37013
(SAENA)
Lucille Kyle Oliver Ludecke
1411 Marchbanks Dr. #3, Walnut Creek, Turmstr 76, DE 10551, Berlin, Germany Jeremy Miller
CA 94598 (ECNM) (DF) 5029 SW Grayson, Seattle, WA 98116
(TAIS)
Robert Lapp Andrew Lux
3212 Arden Villas Blvd. #27, Orlando, FL 2841 Harrsion, San Francisco, CA 94110 Jonathan Patton
32817 (FS) (SFU) 530 Byron Rd., Winter Park, FL 32792 (FS)
10,000
Journal technical ar ticles, convention pr eprints,
and confer ence papers at your finge r tips
The Audio Engineering Society has published a 20-disk electronic library containing most
of the Journal technical articles, convention preprints, and conference papers published
by the AES since its inception through the year 2003. The approximately 10,000 papers
and articles are stored in PDF format, preserving the original documents to the highest
fidelity possible while permitting full-text and field searching. The library can be viewed on
Windows, Mac, and UNIX platforms.
You can purchase the entire 20-disk library or disk 1 alone. Disk 1 contains the program
and installation files that are linked to the PDF collections on the other 19 disks. For
reference and citation convenience, disk 1 also contains a full index of all documents
within the library, permitting you to retrieve titles, author names, original publication
name, publication date, page numbers, and abstract text without
ever having to swap disks.
2
2000
033
OOU
UGGH
H
D
D T
THHR
R
P D
DAAT
TEE
U
U P
For price and ordering

information send email to
Andy Veloz at aav@aes.org,
visit the AES web site at
http://www.aes.org, or call
any AES office at
+1 212 661 8528 ext. 39 (USA);
+44 1628 663725 (UK);
+33 1 4881 4632 (Europe).
AUDIO ENGINEERING SOCIETY 26th Conference
July 7-9, 2005
CALL for PAPERS
AES 26TH CONFERENCE, 2005 AUDIO
FORENSICS
Audio Forensics In The Digital Age
Denver, Colorado, U.S.A.
Dates: July 7-9, 2005, Location: Denver, Colorado, USA Denver, Colorado, U.S.A.
Chair: Roy Pritts, Univ. of Colorado at Denver, Email: 26th_chair@aes.org
The AES 26th International Conference is designed to explore the history, hardware, and techniques of forensic investigation of
audio materials. The field has gone through significant advances with the advent of digital audio recording, signal processing, and
computer-assisted evaluation. The appropriate use of analog and digital processes provides the contemporary audio
engineer with powerful tools for quality audio investigation in support of the law enforcement, legal, archival, and restoration
communities.
The AES 26th Conference Committee invites submission of research and technical papers. By January 31, 2005, a proposed title,
60-120 word abstract, and a 500-1000 word précis of the paper should be submitted via the Internet to the AES 26th Committee at
the following site http://www.aes.org/26th_authors. A preference will be given to papers that combine a lecture with a listening
experience.
A full day of tutorial studies will be held at the University of Colorado at Denver on July 7 to provide historical and practical per-
spective to the technical papers and workshops of the 26th Conference to be held on July 8 and 9. High quality audio and video
support will be provided for presentations and laboratory experiences. Authors may submit proposals for papers, workshops, and
tutorials.
The author’s information, title, abstract, and précis should all be submitted online. The précis should describe the work performed,
methods employed, conclusion(s), and significance of the paper. Titles and abstracts should follow the guidelines in Information for
Authors at http://www.aes.org/journal/con_infoauth.html. You can visit this site for more information and complete instructions
for using the site anytime after November 9, 2004. Acceptance of papers will be determined by the 26th Conference review com-
mittee based on an assessment of the abstract and précis. A conference paper, submitted via the Internet by 2005 April 19, will be a
condition for presentation at the conference.
PROPOSED TOPICS FOR PAPERS

Audio Sources and Acquisition Security Applications
Voice or Data Recorders Forensics in Audio Curriculum
Audio Recovery and Enhancement, Voice Recognition
Analog and Digital Voice Identification
Authenticity, Analog and Digital Media Archiving and Restoration
SUBMISSION OF PAPERS SCHEDULE

Please submit proposed title, abstract, and precis at Proposal deadline: 2004 January 31
http://www.aes.org/26th_authors no later than 2005 Acceptance emailed: 2005 March 7
January 31. If you have any questions, contact:
Paper deadline: 2005 April 19
PAPERS COCHAIRS
Richard Sanders Tom Owen Authors whose contributions have been
University of Colorado Owl Investigations accepted for presentation will receive
at Denver additional instructions for submission
Email: 26th_papers@aes.org of their manuscripts.
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 1
SECTIONS CONTACTS
DIRECTORY
The following is the latest information we have available for our sections contacts. If you
wish to change the listing for your section, please mail, fax or e-mail the new information
to: Mary Ellen Ilich, AES Publications Office, Audio Engineering Society, Inc., 60 East
42nd Street, Suite 2520, New York, NY 10165-2520, USA. Telephone +1 212 661 8528,
ext. 23. Fax +1 212 661 7829. E-mail MEI@aes.org.
Updated information that is received by the first of the month will be published in the
next month’s Journal. Please help us to keep this information accurate and timely.
EASTERN REGION, 2712 Leslie Dr. Worcester Polytechnic New York University Section
USA/CANADA Atlanta, GA 30345 Institute Section (Student) (Student)
Tel./Fax +1 770 908 1833 William Michalson Robert Rowe, Faculty Advisor
Vice President: E-mail Faculty Advisor Steinhardt School of Education
Jim Anderson atlanta_section@aes.org AES Student Section 35 West 4th St., 777G
12 Garfield Place Worcester Polytechnic Institute New York, NY 10012
Brooklyn, NY 11215 MARYLAND 100 Institute Rd. Tel. +1 212 998 5435
Tel. +1 718 369 7633 Peabody Institute of Johns Worcester, MA 01609 E-mail nyu@aes.org
Fax +1 718 669 7631 Hopkins University Section Tel. +1 508 831 5766
E-mail vp_eastern_usa@aes.org E-mail wpi@aes.org NORTH CAROLINA
(Student)
Neil Shade, Faculty Advisor Appalachian State University
UNITED STATES OF NEW JERSEY Section (Student)
AMERICA AES Student Section
Peabody Institute of Johns William Paterson University Michael S. Fleming
CONNECTICUT Hopkins University Section (Student) Faculty Advisor
University of Hartford Recording Arts & Science Dept. David Kerzner, Faculty Advisor Appalachian State University
Section (Student) 2nd Floor Conservatory Bldg. AES Student Section Hayes School of Music
Timothy Britt 1 E. Mount Vernon Place William Paterson University 813 Rivers St.
Faculty Advisor Baltimore, MD 21202 300 Pompton Rd. Boone, NC 28608
AES Student Section Tel. +1 410 659 8100 ext. 1226 Wayne, NJ 07470-2103 Home Tel. +1 828 263 0454
University of Hartford E-mail peabody@aes.org Tel. +1 973 720 3198 Bus. Tel. +1 828 262 7503
Ward College of Technology Fax +1 973 720 2217 E-mail appalachian@aes.org
200 Bloomfield Ave. MASSACHUSETTS E-mail wpu_section@aes.org University of North Carolina
West Hartford, CT 06117 Berklee College of Music at Asheville Section (Student)
Tel. +1 860 768 5358 NEW YORK
Section (Student) Wayne J. Kirby
Fax +1 860 768 5074 Eric Reuter, Faculty Advisor Fredonia Section (Student) Faculty Advisor
E-mail Berklee College of Music Bernd Gottinger, Faculty Advisor AES Student Section
u_hartford_section@aes.org Audio Engineering Society AES Student Section University of North Carolina at
FLORIDA c/o Student Activities SUNY–Fredonia Asheville
Full Sail Real World 1140 Boylston St., Box 82 1146 Mason Hall Dept. of Music
Education Section (Student) Boston, MA 02215 Fredonia, NY 14063 One University Heights
Bill Smith, Faculty Advisor Tel. +1 617 747 8251 Tel. +1 716 673 4634 Asheville, NC 28804
AES Student Section Fax +1 617 747 2179 Fax +1 716 673 3154 Tel. +1 828 251 6432
Full Sail Real World Education E-mail E-mail Fax +1 828 253 4573
3300 University Blvd., Suite 160 berklee_section@aes.org fredonia_section@aes.org E-mail north_carolina@aes.org
Winter Park, FL 327922 Boston Section Institute of Audio Research PENNSYLVANIA
Tel. +1 800 679 0100 Matthew Girard Section (Student)
E-mail full_sail@aes.org Carnegie Mellon University
Tel. +1 781 883 1248 Noel Smith, Faculty Advisor Section (Student)
University of Miami Section E-mail AES Student Section Thomas Sullivan
(Student) boston_section@aes.org Institute of Audio Research Faculty Advisor
Ken Pohlmann, Faculty Advisor 64 University Pl. AES Student Section
AES Student Section University of Massachusetts New York, NY 10003 Carnegie Mellon University
University of Miami –Lowell Section (Student) Tel. +1 212 677 7580 University Center Box 122
School of Music John Shirley, Faculty Advisor Fax +1 212 677 6549 Pittsburg, PA 15213
P. O. Box 248165 AES Student Chapter E-mail iar_section@aes.org Tel. +1 412 268 3351
Coral Gables, FL 33124-7610 University of Massachusetts–Lowell E-mail carnegie_mellon@aes.org
Tel. +1 305 284 6252 Dept. of Music New York Section
Fax +1 305 284 4448 35 Wilder St., Ste. 3 Bill Siegmund Duquesne University Section
E-mail miami_section@aes.org Lowell, MA 01854-3083 Digital Island Studios (Student)
Tel. +1 978 934 3886 71 West 23rd Street Suite 504 Francisco Rodriguez
GEORGIA Fax +1 978 934 3034 New York, NY 10010 Faculty Advisor
Atlanta Section E-mail Tel. +1 212 243 9753 AES Student Section
Robert Mason umass_lowell_section@aes.org E-mail new_york@aes.org Duquesne University
SECTIONS CONTACTS
DIRECTORY
School of Music McGill University Champaign University of Michigan
600 Forbes Ave. Sound Recording Studios Urbana, IL 61801 Section (Student)
Pittsburgh, PA 15282 Strathcona Music Bldg. Tel. +1 217 384 5242 Jason Corey Faculty Advisor
Tel. +1 412 434 1630 555 Sherbrooke St. W. E-mail urbana_section@aes.org University of Michigan School
Fax +1 412 396 5479 Montreal, Quebec H3A 1E3 of Music
E-mail duquesne@aes.org Canada INDIANA 1100 Baits Drive
Tel. +1 514 398 4535 ext. 0454 Ball State University Section Ann Arbor, MI 48109
Pennsylvania State University
E-mail mcgill_u_section@aes.org (Student) E-mail
Section (Student)
Michael Pounds, Faculty Advisor univ_michigan_section@aes.org
Dan Valente Toronto Section
AES Penn State Student Chapter Earl McCluskie AES Student Section
Ball State University West Michigan Section
Graduate Program in Acoustics E32-223 Pioneer Dr. Carl Hordyk
Pennsylvania State University Kitchner, Ontario MET Studios
2520 W. Bethel Ave. Calvin College
P. O. Box 30 N2P 1L9, Canada 3201 Burton S.E.
State College, PA 16803 Tel. +1 519 894 5308 Muncie, IN 47306
Tel. +1 765 285 5537 Grand Rapids, MI 49546
Tel. +1 814 865 2859 Fax +1 416 364 1310 Tel. +1 616 957 6279
Cell +1 814 360 83399 E-mail toronto@aes.org Fax +1 765 285 8768
E-mail Fax +1 616 957 6469
E-mail E-mail
penn_state_section@aes.org ball_state_section@aes.org
west_mich_section@aes.org
F CENTRAL REGION,
Philadelphia Section USA/CANADA Central Indiana Section
Rebecca Mercuri MINNESOTA
James Latta
P. O. Box 1166 Vice President: Sound Around Music Tech College Section
Philadelphia, PA 19105 Frank Wells 6349 Warren Ln. (Student)
Tel. +1 215 327 7105 2130 Creekwalk Drive Brownsburg, IN 46112 Michael McKern
E-mail philly@aes.org Murfreesboro, TN Tel. +1 317 852 8379 Faculty Advisor
Tel. +1 615 848 1769 Fax +1 317 858 8105 AES Student Section
VIRGINIA Fax +1 615 848 1108 E-mail Music Tech College
Hampton University Section E-mail vp_central_usa@aes.org central_indiana_section@aes.org 19 Exchange Street East
(Student) Saint Paul, MN 55101
Bob Ransom, Faculty Advisor UNITED STATES OF KANSAS Tel. +1 651 291 0177
AES Student Section AMERICA Kansas City Section Fax +1 651 291 0366
Hampton University Jim Mitchell E-mail
Dept. of Music ARKANSAS musictech_student@aes.org
Custom Distribution Limited
63 Litchfield Close University of Arkansas at 12301 Riggs Rd.
Hampton, VA 23668 Pine Bluff Section (Student) Ridgewater College,
Overland Park, KS 66209 Hutchinson Campus Section
Office Tel. +1 757 727 5658, Robert Elliott, Faculty Advisor Tel. +1 913 661 0131
+1 757 727 5404 AES Student Section (Student)
Fax +1 913 663 5662 Dave Igl, Faculty Advisor
Home Tel. +1 757 826 0092 Music Dept. Univ. of Arkansas E-mail
Fax +1 757 727 5084 at Pine Bluff AES Student Section
kansas_city_section@aes.org Ridgewater College, Hutchinson
E-mail hampton_u@aes.org 1200 N. University Drive
Pine Bluff, AR 71601 Campus
LOUISIANA
WASHINGTON, DC Tel. +1 870 575 8916 2 Century Ave. S.E.
New Orleans Section Hutchinson, MN 55350
American University Section Fax +1 870 543 8108 Joseph Doherty
(Student) E-mail pinebluff@aes.org E-mail ridgewater@aes.org
Factory Masters
Rebecca Stone-gordon 4611 Magazine St. Upper Midwest Section
Faculty Advisor ILLINOIS
New Orleans, LA 70115 Greg Reierson
AES Student Section Chicago Section Tel. +1 504 891 4424 Rare Form Mastering
American University Jeff Segota Cell +1 504 669 4571 4624 34th Avenue South
4400 Massachusetts Ave., N.W. 2955 No. Halsted #3 Fax +1 504 891 9262 Minneapolis, MN 55406
Washington, DC 20016 Chicago, IL 60657 E-mail new_orleans@aes.org Tel. +1 612 327 8750
Tel. +1 202 885 3242 E-mail chicago_section@aes.org E-mail
E-mail MICHIGAN upper_midwest_section@aes.org
american_u_section@aes.org Columbia College Section
(Student) Detroit Section
District of Columbia Section David Carlstrom MISSOURI
Dominique J. Chéenne
Fred G. Geil Faculty Advisor DaimlerChrysler St. Louis Section
Sound Engineering Company AES Student Section Tel. +1 313 493 4035 John Nolan, Jr.
1408 Harmony Lane 676 N. LaSalle, Ste. 300 E-mail detroit@aes.org 693 Green Forest Dr.
Annapolis, MD 21401 Chicago, IL 60610 Fenton, MO 63026
Tel. +1 410 260 5924 Michigan Technological Tel./Fax +1 636 343 4765
Tel. +1 312 344 7802
Fax +1 410 260 5430 University Section (Student) E-mail st_louis_section@aes.org
Fax +1 312 482 9083
E-mail dc_section@aes.org Greg Piper
E-mail
AES Student Section Webster University Section
CANADA columbia_section@aes.org
Michigan Technological (Student)
McGill University Section University of Illinois at University Gary Gottleib, Faculty Advisor
(Student) Urbana-Champaign Section 121 EERC Building Webster University
William L. Martens and (Student) 1400 Townsend Dr. 470 E. Lockwood Ave.
Martha De Francisco Michael Peterson Houghton, MI 49931 Webster Groves, MO 63119
Faculty Advisors AES Student Section Tel. +1 906 482 3581 Tel. +1 961 2660 x7962
AES Student Section University of Illinois, Urbana- E-mail michigan_tech@aes.org E-mail webster_st_louis@aes.org
SECTIONS CONTACTS
DIRECTORY
NEBRASKA Tel. +1 615 335 8520 Cal Poly San Luis Obispo 5534 Encino Ave. # 214
Heartland Section Fax +1 615 335 8625 State University Section Encino, CA 91316
Anthony D. Beardslee E-mail nashville@aes.org (Student) Tel. +1 818 830 8775
Northeast Community College Bryan J. Mealy, Faculty Advisor E-mail la_section@aes.org
P.O. Box 469 SAE Nashville Section (Student) AES Student Section
Larry Sterling, Faculty Advisor California Polytechnic State San Diego Section
Norfolk, NE 68702 J. Russell Lemon
Tel. +1 402 844 7365 AES Student Section University
7 Music Circle N. Dept. of Electrical Engineering 2031 Ladera Ct.
Fax +1 209 254 8282 Carlsbad, CA 92009-8521
E-mail Nashville, TN 37203 San Luis Obispo, CA 93407
Tel. +1 615 244 5848 Tel. +1 805 756 2300 Home Tel. +1 760 753 2949
heartland_section@aes.org E-mail
Fax +1 615 244 3192 Fax +1 805 756 1458
E-mail saenash_student@aes.org E-mail san_diego_section@aes.org
OHIO
Cincinnati Section san_luis_obispo_section@aes.org San Diego State University
TEXAS
Dan Scherbarth California State University Section (Student)
Digital Groove Productions Texas State University—San –Chico Section (Student) John Kennedy, Faculty Advisor
5392 Conifer Dr. Marcos Section (Student) Keith Seppanen, Faculty Advisor AES Student Section
Mason, OH 45040 Mark C. Erickson AES Student Section San Diego State University
Tel. +1 513 325 5329 Faculty Advisor California State University–Chico Electrical & Computer
E-mail cincinnati@aes.org AES Student Section 400 W. 1st St. Engineering Dept.
Southwest Texas State University Chico, CA 95929-0805 5500 Campanile Dr.
Ohio University Section 224 N. Guadalupe St. Tel. +1 530 898 5500 San Diego, CA 92182-1309
(Student) San Marcos, TX 78666 E-mail chico@aes.org Tel. +1 619 594 1053
Erin M. Dawes Tel. +1 512 245 8451 Fax +1 619 594 2654
AES Student Section Fax +1 512 396 1169 Citrus College Section (Student) E-mail sdsu@aes.org
Ohio University, RTVC Bldg. E-mail tsu_sm@aes.org Stephen O’Hara, Faculty Advisor
AES Student Section San Francisco Section
9 S. College St.
Citrus College Conrad Cooke
Athens, OH 45701-2979
WESTERN REGION, Recording Arts 1046 Nilda Ave.
Home Tel. +1 740 597 6608
USA/CANADA 1000 W. Foothill Blvd. Mountain View, CA 94040
E-mail ohio@aes.org
Glendora, CA 91741-1899 Office Tel. +1 650 846 1132
University of Cincinnati Vice President: Fax +1 626 852 8063 Home Tel. +1 650 321 0713
Section (Student) Bob Moses E-mail san_francisco@aes.org
Island Digital Media Group, Cogswell Polytechnical
Thomas A. Haines San Francisco State
LLC College Section (Student)
Faculty Advisor University Section (Student)
26510 Vashon Highway S.W. Tim Duncan, Faculty Advisor
AES Student Section John Barsotti, Faculty Advisor
Vashon, WA 98070 AES Student Section
University of Cincinnati AES Student Section
Tel. +1 206 463 6667 Cogswell Polytechnical College
College-Conservatory of Music San Francisco State University
Fax +1 810 454 5349 Music Engineering Technology
M.L. 0003 Broadcast and Electronic
E-mail vp_western_usa@aes.org 1175 Bordeaux Dr.
Cincinnati, OH 45221 Communication Arts Dept.
Sunnyvale, CA 94089
Tel. +1 513 556 9497 1600 Halloway Ave.
Tel. +1 408 541 0100, ext. 130
Fax +1 513 556 0202 UNITED STATES OF San Francisco, CA 94132
Fax +1 408 747 0764
E-mail AMERICA Tel. +1 415 338 1507
E-mail cogswell_section@aes.org
univ_cincinnati@aes.org E-mail sfsu_section@aes.org
ARIZONA
Expression Center for New
Conservatory of The Media Section (Student) Stanford University Section
TENNESSEE
Recording Arts and Sciences John Scanlon, Faculty Advisor (Student)
Belmont University Section Jay Kadis, Faculty Advisor
(Student) Section (Student) Director of Sound Arts
Glenn O’Hara, Faculty Advisor AES Student Section Stanford AES Student Section
Wesley Bulla, Faculty Advisor Stanford University
AES Student Section AES Student Section Ex’pression Center for New Media
Conservatory of The Recording 6601 Shellmount St. CCRMA/Dept. of Music
Belmont University Stanford, CA 94305-8180
Nashville, TN 37212 Arts and Sciences Emeryville, CA 94608
2300 E. Broadway Rd. Tel. +1 510 654 2934 Tel. +1 650 723 4971
E-mail Fax +1 650 723 8468
belmont_section@aes.org Tempe, AZ 85282 Fax +1 510 658 3414
Tel. +1 480 858 9400, 800 562 E-mail expression_center_ E-mail stanford@aes.org
Middle Tennessee State 6383 (toll-free) section@aes.org University of Southern
University Section (Student) Fax +1 480 829 1332 California Section (Student)
E-mail Long Beach City College
Phil Shullo, Faculty Advisor Kenneth Lopez
conservatory_RAS@aes.org Section (Student)
AES Student Section Faculty Advisor
Nancy Allen, Faculty Advisor
Middle Tennessee State University AES Student Section
CALIFORNIA AES Student Section
301 E. Main St., Box 21 University of Southern California
Long Beach City College
Murfreesboro, TN 37132 American River College 840 W. 34th St.
4901 E. Carson St.
Tel. +1 615 898 2553 Section (Student) Los Angeles, CA 90089-0851
Long Beach, CA 90808
E-mail mtsu_section@aes.org Eric Chun, Faculty Advisor Tel. +1 213 740 3224
Tel. +1 562 938 4312
AES Student Section Fax +1 213 740 3217
Fax +1 562 938 4409
Nashville Section American River College Chapter E-mail usc@aes.org
E-mail long_beach@aes.org
Tom Edwards 4700 College Oak Dr.
MTV Networks Sacramento, CA 95841 Los Angeles Section COLORADO
330 Commerce St. Tel. +1 916 484 8420 Geoff Christopherson Colorado Section
Nashville, TN 37201 E-mail american_river@aes.org JBL Professional Roy Pritts
SECTIONS CONTACTS
DIRECTORY
2873 So. Vaughn Way AES Student Section c/o Stenbaek Fax +7 095 187 7174
Aurora, CO 80014 The Art Institute of Seattle Mozartsvej 11, 1 TV E-mail all_russian_state@aes.org
Tel. +1 303 369 9514 2323 Elliott Ave. DK-2450, Copenhagen SV,
E-mail Seattle, WA 98121 Denmark Moscow Section
colorado_section@aes.org Tel. +1 206 448 0900 Tel. +45 6133 4588 Michael Lannie
E-mail art_institute_seattle_ E-mail denmark_section@aes.org Research Institute for
University of Colorado at section@aes.org Television and Radio
Denver Section (Student) Danish Student Section Acoustic Laboratory
Roy Pritts, Faculty Advisor Preben Kvist 12-79 Chernomorsky bulvar
AES Student Section CANADA c/o Stenbaek RU-113452 Moscow, Russia
University of Colorado at Denver Alberta Section Mozartsvej 11, 1 TV Tel. +7 095 2502161, +7 095
Dept. of Professional Studies Joshua Tidsbury DK-2450, Copenhagen SV, 1929011
Campus Box 162 AES Alberta Section Denmark Fax +7 095 9430006
P.O. Box 173364 716 Lake Ontario Dr. S.E. Tel. +45 6133 4588 E-mail
Denver, CO 80217-3364 Calgary, Alberta T2J 3J8 E-mail moscow_section@aes.org
Tel. +1 303 556 2795 Canada copenhagen_section@aes.org
Fax +1 303 556 2335 Tel. +1 403 803 4522 Russian Academy of Music
E-mail E-mail alberta@aes.org FINLAND Student Section
cu_denver_section@aes.org Finnish Section Igor Petrovich Veprintsev
Vancouver Section Kalle Koivuniemi Faculty Advisor
OREGON David Linder Nokia Research Center Sound Engineering Division
Portland Section 93.7 JRfm/600am Radio, A P.O. Box 100 30/36 Povarskaya Street
Tony Dal Molin Division of the Jim Pattison FI-33721 Tampere, Finland RU 121069, Moscow, Russia
Audio Precision, Inc. Broadcast Group Tel. +358 7180 35452 Tel. +7 095 291 1532
5750 S.W. Arctic Dr. 300-1401 West 8th Ave. Fax +358 7180 35897 E-mail russian_academy_
Portland, OR 97005 Vancouver, BC V6H 1C9 E-mail finnish_section@aes.org section@aes.org
Tel. +1 503 627 0832 Canada
Fax +1 503 641 8906 E-mail NETHERLANDS St. Petersburg Section
E-mail portland_section@aes.org vancouver_section@aes.org Irina A. Aldoshina
Netherlands Section
Vancouver Student Section Rinus Boone St. Petersburg University of
UTAH Gregg Gorrie, Faculty Advisor Voorweg 105A Telecommunications
AES Greater Vancouver NL-2715 NG Zoetermeer Gangutskaya St. 16, #31
Brigham Young University
Student Section Netherlands RU-191187 St. Petersburg
Section (Student)
Centre for Digital Imaging and Tel. +31 15 278 14 71, +31 62 Russia
Timothy Leishman,
Sound 127 36 51 Tel. +7 812 272 4405
Faculty Advisor
3264 Beta Ave. Fax +31 79 352 10 08 Fax +7 812 316 1559
BYU-AES Student Section
Burnaby, B.C. V5G 4K4, Canada E-mail E-mail
Department of Physics and
Tel. +1 604 298 5400 netherlands_section@aes.org st_petersburg_section@aes.org
Astronomy
Brigham Young University E-mail Netherlands Student Section St. Petersburg Student Section
Provo, UT 84602 vancouver_student @ aes.org Maurik van den Steen Natalia V. Tyurina
Tel. +1 801 422 4612 AES Student Section Faculty Advisor
E-mail Prins Willemstraat 26 Prosvescheniya pr., 41, 185
brigham_young_section@aes.org NORTHERN REGION, 2584 HV Den Haag, Netherlands RU-194291 St. Petersburg, Russia
Utah Section EUROPE Tel. +31 6 45702051 Tel. +7 812 595 1730
Deward Timothy E-mail netherlands_student_ Fax +7 812 316 1559
Vice President: section@aes.org E-mail st_petersburg_student
c/o Poll Sound
Søren Bech _section@aes.org
4026 S. Main Bang & Olufsen a/s
Salt Lake City, UT 84107 CoreTech NORWAY
Tel. +1 801 261 2500 SWEDEN
Peter Bangs Vej 15 Norwegian Section
Fax +1 801 262 7379 DK-7600 Struer, Denmark Jan Erik Jensen Swedish Section
E-mail utah_section@aes.org Tel. +45 96 84 49 62 Nøklesvingen 74 Ingemar Ohlsson
Fax +45 97 85 59 50 NO-0689 Oslo, Norway Audio Data Lab
WASHINGTON E-mail Office Tel. +47 22 24 07 52 Katarinavägen 22
Pacific Northwest Section vp_northern_europe@aes.org Home Tel. +47 22 26 36 13 SE-116 45 Stockholm, Sweden
Gary Louie Fax +47 22 24 28 06 Tel. +46 8 644 58 65
University of Washington BELGIUM E-mail norway_section@aes.org Fax +46 8 641 67 91
School of Music E-mail sweden@aes.org
Belgian Section
P. O. Box 353450 RUSSIA
Hermann A. O. Wilms University of Luleå-Piteå
4522 Meridian Ave. N., #201 AES Europe Region Office All-Russian State Institute of Section (Student)
Seattle, WA 98103 Zevenbunderslaan 142, #9 Cinematography Section Lars Hallberg, Faculty Sponsor
Tel. +1 206 543 1218 BE-1190 Vorst-Brussels, Belgium (Student) AES Student Section
Fax +1 206 685 9499 Tel. +32 2 345 7971 Leonid Sheetov, Faculty Sponsor University of Luleå-Piteå
E-mail Fax +32 2 345 3419 AES Student Section School of Music
pacific_nw_section@aes.org E-mail belgian_section@aes.org All-Russian State Institute of Box 744
The Art Institute of Seattle Cinematography (VGIK) S-94134 Piteå, Sweden
DENMARK W. Pieck St. 3 Tel. +46 911 726 27
Section (Student)
David G. Christensen Danish Section RU-129226 Moscow, Russia Fax +46 911 727 10
Faculty Advisor Preben Kvist Tel. +7 095 181 3868 E-mail lulea_pitea@aes.org
SECTIONS CONTACTS
DIRECTORY
UNITED KINGDOM Czech Republic Student Düsseldolf Section (Student) Fax +48 71 320 3189
British Section Section Corinna A. Bock E-mail poland_section@aes.org
Heather Lane Libor Husník, Faculty Advisor AES Student Section
Audio Engineering Society AES Student Section Juelicher Str. 80 Technical University of Gdansk
P. O. Box 645 Czech Technical Univ. at Prague DE 40477 Düsseldorf, Germany Section (Student)
Slough SL1 8BJ Technická 2, Tel. +49 211 484 6665 Pawel Zwan
United Kingdom CZ-116 27 Prague 6 E-mail duesseldorf_student_ AES Student Section
Tel. +44 1628 663725 Czech Republic section@aes.org Technical University of Gdansk
Fax +44 1628 667002 Tel. +420 2 2435 2115 Sound Engineering Dept.
Ilmenau Section
E-mail uk@aes.org E-mail ul. Narutowicza 11/12
(Student)
czech_student_section@aes.org PL-80 952 Gdansk, Poland
Karlheinz Brandenburg
Home Tel. +48 58 347 23 98
Faculty Advisor
CENTRAL REGION, Office Tel. +4858 3471301
GERMANY AES Student Section
EUROPE Fax +48 58 3471114
Aachen Section (Student) Fraunhofer Institute for Digital
E-mail gdansk_u @aes.org
Michael Vorländer Media Technology IDMT
Vice President:
Faculty Advisor Langewiesener Str. 22 Wroclaw University of
Bozena Kostek DE-98693 Ilmenau, Germany
Multimedia Systems Institut für Technische Akustik Technology Section (Student)
RWTH Aachen Tel. +49 3677 69 4340
Department Andrzej B. Dobrucki
Templergraben 55 E-mail
Gdansk University of Technology Faculty Sponsor
D-52065 Aachen, Germany ilmenau_student_section@aes.org
Ul. Narutowicza 11/12 AES Student Section
80-952 Gdansk, Poland Tel. +49 241 807985 North German Section Institute of Telecommunications
Tel. +48 58 347 2717 Fax +49 241 8888214 Reinhard O. Sahr and Acoustics
Fax +48 58 347 1114 E-mail Eickhopskamp 3 Wroclaw Univ.Technology
E-mail aachen_section@aes.org DE-30938 Burgwedel, Germany Wybrzeze Wyspianskiego 27
vp_central_europe@aes.org Tel. +49 5139 4978 PL-503 70 Wroclaw, Poland
Berlin Section (Student) Fax +49 5139 5977 Tel. +48 71 320 30 68
AUSTRIA Bernhard Güttler E-mail Fax +48 71 320 31 89
Austrian Section Zionskirchstrasse 14 n_german_section@aes.org E-mail
Franz Lechleitner DE-10119 Berlin, Germany wroclaw @ aes.org
Tel. +49 30 4404 72 19 South German Section
Lainergasse 7-19/2/1 Gerhard E. Picklapp
AT-1230 Vienna, Austria Fax +49 30 4405 39 03 REPUBLIC OF BELARUS
E-mail berlin@aes.org Landshuter Allee 162
Office Tel. +43 1 4277 29602 DE-80637 Munich, Germany Belarus Section
Fax +43 1 4277 9296 Tel. +49 89 15 16 17
Central German Section Valery Shalatonin
E-mail austrian_section@aes.org Fax +49 89 157 10 31
Ernst-Joachim Völker Belarusian State University of
Graz Section (Student) E-mail Informatics and
Institut für Akustik und s_german_section@aes.org
Robert Höldrich Bauphysik Radioelectronics
Faculty Sponsor Kiesweg 22-24 vul. Petrusya Brouki 6
HUNGARY BY-220027 Minsk
Institut für Elektronische Musik DE-61440 Oberursel, Germany
und Akustik Hungarian Section Republic of Belarus
Tel. +49 6171 75031
Inffeldgasse 10 István Matók Tel. +375 17 239 80 95
Fax +49 6171 85483
AT-8010 Graz, Austria Rona u. 102. II. 10 Fax +375 17 231 09 14
E-mail
Tel. +43 316 389 3172 HU-1149 Budapest, Hungary E-mail
c_german_section@aes.org
Fax +43 316 389 3171 Home Tel. +36 30 900 1802 belarusian_section@ aes.org
E-mail Fax +36 1 383 24 81
Darmstadt Section (Student) E-mail
graz_student_section@aes.org SLOVAK REPUBLIC
G. M. Sessler, Faculty Sponsor hungarian_section@aes.org
Vienna Section (Student) AES Student Section Slovakian Republic Section
Jürg Jecklin, Faculty Sponsor Technical University of LITHUANIA Richard Varkonda
Vienna Student Section Darmstadt Lithuanian Section Centron Slovakia Ltd.
Universität für Musik und Institut für Übertragungstechnik Vytautas J. Stauskis Podhaj 107
Darstellende Kunst Wien Merkstr. 25 Vilnius Gediminas Technical SK-48103 Bratislava
Institut für Elektroakustik und DE-64283 Darmstadt, Germany University Slovak Republic
Experimentelle Musik Tel. +49 6151 162869 Traku 1/26, Room 112 Tel. +421 7 781 128, 7 788 437
Rienösslgasse 12 E-mail LT-2001 Vilnius, Lithuania Fax. +421 7 762 955
AT-1040 Vienna, Austria darmstadt_student@aes.org Tel. +370 5 262 91 78 E-mail
Tel. +43 1 587 3478 Fax +370 5 261 91 44 slovakian_rep @aes.org
Fax +43 1 587 3478 20 Detmold Section (Student) E-mail lithuania@aes.org
E-mail Andreas Meyer, Faculty Sponsor SWITZERLAND
vienna_student_section@aes.org AES Student Section POLAND
c/o Erich Thienhaus Institut Polish Section Swiss Section
CZECH REPUBLIC Tonmeisterausbildung Andrzej Dobrucki Joël Godel
Czech Section Hochschule für Musik Wroclaw University of AES Swiss Section
Jiri Ocenasek Detmold Technology Sonnmattweg 6
Dejvicka 36 Neustadt 22, DE-32756 Institute of Telecommunication CH-5000 Aarau
CZ-160 00 Prague 6, Czech Detmold, Germany and Acoustics Tel./Fax +41 26 670 2033
Republic Tel/Fax +49 5231 975639 Wybrzeze Wyspiannkiego 27 Switzerland
Home Tel. +420 2 24324556 E-mail PL-50-370 Wroclaw, Poland E-mail
E-mail czech_section@aes.org detmold @ aes.org Tel. +48 48 71 320 3068 swiss_section@aes.org
SECTIONS CONTACTS
DIRECTORY
UKRAINE 209 ave Jean Jaures R. Paulo Renato 1, 2A Talcahuano 141
Ukrainian Section FR-75019 Paris, France PT-2745-147 Linda-a-Velha Buenos Aires, Argentina
Dimitri Danyuk Tel. +33 40 40 4614 Portugal Tel./Fax +5411 4 375 0116
32-38 Artyoma St., Apt. 38 Fax +33 40 40 4768 Tel. +351 214145827 E-mail
UA 04053 Kiev, Ukraine E-mail E-mail portugal @ aes.org vp_latin_american@aes.org
E-mail ukrainian@aes.org Paris_student_section@aes.org
ROMANIA ARGENTINA
French Section Romanian Section Argentina Section
SOUTHERN REGION, Michael Williams Marcia Taiachin German Olguin
EUROPE Ile du Moulin Radio Romania Talcahuano 141
62 bis Quai de l’Artois 60-62 Grl. Berthelot St. Buenos Aires, Argentina 1013
Vice President: FR-94170 Le Perreux sur RO-79756 Bucharest, Romania Tel./Fax +5411 4 375 0116
Ivan Stamac Marne, France Tel. +40 1 303 12 07 E-mail
Ivlje 4 Tel. +33 1 48 81 46 32 Fax +40 1 222 69 19 argentina_section@aes.org
HR-10040 Zagreb, Croatia Fax +33 1 47 06 06 48 E-mail
Tel. +385 1 482 23 61 E-mail french_section@aes.org romanian_section@aes.org BRAZIL
Tel./Fax +385 1 457 44 03
E-mail SERBIA AND MONTENEGRO Brazil Section
Louis Lumière Section
vp_southern_europe@aes.org José Carlos Giner
(Student) Serbia and Montenegro Rua Marechal Cantuária # 18
Julien Basseres Section Urca-Rio de Janeiro
BOSNIA-HERZEGOVINA 4 rue d’Issy Tomislav Stanojevic RJ-2291-060, Brazil
Bosnia-Herzegovina Section FR 92170, Vanves, France Sava centre Tel. +55 21 2244 6530
Jozo Talajic Tel. +33 06 60 12 44 92 M. Popovica 9 Fax +55 21 2244 7113
Bulevar Mese Selimovica 12 E-mail YU-11070 Belgrade, Yugoslavia E-mail aesbrasil@aes.org
BA-71000 Sarajevo louis_lumiere_section@aes.org Tel. +381 11 311 1368Fax +381
Bosnia–Herzegovina 11 605 578 CHILE
Tel. +387 33 455 160 GREECE E-mail
serbia_montenegro_section Chile Section
Fax +387 33 455 163 Greek Section Andres Schmidt
E-mail bosnia_herzegovina_ @aes.org
Vassilis Tsakiris Hernan Cortes 2768
section@aes.org Crystal Audio Ñuñoa, Santiago de Chile
SLOVENIA
Aiantos 3a Vrillissia Tel. +56 2 4249583
BULGARIA GR 15235 Athens, Greece Slovenian Section
E-mail chile@aes.org
Tel. + 30 2 10 6134767 Tone Seliskar
Bulgarian Section RTV Slovenija
Fax + 30 2 10 6137010 COLOMBIA
Konstantin D. Kounov Kolodvorska 2
Bulgarian National Radio E-mail Colombia Section
greek_section@aes.org SI-1550 Ljubljana, Slovenia
Technical Dept. Tel. +386 61 175 2708 Mercedes Onorato
4 Dragan Tzankov Blvd. Fax +386 61 175 2710 Talcahuano 141
ISRAEL Buenos Aires, Argentina
BG-1040 Sofia, Bulgaria E-mail
Tel. +359 2 65 93 37, +359 2 Israel Section slovenian @ aes.org Tel./Fax +5411 4 375 0116
9336 6 01 Ben Bernfeld Jr. E-mail
Fax +359 2 963 1003 H. M. Acustica Ltd. SPAIN colombia_section@aes.org
E-mail 20G/5 Mashabim St. Spanish Section
IL-45201 Hod Hasharon, Israel Javeriana University Section
bulgarian_section@ aes.org Juan Recio Morillas
Tel./Fax +972 9 7444099 (Student)
Spanish Section Silvana Medrano
CROATIA E-mail israel_section@aes.org C/Florencia 14 3oD Carrera 7 #40-62
Croatian Section ES-28850 Torrejon de Ardoz Bogota, Colombia
ITALY
Zoran Vertlberg (Madrid), Spain Tel./Fax +57 1 320 8320
Italian Section Tel. +34 91 675 49 98 E-mail
Hrvatski Radio
Prisavlje 3 Carlo Perretta E-mail spanish @ aes.org javeriana_section@aes.org
HR-10000 Zagreb, Croatia AES Italian Section
Piazza Cantore 10 TURKEY Los Andes University Section
Tel. +385 1 634 27 23
IT-20134 Milan, Italy Turkish Section (Student)
Fax +385 1 634 30 65,
Tel. +39 338 9108768 Sorgun Akkor Jorge Oviedo Martinez
or 1 611 58 29
Fax +39 02 58440640 STD Gazeteciler Sitesi, Transversal 44 # 96-17
E-mail croatian_section@aes.org
E-mail italian@aes.org Yazarlar Sok. 19/6 Bogota, Colombia
Croatian Student Section Esentepe 80300 Tel./Fax +57 1 339 4949 ext.
Marija Salovarda Italian Student Section Istanbul, Turkey 2683
Tatjane Marinic 2 Franco Grossi, Faculty Advisor Tel. +90 212 2889825 E-mail losandes @aes.org
HR 10430 Samobor, Croatia AES Student Section Fax +90 212 2889831 San Buenaventura University
Tel. +385 1 3363 103E-mail Viale San Daniele 29 E-mail Section (Student)
croatian_student_section@aes.org IT-33100 Udine, Italy aesturkey@aes.org Nicolas Villamizar
Tel. +39 0432227527 Transversal 23 # 82-41 Apt. 703
FRANCE E-mail Int.1
Conservatoire de Paris italian_student@aes.org Bogota, Colombia
Section (Student) LATIN AMERICAN REGION
Tel. +57 1 616 6593
Daniel Zalay, Faculty Advisor PORTUGAL Fax +57 1 622 3123
Conservatoire de Paris Portugal Section Vice President: E-mail
Department Son Rui Miguel Avelans Coelho Mercedes Onorato sanbuenaventura@aes.org
SECTIONS CONTACTS
DIRECTORY
ECUADOR Tel. +58 14 9292552 Fax +61 2 9417 3714 125 Regalia Park Tower
Ecuador Section Tel./Fax +58 2 9937296 E-mail P. Tuazon Blvd., Cubao
Juan Manuel Aguillo E-mail sydney@aes.org Quezon City, Philippines
Av. La Prensa 4316 y Vaca de caracas_section@aes.org Tel./Fax +63 2 4211790,
Castro HONG KONG +63 2 4211784
Quito, Ecuador Venezuela Section E-mail
Hong Kong Section
Tel./Fax +59 32 2598 889 Elmar Leal philippines_section@aes.org
Ave. Rio de Janeiro Goeffrey Stitt
E-mail
Qta. Tres Pinos HKAPA, School of Film and SINGAPORE
ecuador_section@aes.org
Chuao, VE-1061 Caracas Television Singapore Section
I.A.V.Q. Section (Student) Venezuela 1 Gloucester Rd. Kenneth J. Delbridge
Felipe Mardones Tel. +58 14 9292552 Wanchai, Hong Kong 480B Upper East Coast Rd.
315 Carrion y Plaza Tel./Fax +58 2 9937296 Tel. +852 2584 8664 Singapore 466518
Quito, Ecuador E-mail Fax +852 2588 1303 Tel. +65 9875 0877
Tel./Fax +59 3 225 61221 venezuela_section@aes.org E-mail Fax +65 6220 0328
E-mail iavq@aes.org hong_kong@aes.org E-mail
singapore@aes.org
MEXICO INDIA
Mexican Section India Section
Jorge Urbano INTERNATIONAL REGION
Cofre de Perote 132 Avinash Oak STUDENT DELEGATE
Fracc. Los Pirules Tlalnepantla Avisound ASSEMBLY
Vice President: A-20, Deepanjali
Edo. de Mexico, C.P. 54040 Neville Thiele
Mexico Shahaji Raje Marg
10 Wycombe St. Vile Parle East
Tel./Fax +52 55 5240 1203 Epping, NSW AU-2121,
E-mail Mumbai IN-400 057, India NORTH/LATIN
Australia Tel. +91 22 26827535
mexican_section@aes.org Tel. +61 2 9876 2407 AMERICA REGIONS
E-mail
Fax +61 2 9876 2749 indian_section@aes.org
PERU E-mail Chair:
vp_international@aes.org Marie Desmarteau
Orson Welles Institute Section JAPAN
(Student) McGill University Section
Javier Antón AUSTRALIA Japan Section (AES)
Av. Salaberry 3641, San Isidro Adelaide Section Katsuya (Vic) Goh 72 Delaware Avenue
Lima, Peru David Murphy 2-15-4 Tenjin-cho, Fujisawa-shi Ottawa K2P 0Z3
Tel. +51 1 264 1773 Krix Loudspeakers Kanagawa-ken 252-0814, Japan Ontario, Canada
Fax +51 1 264 1878 14 Chapman Rd. Tel./Fax +81 466 81 0681 Home Tel. +1 613 236 5411
E-mail Hackham AU-5163 E-mail Office Tel. +1 514 398 4535
orsonwelles@aes.org South Australia aes_japan_section@aes.org E-mail
Tel. +618 8 8384 3433 tonmaestra@hotmail.com
Peru Section
Fax +618 8 8384 3419 KOREA
ArmandÏo Puente De La Vega Vice Chair:
Av. Salaberry 3641 San Isidro E-mail Felice Santos-Martin
Korea Section
Lima, Peru adelaidean_section@aes.org American River College (AES)
Seong-Hoon Kang
Tel. +51 1 264 1773 Taejeon Health Science College Tel. +1 916 802 2084
Brisbane Section
Fax +51 1 264 1878 Dept. of Broadcasting E-mail
E-mail David Spearritt felicelazae@hotmail.com
AES Brisbane Section Technology
peru_section @aes.org 77-3 Gayang-dong Dong-gu
P.O. Box 642
Roma St. Post Office Taejeon, Korea
URUGUAY Tel. +82 42 630 5990
Brisbane, Qld. AU-4003, Australia
Uruguay Section Fax +82 42 628 1423 EUROPE/INTERNATIONAL
Office Tel. +61 7 3364 6510
César Lamschtein E-mail REGIONS
E-mail
Universidad ORT brisbane_section@aes.org korea_section@aes.org
Cuareim 1451
Montevideo, Uruguay Chair:
Melbourne Section MALAYSIA
Tel. +59 1 902 1505 Martin Berggren
Graham J. Haynes Malaysia Section European Student Section
Fax +59 1 900 2952 P. O. Box 5266
E-mail C. K. Ng Varvsgatan 35
Wantirna South, Victoria King Musical Industries Arvika, SE 67133, Sweden
uruguay@aes.org AU-3152, Australia Sdn Bhd Home Tel. +46 0570 12018
Tel. +61 3 9887 3765 Lot 5, Jalan 13/2 Office Tel. +46 0570 38500
VENEZUELA
Fax +61 3 9887 1688 MY-46200 Kuala Lumpur E-mail
Taller de Arte Sonoro, E-mail martin.bergren@imh.se
Caracas Section (Student) Malaysia
melbourne @ aes.org Tel. +603 7956 1668
Carmen Bell-Smythe de Leal Vice Chair:
Faculty Advisor Fax +603 7955 4926
Sydney Section Daniel Hojka
AES Student Section E-mail
Howard Jones TU Graz
Taller de Arte Sonoro malaysia @ aes.org
AES Sydney Section Moserhofgasse 34/28
Ave. Rio de Janeiro P.O. Box 766 AT 8010, Graz, Austria
PHILIPPINES
Qta. Tres Pinos Crows Nest, NSW AU-2065 Tel. +43 650 6471049
Chuao, VE-1061 Caracas Australia Philippines Section E-mail
Venezuela Tel. +61 2 9417 3200 Dario (Dar) J. Quintos daniel.hojka@toningenieur.info
AES CONVENTIONS AND CON The latest details on the following events are posted on the AES Website: http://www.aes.org
117th Convention Convention chair: Papers cochair:

San Francisco, CA, USA John Strawn Brian Link
117th Date: 2004 October 28–31 S Systems
15 Willow Avenue
Dolby Laboratories
Email: 117th_papers@aes.org
2004 Location: Moscone Center Larkspur, CA, 94939 USA Papers cochair:
San Francisco, CA, USA Telephone: + 1 415 927 8856 Rob Maher
Fax: + 1 415 927 2935 Montana State University-Bozeman
Email: 117th_chair@aes.org Email: 117th_papers@aes.org
San Francisco
118th Convention Convention cochair: Papers cochair:

Barcelona, Spain Eloi Batlle Benjamin Bernfeld
Date: 2005 May 28–31 Universitat Pompeu Fabra Fax: +49 (0)40 3603 046919
2005

Location: CCIB Convention Convention cochair:
Centre Papers cochair:
Luis Ortiz Berenguer Basilio Pueo Ortega
Barcelona, Spain Universidad Politecnica de Madrid University of Alicante
118 th
26th Conference Conference chair: Papers cochair:

26th Conference Denver, CO, USA Roy Pritts Richard Sanders
July 7-9, 2005
“Audio Forensics University of Colorado at Denver University of Colorado at Denver
in the Digital Age” Email: 26th_chair@aes.org Email: 26th_papers@aes.org
AUDIO Date: 2005 July 7–9 Papers cochair:
FORENSICS Tom Owen
Location: Adam’s Mark Hotel Owl Investigations
Devner, CO, USA Email: 26th_papers@aes.org
Denver, Colorado, U.S.A.
119th Convention
New York New York, NY, USA
2005 Date: 2005 October 7–10
Location: Jacob K. Javits
Convention Center
New York, NY, USA
All of the papers from AES conventions and conferences through

2003 are available on the 20-disk AES Electronic Library. The 2003
update disks for the Electronic Library are now available.
For price and ordering information go to http://www.aes.org,

send email to Andy Veloz at aav@aes.org,
or call any AES office at +1 212 661 8528, ext. 39 (USA),
+44 1628 663725 (UK), +33 1 4881 4632 (Europe).
FERENCES INFORMATION FOR AUTHORS
Presentation concise. All figures should be labeled with
Authors should submit a PDF for review author’s name and figure number.
by e-mail to: Gerri Calamusa, Senior All illustrations are printed in black and
Exhibit information: Editor, gmc @ aes . org. (Remove spaces white. For more information about
Chris Plunkett/Donna Vivero from e-mail address first.) If manuscript is preparing digital art go to
Telephone: +1 212 661 8528, ext. 30 accepted for publication, the author will http://dx.sheridan.com.
Fax: +1 212 682 0477 be asked to submit the original word-
Email: 117th_exhibits@aes.org processing (double-spaced for The size of illustrations when printed in the
Call for papers: Vol. 52, No. 3 copyediting) and illustrations files. Journal is usually 82 mm (3.25 inches)
p. 319 (2004 March) wide, although 170 mm (6.75 inches) wide
Review can be used if required. Letters on original
Call for workshop participants: Manuscripts are reviewed anonymously illustrations (before reduction) must be large
Vol. 52, No. 5, p. 569 (2004 May) by members of the review board. After the enough so that the smallest letters are at
Convention preview: Vol. 52 No. 7/8, reviewers’ analysis and recommendation least 1.5 mm (1/16 inch) high when the
pp. 828–859 (2004 July/August) to the editors, the author is advised of illustrations are reduced to one of the above
either acceptance or rejection. On the widths. If possible, letters on all original
basis of the reviewers’ comments, the illustrations should be the same size.
editor may request that the author make
Exhibit information: certain revisions which will allow the Units and Symbols
Thierry Bergmans paper to be accepted for publication. Metric units according to the System of
Email: 118th_exhibits@aes.org International Units (SI) should be used.
Content For more details, see G. F. Montgomery,
Call for papers: Vol. 52 No. 10,
p. 1111 (2004 October) Technical articles should be informative “Metric Review,” JAES, Vol. 32, No. 11,
and well organized. They should cite pp. 890–893 (1984 Nov.) and J. G.
original work or review previous work, McKnight, “Quantities, Units, Letter
giving proper credit. Results of actual Symbols, and Abbreviations,” JAES, Vol.
experiments or research should be 24, No. 1, pp. 40, 42, 44 (1976 Jan./Feb.).
included. The Journal cannot accept Following are some frequently used SI
unsubstantiated or commercial statements. units and their symbols, some non-SI units
that may be used with SI units (▲), and
Organization
some non-SI units that are deprecated (■).
An informative and self-contained
abstract of about 60 words must be Unit Name Unit Symbol
Call for papers: This issue,
provided. The manuscript should develop
p. 1200 (2004 October) ampere A
the main point, beginning with an
bit or bits spell out
introduction and ending with a summary
bytes spell out
or conclusion. Pages should be numbered
decibel dB
consecutively. Illustrations must have
degree (plane angle) (▲) °
informative captions and must be referred
farad F
to in the text.
gauss (■) Gs
References should be cited numerically in gram g
brackets in order of appearance in the henry H
text. Footnotes should be avoided, when hertz Hz
possible, by making parenthetical hour (▲) h
remarks in the text. inch (■) in
joule J
Mathematical symbols, abbreviations, kelvin K
Exhibit information: acronyms, etc., which may not be familiar kilohertz kHz
Chris Plunkett/Donna Vivero to readers must be spelled out or defined kilohm kΩ
Telephone: +1 212 661 8528, ext. 30 the first time they are cited in the text. liter (▲) l, L
Fax: +1 212 682 0477 megahertz MHz
Email: 119th_exhibits@aes.org Subheads are appropriate and should be
meter m
inserted where necessary. Paragraph
microfarad µF
division numbers should be of the form 0
micrometer µm
(only for introduction), 1, 1.1, 1.1.1, 2, 2.1,
microsecond µs
2.1.1, etc.
milliampere mA
References should appear at the end of the millihenry mH
manuscript after the text in order of millimeter mm
appearance. References to periodicals millivolt mV
should include the authors’ names, title of minute (time) (▲) min
article, periodical title, volume, page minute (plane angle) (▲) ’
numbers, year, and month of publication. nanosecond ns
Book references should contain the names oersted (■) Oe
of the authors, title of book, edition (if other ohm Ω
than first), name and location of publisher, pascal Pa
publication year, and page numbers. picofarad pF
References to AES convention papers second (time) s
should be replaced with Journal publication second (plane angle) (▲) ”
citations if the convention paper has been siemens S
published in the Journal. tesla T
Reports of recent AES volt V
Illustrations watt W
conventions and conferences are
now available online, go to Figure captions should be duplicated in weber Wb
www.aes.org/events/reports. the word-processing document following
the references. Captions should be
AES
sustaining
member
organizations

Journal of The Audio Engineering Society Audio / Acoustics / Applications

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Journal of The Audio Engineering Society Audio / Acoustics / Applications

Uploaded by

Copyright:

Available Formats

AES

JOURNAL OF THE AUDIO ENGINEERING SOCIETY

Zevenbunderslaan 142/9, BE-1190 Brussels, Belgium, Tel: +32 2 345

STANDARDS AND INFORMATION DOCUMENTS

Dithered Noise Shapers and Recursive

Audio Research Group, University of Waterloo, Waterloo, ON N2L 3G1, Canada

0 INTRODUCTION applications the use of spectrally shaped dither is super-

P␩n(␩) ⳱ ⌸⌬(␩) (6)

Fig. 2. Simple high-pass TPDF dither generator.

shown in Fig. 4, where Q represents the dominant requan-

with an i.i.d. input signal ␩ (see Figs. 4 and 5). ␩ is

and Since ␯ is strict-sense stationary, we will drop the un-

(We will use superscripts (m, n) to indicate a function

the DTFT of which is the dither spectrum,

2. SPECTRALLY SHAPED DITHERS IN

We proceed to determine conditions on dithers of the

This expression will go to zero at the required locations if

If the conditions of Lemma 1 are satisfied, then it has

2.3 Illustrated Special Case: ␩ Is nRPDF

冉 冊 冉 冊兿 冉 冊 tion ⌸⌬ is a sinc function,5

goes to zero at u ⳱ cik/⌬ for all k ⫽ 0 if ci is an integer,

Corollary 1 In a nonsubtractively dithered quantizing

This coefficient set is associated with a dither whose spec- 6

We observe that in NSD systems we cannot generate 3 SPECTRALLY SHAPED DITHERS IN

where the components of ␮ are P共␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲 冉 k1 k2 k1 k2

if the ␩ are all mutually independent and we let vn ⳱ 0,

input independent, Lemma 3 requires that then

maker, “Minimally Audible Noise Shaping,” J. Audio APPENDIX

be regarded as given. Since ␧ ≡ y − x, and

where the delta function with a vector argument is defined

Since the quantizer output at any particular time is com-

We discern that E[␧m] is independent of the distribution of

which are the Sheppard’s corrections.

Motion-Tracked Binaural Sound*

V. RALPH ALGAZI, AES Member, RICHARD O. DUDA, AES Member, AND

CIPIC Interface Laboratory, University of California, Davis, CA 95616, USA

0 INTRODUCTION that are missing in conventional binaural recordings. We

where ␮ ⳱ ␻a/c is the normalized frequency, c is the

Fig. 6. Angular dependence of phase-derived ITD for 87.5-mm-

3.2 Nearest Microphone Selection

Fig. 9. Image representation of magnitude response of Fig. 3.

3.3 Full-Bandwidth Linear Interpolation

x̂(t) ⳱ (1 − w)xn(t) + w xnn(t) (5)

Fig. 10. Using eight microphones to sample sound field around

Slightly different results are obtained if the source is

Fig. 15. Magnitude response for nearest microphone full-

Fig. 16. Signal processing for two-band interpolation. Signals

Fig. 17. Two-band interpolation with high frequencies restored

4 EXTENSIONS AND APPLICATIONS

Table 1. Comparison of five methods for MTB spatial sound capture.

Advantages NM FB TB1 TB2 TB3

MTB is a new method for capturing, recording, and

The authors would like to thank Eric Angel, Robert

V. R. Algazi R. O. Duda D. M. Thompson

Importance and Representation of Phase in the

TUE HASTE ANDERSEN AND KRISTOFFER JENSEN, AES Member

Department of Computer Science, University of Copenhagen, Copenhagen, Denmark

Work is presented on the representation and perceptual importance of phase. Based on a

0 INTRODUCTION We propose a novel phase representation, partial-period

3.1 Visualization 3.3 Relative Phase Delay

In AP the partial-period phase trajectory is approxi-

Instrument f1 f2 f3 f4 f5 Score Impairment

Fig. 5. Mean degradation for different reproduction types.

冉冊冉冊兿冉冊 tion ⌸⌬ is a sinc function,5

where the components of ␮ are P共␯n,␯n+ᐉ兲,共xn,xn+ᐉ兲冉 k1 k2 k1 k2