Professional Documents
Culture Documents
Journal of The Audio Engineering Society Audio / Acoustics / Applications
Journal of The Audio Engineering Society Audio / Acoustics / Applications
DEPARTMENTS
News of the Sections.......................................1188 Membership Information.................................1195
Upcoming Meetings ........................................1192 Advertiser Internet Directory..........................1197
Sound Track .....................................................1192 Sections Contacts Directory ..........................1201
New Products and Developments ..................1193 AES Conventions and Conferences ..............1208
Available Literature..........................................1194
PRESIDENT’S MESSAGE
working with the Education Committee and students and
I
t is a great pleasure for me to address the membership
of this truly remarkable society. In the past I have know that there is much more we can do. Growing empha-
served the AES as a Governor and as Education sis on recording, postproduction, and design competitions
Chair. I have also served on two convention committees, with increased sponsorship at conventions is only a small
chaired an international conference, and have been chair part of what must happen in education. The Education
of a regional section for the past nine years. Through Committee has set up a forum for discussion of the role of
these experiences, I have developed tremendous respect the AES in audio education and our role in promoting men-
for the society and am very grateful to the membership torship. Through a growing number of tutorials and work-
for your vote of support. I would also like to take this op- shops at conventions, the AES promotes education at all
portunity to thank the staff at HQ and all the committee levels, not just for students. It is my strong feeling that
members I have worked with over the years. I look at AES must also work with other organizations to bring
this presidency as a new opportunity to make a unique hearing protection awareness and understanding to students
contribution to the society and its members. as well as educators.
It is clear that there are many wonderful aspects to the We rely on our sustaining members and exhibitors and
AES, but what I would like to focus on are the challenges must be continually aware of their needs and concerns in
that we will face and try to overcome over the next year. this changing marketplace. AES conventions and confer-
One challenge is maintaining and broadening our mem- ences are still the best meeting place for professionals in-
bership. Over the last two years we have seen a dramatic volved in all aspects of audio, but more can still be done to
increase in our overall membership numbers—especially promote this partnership at conventions, regional confer-
with respect to students. This is partly due to streamlining ences, and through online marketing and development.
the online application and registration procedure. How do Finally, we, the Board of Governors, are accountable to
we continue to keep our membership interested and grow- the membership of the society. We must be open to
ing? Our membership committee has cited ways we can change, be very good listeners, and set future goals and di-
retain and grow, through online access as well as by of- rections for marketing strategies, membership growth, and
fering more resources through our website on a subscrip- international standards. It is my intent to work very closely
tion basis with preferential rates for members. We should with the BOG, and especially with the vice-presidents of
also consider enhanced online delivery of educational the various regions, to make this happen. Having said this,
tools for students; for example tutorials and convention I welcome feedback on the operations of the AES Execu-
workshop materials placed on the website for our mem- tive Committee from all our members on an ongoing basis.
bers to access. Expanding our reach through online ser- On my recent visit to Latin America—which has
vices and registrations will also facilitate a more interna- shown extreme growth in AES membership over the past
tional/global set of initiatives. With market reforms in the few years—there were three words that I used frequently
Far East, particularly China, the time is upon us to diver- to express my gratitude and appreciation for the friend-
sify and expand our membership in these growing techno- ship and professionalism I experienced from the section
logical markets. We must give special attention to recruit- members. I would like to conclude by repeating these
ment and setting up new sections in these areas, in words: “Es un honor.” It really is.
addition to promoting regional conferences and activities,
which has been shown to spur membership growth.
Because of the growing rate of student membership, we
have a responsibility in AES to help set standards for audio
education programs worldwide. Students continually look
to us for educational advice; they are the present, as well as Theresa Leonard
the future, of this society. I have spent a great deal of time President
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1123
PAPERS
STANLEY P. LIPSHITZ, AES Fellow, ROBERT A. WANNAMAKER, AND JOHN VANDERKOOY, AES Fellow
The question of which spectrally shaped dither signals are appropriate for use in quantizing
systems, with and without noise shaping error feedback, or in recursive digital filters using
dithered quantization at the output, is addressed. It is shown that dithers that are acceptable
without feedback present may be unacceptable if feedback is introduced. In each case, certain
classes of dither generators are shown to be appropriate for audio applications.
1124 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
䉭
tween zero and the Nyquist frequency 1/2T gives the total Here ⳱ denotes equality by definition.2
error power ⌬2/4. Note that this sort of high-pass dither requires the gen-
eration of only one new random number n per sampling
0.2 Spectrally Colored Dithers period, as opposed to two such numbers per sampling
The recent literature has seen some discussion of dithers period if white TPDF dither is used. This slight saving of
that are not spectrally white, and whose samples are there- time is sometimes important to software designers work-
fore not all statistically independent of one another. Nec- ing near the temporal limits of their signal processing
essary and sufficient conditions upon the statistics of such hardware.
dithers have been discovered that guarantee that the cor-
responding total error spectra will be the sum of the dither 0.3 Dithered Systems with Noise-Shaping
spectrum and a white quantization noise component of Error Feedback
⌬2/12 total power [2, theorem 4]; that is, The question of a dither’s appropriateness is compli-
cated by the adoption of a noise-shaping quantization
⌬2T scheme of the sort illustrated in Fig. 3. Such systems pro-
PSD共f兲 = PSD共f兲 + . (4)
6 duce a shaped total error e that is spectrally shaped with
At least one spectrally shaped dither is well known [1] that respect to according to the formula [6]
has also been shown to satisfy these conditions [2]—it is
PSDe(f) ⳱ |1 − H(e−j2fT)|2 PSD(f) (8)
simple first-order high-pass dither such as would be gen-
erated by the scheme shown in Fig. 2. Here the TPDF
where H(e−j2fT) represents the frequency response of the
dither samples represent differences between consecu-
noise-shaping filter H(z) shown in Fig. 3. This filter al-
tive samples of a random process ,
ways includes one implicit delay element that prevents the
n ⳱ n − n−1 (5) current error from being subtracted from the current input.
where the subscripts time-index the samples and n are From Eq. (8) we note that for a given noise-shaper
i.i.d. with rectangular probability density function (RPDF). design the power spectrum of e is entirely determined by
That is to say, the n all have probability density functions the spectrum of . It has been observed [6], [7] that use of
(pdfs) of the form the usual TPDF dither with samples that are statistically
再
the convolution of n rectangular window functions. Throughout
the sequel, we will refer to the combination of n independent
1 ⌫ ⌫
䉭 , if − ⬍ⱕ RPDF processes as an n RPDF process, so that an RPDF process
兿⌫共兲 = ⌫ 2 2 (7) may also be referred to as 1RPDF and a TPDF process as
0, otherwise. 2RPDF.
Fig. 1. Schematic of nonsubtractively dithered quantizing system without noise-shaping error feedback.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1125
LIPSHITZ ET AL. PAPERS
independent of one another results in an spectrum that is jective, we will not explicitly consider such elaborate
flat and independent of the system input x⬘. The reason for systems since the results derived for a simple noise-
this is that the noise-shaping feedback path always in- shaping quantizer will be directly applicable thereto.
cludes the aforementioned delay element so that the cur-
rent dither sample n is always statistically independent of 0.4 Outline of Paper
the current input to the dither summer xn despite the pres- The salient question raised by the preceding discussion,
ence of the feedback path. We will see that this is suffi- then, is the following: what spectrally shaped dithers are
cient to ensure that the spectrum of is flat and indepen- appropriate for use in quantizing systems? Furthermore, it
dent of the system input, as has been observed in practice. is clear that for systems without noise shaping, the answer
Unfortunately the one sample delay is not as helpful in will be very different from that for noise-shaping systems
a system using spectrally shaped dither. One would hope due to the absence of a feedback path. Indeed we will
that such a system would produce a shaped total error proceed by treating the two cases quite separately.
spectrum given by substitution of Eq. (4) into Eq. (8), In both instances, however, we will assume the same
namely, reasonably general scheme for the generation of dither.
冋 册
The first step in our treatment, then, will be to define and
⌬2T characterize the statistics of the family of dithers under
PSDe共f兲 = |1 − H共e−j2fT兲|2 PSD共f兲 + . (9)
6 consideration. This is done in Section 1.
The analysis of quantizing systems will begin in Section
Unfortunately this is not always the case. To understand 2 with the simpler case where no feedback is present, the
why, consider simple high-pass dither. Here xn contains case of noise shaping being taken up in Section 3. In both
vestiges of n−1 arriving via the feedback path, and this instances our objective will be to find conditions upon the
signal is also present in n. Hence the current dither dither signal that will ensure that the spectrum PSD of the
sample is not, in general, independent of the current x. In total error is independent of the system input.3 To accom-
fact, we will show that with high-pass dither and an arbi- plish this we will examine E[12], the correlation be-
trary noise-shaping system the shaped total error spectrum tween two samples of the total error, 1 and 2, separated
is not given by Eq. (9) and is not independent of the in time by a time lag of ᐉ sampling periods. If the value of
system input. this quantity depends only on ᐉ (not on the system inputs)
Analysis of this sort of system is of potentially very then the error is said to be wide-sense stationary and we
wide interest, not only because noise-shaping converters can construct its autocorrelation function,
are now commonplace, but also because garden-variety
digital filters often employ feedback of this sort when the
filtering operation must produce an output of specified
precision. A direct-form recursive filter of this sort is
䉭
r共ᐉ兲 = 再 E关2兴,
E关12兴共ᐉ兲,
if ᐉ = 0
otherwise.
(10)
Fig. 3. Schematic of nonsubtractively dithered quantizing system with noise-shaping error feedback.
1126 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
Here T is the sampling period of the system, f is the fre- of a signal’s statistics with which to work than the pdf
quency variable in hertz, and the normalization of the itself.
transform is such that if it is integrated between zero and We will consider a dither signal whose nth sample can
the Nyquist frequency 1/2T, the result is equal to the vari- be written as
ance of (which is the same as the normalization of the ⬁
PSDs just cited). Clearly, since r(ᐉ) is input-independent
by definition, if it can be constructed as indicated, then the
n = 兺 c
i=−⬁
i n−i. (12)
power spectrum PSD(f) will be input-independent also. Throughout the sequel we will assume that all i are i.i.d.
The discussion is necessarily mathematical, but the Furthermore, although the assumption will not be made
treatment has been organized so that readers uninterested explicit until it is required in Section 3, in practice it will
in the technical details can extract the results of broadest be taken for granted that ci ⳱ 0 for i < 0, so that
interest by reading Section 1.1 and the discussions follow- corresponds to the strict-sense stationary4 output of a caus-
ing Corollary 1 (in Section 2.3) and Corollary 2 (in Section al nonrecursive dither filter G of the form
3.3). These address the most common sorts of dithers that ⬁
might be used in systems with and without noise-shaping
error feedback, respectively.
G共z兲 = 兺 cz
i=0
i
−i
(13)
Fig. 4. Schematic of general direct-form recursive digital filter using error feedback around output requantizer.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1127
LIPSHITZ ET AL. PAPERS
冉 冊
⬁
Also, by setting to zero all of the wi and all of the uj
⬁
兿␦ 兺 c
except for un and un+ᐉ (which we relabel u1 and u2), Eq.
= j − i j−i p共j兲.
j=−⬁ i=−⬁
(14) yields
⬁
Here we have used the facts that j is completely deter-
mined by choosing the i and that the i are i.i.d. so that Pn,n+ᐉ共u1, u2兲 = 兿 P 共c
i=−⬁
n−iu1 + cn+ᐉ−iu2兲
their joint pdf splits into a product of identical functions
that we will simply denote by p; that is, we take ⬁
pi ≡ p ᭙i.
= 兿 P 共c u + c
i=−⬁
i 1 i+ᐉu2 兲. (16)
To obtain the associated cf, we now Fourier transform all The moments of a random variable can be computed
variables. The transform variable corresponding to j will from the derivatives of its characteristic function at the
be uj and that corresponding to i will be wi, where, as origin [9]. In particular, for arbitrary random variables x
before, we will form real vectors u and w from these and y
components for notational convenience,
冉 冊 兰
䉭 ⬁
⬁ ⬁ E关xm兴 = xmpx共x兲 dx
兿兰 exp ∑c
⬁ −⬁
−j2jwj
P,共u, w兲 = −j2uj p(j)e
冉 冊
i j−1 dj
−⬁ j m
j=−⬁ i=−⬁
= P共xm兲共0兲 (17)
⬁ ⬁ 2
兿兰 兿 e
⬁
−j2ujcij−i
= p共j兲e−j2jwj dj
−⬁ and
j=−⬁ i=−⬁
⬁ ⬁
兿兰 兿e 兰兰
⬁ 䉭 ⬁ ⬁
= −j2cj−iuji
p共i兲e−j2iwi di E关xy兴 = xypx,y共x, y兲 dx dy
−⬁ −⬁ −⬁
冉 冊
i=−⬁ j=−⬁
冉 冊
j 2
⬁ ⬁ 1,1兲
= P共x,y 共0, 0兲. (18)
= 兿
i=−⬁
P wi + 兺
j=−⬁
cj−iuj . (14) 2
兺 cc
⬁
r共ᐉ兲 = E关2兴
Pn共un兲 = 兿 P 共c
i=−⬁
n−iun 兲. j=−⬁
j j+ᐉ
PSD共f兲 = 2T E关2兴 冋兺 ⬁
j=−⬁
c2j + 2
⬁ ⬁
兺兺 c c
ᐉ=1 j=−⬁
j j+ᐉ 册
cos共2fᐉT兲 .
(19)
2.1 First-Order Error Statistics Applying the product rule for differentiation to Eq. (15)
(without Feedback) we have
冉冊 冉 冊兿 冉 冊
In the first-order case, the theorem of fundamental im- ⬁ ⬁
兺
portance is given in [2, corollary 2 to theorem 1] (which is k k k
P共1兲 = cjP共1兲 cj P ci . (25)
also in [3, theorem 7] in a slightly different but entirely ⌬ j=−⬁ ⌬ i=−⬁ ⌬
equivalent form). i⫽j
冉 冊
m ⳱ 1, 2, . . . , M if and only if
k
P共1兲 ci =0
P共i兲 冉冊
k
⌬
=0 ᭙k ⫽ 0, i = 0, 1, 2, . . . , M − 1.
and
⌬
E关m兴 =
m Ⲑ 2
兺 冉 冊冉 冊
m ⌬
2
2ᐉ E关m−2ᐉ兴
2ᐉ + 1
(20)
冉 冊
P ci
⌬
k
=0
ᐉ=0 2ᐉ
so that, although terms occur in Eq. (25) in which either
where the floor operator returns the greatest integer less one of these two functions alone will be differentiated, in
than or equal to its argument. These relations are often any given term one will be undifferentiated and will cause
referred to as Sheppard’s corrections. The first two of the respective term to vanish in the required places.
them are of special interest, since they concern the mean
and the variance of the error, 2.2 Second-Order Error Statistics
(without Feedback)
E关兴 = E关兴 (21) In audio applications it is not sufficient to ensure only
that the error’s mean and variance are input independent.
⌬2 More generally, the error’s power spectrum should be con-
E关2兴 = E关2兴 + . (22) stant and predictable. We now proceed to find conditions
12
under which a spectrally shaped dither will render the
Note that Eq. (21) implies that there is no distortion of the complete autocorrelation function (and hence the power
input signal because the mean error is zero, whereas Eq. spectrum) of the total error input independent. Necessary
(22) implies that the error variance is also input indepen- and sufficient conditions for E[12] to be input indepen-
dent so that no so-called noise modulation is present. dent, presented in [2, theorem 4], are transcribed here as
We need to ascertain conditions under which the filtered the following lemma.
dither cf of Eq. (15) satisfies the requirements imposed by
Lemma 1. We begin with the case of the error mean (ᐉ ⳱ Lemma 2 In an NSD system where all dither values
1), which entails the requirement that are statistically independent of all system input values,
冉冊
E关12兴 = E关12兴
k
P =0 ᭙k ⫽ 0 (23)
⌬ for arbitrary input distributions if and only if the fol-
lowing three conditions are satisfied:
冉 冊
in order that it be independent of the input and given by
Eq. (21). Clearly, this condition will be satisfied by the k1 k2
P1,2 , =0 ᭙共k1, k2兲 ⫽ 共0, 0兲 (26)
dither cf of Eq. (15) if and only if for each k ⫽ 0 there ⌬ ⌬
冉 冊
exists an i such that
兲
k1
P共0,1 ,0 =0 ᭙k1 ⫽ 0 (27)
冉 冊 ⌬
1,2
k
= 0.
冉 冊
P ci
⌬ k2
兲
P共1,0 0, =0 ᭙k2 ⫽ 0. (28)
1,2 ⌬
Requiring that the error variance be input-independent
introduces an additional constraint, Subject to the conditions of Lemma 2, then, we have
P共1兲 冉冊
k
⌬
=0 ᭙k ⫽ 0. (24) E关nn+ᐉ兴 = 再 E关2n兴,
E关nn+ᐉ兴,
if ᐉ = 0
otherwise.
(29)
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1129
LIPSHITZ ET AL. PAPERS
The case where the lag parameter ᐉ equals zero has al- and
冉 冊
ready been handled in Section 2.1. Provided that the con-
ditions for constancy of the error variance are met so that k
P共1兲 ci =0 (33)
it is given by Eq. (22), then Eq. (29) is just the autocor- ⌬
relation function of the dither apart from an added ⌬2/12 at
ᐉ ⳱ 0. In this case Fourier transforming both sides of Eq. or there exist two distinct values of i such that
冉 冊
(29) yields Eq. (4). Hence the error spectrum will be equal
to the dither spectrum, apart from an additive white-noise k
P ci = 0. (34)
component arising from the zero lag term. ⌬
We proceed by applying each of the three conditions
required by the lemma to the two-dimensional dither cf of The conditions in the theorem are sufficient but not
Eq. (16). Condition 1 [Eq. (26)] is satisfied for all lags ᐉ necessary, with more complicated and general conditions
⫽ 0 if and only if for all (k1, k2) ⫽ (0, 0) and ᐉ ⫽ 0 there probably existing. In spite of this, the conditions of the
exists an i such that theorem are so general as to be difficult to use, but they are
of the form required for certain pdfs (see [10]). We will
冉
k1
P ci + ci+ᐉ
⌬
k2
⌬
= 0. 冊 (30)
interpret them next in a special case of interest.
兺
兲
k1 k1 k1
P共0,1 ,0 = cj+ᐉP共1兲 cj P ci .
n,n+ᐉ ⌬ j=−⬁ ⌬ i=−⬁ ⌬ sin共⌬u兲
䉭
i⫽j sinc共u兲 =
⌬u
Demanding that all terms in this sum go to zero at the
required locations for all ᐉ ⫽ 0, we arrive at the same and the cf of a sum of statistically independent random
condition found for constancy of the error variance above; processes is the product of their individual cfs [5], so the
that is, we require for all k ⫽ 0 that either P(cik/⌬) ⳱ 0 cf of an nRPDF dither is P(u) ⳱ sincn(u). It follows that
(cik/⌬) ⳱ 0 for some value of i, or P(cik/⌬) ⳱
and P(1) if the are i.i.d. and nRPDF, then condition 1 of Theorem
0 for any two values of i. 1 will be satisfied for all ᐉ ⫽ 0 if for each ᐉ ⫽ 0 there
Condition 3 [Eq. (28)] is symmetric with condition 2 exists an i, call it i0, such that of ci0 and ci0+ᐉ one is zero
and yields the same conditions on the cf of . and the other is a nonzero integer. To see why this is so,
Clearly, the conditions for the joint error moments to be note that for an of this sort Eq. (30) becomes
冉 冊 冉 冊
input independent are stronger than (and include) the cor-
responding conditions for the mean and the variance of the k1 k2 k1 k2
P ci + ci+ᐉ = sincn ci0 + ci0+ᐉ .
error. Collecting them gives us the following sufficient ⌬ ⌬ ⌬ ⌬
conditions for the error spectrum to be constant and input
independent This equation must hold if both k1 ⫽ 0 and k2 ⫽ 0 since
the argument of the sine function will then be a nonzero
integer multiple of under this condition. What happens
Theorem 1 In a nonsubtractively dithered quantizing in the case where ci0 ⳱ 0 and k2 ⳱ 0 (k1 ⫽ 0)? Then there
system without noise-shaping error feedback and using exists i1 ⳱ i0 + ᐉ such that Eq. (30) holds and becomes
dither of the form described by Eq. (12), the total error
will be wide-sense stationary and independent of the
system input with its PSD given by Eq. (4) if both of the
following conditions are satisfied:
冉 冊
sincn ci1
k1
⌬
= 0.
1) For each pair (k1, k2) ⫽ (0, 0) and for each ᐉ ⫽ 0 A similar factor exists if ci0+ᐉ ⳱ 0 and k1 ⳱ 0 (k2 ⫽ 0).
there exists an i such that Hence for each pair (k1, k2) ⫽ (0, 0) there exists, under the
冉 冊
stated condition, an i such that Eq. (31) holds.
k1 k2 What does condition 2 of Theorem 1 entail when is
P ci + ci+ᐉ =0 (31)
⌬ ⌬ nRPDF with n ⱖ 1? In such a case we see that the exis-
tence of two distinct ci with values that are nonzero inte-
and gers is sufficient to satisfy the requirements of both Eq.
2) for each k ⫽ 0, either there exists a value of i such
that
冉 冊
5
Note that this definition of the sinc function differs slightly
k from the most standard definition appearing in the literature,
P ci =0 (32)
⌬ which omits the factors of ⌬.
1130 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
(32) and Eq. (33). If, on the other hand, is nRPDF with Any coefficient sequence satisfies the conditions if it con-
n ⱖ 2, then it is sufficient that one nonzero integral ci tains at least two integers at least one of which is either
exists to satisfy the requirements of Eq. (34). For instance, leading or trailing. For instance, the following permuted
the cf of a TPDF process sequences
Pi(u) ⳱ sinc2(u)
再 冎
system input with a PSD given by Eq. (4) if both of the
following conditions are satisfied: 1 1
. . . , 0, 0, 0, 1, − , , −1, 0, 0, 0, . . .
1) For each ᐉ ⫽ 0 there exists an i such that of ci and 2 2
再 冎
ci+ᐉ one is zero and the other is a nonzero integer, and
2) either is nRPDF with n ⱖ 1 and there exist at 1 1
. . . , 0, 0, 0, 1, − , 0, , −1, 0, 0, 0, . . .
least two distinct values of i such that ci is a nonzero 2 2
integer, or is nRPDF with n ⱖ 2 and there exists at
least one value of i such that ci is a nonzero integer. all satisfy the requirements. Fig. 6 shows the total error
spectrum from a system using one of these dithers will a
Clearly, the conditions of the theorem are met by i.i.d. null system input, as well as that error spectrum normal-
TPDF dither, which corresponds to the case where is ized by the error PSD as predicted by Eq. (4) for a properly
TPDF and the associated dither filter has a single nonzero dithered system.6 The result of the normalization is flat,
coefficient, c0 ⳱ 1. Now consider a system with a sta- indicating that the spectrum is of the expected shape.
tionary RPDF signal. What sets of dither filter coeffi- Other suitable sequences without leading or trailing inte-
cients satisfy these conditions? Obviously, the require- gers can also be constructed. an example is
再 冎
ments are met by the simple high-pass dither coefficients
1 1
. . . , 0, 0, 0, , −1, 0, 1, − , 0, 0, 0, . . . .
{. . . , 0, 0, 0, 1, − 1, 0, 0, 0, . . .}. 2 2
Fig. 6. PSD(f) for quantizing system without error feedback and using a dither filter with RPDF input and coefficients {0.5, −1.0, 0.5,
−1.0}. System was presented with static null input (0.0 LSB). (a) Observed PSD. (b) Observed PSD normalized by predicted PSD.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1131
LIPSHITZ ET AL. PAPERS
The pair of unit-magnitude coefficients in each of these shows error spectra from a system using this sort of dither
sequences ensures satisfaction of the second condition of with different static system inputs. Also shown are nor-
Corollary 1. With an RPDF these coefficients ensure malizations of these spectra by the expected curve speci-
that the total dither contains a TPDF component. The cor- fied by Eq. (4). The results of the normalization are clearly
ollary guarantees that the presence of other components in not flat indicating that the error spectrum is not of the sort
the total dither does not interfere with the well-known predicted. Furthermore, variation of the spectrum with the
benefits of such a TPDF component (that is, the elimina- system input value is apparent.7 Resorting to a TPDF not
tion of distortion and error-variance modulation; see Sec- only increases the error variance, but it does not ameliorate
tion 0.1). If were itself TPDF rather than RPDF, then the problem of input dependence because it does nothing
only a single integer coefficient would be required in order to satisfy the first condition of the corollary. This is illus-
to achieve this result, as indicated in the corollary. On the trated in Fig. 8, where a small but consistently reproduc-
other hand, the first condition of the corollary introduces ible deviation from the expected spectrum is observed at
an additional restriction on the coefficient sequence that low frequencies. On the other hand, simply doubling the
ensures not only that distortion and error-variance modu- coefficient sequence (or, equivalently, doubling ) yields
lation are eliminated, but that modulation of the shape of a suitable dither, although it also increases the error vari-
the error’s power spectrum is also prevented. ance and somewhat alters the spectral shape as well [since
Consider, for instance, the coefficient sequence the additive white-noise component in Eq. (4) is not
doubled]. The result is illustrated in Fig. 9.
再 1 1
. . . , 0, 0, 0, , −1, 1, − , 0, 0, 0, . . .
2 2 冎 7
Interestingly, this coefficient sequence does satisfy the sec-
ond condition of the corollary, and thus the total error variance
Like the sequences shown in the preceding, this one meets assumes the predicted value given by Eq. (22) regardless of the
the second condition of Corollary 1. However, it fails to input. It is only the error’s spectral shape and not its variance that
meet the first condition of Corollary 1 for ᐉ ⳱ ±1. Fig. 7 is input dependent.
Fig. 7. PSD(f) for quantizing system without error feedback and using a dither filter with RPDF input and coefficients {0.5, −1.0, 1.0,
−0.5}. System was presented with static inputs. (a) Observed PSD with 0.0 LSB input. (b) Observed PSD normalized by predicted PSD
for 0.0 LSB input. (c) Observed PSD with 0.25 LSB input. (d) Observed PSD normalized by predicted PSD for 0.25 LSB input.
1132 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
Fig. 8. PSD(f) for quantizing system without error feedback and using a dither filter with TPDF input and coefficients {0.5, −1.0, 1.0,
−0.5}. System was presented with static null input (0.0 LSB). (a) Observed PSD. (b) Observed PSD normalized by expected PSD.
Fig. 9. PSD(f) for quantizing system without error feedback and using a dither filter with TPDF input and coefficients {1.0, −2.0, 2.0,
−1.0}. System was presented with static null input (0.0 LSB). (a) Observed PSD. (b) Observed PSD normalized by predicted PSD.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1133
LIPSHITZ ET AL. PAPERS
Lemma 3 In an NSD quantizing system in which the pendent of xn. This is i = n, so that Eq. (38) can be written
dither and the system input signal x are not necessarily as the product
statistically independent, E[ᐉ] is independent of the
Pn,xn共c0un, vn兲 = Pn共c0un兲P,xn共⬘, vn兲
distribution of the input x for ᐉ ⳱ 1, 2, . . . , N if and
only if the joint characteristic function of the dither and where
再
the input P,x(u, v) obeys the condition that
i, if i ⬍ n
P共,x 冉 冊
i,0兲
k k
,
⌬⌬
=0 ᭙k ⫽ 0, i = 0, 1, 2, . . . , N − 1.
⬘i =
0, if i ⱖ n.
We conclude that Eq. (39) holds if
(36)
Subject to the conditions of Lemma 3, E[m] for 0 ⱕ m
ⱕ N is given by the Sheppard’s corrections, as before.
冉 冊
P c0
⌬
k
=0 ᭙k ⫽ 0 (41)
The derivation of P,x in terms of the i proceeds pre- and, similarly, that Eq. (40) holds if
冉 冊
cisely as for the case where x is not involved, and we
simply state the result: k
P共1兲 c0 =0 ᭙k ⫽ 0. (42)
⌬
P,,x共u, w, v兲 = P,xr共␥, v兲 (37)
where 3.2 Second-Order Error Statistics
(with Feedback)
x = 共. . . , x−1, x0, x1, . . .兲 The analysis of the two-dimensional statistics proceeds
and in the usual fashion. The following obvious generalization
of Lemma 2 is derived in the Appendix.
v = 共. . . , v−1, v0, v1, . . .兲
Lemma 4 Consider two values n and n+ᐉ of the total
v being the corresponding vector of Fourier-transformed
error produced by an NSD quantizing system in which
variables. ␥ is a similar vector with components
the dither and the input to the quantizing system are not
⬁ necessarily statistically independent. Let these error
␥i = wi + 兺c
j=−⬁
j−iuj. samples be separated in time by ⳱ ᐉT, where T is the
sampling period of the system and ᐉ ⫽ 0. Denote by
By setting all the unwanted variables in Eq. (37) to zero P(n,n+ᐉ),(xn,xn+ᐉ) the joint cf of the dither and input values
we obtain n, n+ᐉ, xn, and xn+ᐉ, corresponding to n and n+ᐉ,
respectively. If and only if
Pn,xn共un, vn兲 = P,xn共, vn兲 (38)
Pn,xn 冉 冊
k k
,
⌬ ⌬
=0 ᭙k ⫽ 0 (39)
E关nn+ᐉ兴 = E关nn+ᐉ兴.
From Eq. (37) we have
and P共n,n+ᐉ兲,共xn,xn+ᐉ兲共u1, u2, v1, v2兲 = P,共xn,xn+ᐉ兲共, v1, v2兲 (46)
P共1,0
n,xn
兲
冉 冊
k k
,
⌬ ⌬
=0 ᭙k ⫽ 0. (40)
where
i = cn−iu1 + cn+ᐉ−iu2.
At first glance, interpretation of these conditions in terms We first consider the case where ᐉ > 0. Using the same
of Eq. (38) appears to be frustrated by the fact that we brand of reasoning that we used in the one-dimensional
know nothing about the quantity P,xn. However, we can case, we note that there exists exactly one value of i for
assume that 1) the dither filter is causal so that ci ⳱ 0 for which (cn−i,cn+ᐉ−i) ⫽ (0, 0) and for which i is statistically
all i < 0, and that 2) i is statistically independent of the independent of (xn,xn+ᐉ). This is i ⳱ n + ᐉ, so that Eq. (46)
random vector (. . . , xn−2, xn−1, xn) for i ⱖ n, where we can be written
recall that the dither filter H(z) must contain an implicit
single-sample delay. Thus there exists exactly one value of P共n,n+ᐉ兲,共xn,xn+ᐉ兲共u1, u2, v1, v2兲 = Pn+ᐉ共c0u2兲P,共xn,xn+ᐉ兲共⬘, v1, v2兲
i such that ci ⫽ 0 and for which i is statistically inde- (47)
1134 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
where only c0 remains since the other coefficient is zero, Corollary 2 In an NSD quantizing system with arbi-
and where trary noise-shaping error feedback and using filtered
再
dither with being an i.i.d. mRPDF random process,
i, if i ⬍ n + ᐉ
⬘i = the shaped total error will be wide-sense stationary and
0, if i ⱖ n + ᐉ. independent of the system input with a PSD given by
According to Eq. (47), condition 1 [Eq. (43)] of Lemma Eq. (9) if c0 is a nonzero integer and m ⱖ 2.
4 will be satisfied for ᐉ > 0 and k2 ⫽ 0 if
冉 冊
The result exploits the inherent single-sample delay in
k the feedback loop (see Section 0.3), which guarantees that
P c0 =0 ᭙k ⫽0.
⌬ at least the most recent value is independent of x because
it has not been recirculated. Thus whatever the remaining
On the other hand, if k2 ⳱ 0, then Eqs. (43) and (46) yield
components in the total dither signal may be, this can
P共n,n+ᐉ兲,共xn,xn+ᐉ兲 冉 k1 k1
⌬ ⌬ 冊 冉 冊冏
, 0, , 0 = P,xn ⬙,
k1
⌬ u1=k1 Ⲑ ⌬
single-handedly provide a suitable dither signal if it is at
least TPDF (that is, m ⱖ 2) and if it has an appropriate
width (that is, if c0 is a nonzero integer).
(48)
To appreciate just how restrictive this condition really
where is, it should be noted that simple TPDF high-pass dither
generated by filtering an RPDF random process does not
⬙i = cn−iu1.
satisfy it. This is confirmed by Fig. 10, which shows the
Then there exists exactly one i such that cn−i ⫽ 0 and for spectrum of from a noise shaper using this kind of dither
which i is independent of xn. This is i ⳱ n. Thus the and a one-tap feedback filter with coefficient −0.5. [As
right-hand side of Eq. (48) splits into a product that goes was pointed out in the Introduction the shaped total error
to zero if spectrum PSDe (f) will have the expected form given by
冉 冊
Eq. (9) if and only if the total error spectrum PSD(f) has
k the form given by Eq. (4); that is, we only require that the
P c0 =0 ᭙k ⫽ 0.
⌬ dither fix PSD(f) since PSDe (f) is then determined via
Thus condition 1 is satisfied for all (k1, k2) ⫽ (0, 0) subject Eq. (8).] Also shown is the spectrum normalized by the
to this requirement. By symmetry, the ᐉ < 0 case produces predicted spectrum of Eq. (4). Two static inputs (x ⳱ 0.0
identical conditions. and 0.5 LSB, respectively) were used. The normalized
Conditions 2 and 3 [Eqs. (44) and (45)] are handled by spectra are not flat, indicating that the error spectra are not
application of the product rule, as before. We omit the of the expected shape. Furthermore, the two spectra are
details, but it can be shown that these conditions are sat- different, clearly indicating that the error spectrum is input
isfied if Eqs. (41) and (42) hold. All three conditions being dependent.
satisfied, Eq. (4) gives the total error spectrum in terms of The observed spectral modulations can be eliminated by
the dither spectrum. using a TPDF rather than an RPDF , in which case the
We will now collect the conclusions from the foregoing conditions of Corollary 2 are satisfied since m ⳱ 2 but the
analysis. error variance is increased. This is illustrated in Fig. 11.
However, it is not clear that this use of spectrally shaped
Theorem 2 In an NSD quantizing system with arbi- dither offers an advantage over using simple i.i.d. TPDF
trary noise-shaping error feedback and using filtered dither, since any desired error spectrum can be obtained
dither of the form described by Eq. (12) the shaped total using noise-shaping error feedback. Fig. 12 shows power
error will be wide-sense stationary and independent of spectra of in the case where the noise-shaping system is
the system input with a PSD given by the same as that of Fig. 10 but where i.i.d. TPDF dither is
used rather than filtered RPDF (high-pass) dither. The
冋
PSDe共f兲 = |1 − H共ej2fT兲|2 PSD共f兲 +
⌬2T
6
册 normalized spectra are flat, indicating that the error spectra
are of the expected shape and are input independent.
When dithers that do not satisfy the conditions of Cor-
if both of the following conditions are satisfied: ollary 2 are used in conjunction with noise shaping, modu-
冉 冊
lation of the error spectrum typically decreases in magni-
k
P c0 =0 ᭙k ⫽0 tude with increasing complexity of the noise-shaping
⌬ filter. For instance, the plots in Fig. 13 correspond to those
and in Fig. 10, with the sole difference being the use of a
冉 冊
three-coefficient noise-shaping filter with psychoacousti-
k cally optimized coefficients (see [6]). Although some
P共1兲 c0 =0 ᭙k ⫽ 0.
⌬ variation of the spectrum with input is probably still pres-
ent, it is apparently negligible. To further characterize this
3.3 Illustrated Special Case: Is nRPDF variation would require a general statistical model of signals
(with Feedback) in the noise shaper, and the development of such a model
If is mRPDF we reach the following simple but quite remains an open problem. In any event, we do not recom-
restrictive conclusion. mend the use of dithers that violate the conditions of Corol-
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1135
LIPSHITZ ET AL. PAPERS
lary 2 in conjunction with noise shaping. For most applica- fective dither filter, 1 − H(z), must be minimum phase for
tions, simple i.i.d. TPDF dither is the best choice, with Fig. 14 to be realizable; that is it must be invertible.] This
spectral shaping of the error effected by means of noise- means that for such noise shapers, the broad class of
shaping feedback rather than by spectrally shaping the dither. shaped dithers defined by the conditions of Section 2 must
produce the expected input-independent error spectra. This
3.4 Results for Special Classes of Shapers is confirmed by Fig. 15, which shows total error spectra
We have so far been unable to find necessary and suf- PSD, unnormalized and normalized, for such a system
ficient conditions that will guarantee input independence using the simple high-pass dither that failed when a feed-
of the error spectrum for an arbitrary noise shaper (al- back filter with noninteger coefficients was used.
though a set of weaker but more complicated conditions
for mRPDF is given in [11, theorem 6]). However, some 4 CONCLUSIONS
interesting results are known for certain special classes of
shapers. The most obvious is that if the feedback filter Systems without noise-shaping feedback respond quite
H(z) is FIR and its first ᐉ coefficients are all zero, the differently to the use of particular spectrally shaped dither
shaped total error spectrum is wide-sense stationary and signals from those with error-feedback paths. The total
given by Eq. (9) if ci ⳱ 0 for i > ᐉ and the conditions of error of the system may be wide-sense stationary in one
Theorem 1 are satisfied. This ensures that xi contains no case and not in the other. For instance, simple high-pass
vestiges of any j that will also be present in the current dither renders the total error wide-sense stationary if no
dither sample i, so that xi and i will be independent. feedback is present, but fails to do so for systems with
An interesting result has been obtained for a special arbitrary noise shapers.
class of noise-shaper designs by Gerzon et al. [12]. These Dithered systems using spectrally colored dither signals
shapers employ feedback filters H (z) whose filter coeffi- should be designed according to the criteria of Corollary 1
cients are all integers. Gerzon et al. have shown that any when no noise-shaping error feedback is to be used. How-
such system produces precisely the same output as the ever, in most applications the greater flexibility of noise-
system of Fig. 14, which employs no feedback. [The ef- shaping error feedback will supersede the use of spectrally
Fig. 10. PSD(f) for quantizing system with error feedback and using a dither filter with RPDF input and coefficients {1.0, −1.0}. A
single-tap noise-shaping filter with coefficient −0.5 was used. (a) Observed PSD for 0.0 LSB input. (b) Observed PSD normalized by
expected PSD for 0.0 LSB input. (c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected PSD for 0.5 LSB
input.
1136 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
shaped dither. When such noise shaping is employed, Audio Engineering Society, Tokyo (1989 June); in Col-
the dither signal should meet the restrictive conditions lected Preprints (AES Japan Section, Tokyo, 1989), pp.
of Corollary 2. Many spectrally shaped dithers will 72–75.
introduce unexpected error modulations due to recircula- [2] R. A. Wannamaker, S. P. Lipshitz, J. Vanderkooy,
tion of the dither by the feedback loop so that it is no and J. N. Wright, “A Theory of Non-Subtractive Dither,”
longer independent of the input signal. Thus we rec- IEEE Trans. Signal Process., vol. 48, pp. 499–516 (2000
ommend simple i.i.d. TPDF dither for most noise- Feb.).
shaping applications, since any desired shape of error [3] S. P. Lipshitz, R. A. Wannamaker, and J. Vander-
spectrum can be achieved by specifying a suitable kooy, “Quantization and Dither: A Theoretical Sur-
feedback filter. Such precautions will guarantee that the vey,” J. Audio Eng. Soc., vol. 40, pp. 355–375 (1992
total systemic error is wide-sense stationary, as is ap- May).
propriate in audio systems, and is possessed of the ex-
[4] S. P. Lipshitz and J. Vanderkooy, “Digital Dither,”
pected spectral characteristics.
presented at the 81st Convention of the Audio Engineering
Society, J. Audio Eng. Soc. (Abstracts), vol. 34, p. 1030
5 ACKNOWLEDGMENT (1986 Dec.), preprint 2412.
Stanley P. Lipshitz and John Vanderkooy have been [5] J. Vanderkooy and S. P. Lipshitz, “Digital Dither: Sig-
supported by operating grants from the Natural Sciences nal Processing with Resolution Far below the Least Signifi-
and Engineering Research Council of Canada. cant Bit,” in Proc. AES 7th Int. Conf. on Audio in Digital
Times (Toronto, ON, Canada, 1989 May), pp. 87–96.
[6] R. A. Wannamaker, “Psychoacoustically Optimal
6 REFERENCES Noise Shaping,” J. Audio Eng. Soc., vol. 40, pp. 611–620
[1] S. P. Lipshitz and J. Vanderkooy, “High-Pass (1992 July/Aug.).
Dither,” presented at the 4th Regional Convention of the [7] S. P. Lipshitz, J. Vanderkooy, and R. A. Wanna-
Fig. 11. PSD(f) for quantizing system with error feedback and using a dither filter with TPDF input and coefficients {1.0, −1.0}. A
single-tap noise-shaping filter with coefficient −0.5 was used. (a) Observed PSD for 0.0 LSB input. (b) Observed PSD normalized by
expected PSD for 0.0 LSB input. (c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected PSD for 0.5 LSB
input.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1137
LIPSHITZ ET AL. PAPERS
Fig. 12. PSD(f) for quantizing system with error feedback and using simple i.i.d. TPDF dither. A single-tap noise-shaping filter with
coefficient −0.5 was used. (a) Observed PSD for 0.0 LSB input. (b) Observed PSD normalized by expected PSD for 0.0 LSB input.
(c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected PSD for 0.5 LSB input.
1138 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
where ⺢N is the space of all N-vectors with real com- Using the definition of conditional probability [9], we may
ponents and ⺪ N is the space of all N-vectors with express the joint pdf of the signals under consideration as
integer components. x1 and w1, for example, represent sig-
nals present in the system at the same time instant, p,y,,x共,y,,x兲 = p|y,,x共,y,,x兲 py|,x共,y,,x兲 p,x共,y,,x兲
whereas x1 and x2 represent distinct but not necessarily (49)
successive samples. We note that if N ⳱ 1 then the results
that follow reduce directly to the “one-dimensional” re- where it should be kept in mind that the arguments and
sults of [2]. However, N may have any value between 1 subscripts in general represent vectors. We will compute
and infinity. the factors on the right-hand side of this equation. p,x will
Fig. 13. PSD(f) for quantizing system with error feedback and using a dither filter with RPDF input and coefficients {1.0, −1.0}. A
three-tap FIR noise-shaping filter with coefficients {1.33, − 0.73, 0.65} was used. (a) Observed PSD for 0.0 LSB input. (b) Observed
PSD normalized expected PSD for 0.0 LSB input. (c) Observed PSD for 0.5 LSB input. (d) Observed PSD normalized by expected
PSD for 0.5 LSB input.
Fig. 14. System equivalent to that of Fig. 3 for case where all coefficients of error-feedback filter H(z) are integers.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1139
LIPSHITZ ET AL. PAPERS
p|y,,x共,y,,x兲 = ␦共 − y + x兲 N
兺␦冉u + u − ⌬冊␦冉u + u − ⌬冊
midtread quantizer with transfer characteristic
k k
⳯
w 1 x y y
Q共w兲 = ⌬ + (51) k∈⺪N
⌬ 2
P,x共u,uy,u,ux兲 = ␦共u,uy兲P,x共u,ux兲
we observe that if
共2n − 1兲⌬ Ⲑ 2 ⱕ x + ⬍ 共2n + 1兲⌬ Ⲑ 2 where ux ⳱ (ux1, ux2, ux3, . . . , uxN) ∈ ⺪N is a vector of
transform domain variables associated with x, where u,
then the quantizer output is n⌬. Thus py|,x can be ex- uy, and u are similarly defined, where x ∈ ⺪N, and where
䉭
pressed as the following product of a window function sinc(u) ⳱ ⌸Ni⳱1 sinc(ui). After transforming, the multipli-
with an impulse train, cations of Eq. (49) become convolutions, so in order to
compute the joint cf P,y,,x we convolve the three preced-
py|,x共y,,x兲 = ⌬⌸⌬关y − 共x + 兲兴W⌬共y兲 (52) ing expressions with one another (separate convolutions
over each transform variable being required). After sim-
where plification the result is
兺 sinc冉u + u + u + u − ⌬冊
⬁
䉭
W⌬共y兲 = 兺 ␦共y − k⌬兲.
k=−⬁
P,y,,x共u,uy,u,ux兲 = y x
k
k∈⺪N
兺 sinc冉 u − ⌬ 冊P 冉 冊
N k k k
䉭
⌸⌬共y兲 = 兿 ⌸ 共y 兲
i=1
⌬ i
P共u兲 =
k∈⺪N
,x u− ,−
⌬ ⌬
. (53)
Fig. 15. PSD(f) for quantizing system with error feedback and using a dither filter with RPDF input and coefficients {1.0, −1.0}.
System was presented with null static input (0.0 LSB), and a single-tap noise-shaping filter with coefficient 1.0 was used. (a) Observed
PSD. (b) Observed PSD normalized by expected PSD.
1140 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS DITHERED NOISE SHAPERS
Moments of are determined by the derivatives of its cf The correlation between two total error samples sepa-
at the origin [9]. Consider, for instance, N ⳱ 1. Then Eq. rated in time by a nonzero lag is
冉 冊
(53) becomes 2
䉭 j 兲
E 关12兴 = P共1,1
1,2
共0, 0兲
兺 sinc冉 u − ⌬ 冊P 冉 冊
⬁
2
冉 冊 兺再 冉 冊 冉 冊
k k k
P共u兲 = ,x u− ,− . j 2 k1 k2
k∈−⬁ ⌬ ⌬ = sinc共1兲 − sinc共1兲 −
2 ⌬ ⌬
k∈⺪2
and
冉
× P1,2,x1,x2 −
k1 k2
⌬
,− ,− ,−
⌬ ⌬ ⌬冊
k1 k 2
䉭
E关m兴 =
2冉 冊
j m
P共m兲共0兲
冉 冊
+ sinc −
k1
⌬
sinc共1兲 −
k2
⌬ 冉 冊
冉 冊 兺兺冉 冊 冉 冊
⬁ m
冉 冊
j m k
m 共r兲
= r sinc − ⌬ k1 k2 k1 k2
兲
2 k=−⬁ r=0 ⳯ P 共1,0,0,0 − ,− ,− ,−
1,2,x1,x2 ⌬ ⌬ ⌬ ⌬
冉 冊
m−r,0兲
⳯ P共,x
k k
− ,− .
⌬ ⌬
(54)
冉 冊
+ sinc共1兲 −
k1
⌬
sinc − 冉 冊
k2
⌬
i,0兲
p共,x 冉 冊
k k
᭙k ⫽ 0 i = 0, 1, 2, . . . , m − 1. (55)
冉 冊
+ sinc −
k1
⌬
sinc −
k2
⌬ 冉 冊
冉 冊冎
,
⌬⌬ k k k1 k2
兲
⳯ P共1,1,0,0
1 2
− ,− ,− ,− . (59)
1,2,x1,x2 ⌬ ⌬ ⌬ ⌬
This is the forward direction of the assertion in Lemma 3.
(The converse is proven in [13] using induction.) In this Careful inspection of Eq. (59), keeping in mind that the
case Eq. (54) reduces to derivatives of the sinc function vanish at the origin, shows
that it will be independent of the system input distribution
if and only if the following three conditions are satisfied:
冉 冊 兺冉 冊 冉 冊
m m
j m 共r兲 共m−r兲
E关m兴 = r sinc 共0兲P,x 共0兲
k1 k2 k1 k2
2 P1,2,x1,x2 , , , =0 ᭙共k1, k2兲 ⫽ 共0, 0兲 (60)
r=0 ⌬ ⌬ ⌬ ⌬
兺冉 冊冉 冊
m Ⲑ 2
冉 冊
m ⌬ 2r E关m−2r兴
= (56) 兲
k1 k1
2r 2 2r + 1 P共0,1,0,0 , 0, , 0 = 0 ᭙k1 ⫽ 0 (61)
r=0 1,2,x1,x2 ⌬ ⌬
the joint statistics between total error samples separated in In that case Eq. (59) reduces to
time in NSD systems. From Eq. (53) we have
兲 䉭
E 关12兴 = P共1,1
1,2
共0,0兲 = E 关12兴. (63)
兺 sinc冉 u − ⌬ 冊P 冉 冊
k k k We have now derived all of the theorems used in the
P共u兲 = ,x u − , −
⌬ ⌬ body of the paper, including, although this may not be
k∈⺪2
obvious, those in Section 2. The latter follow from assum-
ing that the random vectors and x are statistically inde-
but now we will let ⳱ (1, 2), u ⳱ (u1, u2), and k ⳱ pendent (since no feedback is present). That is, we let
(k1, k2). Then
P,x共, x兲 = P共兲 Px共x兲.
P共u兲 = P1,2共u1, u2兲 By then insisting that Px be arbitrary, the results in this
兺 sinc冉 u 冊 冉 冊
appendix immediately reduce to those of [2], [3].
k1 k2
= 1 − sinc u2 −
⌬ ⌬ The biographies of Stanley P. Lipshitz and John Vanderkooy
k∈⺪2
冉 冊
were published in the 2004 March issue of the Journal. The
k1 k2 k1 k2 biography of Robert A. Wannamaker was published in the 2004
× P1,2,x1,x2 u1 − ,u − ,− ,− . (58)
⌬ 2 ⌬ ⌬ ⌬ June issue of the Journal.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1141
PAPERS
A new method is presented for capturing, recording, and reproducing spatial sound that
provides a vivid sense of realism. The method generalizes binaural recording, preserving the
information needed for dynamic head-motion cues. These dynamic cues greatly reduce the
need for customization to the listener. During either capture or recording, the sound field in
the vicinity of the head is sampled with a microphone array. During reproduction, a head
tracker is used to determine the microphones that are closest to the positions of the listener’s
ears. Interpolation procedures are used to produce the headphone signals. The properties of
different methods for interpolating the microphone signals are presented and analyzed.
2
It is possible to reproduce MTB signals directly over loud-
speakers without head tracking and achieve many of the spatial
effects of headphone listening, much as Johnston and Lam did
using another approach [1], but “crosstalk” introduces audible
artifacts. When the listeners are in separated sound environ-
Fig. 1. Basic components of motion-tracked binaural system. A
head tracker is used to find the microphones closest to the lis- ments, it is possible to use crosstalk cancellation techniques and
tener’s ear and to interpolate between their outputs, thereby dy- replace the headphones by loudspeakers [2], [3], but the cancel-
namically capturing the sound at the point where the ear would lation algorithm must be responsive to head motion. In either
be located. case, the listener is confined to a relatively small “sweet spot.”
1142 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
rotates his or her head, the sound image translates when for resolving front/back ambiguities and improving local-
the listener translates. The perceived source locations are ization accuracy. (See [18] for recent results and additional
completely stabilized only when the radius of the micro- references.)
phone array is the same as the radius of the listener’s head.
Additional bandwidth is needed for the microphone sig- 1.2 Spatial Sound Technology
nals. If too few microphones are used, objectionable in- There are fundamentally only two different engineering
terpolation artifacts may be audible. Because the micro- approaches to reproducing spatial sound: wavefield syn-
phones are not equipped with artificial outer ears or thesis and binaural reproduction [19], [20]. When used for
pinnae, the signals lack the listener-dependent spectral faithful reproduction, stereo and the various forms of sur-
cues for elevation. round sound (such as quadraphonics, 5.1-channel sur-
In this paper we present a detailed analysis of the MTB round, Ambisonics) can all be viewed as attempts to re-
system and show how many of these limitations can be construct the sound field that was sampled by the
overcome. We begin with a brief review of the physical recording microphones. Although this technology is com-
and psychophysical principles of spatial hearing, placing mercially dominant, the theoretical requirements for exact
particular emphasis on the theoretical properties of the wavefield synthesis throughout the audible frequency
spherical-head model. We then describe the errors that are range are severe [21]. For example, the required area sam-
introduced when the sound field is sampled spatially. We pling density for a hexagonal array of microphones spaced
present several alternative approaches of increasing effec- a half-wavelength apart is 8/32; for 20-kHz bandwidth,
tiveness for reducing these errors, and the classes of ap- this calls for about 15 700 microphones per square meter.
plications for which each is most appropriate. Ambisonic recording provides a theoretically well
founded local approximation to exact reconstruction that is
1 BACKGROUND vastly more efficient, but the listener is confined to a rela-
tively small “sweet spot,” and multiple loudspeakers are
1.1 Spatial Hearing still needed for spatially faithful reproduction [22], [23].
Research on the physical and psychophysical basis for Binaural sound reproduction has the great advantage of
sound localization has a long history [4]–[7]. The many being able to produce fully three-dimensional sound with
auditory cues used by people include: only two signals—the pressure waveforms at the listener’s
1) The interaural time difference (ITD) eardrums. Reproduced over properly compensated head-
2) The interaural level difference (ILD) phones, binaural reproduction can sound impressively re-
3) Monaural spectral cues introduced by the pinnae alistic. However, there are several reasons why the binau-
4) Torso reflection and diffraction cues ral approach has not been accepted widely.
5) The ratio of direct to reverberant energy 1) The listener must either wear headphones or be con-
6) Cue changes induced by voluntary head motion fined to a small “sweet spot.”
7) Familiarity with the sound of the source 2) Differences between the size and shape of the pinnae
All of the acoustic cues vary with azimuth, elevation, of the dummy head and the pinnae of the listener can cause
range, and frequency. The two interaural difference cues the apparent source elevation to be either poorly defined or
are particularly important, because they are largely inde- seriously in error.
pendent of the source spectrum. Lord Rayleigh’s pioneer- 3) The perceived auditory field turns if the listener turns
ing and well-known duplex theory asserts that the ITD is his or her head. This is unacceptable if the sound must be
exploited at low frequencies and the ILD is exploited at spatially registered with imagery.
high frequencies, the crossover frequency being around 4) Sound sources that are in front are often perceived as
1.5 kHz [8]. Indeed, the ITD and the ILD are the primary being in back or in the head.
cues for estimating the so-called lateral angle , the angle The first of these problems is intrinsic to binaural re-
between a ray from the center of the head to the sound production. The second problem can be ameliorated,
source and the vertical median plane. Above 3 kHz the though it remains a challenge. However, the other two
monaural spectral changes introduced by the pinnae pro- problems can be solved completely if head motion is taken
vide the primary cues for estimating elevation [9], whereas into account.
below 3 kHz the torso provides weak but useful elevation In 1941 de Boer and van Yrk showed that front/back
cues [10]. For estimating range, the primary cues appear to confusion could be eliminated with a spherical dummy
be familiarity with the source [11], the ILD for close head by rotating the head back and forth, provided that the
sources [12], and the direct-to-reverberant energy ratio for listener turned his or her head back and forth in synchrony
distant sources [13]. with it [24]. More recent work using a head tracker on the
The fact that people also use head motion to help lo- listener and a servomechanism to turn the dummy head in
calize sounds has long been recognized [14]. In a series of accordance with the listener’s head motion produced a
classic experiments, Wallach demonstrated that motion stable acoustic field and eliminated front/back confusion
cues can override pinna cues in resolving front/back con- [25]. Clearly, this solution cannot be used to record sound,
fusion [15], [16]. Although pinna cues are also important and even for remote listening it requires a separate dummy
[17], and although head motion is not effective for local- head for each listener. The MTB method can be viewed as
izing very brief sounds [5], subsequent research has a generalization of the servomechanism approach that 1)
largely confirmed the importance of these dynamic cues eliminates the need for physically turning the dummy
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1143
ALGAZI ET AL. PAPERS
head, 2) allows recording as well as remote listening, and This solution for an idealized model provides a useful
3) allows multiple listeners to listen simultaneously. and widely used first-order approximation to a human
Binaural reproduction also plays a central role in the HRTF. For best results, the radius a should be adapted to
creation of virtual auditory space. Here the left-ear and each listener [34]. However, the major features of the
right-ear signals for any sound source are computed by HRTF behavior can be illustrated using the traditional
convolving the source signal with the head-related impulse value of 87.5 mm for the head radius. We used this value
responses for the respective ears [26]–[28]. The Fourier and the algorithm given in [33] to compute H numerically.
transforms of these impulse responses are the head-related The HRTFs for other radii can be found by scaling fre-
transfer functions (HRTFs), which capture all of the quency inversely with head size.
acoustic sound localization cues. Because the HRTFs de- Fig. 3 shows the resulting magnitude responses for 19
pend on the location of the source relative to the head, the different observation angles. Because the sphere does not
HRTFs change if the source moves or if the listener turns appreciably disturb the incident field at low frequencies
his or her head. The use of a head tracker to modify (frequencies where the wavelength is greater than the cir-
HRTFs in real time was reported as early as 1988 [29], and cumference of the sphere), all of the curves approach 0 dB
is now common practice [26]. In particular, it is the basis at low frequencies. In general, the high frequencies are
for the stabilization of stereo and surround-sound record- boosted on the ipsilateral side of the sphere (␣ < 90°) and
ings for headphone listening [30]. The techniques used to are cut on the contralateral side (␣ > 90°). This contralat-
generate virtual auditory space can be readily modified to eral high-frequency attenuation is commonly referred to as
simulate MTB sound capture, thereby allowing multiple “head shadow.” However, the strongest head shadow does
listeners to experience the same computer-generated spa- not occur at ␣ ⳱ 180°. As the observation point ap-
tial sounds. proaches 180°, it enters the so-called “bright spot,” where
A variety of different dummy heads have been devel- waves traveling over the surface of the sphere come to-
oped for binaural recording [31]. Differences in their pinna gether in phase and the response becomes essentially flat.
shapes produce corresponding differences in their HRTFs, A consequence of the extreme symmetry of the sphere, the
particularly at high frequencies. In the absence of head bright spot is not as pronounced in human HRTF data,
motion, these differences impact the perception of eleva- although it can be seen there as well [35].
tion, front/back discrimination, and externalization [32]. In
our experience, however, the dynamic cues that arise from
head motion are sufficiently strong that they often domi-
nate pinna cues. The subtle differences between these
HRTFs become much less significant when head motion is
accounted for. In fact, with proper compensation, remark-
ably good results can be obtained from a spherically
shaped head. This leads us to examine the theoretical be-
havior of an ideal spherical-head model.
2 SPHERICAL-HEAD MODEL
Consider an ideal rigid sphere of radius a that is scat-
tering incident plane waves of angular frequency . In Fig. 2. Infinitely distant sound source producing plane waves that
particular, suppose that the free-field sound pressure at the are scattered by a rigid sphere. Pressure at observation point
origin—the pressure due to the source when the sphere is varies with frequency , observation angle ␣, and radius a of
removed—is given by the real part of Pff exp(jt), where sphere.
Pff is the phasor free-field pressure. Let ␣ be the observa-
tion angle, the angle between a ray to the sound source and
a ray to any observation point on the surface of the sphere
(see Fig. 2), and let the total pressure at the observation
point be the real part of Pop exp(jt), where Pop is the
phasor pressure at the observation point. Then it can be
shown that the HRTF H is given by
冉冊兺
⬁
Pop 1 2 jm−1共2m + 1兲Pm共cos ␣兲
H共, ␣兲 = = (1)
Pff m=0 h⬘m 共兲
1144 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
These results can also be used to compute the ILD as a The equally important ITD can be obtained from the
function of the lateral angle . Although people’s ears are phase response.3 The variations of the ITD with the lateral
usually somewhat below and behind the center of the head, angle are shown in Fig. 6. For human hearing the most
for simplicity we assume that the ears are on opposite important frequency range is below 1.5 kHz [37]. Below
sides of a diameter of the sphere. With that assumption, about 600 Hz it can be shown that the first two terms in
⳱ /2 − ␣ (see Fig. 4). It follows that the ILD (in dB) is Eq. (1) provide the following low-frequency approximation:
given by
a
ITDlow frequency ≈ 3 sin . (3)
冉 冊
c
冨 冨
H , − A simple ray-tracing argument provides an approximation
2
冉 冊
ILD共, 兲 = 20 log10 . (2) for the high-frequency ITD, which is known as Wood-
H , + worth’s formula [38],
2
a
ITDhigh frequency ≈ 共 + sin 兲. (4)
The variations of the ILD with the lateral angle are c
shown in Fig. 5. Note that substantial interaural level dif-
As Fig. 6 illustrates, these approximations agree closely
ferences can be developed for frequencies above 1.5 kHz,
with the computed results. Although the perceptual sig-
where the ILD contributes strongly to sound localization.
nificance of the difference between low- and high-
The reduction in the magnitude of the ILD as || ap-
frequency behavior has been questioned [7], we note in
proaches 90° is due to the bright spot. ILDs measured for
passing that the difference between low-frequency and
human heads have this same general behavior, but because
high-frequency ITDs is greatest when ⳱ 60°, and the
the bright spot is not as strong, they tend to vary more
percentage difference is greatest when ⳱ 0°. To be more
monotonically with the angle.
specific, the percentage difference is 35.8% when ⳱ 60°
and 50% when ⳱ 0° [39].
It is natural to ask how well the HRTF for the spherical-
head model matches human HRTFs. The pinna has a
strong effect on the HRTF at high frequencies, which com-
plicates a direct comparison. However, a simple compari-
son can be made with the HRTF for the KEMAR manne-
quin, for which the pinnae can be removed. Figs. 7 and 8
show the angular dependence of the ILD and the ITD for
a pinnaless KEMAR for a source in the anterior horizontal
plane.
3
Although the group delay is sometimes used to define the
ITD, its frequency dependence is usually determined from the
phase delay [36], which is consistent with neurophysiological
auditory models. For this reason we also use the phase delay to
Fig. 4. Top view of listener’s head for source in the horizontal define the ITD. Below 1.5 kHz, where hearing is phase sensitive,
plane. Because of symmetry of sphere, same diagram applies to
the differences between the two measures of the ITD for the
any plane through the interaural axis. ILD and ITD are constant
on a surface of constant lateral angle, called cone of confusion. sphere are relatively small [4, p.74].
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1145
ALGAZI ET AL. PAPERS
There are noticeable differences between the low- the top half of the image, and the contralateral responses
frequency ILDs for the KEMAR and the sphere (compare appear in the bottom half. The strong response on the
Figs. 5 and 7). Most of these differences can be attributed ipsilateral side and both the head shadow and the bright
to torso reflections [35]. In addition, the bright spot, which spot on the contralateral side are clearly seen in this
reduces the ILD when || is close to 90°, is significantly representation.
stronger for the sphere than for the KEMAR. Despite these Fig. 5, 6, and 9 show how the critical cues provided by
differences, one sees the same general behavior, namely, a a spherical-head model vary with rotation, revealing a sig-
strong increase in ILD with frequency and with lateral nificant and continuous variation of the ILD, the ITD, and
angle. In addition, the equally important ITD response is the monaural spectrum with the lateral angle. Although the
quite close to the results for the sphere, both at low and at corresponding behavior of a human head is more complex,
high frequencies (compare Figs. 6 and 8). Clearly, the the dynamic effects that are produced by this simple ap-
general behavior of the HRTF for the sphere retains the proximation can produce a compelling perceptual experi-
basic features of the HRTF for a pinnaless KEMAR. ence. The MTB recording method approximates this con-
In the remainder of this paper we find it more revealing tinuous behavior through sampling and interpolation, thus
to display frequency-response data using polar coordinates introducing errors. We now use the spherical-head model
and an image representation. For purposes of illustration, to investigate the spectral errors that different interpolation
the magnitude response data shown in Fig. 3 are presented procedures introduce.
as an image in Fig. 9. Here the brightness at any point in
the image represents the dB magnitude, the radius speci- 3 MTB SAMPLING AND INTERPOLATION
fies the frequency on a logarithmic scale, and the polar
angle directly specifies the observation angle. This makes 3.1 The Interpolation Problem
it easy to visualize how the sound spectrum changes and, The core problem for MTB sound capture is to recover
by direct implication, how the ILD changes as the head is the sound pressure at the location of a listener’s ear from
rotated. In Fig. 9 the incident sound is propagating down the signals picked up by a small number of microphones.
the positive y axis. Thus the ipsilateral responses appear in In this section we analyze and evaluate the behavior of
three different interpolation procedures. In each case we
assume that the N microphones are equally spaced around
the equator of a rigid sphere. Fig. 10 illustrates this case
for N ⳱ 8. We assume that the signals from a head tracker
can be used to determine the locations of the listener’s left
and right ears relative to the center of the sphere. The
problem is to find a good approximation to the signals at
the corresponding ear locations on the sphere.
1146 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
divides the circle into N sectors within which the system where the interpolation coefficient w is given by
does not respond to changes in head motion. Thus for N ⳱
8, listeners will hear the location of the source turn with w ⳱ /N. (6)
their heads as they turn through a 45° angle, and then
However, when microphone signals are linearly com-
suddenly jump back to the initial position each time a
bined, phase interference can produce comb filtering and
switching boundary is crossed. In addition to the positional
significant linear distortion. The degree to which x̂(t) ap-
jump, a discontinuity or “click” will be heard each time a
proximates x(t) can be determined by comparing H(, ␣),
boundary is crossed.4
Discontinuities occur in the ITD, the ILD, and the mon-
aural spectrum, which are piecewise constant functions of
the lateral angle. These discontinuities are clearly visible
in the spectra shown in Fig. 11. Here the source is assumed
to be aimed directly at the first microphone and infinitely
distant, that is, propagating plane waves. Fig. 11(a) shows
the magnitude spectrum for an eight-microphone MTB
array. If there were no error, this image would be identical
to the image in Fig. 9(b). As Fig. 11(c) illustrates, a much
better approximation can be obtained by increasing the
number of microphones to 32. In practice, many micro-
phones are required to make the artifacts introduced by
this method inaudible.
4
Fig. 11. (a) Magnitude response for interpolation by nearest mi-
These switching artifacts can be reduced by cross fading crophone selection for eight-microphone MTB array. If there
rather than switching between microphones, and by including were no error, response would be the same as in Fig. 9(b). (b)
hysteresis to prevent “chattering” when the listener’s head is on Interpolation error. Largest errors occur when ear is on contra-
a switching boundary. However, unless the number of micro- lateral side. However, discontinuities at visible sector boundaries
phones is quite large, listeners are still aware of the sudden are also audible. (c), (d) corresponding results for 32-microphone
changes that occur when a switching boundary is crossed. array.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1147
ALGAZI ET AL. PAPERS
the desired HRTF, to ĤN(, ␣), the HRTF produced by Equivalently, the distance between microphones should
Eq. (5), be no more than one quarter of the shortest wavelength.5
For N ⳱ 8 and a ⳱ 87.5 mm, fmax ⳱ 1.25 kHz. In
ĤN 共, ␣兲 = 共1 − w兲H共, ␣ − 兲 + wH共, ␣ + N − 兲 principle, 128 microphones are needed to meet the sam-
= 1−冉 ␣ − ␣n
N 冊H共, ␣n兲 +
␣ − ␣n
N
H共, ␣n + N兲
pling conditions out to 20 kHz. Fortunately, as we shall
see in Section 3.4, simple modifications of the full-
bandwidth interpolation procedure reduce this requirement
(7)
significantly.
where ␣ is the angle between the sound source and the ear,
and ␣n ⳱ ␣ −  is the angle between the sound source and 3.3.2 Exact Response
the nearest microphone. We will use this expression to evalu- We now compare the spectra for the exact HRTF and
ate the interpolation error. However, we begin with a simple the interpolated HRTF. Similar errors also appear in the
approximate analysis that provides some physical insight. ITD and the ILD. Ideally, we would like to see no differ-
ence between the exact HRTF H(, ␣) given by Eq. (1)
3.3.1 Approximate Analysis and the interpolated HRTF Ĥ(, ␣) given by Eq. (7). We
When adjacent microphones are sufficiently close, the used Eq. (7) and the algorithm given in [33] to compute
primary difference between xn(t) and xnn(t) is a time delay ĤN(, ␣) for the case a ⳱ 87.5 mm and for several dif-
T. Thus xnn(t) ≈ xn(t − T), and the signal x(t) at the ear ferent values of N. In every case we use the microphone
occurs at an intermediate delay, x(t) ≈ xn(t − wT). By configuration shown in Fig. 10, with the first microphone
substituting xn(t − T) for xnn(t) in Eq. (5) and approximat- at the top.
ing xn(t − T) by the first two terms in a Taylor’s series Fig. 13(a) shows the dB magnitude of ĤN(, ␣) for the
expansion, we obtain case where N ⳱ 8 and where the source is directed at the
x̂共t兲 ≈ 共1 − w兲 xn共t兲 + w xn共t − T兲 first microphone. The error, shown in Fig. 3(b), is small at
low frequencies, and there is no error at all when the ear
≈ 共1 − w兲 xn共t兲 + w关xn共t兲 − ẋn共t兲T兴 is positioned at one of the microphones. This is reflected
≈ xn共t兲 − ẋn共t兲共wT兲 visually in the clear division of the image into eight sec-
tors, defined by the eight radial streaks in the Fig. 13
≈ xn共t − wT兲 images. However, as Eq. (8) predicts, significant errors
as desired. That is, the weighted combination of the signal occur above about 1.25 kHz. The negative errors, shown
and the time-delayed signal is approximately the signal as dark spots in Fig. 13(b), are a consequence of phase
that arrives at the ear, and thus linear interpolation pro- interference. The positive errors, shown as bright streaks
duces the equivalent intermediate delay for any signal at the bottom of Fig. 13(b), stem from the failure of the
direction. interpolation procedure to properly reproduce the strong
However, the Taylor’s series approximation breaks head shadow that occurs between microphone locations.
down when the delay T is so large that quadratic and
higher order terms are required. For a sinusoidal signal, 5
The microphones can be thought of as sampling the sound
the approximation becomes poor when the delay is greater field spatially. From sampling theory one might expect that it
than about a quarter of a period. If the source contains no would be sufficient to have two samples per wavelength. How-
significant energy above some maximum frequency fmax, ever, that would require interpolation involving more micro-
we can expect that the linear interpolation will be accept- phones and a more sophisticated interpolation procedure.
able if T < 1/4fmax.
In addition, note that when xnn(t) ≈ xn(t − T) and when
w ⳱ 0.5, it follows from Eq. (5) that X̂() ⳱ Ĥ() Xn (),
where Ĥ() ⳱ 0.5 [1 + exp(− jT)]. This is the transfer
function for a comb filter. This filter has its first notch at
f ⳱ 1/2T, and its response is 3 dB down at f ⳱ 1/4T. It
follows that spectral coloration will be strong unless T <
1/4fmax. Thus both the time-delay and the spectral-
coloration considerations lead to the requirement that the
delay be less than a quarter of the shortest period.
Large delays cause serious problems. The delay T is
maximum when the sound wave is traveling around the
sphere from one microphone to the next. Because the dis-
tance between microphones is 2a/N, the maximum value
for T is 2a/Nc. Thus for good performance in the worst- Fig. 13. (a) Magnitude response for nearest microphone full-
bandwidth interpolation for eight-microphone MTB array. As in
case situation there should be no significant spectral en- Fig. 9, frequency range is from 100 Hz to 10 kHz. In the worst
ergy above case, a deep interference notch occurs near 2.5 kHz, which is
twice fmax. (b) Interpolation error. Dotted circle identifies 1-kHz
Nc frequency contour; solid contours are for −3 dB. Error is small
fmax = . (8)
8a for frequencies ⱕ1.5 kHz.
1148 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1149
ALGAZI ET AL. PAPERS
The results of this procedure are shown in Fig. 17 for the It is important to note that the performance of this pro-
case of eight microphones and ideal low-pass and high-pass cedure is significantly better than the elementary nearest
filters with a cutoff frequency of 1.5 kHz. Below the cutoff microphone procedure illustrated in Fig. 11. For wide-
frequency both the ITD and the ILD are essentially correct. band sources one can still hear the spectral discontinuities
However, above the cutoff frequency both the ITD and the that are clearly visible in Fig. 18(a). These can be per-
ILD are zero. The single complementary microphone is able ceived as sudden changes in the brightness of the tone
to restore the spectral energy above the cutoff frequency, but color, perhaps accompanied by small jumps in location
the high frequency directional cues are wrong for sound due to jumps in the high-frequency ILD. However, below
sources that are not in the median plane. 1.5 kHz both the temporal and the spectral cues vary con-
Because the auditory system is not sensitive to phase at tinuously with the head motion, which largely eliminates
high frequencies, the erroneous high-frequency ITD cues the positional jumps heard with the simpler procedure.
may not be serious. However, the erroneous ILD cues
produce perceptual errors. Sound sources that have little 3.4.3 Spectral-Interpolation Restoration
energy above the cutoff frequency tend to be heard cor- It is also possible to use a small number of microphones
rectly. Sound sources that have most of their energy above and spectral interpolation to obtain high-frequency content
the cutoff frequency appear to be in the median plane, that varies continuously with the head motion. Let Mn() be
usually at or near the center of the head. In our informal the magnitude of the short-time Fourier transform of xn(t),
tests, broad-band sound sources often produce a split and let Mnn() be the magnitude of the short-time Fourier
sound image, with a low-frequency image heard correctly transform of xnn(t). Then we can estimate the magnitude of
and a high-frequency image heard near the center of the the short-time Fourier transform of x(t) by
head. Thus this simple procedure is potentially very useful
for speech and similar limited-bandwidth applications, but Mc() ⳱ (1 − w)Mn() + wMnn() (9)
is less than ideal.
and we can use any of several standard methods to recover
3.4.2 Nearest Microphone Restoration the time signal xc(t) from Mc() [42], [43].
The high-frequency ILD can be roughly restored by The magnitude responses resulting from this procedure
using the nearest microphone to provide the high- are shown in Fig. 19. The responses now vary continuously
frequency information, that is, by letting xc(t) ⳱ xn(t). with the head motion, and there are no artifacts from switch-
This sample-and-hold approach for the high frequencies ing discontinuities. As with all of the two-band procedures,
leads to the magnitude responses shown in Fig. 18. the ITD cues are properly reproduced. With 32 microphones
1150 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
the ILD is also properly reproduced, and it is reasonably simplicity/performance compromise. However, each
well approximated even with only eight microphones. method might be the preferred choice for a particular ap-
plication, and in this section we compare their different
3.5 Comparison of Interpolation Methods advantages and disadvantages. A concise summary is
We have presented three different methods for interpolation, given in Table 1.
the third method having three different ways to restore high If the use of 16 or more microphones is acceptable, the
frequencies. For brevity we identify these methods as follows: conceptual simplicity of NM is attractive. However, with
NM Nearest microphone selection a small number of microphones the spatial instability and
FB Full-bandwidth interpolation discontinuities make it unacceptable for music, and its use
TB1 Two-band interpolation, fixed-microphone is limited to low-quality applications.
restoration The instability and discontinuity problems of NM are
TB2 Two-band interpolation, nearest microphone largely removed by the other methods. Although the spec-
restoration tral notches introduced at high frequencies make FB un-
TB3 Two-band interpolation, spectral-interpolation acceptable for music (see Figs. 13–15), in our informal
restoration. listening tests FB is remarkably good for speech. It is
worth observing that reflections from walls, tables, and
Of these five methods, NM is the simplest, TB3 pro- other environmental surfaces also introduce spectral
vides the best performance, and TB2 offers an attractive notches that change with changes in head position, and
familiarity with these effects may account for part of the
surprisingly small degree to which this spectral coloration
is distracting.
The two-band methods (TB1, TB2, and TB3) exploit
the psychoacoustics of spatial hearing, with the ITD cues
being confined to the low-frequency band. All of them
essentially eliminate the spectral notches (see Figs. 17–
19). By using only one full-bandwidth channel, TB1 is
particularly efficient in the use of bandwidth. The price for
this bandwidth efficiency is the absence of high-frequency
ILD cues and the appearance of split images for wide-band
sources. However, TB1 is an attractive option for speech,
and it may even be acceptable for moderate-quality music
if the dominant sources are more or less in front of the
listener. By sacrificing bandwidth efficiency and including
the high-frequency ILD cues, TB2 provides good perfor-
mance on music as well as speech with a small number of
microphones. However, the spectral differences in the dif-
ferent sectors revealed in Fig. 18(a) are audible. TB3 re-
moves this flaw at the cost of higher computational
requirements.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1151
ALGAZI ET AL. PAPERS
eral classes of applications, which we term panoramic, tions performance can be improved significantly by cus-
frontal, and omnidirectional. In every case the basic prin- tomizing the procedure to the individual listener. In this
ciple is to make the sampling density proportional to the section we present some possible customization techniques.
probability density for the ear locations.
With panoramic applications listeners are equally likely 4.2.1 Head Size
to turn to face any position around a full horizontal circle, It was observed in the Introduction that if the radius a of
but will usually not tilt or roll their heads. For these ap- the sphere differs significantly from the radius b of the
plications it is appropriate to have the N microphones listener’s head, the apparent locations of the sound sources
equally spaced around the equator of a sphere [see Fig. are not stable, but shift systematically with head motion.
20(a)]. This is the case that was analyzed in Section 3. Specifically, if a < b, the perceived motion is in the di-
Frontal applications are a restricted form of panoramic rection of the listener’s motion, whereas if a > b, the
applications in which there is a preferred direction of at- motion is retrograde.
tention, and listeners usually restrict head motions to turn- In the United States, 98% of the adult population has a
ing no more than 45° from side to side. This is typically head radius within approximately ±15% of the mean [46].
the situation for musical and theater performances. For Thus for most listeners who turn their heads through an
frontal applications the microphones can be spaced more angle , the magnitude of the apparent angular motion of
closely along the sides of the sphere [see Fig. 20(b)]. If the the source is at worst 0.15 . This is usually a small effect,
listener turns his or her head beyond the limit of an out- but it may be important for demanding applications.
ermost microphone, one could either continue to interpo- For frontal applications in which the sound sources of
late between the more widely separated microphones, or interest are in front of the listener, the disturbance can be
maintain the signal from the outermost microphone. With reduced significantly by simply replacing the measured
the latter procedure the sound field is no longer stabilized head rotation angle by the scaled value (b/a) , limiting
beyond the maximum angle of head rotation, but there are the magnitude of the result to 90°. Equivalently, the listener
no sudden spectral artifacts from phase cancellation. Once can be allowed to adjust the scale factor interactively until
again, the preferred choice is application dependent. the perceived stability of the sound image is maximized.
For true omnidirectional applications, the microphones
should be spaced uniformly over the sphere. We use the 4.2.2 Pinna Compensation
formula for a hexagonal grid to estimate the number of An isolated sphere is only a first approximation to the
microphones required to cover a sphere of radius a with human head, and sounds captured by an MTB array lack
quarter-wavelength sampling, obtaining the directional cues provided by the torso and pinnae. In
冉 冊 2
static listening tests increased front/back confusion and
128 a fmax
N≈ . (10) excessive elevation are commonly experienced conse-
3 c quences of the lack of pinna cues. Although head motion
For fmax ⳱ 1.5 kHz, a ⳱ 87.5 mm, and c ⳱ 343 m/s this cues resolve front/back confusion and help to establish
formula calls for 20 microphones. Because eight-track re- elevation, people listening to basic MTB reproduction fre-
cordings are technically convenient, and because sampling quently comment that sound sources appear to be elevated.
near the top and bottom may not be necessary, a 16- In addition some listeners comment that a source seems to
microphone configuration is attractive for practical omni- rise in elevation when they turn to face it.
directional applications. The effects of the pinna on the HRTF have been stud-
iedly extensively, but are still not completely understood,
4.2 Customization to Individual Listeners the role of the so-called pinna-notch being particularly
Basic MTB sound reproduction can be thought of as controversial [4], [17], [47]–[51]. It is possible, of course,
rendering spatial sound by substituting the HRTF of a to affix nonindividualized, “average” pinnae to the surface
sphere for the HRTF of the listener. Even though people used for an MTB array, just as is done for dummy-head
can adapt to perceptual distortions, it is well known that recordings. This is particularly attractive for frontal appli-
people are better at localizing sounds with their own cations, where left ears can be used on the left side of the
HRTFs than with other people’s HRTFs [44], [45]. For head and right ears on the right side. However, in addition
many applications absolute localization may not be impor- to being visually intrusive, acoustic interference between
tant, and the dynamic cues may more than compensate for adjacent pinnae will introduce spectral disturbances if the
the loss of spectral cues [17]. However, for some applica- spacing between microphones is small.
In informal listening tests we have found that both the
excessive apparent elevation of sound sources and its de-
pendence on head rotation can be reduced by inserting a
filter that introduces a simulated pinna notch. A typical
filter has a center frequency from 5 to 8 kHz, a Q of 3, and
a 20-dB depth. Of course the elevation cues provided by
the pinna depend on both the listener and the source lo-
Fig. 20. Appropriate sampling patterns. (a) Panoramic applica- cation, and a fixed pinna-notch filter cannot provide the
tion. (b) Frontal application. (c) Omnidirectional application. proper correction for all listeners and all source locations.
Arrow in (b) points in direction of listener’s preferred orientation. However, one can allow the listener to adjust the filter
1152 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
parameters for best results. Furthermore the characteristics ations such as ease of manufacturing, ruggedness, direc-
of the pinna notch change slowly for sources in the ante- tional sound properties, or aesthetic appeal can also affect
rior horizontal plane, which makes this form of pinna cor- the final selection.
rection particularly effective for frontal applications. Prog-
ress on this form of customization is reported in [52]. 4.5 Applications
There are many potential applications for MTB sound
4.3 MTB in Virtual Auditory Space capture and reproduction. They can be broadly classified
The MTB method for spatial sound reproduction is also into three categories: 1) remote listening, 2) recording, and
applicable to computer-generated spatial sound and pro- 3) immersive interactive multimedia. We consider each of
vides the same benefits: stabilized sound images, elimina- these briefly in turn.
tion of front/back confusion, and support for an arbitrary Teleconferencing and collaborative work systems are
number of simultaneous listeners. obvious remote listening applications for MTB. These are
A typical system for generating virtual auditory space typically frontal applications, and thus can be customized
includes subsystems for modeling the source, modeling to individual listeners for optimum performance [52].
the acoustics of the room, and modeling the listener [26], MTB can also expand the functionality of omnidirectional
[28]. Usually either individualized or nonindividualized surveillance and security systems, and is potentially valu-
HRTFs are used to model the listener. If head tracking is able for the remote operation of equipment (teleopera-
used, the HRTF can be changed dynamically, but if sev- tions). In particular, it is well known that divers have
eral listeners are using the system simultaneously, the difficulty localizing sound sources because the higher
headphone output must be rendered for each listener speed of sound under water leads to small ITDs. If the
separately. radius of the array can be scaled appropriately, an MTB
With the use of MTB, these individual two-channel, array could prove useful in underwater activities.
variable-location HRTFs are replaced by the N-channel, Home theater sound and musical entertainment recordings
fixed-location HRTFs for the MTB array. This HRTF can are frontal applications, and thus both can be customized to
be approximated by a simple and very efficient fixed-pole individual listeners. Although it would be best to make new
variable-zero-plus-delay model [53]. With this approach recordings using MTB microphone arrays, it is also possible
the computational requirements for supporting N listeners to convert legacy recordings into the MTB format. Because
are not substantially greater than the computational re- the locations of the sound sources for surround-sound record-
quirements for supporting one listener. ings are known exactly, either generalized or individualized
pinna cues can be added to control elevation and to enhance
4.4 Alternative mounting surfaces static front/back discrimination.
To simplify analysis, up to this point we have assumed In Section 4.3 we described how a virtual MTB system
that the microphones are mounted on the surface of a can be used to generate sound for virtual auditory space. The
sphere. However, other alternatives may be preferable. For main advantage of this approach stems from its efficiency at
example, the microphones could be effectively suspended supporting multiple simultaneous listeners. MTB also offers
in space by supporting them by stiff rods, they could be a simple and effective way to enhance video and computer
mounted on any surface of revolution about a vertical axis, games. Finally, creating augmented reality systems by com-
or they could be mounted on the flat surfaces of a vertical bining remote listening and virtual auditory space provides a
prism. Fig. 21 shows two experimental MTB arrays, one particularly attractive application of MTB technology [54].
with the microphones mounted on a truncated cylinder and The live sounds can be acquired directly by an MTB array,
the other with the microphones mounted on a sphere. and the virtual sounds can be efficiently rendered in MTB
Any of the nonspherical surfaces have the advantage of format. In this way any of the remote listening applications
not developing a strong bright spot, and thus behaving described here can be enhanced with computer-generated
more like the HRTF for a human head. Other consider- audio information.
5 CONCLUSION
6 ACKNOWLEDGMENT
tions to the implementation of the MTB system. Support Vestibular and Visual Cues in Sound Localization,” J.
was provided by the National Science Foundation under Exper. Psychol., vol. 27, pp. 339–368 (1940 Apr.).
grants IIS-00-97256 and ITR-00-86075. Any opinions, [17] H. G. Fisher and S. J. Freedman, “The Role of the
findings, and conclusions or recommendations expressed Pinna in Auditory Localization,” J. Audit. Res., vol. 8, pp.
in this material are those of the authors and do not neces- 15–26 (1968).
sarily reflect the view of the National Science Foundation. [18] F. L. Wightman and D. L. Kistler, “Resolution of
Front–Back Ambiguity in Spatial Hearing by Listener and
7 REFERENCES Source Movement,” J. Acoust. Soc. Am., vol. 105, pp.
2841–2853 (1999 May).
[1] J. D. Johnston and Y. H. Lam, “Perceptual Sound- [19] M. F. Davis, “History of Spatial Coding,” J. Audio.
field Reconstruction,” presented at the 109th Convention Eng. Soc. (Features), vol. 51, pp. 554–569 (2003 June).
of the Audio Engineering Society, J. Audio Eng. Soc. (Ab- [20] F. Rumsey, Spatial Audio (Focal Press, Oxford,
stracts), vol. 48, p. 1102 (2000 Nov.), preprint 5202. UK, 2001).
[2] W. G. Gardner, 3-D Audio Using Loudspeakers [21] M. M. Boone, “Acoustic Rendering with Wave
(Kluwer Academic, Boston, MA, 1998). Field Synthesis,” in Proc. ACM SIGGRAPH and Euro-
[3] J. Bauck, “A Simple Loudspeaker Array and Asso- graphics Campfire: Acoustic Rendering for Virtual Envi-
ciated Crosstalk Canceler for Improved 3D Audio,” J. Au- ronments (Snowbird, UT, 2001 May).
dio Eng. Soc., vol. 49, pp. 3–13 (2001 Jan./Feb.). [22] M. A. Gerzon, “Ambisonics in Multichannel
[4] J. Blauert, Spatial Hearing: The Psychophysics of Broadcasting and Video,” J. Audio Eng. Soc., vol. 33, pp.
Human Sound Localization, rev. ed. (MIT Press, Cam- 859–871 (1985 Nov.).
bridge, MA, 1997). [23] J. S. Bamford and J. Vanderkooy, “Ambisonic
[5] J. C. Middlebrooks and D. M. Green, “Sound Lo- Sound for Us,” presented at the 99th Convention of the
calization by Human Listeners,” Ann. Rev. Psychol., vol. Audio Engineering Society, J. Audio Eng. Soc. (Ab-
42, no. 5, pp. 135–159 (1991). stracts), vol. 43, p. 1095 (1995 Dec.), preprint 4138.
[6] S. Carlile, “The Physical and Psychophysical Basis [24] K. de Boer and A. T. van Yrk, “Some Particulars of
of Sound Localization,” in Virtual Auditory Space: Gen- Directional Hearing,” Philips Tech. Rev., vol. 6, pp.
eration and Applications, S. Carlile, Ed. (R. G. Landes, 359–364 (1941).
Austin, TX, 1996), pp. 27–78. [25] U. Horbach, A. Karamustafaoglu, R. Pellegrini, P.
[7] F. L. Wightman and D. J. Kistler, “Factors Affecting Mackensen, and G. Theile, “Design and Applications of a
the Relative Salience of Sound Localization Cues,” in Bin- Data-Based Auralization System for Surround Sound,”
aural and Spatial Hearing in Real and Virtual Environ- presented at the 106th Convention of the Audio Engineer-
ments, R. H. Gilkey and T. R. Anderson, Eds. (Lawrence ing Society, J. Audio Eng. Soc. (Abstracts), vol. 47, p. 528
Erlbaum Assoc., Mahwah, NJ, 1997), pp. 1–23. (1999 June), preprint 4976.
[8] E. A. Macpherson and J. C. Middlebrooks, “Listener [26] D. R. Begault, 3-D Sound for Virtual Reality and
Weighting of Cues for Lateral Angle: The Duplex Theory Multimedia (AP Professional, Boston, MA, 1994).
of Sound Localization Revisited,” J. Acoust. Soc. Am., vol. [27] B. Shinn-Cunningham and A. Kulkarni, “Recent
111, Prt. 1, pp. 2219–2236 (2002 May). Developments in Virtual Auditory Space,” in Virtual Au-
[9] S. K. Roffler and R. A. Butler, “Factors that Influ- ditory Space: Generation and Applications, S. Carlile, Ed.
ence the Localization of Sound in the Vertical Plane,” J. (R. G. Landes, Austin, TX, 1996), pp. 185–243.
Acoust. Soc. Am., vol. 43, pp. 1255–1259 (1967 Dec.). [28] L. Savioja, J. Huopaniemi, T. Lokki, and R. Vään-
[10] V. R. Algazi, C. Avendano, and R. O. Duda, “El- änen, “Creating Interactive Virtual Acoustic Environ-
evation Localization and Head-Related Transfer Function ments,” J. Audio Eng. Soc., vol. 47, pp. 675–705 (1999
Analysis at Low Frequencies,” J. Acoust. Soc. Am., vol. Sept.).
109, pp. 1110–1122 (2001 Mar.). [29] E. M. Wenzel, F. L. Wightman, D. J. Kistler, and
[11] M. B. Gardner, “Distance Estimation of 0° or Ap- S. H. Foster, “The Convolvotron: Realtime Synthesis of
parent 0°-Oriented Speech Signals in Anechoic Space,” J. Out-of-Head Localization,” presented at the 2nd Joint
Acoust. Soc. Am., vol. 45, pp. 47–53 (1969 Jan.). Meeting of the Acoustical Societies of America and Japan
[12] D. S. Brungart, “Auditory Localization of Nearby (Honolulu, HI, 1988 Nov.).
Sources. III. Stimulus Effects,” J. Acoust. Soc. Am., vol. [30] K. Inanaga, Y. Yamada, and H. Koizumi, “Head-
106, pp. 3589–3602 (1999 Dec.). phone System with Out-of-Head Localization Applying
[13] A. W. Bronkhorst and T. Houtgast, “Auditory Dis- Dynamic HRTF (Head-Related Transfer Function),” pre-
tance Perception in Rooms,” Nature, vol. 397, pp. sented at the 98th Convention of the Audio Engineering
517–520 (1999 Feb.). Society, J. Audio Eng. Soc. (Abstracts), vol. 43, pp. 401,
[14] P. T. Young, “The Role of Head Movements in 402 (1995 May), preprint 4011.
Auditory Localization,” J. Exper. Psychol., vol. 14, pp. [31] H. Møller, D. Hammershøi, C. B. Jensen, and M. F.
96–124 (1931). Sørensen, “Evaluation of Artificial Heads in Listening
[15] H. Wallach, “On Sound Localization,” J. Acoust. Tests,” J. Audio Eng. Soc., vol. 47, pp. 83–100 (1999 Mar.).
Soc. Am., vol. 10, pp. 270–274 (1939 Apr.). [32] P. Minnaar, S. K. Olesen, F. Christensen, and H.
[16] H. Wallach, “The Role of Head Movements and Møller, “Localization with Binaural Recordings from Ar-
1154 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS MOTION-TRACKED BINAURAL SOUND
tificial and Human Heads,” J. Audio Eng. Soc., vol. 49, pp. [45] H. Møller, M. F. Sørensen, C. B. Jensen, and D.
323–336 (2001 May). Hammershøi, “Binaural Technique: Do We Need Indi-
[33] R. O. Duda and W. L. Martens, “Range Depen- vidual Recordings,” J. Audio Eng. Soc., vol. 44, pp.
dence of the Response of a Spherical Head Model,” J. 451–469 (1996 June).
Acoust. Soc. Am., vol. 104, pp. 3048–3058 (1998 Nov.). [46] H. Dreyfuss Assoc., The Measure of Man and
[34] V. R. Algazi, C. Avendano, and R. O. Duda, “Es- Woman (Whitney Library of Design, New York, 1993).
timation of a Spherical-Head Model from Anthropom- [47] E. A. G. Shaw, “Acoustical Features of the Human
etry,” J. Audio Eng. Soc., vol. 49, pp. 472–479 (2001 June). External Ear,” in Binaural and Spatial Hearing in Real
[35] V. R. Algazi, R. O. Duda, R. Duraiswami, N. and Virtual Environments, R. H. Gilkey and T. R. Ander-
Gumerov, and Z. Tang, “Approximating the Head-Related son, Eds. (Lawrence Erlbaum Assoc., Mahwah, NJ, 1997),
Transfer Function Using Simple Geometric Models of the pp. 25–47.
Head and Torso,” J. Acoust. Soc. Am., vol. 112, pp. [48] R. A. Butler and K. Belendiuk, “Spectral Cues Uti-
2053–2064 (2002 Nov.). lized in the Localization of Sound in the Median Sagittal
[36] P. Minnaar, J. Plogsties, S. K. Olesen, F. Chris- Plane,” J. Acoust. Soc. Am., vol. 61, pp. 1264–1269 (1977
tensen, and H. Møller, “The Interaural Time Difference in May).
Binaural Synthesis,” presented at the 108th Convention of [49] H. L. Han, “Measuring a Dummy Head in Search
the Audio Engineering Society, J. Audio Eng. Soc. (Ab- of Pinna Cues,” J. Audio Eng. Soc., vol. 42, pp. 15–37
stracts), vol. 48, p. 359 (2000 Apr.), preprint 5133. (1994 Jan./Feb.).
[37] F. L. Wightman and D. J. Kistler, “The Dominant [50] E. A. López-Poveda and R. Meddis, “A Physical
Role of Low-Frequency Interaural Time Differences in Model of Sound Diffraction and Reflections in the Human
Sound Localization,” J. Acoust. Soc. Am., vol. 91, pp. Concha,” J. Acoust. Soc. Am., vol. 100, pp. 3248–3259
1648–1661 (1992 Mar.). (1996).
[38] R. S. Woodworth and G. Schlosberg, Experimental [51] Y. Kahana and P. A. Nelson, “Spatial Acoustic
Psychology (Holt, Rinehard and Winston, New York, Mode Shapes of the Human Pinna,” presented at the 109th
1962), pp. 349–361. Convention of the Audio Engineering Society, J. Audio
[39] G. F. Kuhn, “Physical Acoustics and Measure- Eng. Soc. (Abstracts), vol. 48, pp. 1102, 1103 (2000 Nov.),
ments Pertaining to Directional Hearing,” in Directional preprint 5218.
Hearing, W. A. Yost and G. Gourevitch, Eds. (Springer, [52] J. Melick, V. R. Algazi, R. Duda, and D. Thomp-
New York, 1987), pp. 3–25. son, “Customization for Personalized Rendering of Mo-
[40] J. Zwislocki and R. S. Feldman, “Just Noticeable tion-Tracked Binaural Sound,” presented at the 117th
Differences in Dichotic Phase,” J. Acoust. Soc. Am., vol. Convention of the Audio Engineering Society, J. Audio
28, pp. 860–864 (1956 Sept.). Eng. Soc. (Abstracts), vol. 52 (2004 Dec.), convention
[41] A. W. Mills, “On the Minimum Audible Angle,” J. paper 6225.
Acoust. Soc. Am., vol. 30, pp. 237–246 (1958 Apr.). [53] R. Algazi, R. O. Duda, and D. M. Thompson, “The
[42] M. R. Portnoff, “Implementation of the Digital Use of Head-and-Torso Models for Improved Spatial
Phase Vocoder Using the Fast Fourier Transform,” IEEE Sound Synthesis,” presented at the 113th Convention of
Trans. Acoust., Speech, Signal Process., vol. ASSP-24, pp. the Audio Engineering Society, J. Audio Eng. Soc. (Ab-
243–248 (1976 June). stracts), vol. 50, pp. 976, 977 (2002 Nov.), convention
[43] J. F. Alm and J. S. Walker, “Time–Frequency paper 5712.
Analysis of Musical Instruments,” SIAM Rev., vol. 44, pp. [54] A. Härmä, J. Jakka, M. Tikander, M. Karjalainen,
457–476 (2002). T. Lokki, H. Nironen, and S. Vesa, “Techniques and Ap-
[44] E. M. Wenzel, M. Arruda, D. J. Kistler, and F. L. plications of Wearable Augmented Reality Audio,” pre-
Wightman, “Localization Using Nonindividualized Head- sented at the 114th Convention of the Audio Engineering
Related Transfer Functions,” J. Acoust. Soc. Am., vol. 94, Society, J. Audio Eng. Soc. (Abstracts), vol. 51, p. 419
pp. 111–123 (1993 July). (2003 May), convention paper 5768.
THE AUTHORS
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1155
ALGAZI ET AL. PAPERS
V. Ralph Algazi received a degree of Ingénieur Radio Semiconductor, after which he joined Syntelligence. In
from l’Ecole Supérieure d’Electricité (ESE), Paris, France, 1988 he became emeritus professor of Electrical Engineer-
and M.S and Ph.D. degrees from the Massachusetts Insti- ing at San Jose State University, and currently is a visiting
tute of Technology, Cambridge, in 1952, 1955, and 1963, professor in the Department of Electrical and Computer
respectively. Engineering at the University of California at Davis. His
He was at MIT from 1959 to 1965 as a research and research interests include pattern recognition, image
teaching assistant and then as a postdoctoral fellow and analyses, expert systems, auditory scene analysis, and the
assistant professor. On the faculty of the University of localization and synthesis of spatial sound.
California, Davis, since 1965, he was chairman of the Dr. Duda is the coauthor with Peter Hart and David
Department of Electrical and Computer Engineering from Stork of Pattern Classification, 2nd Ed. (Wiley-Inter-
1975 to 1986. He founded CIPIC, the Center for Image science, 2001). He is a member of the Audio Engineering
Processing and Integrated Computing, in 1989 and served Society and the Acoustical Society of America, and is a
as its director until 1994. He is now a research professor fellow of the IEEE and the American Association for Ar-
at CIPIC, pursuing research interests in signal processing, tificial Intelligence.
engineering applications of human perception for both
●
speech and images, and image and video processing and
coding. Dennis M. Thompson was born in Bradenton, FL, in
Dr. Algazi is a life senior member of the IEEE and is a 1958. He studied electronic technology at the College of
member of the AES, SPIE, and AAAS. the Redwoods, Eureka, CA. He is currently working to-
ward a degree in electrical engineering at the University of
●
California at Davis.
Richard O. Duda was born in Evanston, IL, in 1936. He In 1958 he started Yknot Sound, a regional sound com-
received B.S. and M.S. degrees in engineering from pany that specializes in live music PA systems. Currently
UCLA, Los Angeles, CA, in 1958 and 1959, respectively, he is working at the CIPIC Interface Lab, where he de-
and a Ph.D. degree in electrical engineering from MIT, signs hardware and software. His main research interest is
Cambridge, MA, in 1962. He was in the Artificial Intel- high-quality 3-D sound reproduction. He still enjoys
ligence Center at SRI International from 1962 to 1980, working in concert hall reinforcement, with an emphasis
serving as a visiting professor at the University of Texas at on quality over quantity.
Austin during the 1973/74 academic year. From 1980 to He is a student member of the Audio Engineering
1983 he was at the Laboratory for AI Research at Fairchild Society.
1156 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS
phase can be modeled in sound transformations and pure but was a less effective forward masker than a tone with
synthesis. components added in random phase.
Whereas phase changes are detectable in controlled
1.1 Does Phase Affect Timbre? situations, as even polarity change was found to be audible
One of the first experiments concerning the perception in two-component signals [16], the discrimination of phase
of timbre of complex tones in relation to phase was con- changes in individual components often requires specific
ducted by von Helmholtz [3]. Using a special technique he phase alignment, such as cosine phase [17].
was able to generate complex tones consisting of eight
sinusoids (partials) with variable phase and fundamental 1.2 Importance of Phase in Transients
frequencies of 120 and 240 Hz. Helmholtz concluded that Patterson and Green [18] used Huffman sequences, in
“the changes in timbre are not distinct enough to be ob- which the phases can be varied independently of the en-
served after a few seconds required to alter the phases; ergy spectrum, to assess the discrimination of phase
anyhow these changes are too small to transform one changes in transients. They found that phase changes
vowel in another,” and “harmonics beyond the sixth to could be discriminated reliably, for some stimulus wave-
eighth give dissonances and beats, so it is not excluded forms, for durations above 5 ms.
that, for these higher harmonics, a phase effect does exist.” Wakefield et al. [19] conducted a study of the percep-
These conclusions have often been interpreted as indicat- tion of transients using filtered noise, where a two-interval
ing that phase has no influence on timbre [4], even though forced-choice adaptive psychophysical procedure was
later experiments showed otherwise [5], [6]. used to find the JND between a given sound and a copy of
Plomp and Steeneken [4] conducted a number of ex- the sound where the magnitude spectrum was smoothed
periments involving complex tones with ten harmonic par- and the phase spectrum held constant. The surprising re-
tials and equal spectral envelope, but different phase sult was that the JND depended strongly on the phase
shifts. The most important finding was that the maximum pattern used. It was concluded that “the effect for short
effect of phase on timbre perception occurs when a tone duration signals is greater than what the (sparse) literature
containing harmonic partials that all start at sine phase (0°) on the auditory perception of transients would suggest.”
is compared to one where the partials alternate between The perception of clicks and chirps was further investi-
sine phase and cosine phase (90°). The effect of reducing gated by Uppenkamp et al. [20]. The up-chirps used by
the level of each successive partial by 2 dB was greater Uppenkamp et al. are signals constructed to contain the
than the maximum phase effect described earlier. Also the same frequencies as clicks, but where the phase is ma-
effect of phase on timbre appeared to be independent of nipulated to compensate for the spatial dispersion along
the sound level. the cochlea. Up-chirps should therefore reach maximum
Patterson [7] presented psychoacoustic experiments in- amplitude at the same moment in time at all places of the
volving alternating phase (APH) waves, that is, harmonic basilar membrane. They compared the perceived “com-
partials in which even partials start in cosine phase while pactness” of clicks to that of chirps and found that clicks
odd ones start in cosine phase + D°. It was found that the were perceptually more compact than up-chirps, but that
value of D leading to a just noticeable difference (JND) down-chirps, that is, up-chirps reversed in time, sounded
between a sound with partials in cosine phase and an APH more compact than up-chirps. Even though up-chirps are
sound was lower for sounds with high bandwidth, low aligned in time at the basilar membrane output, they have
repetition rate, and high signal level. The signal duration a longer within-channel impulse reponse than down-chirps
was found to have no, or very little, effect on the JND. and clicks. This suggests that “the perceived ‘compact-
Progressively improved models, using summary auto- ness’ of a sound is apparently more determined by the fine
correlation [8], [9], auditory imaging [10], and models structure of excitation within each peripheral channel than
including the behavior of early cortical stages, using the by between-channel phase differences.”
summary measure of spectrograms [11], have provided
explanations for the observed effect of phase. 1.3 Phase Models
Alcántara et al. [12] studied the influence of phase on Schroeder [21] reported a number of effects related to
the identification of vowel-like sounds. The “vowel” were sounds with up to 31 harmonic partials. Most interesting is
created by increasing the level of three pairs of successive the reported strong dependence of timbre on the peak fac-
harmonic partials. They found better identification when tor. The peak factor can be minimized via an analy-
the components had cosine starting phase than when they tical approximation equation [22]. The synchronization
had random phase, and poorer performance for weaker index model (SIM) of Leman [23] employs a functional
stimuli. Pressnitzer and McAdams [13] studied the influ- model of the auditory periphery and a method of predict-
ence of phase on roughness perception and found that ing the roughness of a sound. This model was used by
roughness is linked to shapes of the waveforms at the Tind and Jensen [24] to devise a propagation formula of
output of the simulated auditory filter. Roberts et al. [14] the phase shifts that control the roughness output of
showed that phase shifts could influence stream segrega- the SIM. By basing the propagation formula on the rough-
tion for rapid sound sequences. Gockel et al. [15] studied ness prediction for three partials, they obtained a corre-
the influence of phase on loudness and forward masking spondence between the roughness control parameter and
produced by harmonic complex tones. They found that a the predicted roughness for complex harmonic sounds.
tone with components added in cosine phase was louder They concluded that there exists a (nonunique) phase shift
1158 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS PHASE IN THE SINUSOIDAL MODEL
for a given perceptual roughness of complex harmonic sinusoidal tracks over time from analyzed recordings of
sounds. instrument sounds. The synthesis quality of a comparable
analysis method, when phase information is not used, has
2 SINUSOIDAL ANALYSIS/SYNTHESIS previously been measured to be equal to, or better than
“perceptible but not annoying,” when compared to the
This study is based on the analysis by synthesis meth- original recorded monophonic sounds [1].
odology [25], using additive (sinusoidal) analysis/
synthesis techniques. In the additive framework, sounds 2.1 Analysis
are modeled as a sum of sinusoids with time-varying am- For each sound under analysis, the fundamental fre-
plitudes, frequencies, and sometimes also phase shifts. quency 0 is estimated using autocorrelation [42]. This
The short-time Fourier transform (STFT) [26] is a re- method, which is applicable only to monophonic quasi-
lated technique that can be used for analysis/synthesis and harmonic sounds, is used to determine the fixed block size
transformations of sounds [27]. In the STFT overlapping used in the analysis of the given sound. For each block of
blocks of the windowed sound are Fourier transformed, sound k under analysis, a new local measure k0 of the
modified, and inverse Fourier transformed. However, for fundamental frequency is calculated, again by use of au-
harmonic sounds or sounds with strong partials, the fre- tocorrelation. From this measure an FFT is performed and
quency components between the strong partials are a search for peaks is done near the regions of the quasi-
masked. Because a large number of the frequency com- harmonic frequencies. The amplitude Aki and frequency ki
ponents in the STFT are masked, the number of param- are stored for each partial i and time frame k.
eters used to model the sound can be greatly reduced. This In order to retain more of the additive noise compo-
is the assumption in the additive model that is used in this nents, a method inspired by the NBBF [35] has been em-
work. The additive model is chosen for two reasons. First, ployed here, in which sinusoids are estimated in between
it is well suited for further high-level modeling of musical the harmonic partials if the fundamental frequency is
sounds. Second, the additive model is being used in many above 400 Hz. This method essentially retains noises such
research and development prototypes today [28]–[31], and as hammer noise or the additive noise in wind instruments.
thus it provides a stable frame work for exploration in the Peaks from adjacent blocks are connected to form sinu-
perception of natural sounds. The additive model has, soidal tracks. The system has been extended to output not
however, several shortcomings. The transients are often only the amplitude and frequency of the tracks, but also
smeared in block-based analysis/synthesis and noise is not the phase ki for each block. To model the phase over time,
well represented. high precision of the estimated phase values is required.
Several methods exist for determining the time-varying To achieve this it was found necessary to extend the length
amplitudes and frequencies of the harmonic partials. Al- of each analysis block from 2.8 periods of the fundamental
ready in the last century, musical instrument tones were period length to 4 periods. By doing this, the time resolu-
divided into their Fourier series [3]. Early techniques for tion is affected, and thus the sound quality of transient
the time-varying analysis of the additive parameters are sounds is degraded.
presented by Matthews et al. [32] and Freedman [33].
Today the most common technique for the additive analy- 2.2 Synthesis
sis of musical signals is based on STFT analysis [2]. In The sound is synthesized using the analysis parameters
order to retain the noise components, several noise models in the following way:
of musical sounds have been presented, including the re- N
sidual noise model in the fast Fourier transform (FFT)
[28], [2], the bandwidth-enhanced additive synthesis [34],
s共n兲 = 兺 A 共n兲 cos关 (n)兴
i=0
i i (1)
[1], and the narrow-band basis functions (NBBF) in
speech models [35]. In order to improve the frequency, for N partials, where i(n) denotes the time-varying phase
and in particular the time resolution, that is, to better retain for partial i and sample index n. In practice the values of
the transient behavior of percussive musical instruments, Ai(n) used in the synthesis are obtained by linear interpo-
time–frequency based methods [1], [36] could be used, lation of the measured amplitude values between the block
and the time and frequency reassignment method [37] has boundaries. Two methods for finding the phase i are
recently gained popularity [38], [34]. Ding and Qian [39] used:
have presented an interesting method for improving the
time resolution, fitting a waveform by minimizing the en- Sa—Synthesis without measured phase information
ergy of the residual. This was improved and dubbed adap- Sb—Synthesis with measured phase information.
tive analysis by Röbel [40].
We used a software package developed by the authors When synthesizing sound without the measured phase in-
[1], [41], previously used in explorations of the timbre of formation ki , the phases of the sinusoidal tracks are found
musical instruments. It is based on the classic peak- by the cumulative sum of the interpolated frequency val-
picking method, where overlapping blocks are windowed, ues over time,
and the amplitudes, frequencies, and phases are found n
from interpolated peaks of the magnitude of the FFT. This
method has been shown to work well in forming stable
i 共 n 兲 = 兺 (n).
0
i (2)
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1159
ANDERSEN AND JENSEN PAPERS
When synthesizing the sound using phase information, the for two types of musical sounds. Fig. 1(a) shows a sta-
phase trajectory is interpolated in such a way that bound- tionary part of a soprano voice, where the phase progresses
ary conditions are satisfied. This can be done by cubic in a coherent way through time and frequency. The attack
interpolation [2] of the phase using the measured fre- of a guitar note is shown in Fig. 1(b). Here the phase
quency and phase values. In this way the measured phase evolution over time is less coherent. The goal of the phase
and frequency values are preserved at the block bound- representation presented in this section is to describe
aries, but oscillating frequency tracks can occur between stable sounds, such as the sustained part of most sounds
the block boundaries [39]. A solution to this problem is to from musical instruments, using a few parameters.
use quadratic interpolation where the phase and frequency
values cannot be preserved at the boundaries. Instead a 3.2 Phase Delay
weighting factor is used to determine the importance of the The phase values () obtained from the discrete Fou-
estimated phase relative to the frequency [39]. However, rier transform, and thus also the values used in additive
no degradation caused by the oscillations in frequency has analysis, are specified as the phase shift in radians for each
been found in this work; therefore the cubic interpolation sinusoidal component. Another way to represent phase is
method is used. as phase delay [43],
共兲
3 PHASE REPRESENTATION P=− (3)
The goal of additive phase modeling is to improve the where () is the phase at frequency , and P() ex-
sound quality in pure synthesis models such as the Timbre presses the time delay in seconds relative to the center of
Engine, based on the timbre model [31], to improve the the frame. The magnitude, phase, and phase delay as a
sound quality in time–frequency scaling of signals, and function of frequency of a stable part of a saxophone
finally to gain a better understanding of the perception of sound are shown in Fig. 2. Phase delay is not a common
musical signals. We chose to investigate the phase as a way to represent phase in the sinusoidal model. However,
function of time, and thus a convenient representation of it is shown here as it is used as the basis of the relative
the phase trajectories over time is needed. phase delay described in the following section.
Fig. 1. Phase as a function of time and frequency. Brightness represents phase (radians) between − and . (a) Sustained part of a
soprano voice. (b) Attack of a guitar. Fundamental frequency of both sounds is approximately 500 Hz.
1160 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS PHASE IN THE SINUSOIDAL MODEL
However, in many cases it is not convenient to have forced the overall waveform characteristics are preserved, and
nonconstant step sizes during analysis, and thus another thus the phase delay of the fundamental can be chosen at
phase representation is needed. To overcome this problem, random. Having modified the phase delay of the funda-
while still being able to preserve the shape of the wave- mental, phase delays of the other partials are converted
form when doing time or pitch scaling, Di Federico pro- back into phase values,
posed a representation, relative phase delay (RPD) [44],
based directly on additive parameters, by representing the
phase trajectories as phase delays relative to the phase
i,k = mod 冋冉 1,k
1,k 冊 册
+ ⌬i,k i,k, 2 , k = 2, . . . , N
(6)
delay of the first partial. When performing time scaling,
the amplitudes and frequencies of the partials are left un- where 1,k is the phase of the fundamental.
touched, but the phase values of the fundamental are up- Relative phase delay is a representation that works well
dated using a propagation formula. After the new phase for harmonic sounds in that the waveform characteristics
values of the fundamental are found, the phase values of are preserved. However, to actually use this representation
the other partials are changed, based on their position rela- it is necessary to take phase wrapping into account. An-
tive to a fixed point in the fundamental period. other problem with the relative phase delay is that the
RPD is based on the definition of phase delay from Eq. phase delay calculated from the wrapped phase values
(3), and it is defined as approaches zero as the frequency increases. This makes it
difficult to compare relative phase delays and plot them
i,k visually. Finally if the sound is slightly inharmonic, a drift
i,k = (4)
i,k in the relative phase delays will occur for the partials. To
show this, imagine a nearly harmonic signal with two
where i is the index of the partial and k is the analysis sinusoids of start phase 0, one with a frequency of 110 Hz
frame index. expresses the distance in time between the and one with a frequency of 225 Hz. The sound is ana-
analysis frame center and a specific point in the partial lyzed using a step size of 100 ms. In the first analysis
period. frame the relative phase delay between the first and second
The relative phase delay is defined for the partials as the partials is 0. At the next frame, t ⳱ 0.1 s, the phase of each
difference between the phase delay of the fundamental and partial is
the partial i,
0,1 = mod共0.1 s ⭈ 2 ⭈ 110 Hz, 2兲 = 0
⌬i,k ⳱ i,k − 1,k. (5) (7)
1,1 = mod共0.1 s ⭈ 2 ⭈ 225 Hz, 2兲 = .
Since the relative phase delay ⌬i,k for i ⳱ 2, . . . , N is The phase delay of the first partial is 0,1 ⳱ 0/110 Hz ⳱
defined relative to a fixed point in the fundamental period, 0 s. The phase delay of the second is evaluated at an
Fig. 2. One analysis frame of sustained part of a saxophone sound. (a) Magnitude. (b) Phase. (c) Phase delay. +—spectral peaks in
magnitude plot.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1161
ANDERSEN AND JENSEN PAPERS
integer multiple of the “fundamental frequency,” 2 ⭈ 110 tion. Furthermore, the new way of representing the phase
Hz ⳱ 220 Hz, 1,1 ⳱ /220 Hz ≈ 0.014 s, and thus a drift solves the phase unwrapping problem of the RPD.
in the relative phase delay of the second partial has oc-
curred. If the second partial had been at frequency 220 Hz, 3.5 Partial-Period Phase Representation
no drift would have occurred, and the relative phase delay To correct for drifting phase values in inharmonic
representation would have given a usable result. This drift sounds, an improvement to the fundamental-period phase
can be demonstrated on synthetic and recorded signals [41]. representation is proposed, the partial-period phase (PPP),
in which the phase is expressed relative to the same point
3.4 Fundamental Period Phase Representation between frames of the partial period instead of relative to
To overcome some of the problems of the relative phase a point in the fundamental period. The method presented
delay, an improved phase representation is proposed. One here bears some similarity to the phase propagation em-
of the goals that was achieved with the RPD was that ployed in STFT-based phase vocoders [45] when time or
phase values between frames could be compared. This is pitch scaling a signal.
usually possible only in a frame-based analysis when the Eqs. (10) and (8) then give
frame size is exactly an integer multiple of the fundamen-
k,i ⳱ k,i + ⌬tk,ik,i (12)
tal period. In this case the phase value is measured at the
same point in the waveform period for successive frames, where k is the frame number, i is the index of the partial,
and is thus comparable between frames. In the RPD this and ⌬tk,i is the time difference between the point at which
problem was overcome by using phase delays. Another the phase value is measured and the corrected value (see
way to make the phase values comparable between frames Fig. 3),
冋 冉 冊册
is used in the fundamental-period phase representation. In Ra − ⌬tk−1,i
this representation the measured phase values of the fun- ⌬tk,i = 1 − mod ,1 Lk,i. (13)
damental, k,0, for a given block k, are corrected by a linear Lk−1,i
change in phase, corresponding to a time difference ⌬tk,0 Fig. 4 shows an example of the difference between the
at the measured frequency k,0. ⌬tk,0 is defined as the time fundamental-period phase representation and the PPP rep-
difference between a fixed point in the fundamental pe- resentation. A segment of a piano sound is analyzed, and
riod, that is, a point in the period which is the same be- the corrected phase values for the first five partials are
tween frames, and the point where k,0 is measured. The shown. The piano sound is known to have stretched har-
point where k,0 is measured is dependent on the step size monic frequencies [46], and thus effectively demonstrates
Ra. More formally ⌬tk,0 can be found by the following the problem with RPD and fundamental-period phase rep-
formula, where Lk,0 represents the length of the fundamen- resentation. The partial-period phase representation [Fig.
tal period in the kth frame. 4(b)] is clearly superior to the fundamental-period phase
冋 冉 冊册
representation [Fig. 4(a)], removing the phase drift caused
Ra − ⌬tk−1,0
⌬tk,0 = 1 − mod ,1 Lk,0. (8) by nonharmonic partials.
Lk−1,0 All phase representations presented here preserve the
The modulus function is used to ensure that the phase phase information, and thus no degradation in sound qual-
correction stays in the interval between − and . Note ity results from the use of these representations. By sub-
that Lk,0 can be found by knowing the frequency of the stituting Ra in Eq. (13) with the synthesis frame length Rs
fundamental k,0, we obtain the phase values used in the cubic interpolation
when synthesizing,
2
Lk,0 = . (9) k,i ⳱ k,i − ⌬tk,ik,i (14)
k,0
Using Eq. (10) all partials in a given analysis frame are
The corrected phase for the fundamental k,0 can now be analyzed relative to the same time in the fundamental pe-
found, riod. In the partial-period representation of Eq. (12) each
k,0 ⳱ k,0 + ⌬tk,0k,0. (10) partial phase value is evaluated relative to the same point
in the last analysis frame of the particular partial. In a
The phase values of the other partials are corrected using transient or low-energy part of the sound, the estimation of
the same time difference ⌬tk,0 that was used in correcting the partial frequencies is likely to fail, resulting in new
the fundamental, absolute phase values, and thus occurrence of transients
k,i ⳱ k,i + ⌬tk,0k,i. (11) has to be taken into consideration when using the partial-
period phase representation. In practice the phase repre-
This representation is called fundamental-period phase sentation and modeling should be coupled with a transient
representation and is equivalent to the RPD, apart from the detector to signify in which portions of the sound the
fact that the RPD uses phase delays measured in time to phase values are comparable.
represent the phase differences, whereas radians are used
in the fundamental-period phase representation. This 4 EXPERIMENT: THE IMPORTANCE OF PHASE
means that now we have a representation preserving the
waveform characteristics and allowing for a comparison of The purpose of this experiment is to determine how
phase values between frames, as in the RPD representa- important phase is with regard to sound quality when syn-
1162 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS PHASE IN THE SINUSOIDAL MODEL
thesizing monophonic singing voice or other musical in- 4) Synthesized sound, constant partial-period phase, ap-
struments. Recorded instrument sounds are analyzed using proximating absolute phase values in the stationary part
the method described in Section 2, followed by modifica- (AP)
tion of the phase trajectories using the partial-period phase 5) Synthesized sound, no phase information (NP).
representation. Finally the sounds are resynthesized from For the synthesized sounds method Sb described in Sec-
the modified analysis parameters and compared in a lis- tion 2.2 was used, except for the sound with no phase
tening experiment. information, where method Sa was used.
In ARP all phase information is preserved, and thus no
4.1 Sound Reproduction Methods modification to the phase information is made between
Five conditions were used in a repeated-measures full analysis and synthesis.
factorial experiment. In each condition the original sound To change the partial phase trajectories in RP and AP,
was compared to one of the following sounds: we use the partial-period phase representation. In synthe-
1) Original sound (ORG) sizing RP the relative phase shift between each partial is
2) Synthesized sound, with full phase information, preserved, but the absolute phase value of each partial is
maintaining absolute and relative phase (ARP) discarded. This is accomplished by randomizing the start
3) Synthesized sound, with phase information, main- phase of each phase trajectory in the partial-period phase
taining relative phase (RP) representation.
Fig. 3. Schematic drawing of partial-period phase value for one partial. 哹 block boundaries; —measured phase value for each block;
—PPP value found by knowing frequency of partial and step size Ra. As seen, k and k-1 are at the same position in the partial
period, even though phase values, k and k−1 are measured at different positions in the partial period.
Fig. 4. First five partials of a piano sound. (a) Fundamental-period phase. (b) Partial-period phase. Noise at the start is due to transient
sound of piano attack, for which no stable frequency information can be estimated. Partial-period phase is clearly superior to
fundamental-period phase in that phase trajectories are nearly constant over time for sustained part of sound.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1163
ANDERSEN AND JENSEN PAPERS
1164 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS PHASE IN THE SINUSOIDAL MODEL
experiment took about one hour, and thus time was the proved as a function of frequency, except for f5, where it
limiting factor for the number of repetitions, reproduction was significantly lower than for f3 and f4, at p ⱕ 0.024.
conditions, and instrument types used in the experiment. The interaction of reproduction and fundamental fre-
quency was significant (F16,336 ⳱ 30.1, p < 0.001), with
4.4 Results and Discussion an unexplained large mean degradation in the AP and NP
After the experiment the subjects were asked to com- reproductions for f5 compared to f3 and f4. For ARP and
ment on the experiment. Many stated that the perceived RP we see a clear relationship between fundamental fre-
difference between reference and processed sound was quency and perceived sound quality, where low funda-
due to changes in the sustained part of the sound. The mental frequency results in larger degradation. It seems
sounds used in the experiment all had soft attacks and no that ARP and RP retain the phase relations that are im-
transients, except for the piano, which has a fast attack portant when modeling the noise between the harmonic
when the hammer hits the string. For the piano one subject partials in the high-pitched sounds. The noise is modeled
commented on a perceived difference in the attack. using additional nonharmonic sinusoids, which make “the
The degradation varied significantly across conditions noise take on a tonal quality that is unnatural and annoy-
(F4,84 ⳱ 60.7, p ⳱ 0.001). Fig. 5 shows the mean degra- ing,” if the phase information is not used [2]. This only
dation and the standard error of the mean for each repro- applies to the high-pitched tones, however.
duction type. Pairwise comparison showed that the indi- Mean degradation as a function of instrument type and
vidual levels of reproduction were significantly different reproduction is shown in Fig. 7. The degradation varies
from each other (p ⱕ 0.001). The order of the mean deg- significantly across the type of instrument (F4,84 ⳱ 54.4,
radations ranged, as expected, from imperceptible toward p < 0.001). In general the degradation was lower for cello
larger degradation of the sound quality, as phase informa- and piano than for the rest of the instruments. The inter-
tion was removed. One exception is that the mean degra- action of instrument type and reproduction was significant
dation of relative phase (RP) was lower than that of ab- (F16,336 ⳱ 28.4, p < 0.001). For bass trombone and bass
solute phase (AP). In synthesizing RP far more phase clarinet, AP gave a lower degradation than the other re-
information is used than in AP. Even though AP is rated production methods, with the exception of ORG and ARP.
lower than ARP, the results show that AP, the model using Piano gave the largest degradation for ARP reproduc-
the partial-period phase representation, can indeed retain tion, which is most likely due to errors in the reproduction
some of the perceptually important phase information. An- of the attack. The transient caused by the hammer–string
other explanation for the finding may be the fact that the interaction in the piano attack is the only fast transient that
attack in AP is identical to the attack in ARP and thus occurs in the instrument selection included in this experi-
close to that of the original sound. ment. Because of the window used in the block-based
The variation in degradation across the fundamental fre- analysis, smearing does occur, which is harmful to the
quency group was also significant (F4,84 ⳱ 82.5, p < modeling of fast transients. This may explain the higher
0.001). Fig. 6 shows the mean degradations for the fun- degradation in ARP for piano. A within-subject analysis of
damental frequencies f1 to f5 as defined in Table 1 for the the degradation for the piano sounds reveals a significant
different levels of reproduction. The sound quality im- difference between reproduction types (F4,84 ⳱ 4.5, p ⳱
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1165
ANDERSEN AND JENSEN PAPERS
Fig. 6. Mean degradation for different reproduction types as a function of fundamental frequency. A high fundamental frequency group
corresponds to a high fundamental frequency.
Fig. 7. Mean degradation for different reproduction types as a function of instrument type.
1166 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS PHASE IN THE SINUSOIDAL MODEL
An experiment was conducted where synthesized Phase Sensitivity of a Computer Model of the Auditory
sounds were compared to original recorded sounds. The Periphery. I: Pitch Identification,” J. Acoust. Soc. Am., vol.
results of the experiment show that the inclusion of phase 89, pp. 2866–2882 (1991).
alignment enhances the sound quality of the analysis/ [9] R. Meddis and M. J. Hewitt, “Virtual Pitch and
synthesis system. A significant change in mean degrada- Phase Sensitivity of a Computer Model of the Auditory
tion was found between synthesis without and with phase, Periphery. II: Phase Sensitivity,” J. Acoust. Soc. Am., vol.
going from “perceptible, but not annoying” to “impercep- 89, pp. 2883–2894 (1991).
tible.” This result is in agreement with the literature on [10] R. D. Patterson, M. H. Allerhand, and C. Giguere,
auditory perception of complex tones. A significant effect “Time-Domain Modelling of Peripheral Auditory Process-
of fundamental frequency was found, resulting in degra- ing: A Modular Architecture and a Software Platform,” J.
dation approaching “slightly annoying” for sounds with Acoust. Soc. Am., vol. 98, pp. 1890–1894 (1995).
fundamental frequencies below approximately 100 Hz [11] R. P. Carlyon and S. Shamma, “An Account of
synthesized without phase (NP). Monaural Phase Sensitivity,” J. Acoust. Soc. Am., vol.
By use of the partial-period phase representation, a 114, pp. 333–348 (2003).
phase model (AP) is proposed where the sustained part of [12] J. I. Alcántara, I. Holube, and B. C. J. Moore, “Ef-
the sound is modeled by a constant partial-period phase fects of Phase and Level on Vowel Identification: Data
trajectory. The experiment shows that this model is sig- and Predictions Based on a Nonlinear Basilar-Membrane
nificantly better than when discarding phase alignment Model,” J. Acoust. Soc. Am., vol. 100, pp. 2382–2392
information (NP), or when maintaining the relative phase (1996).
shift (RP) but discarding the absolute alignment of the [13] D. Pressnitzer and S. McAdams, “Two Phase Ef-
partials. fects on Roughness Perception,” J. Acoust. Soc. Am., vol.
For the piano, synthesis with full phase information 105, pp. 2773–2782 (1999).
(ARP) was worse than for the other instruments, which is [14] B. Roberts, B. R. Glasberg, and B. C. J. Moore,
most likely due to the smearing of the fast transient in the “Primitive Stream Segregation of Tone Sequences without
attack. No significant difference was found between the Differences in F0 or Passband,” J. Acoust. Soc. Am., vol.
different synthesis methods of the piano sound. 112, pp. 2074–2085 (2002).
[15] H. Gockel, B. C. J. Moore, R. D. Patterson, and R.
6 ACKNOWLEDGMENT Meddis, “Louder Sounds Can Produce less Forward Mask-
ing: Effects of Component Phase in Complex Tones,” J.
The authors would like to thank the reviewers, Brian
Acoust. Soc. Am., vol. 114, pp. 978–990 (2003).
C. J. Moore and one anonymous reviewer, for helpful
[16] R. A. Greiner and D. E. Melton, “Observations on
comments and suggestions. We would also like to thank
the Audibility of Acoustic Polarity,” J. Audio Eng. Soc.,
the subjects participating in the experiment.
vol. 42, pp. 245–253 (1994 Apr.).
[17] B. C. J. Moore and B. R. Glasberg, “Differ-
7 REFERENCES ence Limens for Phase in Normal and Hearing-Impaired
[1] K. Jensen, “Timbre Models of Musical Sounds,” Subjects,” J. Acoust. Soc. Am., vol. 86, pp. 1351–1365
Ph.D. dissertation, Tech. Rep. 99/7, Dept. of Computer (1989).
Science, University of Copenhagen, Copenhagen, Den- [18] J. H. Patterson and D. M. Green, “Discrimination
mark (1999). of Transient Signals Having Identical Energy Spectra,” J.
[2] R. J. MacAulay and T. F. Quatieri, “Speech Analy- Acoust. Soc. Am., vol. 48, pp. 894–905 (1970).
sis/Synthesis Based on a Sinusoidal Representation,” [19] G. H. Wakefield, L. M. Heller, L. H. Carney, and
IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP- M. Mellody, “On the Perception of Transients: Applying
34, pp. 744–754 (1986 Aug.). Psychophysical Constraints to Improve Audio Analysis
[3] H. Helmholtz, On the Sensation of Tone, 2nd En- and Synthesis,” in Proc. Int. Computer Music Conf.
glish ed., based on 4th German ed. of 1877 (Dover, New (2000), pp. 225–228.
York, 1954). [20] S. Uppenkamp, S. Fobel, and R. D. Patterson, “The
[4] R. Plomp and H. J. M. Steeneken, “Effect of Phase Effects of Temporal Asymmetry on the Detection and the
on the Timbre of Complex Tones,” J. Acoust. Soc. Am., Perception of Short Chirps,” Hear. Res., vol. 158, pp.
vol. 46, pp. 409–421 (1969). 71–83 (2001).
[5] R. C. Mathes and R. L. Miller, “Phase Effects in [21] M. R. Schroeder, “New Results Concerning Mon-
Monaural Perception,” J. Acoust. Soc. Am., vol. 19, pp. aural Phase Sensitivity,” J. Acoust. Soc. Am., vol. 31, p.
780–797 (1947). 1579 (1959).
[6] J. L. Goldstein, “Auditory Spectral Filtering and [22] M. R. Schroeder, “Synthesis of Low-Peak-Factor
Monaural Phase Perception,” J. Acoust. Soc. Am., vol. 41, Signals and Binary Sequences with Low Autocorrelation,”
pp. 458–479 (1967). IEEE Trans. Inform. Theory, vol. 16, pp. 85–89 (1970).
[7] R. D. Patterson, “A Pulse Ribbon Model of Monau- [23] M. Leman, “Visualization and Calculation of the
ral Phase Perception,” J. Acoust. Soc. Am., vol. 82, pp. Roughness of Acoustical Musical Signals Using the Syn-
1560–1586 (1987). chronization Index Model (sim),” in Proc. Conf. on Digi-
[8] R. Meddis and M. J. Hewitt, “Virtual Pitch and tal Audio Effects (DAFX-00) (2000), pp. 125–130.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1167
ANDERSEN AND JENSEN PAPERS
[24] E. Tind and K. Jensen, “Phase Models to Control [37] F. Auger and P. Flandrin, “Improving the Read-
Roughness in Additive Synthesis,” in Proc. Int. Computer ability of Time Frequency and Time Scale Representations
Music Conf. (Miami, FL, 2004), To be published (2004 by the Reassignment Method,” IEEE Trans. Signal Pro-
Nov.). cess., vol. 43, pp. 1068–1089 (1995).
[25] J. C. Risset and D. L. Wessel, “Exploration of Tim- [38] S. Borum and K. Jensen, “Additive Analysis/
bre by Analysis and Synthesis,” in Psychology of Music, Synthesis Using Analytically Derived Windows,” in Proc.
D. Deutsch, Ed. (Academic Press, New York, 1982). Digital Audio Effects Workshop (Trondheim, Norway,
[26] M. R. Portnoff, “Implementation of the Digital 1999), pp. 125–128.
Phase Vocoder Using the Fast Fourier Transform,” IEEE [39] Y. Ding and X. Qian, “Processing of Musical
Trans. Acoust., Speech, Signal Process., vol. ASSP-24, pp. Tones Using a Combined Quadratic Polynomial-Phase Si-
243–248 (1976). nusoid and Residual (QUASAR) Signal Model,” J. Audio
[27] J. B. Allen, “Short Term Spectral Analysis, Syn- Eng. Soc., vol. 45, pp. 571–584 (1997 July/Aug.).
thesis and Modification by Discrete Fourier Transform,” [40] A. Röbel, “Adaptive Additive Synthesis of Sound,”
IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP- in Proc. Int. Computer Music Conf. (Berlin, Germany,
25, pp. 235–238 (1977). 1999), pp. 256–259.
[28] X. Serra and J. Smith, “Spectral Modeling Synthe- [41] T. H. Andersen, “Phase Models in Real-Time
sis: A Sound Analysis/Synthesis System Based on a De- Analysis/Synthesis of Voiced Sounds,” Master’s thesis,
terministic Plus Stochastic Decomposition,” Computer Dept. Computer Science, University of Copenhagen, Co-
Music J., vol. 14, pp. 12–24 (winter 1990). penhagen, Denmark (2002 Jan.).
[29] K. Fitz and L. Haken, “Sinusoidal Modeling and [42] L. R. Rabiner, “On the Use of Autocorrelation
Manipulation Using Lemur,” Computer Music J., vol. 20, Analysis for Pitch Detection,” IEEE Trans. Acoust.,
no. 4, pp. 44–59 (1996). Speech, Signal Process, vol. ASSP-25, pp. 24–33 (1977).
[30] X. Rodet, “The Additive Analysis–Synthesis Pack- [43] A. Papoulis, Signal Analysis (McGraw-Hill, New
age,” Tech. Rep., IRCAM, Paris, France (2004 July). www. York, 1977).
ircam.fr/equipes/analyse-synthese/DOCUMENTATIONS/ [44] R. Di Federico, “Waveform Preserving Time
additive/index-e.html. Stretching and Pitch Shifting for Sinusoidal Models of
[31] K. Jensen, “The Timbre Model,” in Proc. Work- Sound,” in Proc. COST-G6 Digital Audio Effects Work-
shop on Current Research Directions in Computer Music shop (1998), pp. 44–48.
(Barcelona, Spain, 2001), pp. 174–186. [45] J. Laroche and M. Dolson, “New Phase-Vocoder
[32] M. V. Matthews, J. E. Miller, and E. E. David, Techniques for Real-Time Pitch Shifting, Chorusing, Har-
“Pitch Synchronous Analysis of Voiced Speech,” J. monizing, and Other Exotic Audio Modifications,” J. Au-
Acoust. Soc. Am., vol. 33, pp. 179–186 (1961 Feb.). dio Eng. Soc., vol. 47, pp. 928–936 (1999 Nov.).
[33] M. D. Freedman, “Analysis of Musical Instrument [46] H. Fletcher, E. D. Blackham, and R. Stratton,
Tones,” J. Acoust. Soc. Am., vol. 41, pp. 793–806 (1967). “Quality of Piano Tones,” J. Acoust. Soc. Am., vol. 34, pp.
[34] K. Fitz and L. Haken, “On the Use of Time–Fre- 749–761 (1962).
quency Reassignment in Additive Sound Modeling,” J. [47] ITU-R 8510, “Methods for the Subjective Assess-
Audio Eng. Soc., vol. 50, pp. 879–893 (2002 Nov.). ment of Small Impairments in Audio Systems, Including
[35] J. S. Marques and L. B. Almeida, “New Basis Multichannel Sound Systems,” Tech. Rep., International
Functions for Sinusoidal Decompositions,” in Proc. 8th Telecommunications Union, Geneva, Switzerland (1994
Eur. Conf. in Electrotechnics (EUROCON’88) (1988 Mar.).
June), pp. 48–51. [48] J. Bensa, K. Jensen, and R. Kronland-Martinet, “A
[36] P. Guillemain, “Analyse et modélisation de signaux Hybrid Resynthesis Model for Hammer–String Interaction
sonores par des représentations temps–frequence linéaires,” of Piano Tones,” EURASIP J. Appl. Signal Process., vol.
PhD thesis, Université d’Aix–Marseille II, France (1994). 7, pp. 1021–1035 (2004).
THE AUTHORS
T. H. Andersen K. Jensen
1168 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
PAPERS PHASE IN THE SINUSOIDAL MODEL
Tue Haste Andersen received a master’s degree in com- from the Department of Datalogy, University of Copen-
puter science from the Department of Computer Science, hagen, Denmark, doing work in analysis/synthesis, signal
University of Copenhagen, Copenhagen, Denmark, in processing, classification, and modeling of musical
2002. At present he is pursuing graduate studies in the sounds.
same department, working with human–computer interac- Dr. Jensen is an assistant professor in the Department of
tion aspects of sound and music. Datalogy. He has a broad background in signal processing
and has been involved in synthesizers for children, state-
●
of-the-art next-generation effect processors, and general
Kristoffer Jensen received a master’s degree in com- topics in music informatics. His current research topic is
puter science from the Technical University of Lund, Swe- signal processing with musical applications, which in-
den, and a D.E.A in signal processing from ENSEEIHT, cludes knowledge of perception, psychoacoustics, physi-
Toulouse, France. In 1999 he received a Ph.D. degree cal models, and expression of music.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1169
ENGINEERING REPORTS
h System tuning ratio, ⳱ fB/fS Equating Eqs. (1) and (8), this yields the following coef-
QL Enclosure Q at fB resulting from leakage losses ficient relationships:
QT Total driver Q at fS resulting from all system resistances
s Complex frequency variable b1 + d
a1 =
T(s) Third-order transfer function c
U Uncoupling factor
VAS Volume of air having same acoustic compliance as b2 + b1d
a2 = (9)
driver suspension c2
VB Net internal volume of enclosure
b3 + b2d
0 Angular frequency variable of fourth-order transfer a3 = .
function c3
1 Angular frequency variable of first- and third-order
transfer functions
2 DETERMINATION OF UNCOUPLING FACTOR
T3 Cutoff angular frequency of third-order transfer
function We can determine an uncoupling factor between the two
␣ System compliance ratio, ⳱ VAS/VB transfer functions by evaluating the log magnitude-
squared form of F(s) at the cutoff angular frequency of
1 RESPONSE SYNTHESIS T(s), that is, we define the uncoupling factor U as follows:
s3 1
T共s兲 = (4) U = 10 logⱍF共T3兲ⱍ2 = 10 log
s3 + b11s2 + b221s + b331 1 + d2
c = 共b3d兲1 Ⲑ 4 (6) s3
T共s兲 = BL3共s兲 = .
0 = c1 (7) s3 + 2.4661s2 + 2.43321s + 31
and Hence
d U (dB)
h1 Ⲑ 2
0.05 −0.004 1 = . (15)
0.1 −0.022 c S
0.2 −0.088
0.3 −0.195 Some examples are shown next.
0.4 −0.339
0.5 −0.521 3.1 Approximate Third-Order Butterworth
Responses (AB3)
Finally let us suppose an approximate third-order Cheby- The coefficient values are given by
shev response with a 0.5-dB peak dip (AC3(0.5)),
b1 = b2 = 2, b3 = 1.
s3
T共s兲 = C3共0.5兲共s兲 = . The system parameters for QL = 7 are given in Table 1.
s3 + 2.1451s2 + 1.75121s + 1.39731
3.2 Approximate Third-Order Bessel
Then
Responses (ABL3)
T3 = 0.8551 The coefficient values are given by
1 b1 = 2.466, b2 = 2.433, b3 = 1.
U = 10 logⱍF共T3兲ⱍ2 = 10 log
1 + 共d Ⲑ 0.855兲2
The system parameters for QL = 7 are given in Table 2.
and
d U (dB) 3.3 Approximate Third-Order Chebyshev
Responses (AC3)
0.05 −0.013
0.1 −0.061 Note that both Butterworth and Bessel specify unique
0.2 −0.232 alignments, whereas Chebyshev is a family that must have
0.3 −0.506 a parameter specified, such as the ripple magnitude.
The coefficient values for a 0.5-dB peak dip are given by
It then follows that by using an uncoupling coefficient d
b1 = 2.145, b2 = 1.751, b3 1.397.
that is not too large, the response of the fourth-order
vented-box system is made to closely resemble that of a The system parameters for QL = 7 are given in Table 3.
prototype third-order high-pass function. From Tables 1–3 can be seen that smaller box volumes
require lower quality factors and higher tuning ratios, thus
3 COMPUTATION OF PARAMETERS
In this section we show the calculated Thiele–Small Table 1. System parameters for a set of AB3 responses.
parameters for three typical classes of responses. In all d a1 a2 a3 h ␣ QT f3/fS
cases the vented-box system is assumed to have a leak-
age loss of QL ⳱ 7 and a desired uncoupling factor ⱍUⱍ ⱕ 0.05 4.334 9.386 10.395 2.509 13.922 0.154 3.352
0.1 3.737 6.965 6.760 1.880 7.256 0.206 2.448
0.4 dB.
0.2 3.288 5.362 4.676 1.464 3.919 0.265 1.832
To do this we follow the steps described in Small [3]. 0.3 3.108 4.748 3.948 1.297 2.854 0.298 1.583
1) Calculate
c1 = a1QL, c2 = a3QL. (11)
Table 2. System parameters for a set of ABL3 responses.
2) Find the largest positive real root r of
d a1 a2 a3 h ␣ QT f3/fS
r4 − c1r3 + c2r − 1 = 0. (12)
0.05 5.319 11.426 10.599 2.053 16.094 0.136 4.259
3) Then the alignment parameters are 0.1 4.566 8.484 7.004 1.570 8.622 0.182 3.142
0.2 3.985 6.538 4.965 1.266 4.896 0.233 2.393
h = r2 0.3 3.738 5.794 4.269 1.153 3.716 0.260 2.098
0.4 3.605 5.410 3.927 1.096 3.161 0.276 1.943
1
␣ = a2h − h2 − 1 − 共a3h1 Ⲑ 2QL − 1兲
Q2L
Table 3. System parameters for a set of AC3(0.5) responses.
hQL
QT = . (13)
a3h1 Ⲑ 2QL − 1 d a1 a2 a3 h ␣ QT f3/fS
0.05 4.270 7.034 10.932 2.686 8.139 0.151 2.734
Furthermore
0.1 3.674 5.265 6.892 1.954 4.114 0.206 1.965
0.2 3.226 4.125 4.547 1.450 2.117 0.272 1.433
0 = h1 Ⲑ 2S (14)
1172 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
ENGINEERING REPORTS THIRD-ORDER RESPONSES AND VENTED-BOX LOUDSPEAKERS
yielding higher cutoff frequencies. As to the response Tables 4–6 list the system parameter values of different
shape, it is clear that as d decreases, the approximate alignments with QT being held constant in each case. To
rolloff of 18 dB per octave will be extended down to lower be able to compare these readily, we plotted their respec-
frequencies below the cutoff frequency. tive normalized response curves in Figs. 1–3.
Comments to Figs. 1–3 can be summarized as follows.
4 COMPARISON OF APPROXIMATE 1) For low QT values, AB3 and QB3 alignments exhibit
THIRD-ORDER AND QB3–SC4 ALIGNMENTS almost identical response shapes and provide similar sys-
tem parameter values. Note that in both cases one gets the
The QB3 and SC4 responses of Thiele [1] can be largest values of ␣ (a larger value of ␣ means a smaller
calculated by a coefficient parameter B and a pole-shift- box size).
ing factor k, respectively, as described in [3]. It would 2) The lowest cutoff frequencies and steepest cutoff
therefore be interesting to compare both of these slopes always occur for AC3 alignments.
alignment types to the new approximate third-order align- 3) The ABL3 and SC4 alignments feature the most
ments. rounded response—in other words the best transient re-
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1173
LLAMAZARES ENGINEERING REPORTS
sponse—but the price is that they provide the highest cut- teristics midway between second-order sealed-box and
off frequencies. fourth-order vented-box systems.
4) It has to be emphasized that for quality factors QT
having values less than about 0.23, SC4 alignments are no 6 ACKNOWLEDGMENT
longer possible.
The author would like to express his gratitude to the two
5 CONCLUSIONS reviewers for their valuable comments and suggestions.
1174 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
ENGINEERING REPORTS THIRD-ORDER RESPONSES AND VENTED-BOX LOUDSPEAKERS
THE AUTHOR
Bernat Llamazares was born in Barcelona, Spain, in communication and system projects and is presently man-
1965. He received a degree in telecommunication engi- aging his own company. Besides a passion for music, his
neering in 1998 and a master’s degree in private and pub- main interests include loudspeaker systems, room acous-
lic telecommunication services and networks in 2003, both tics, and audio signal processing.
from the Polytechnic University of Catalonia (UPC). He Mr. Llamazares is an associate member of the
has been working with different consultancies on both AES.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1175
LETTERS
CORRECTIONS
CORRECTION TO “ANALYSIS OF LOUDSPEAKER LINE ARRAYS”
In the above letter to the editor1 Fig. 39(b) and 39(c) should have appeared as follows. The author wishes to thank Greg
Oshiro for bringing it to his attention.
Fig. 39. Comparison of directivity functions of a stack of three curved sources and a straight-line source. Curved sources have element
length L ⳱ 150 mm, total included angle ⳱ 20°. Straight-line source has total length 3L.
1176 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
AES STANDARDS
COMMITTEE NEWS
Detailed information regarding AES Standards Committee
(AESSC) proceedings including structure, procedures, reports,
meetings, and membership is published on the AES Standards Web
site at http://www.aes.org/standards/. Membership of AESSC work-
ing groups is open to any individual materially and directly affect-
ed by the work of the group. For current project schedules, see the
project-status document also on the Web site.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1177
Metadata RevPAT 10/19/04 1:04 PM Page 1
METADATA REVISITED
Six New Things to Know About Audio Metadata
L
ast year we published an arti- that metadata is stored, as is planned for sal structure for data, can be either a
cle designed to demystify and a system being developed by the record- lowest common denominator or a
explain a number of key ing industry. complete description that fits the most
concepts relating to audio Morrell mentioned the ISBN general case. While the former seems
metadata (see JAES July/August 2003). (International Standard Book Number) limited, the latter is virtually impossible
Since that time the field has moved on system as a good example of a well- to achieve in Wright’s view. Although a
and the AES 25th International known registry system that has enabled lowest common denominator approach
Conference held in London during June the development of a number of useful is limited, it can do a finite and rather
2004 provided an opportunity to find out tools for libraries and publishers: a one- small job in a finite time. Such a core
more about recent developments, as well stop source for bibliographic informa- standard should be essential, general,
as topics not covered in the original arti- tion on English-language books in print; simple, and popular. The recognized
cle. The following is a short summary of a range of online data interchange and standard is Dublin Core (see the previ-
some of these, concentrating primarily order-routing solutions for books; and a ous article on metadata in the
on audio metadata standards and appli- database of publisher information. July/August 2003 JAES for more
cations rather than on feature extraction Global Data Synchronization (see details). Given the fact that digital data
(which was another key concept www.e-centre.org.uk), which uses the is impermanent and that filing systems
discussed at the AES 25th Conference). EAN.UCC method for numbering and are superseded thanks to the evolution of
barcoding, was developed as a system technology, the primary need is for core
WHAT’S A REGISTRY? for relating product, company, and loca- standards that are quickly implemented
Philippa Morrell, industry standards tion metatdata to facilitate collaborative and simple.
manager of the BOSS Federation, business processes. There is also a
described the purpose and nature of global registry—Global Product WHAT’S THE SAM/EBU DUBLIN
metadata registries. These are secure, Classification (GPC)— that keeps track CORE STANDARD?
central repositories of data that are of the original data relating to products Following on from this last point, Lars
increasingly used for business and and companies by providing a common Jonsson and Gunnar Dahl explained that
commercial operations, particularly on link between the classification systems the Scandinavian Academy of
the Internet where e-commerce leads to of different companies. Management worked with 25 archive
the need for centralized licensing and specialists and engineers to specify a
searching. WHAT’S CORE METADATA? core metadata standard for use within
Registries provide cross-references Richard Wright of BBC Information and the audio industry. This was proposed to
between digital items (such as a song Archives argued persuasively at the the European Broadcasting Union
stored on a server somewhere on the conference that the typical definition of (EBU) and later approved as EBU Tech.
Internet) and the information describing metadata as being “data about data” is Dec. 3293 by the EBU panel P-FRA
them. This is done either directly or via unfortunately too simple. In fact it is (Future Radio Archives). It is based on
a proxy (an indirect address or server more correctly labeled “beyond data.” It Dublin Core (DC) and includes some
that handles the means of accessing the is the organization, naming, and rela- additional internal fields for the transfer
real information). Morrell’s main point tionships of the descriptive elements; of near-online production audio files
was that registries provide a means by the structure of data rather than any within organizations. Use of the XML
which everyone can be “singing off the actual data. Wright pointed out that there syntax for transferring the extended
same hymn sheet.” In other words, are really three layers in any system: Dublin Core metadata information
everyone is using the same descriptive object (such as a digital audio file), enables the addition of other information
and licensing information from a descriptive data (data describing the file) that the organization might find impor-
common and reliable source, rather than and metadata (the convention or struc- tant. It is possible to incorporate this
there being numerous different versions ture for that descriptive data). new form of metadata within Broadcast
in numerous locations. A registry may Descriptive data is often mistakenly WAVE files using a header defined in
not itself contain metadata relating to termed metadata. Supplement 5 to the BWF standard,
digital items, but may point to where Core metadata, being a basic, univer- using the XML structure. One important
1178 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
Metadata RevPAT 10/19/04 1:04 PM Page 2
METADATA REVISITED
feature is that none of the 15 fields in
basic Dublin Core metadata is compul-
sory, only titles and names could be USEFUL WEBLINKS
used if desired.
The implementation developed by the EBU Tech. Doc. 3293-2001: http://www.ebu.ch/tech_32/tech_t3293.html
SAM/EBU group is known as AXML, SMPTE standards: http://www.smpte.org
which adds four TYPE values—PGR
(program group), program, item Annodex: http://www.annodex.net
(constituent editorial part of a program), XPath: http://www.w3.org/TR/xpath
and MOB (media object)—to the 15
basic elements of Dublin Core. These Global Data Synchronization: http://www.e-centre.org.uk
come from the EBU P-META standard
(see the previous article on metadata in
the July/August 2003 JAES) . undertaken on the XML documents. elementary streams).
These allow indices to be updated when Metadata extracted from the audio
HOW TO SOLVE PROBLEMS changes are made to the elements of the stream or file can be taken into account
SEARCHING MPEG-7 XML tree (which are stored separately and synchronized with the storage of
DATABASES in the database). The indices are stored audio. Specific metadata contained in
Max Jacob from IRCAM in Paris as case tables, which are fast and easy the original audio format, such as the
showed that there is no common way to to search, so there is no need to browse format chunk in BWF, can be mapped
manage sound databases. Searching the whole XML tree. Elements can be into relevant MXF descriptors and meta-
such databases is not straightforward, inserted and updated. (SAX is a Simple data components. Channel status data
but the MPEG-7 framework may API for XML; in other words, the SAX from an AES3 stream can be incorpo-
provide a way of enabling more efficient parser is an application programming rated into MXF; the data mode is
processes in this regard. Included in interface that enables the parsing of mapped to the AES3 audio essence
MPEG-7 are a number of parts to the XML documents so as to separate them descriptor set, and the channel status
XML schema, including audio descrip- into events separated by relevant tags.) data itself is stored as data essence (on a
tors and multimedia description schemes separate track) or in the file header.
(MDS). This latter defines complex MAPPING AES3/BWF AUDIO
object-oriented data structures and is AND METADATA INTO MXF WHAT’S CMML?
used for managing relations between Bruce Devlin, David Brooks, and David CMML (the Continuous Media
descriptors, content segmentation, and Schweinsberg from Snell and Wilcox Markup Language), as described by
semantic descriptions, among other discussed ways in which audio and Claudia Schremmer, Steve Cassidy,
things. Some issues need to be metadata from BWF or AES3 (the stan- and Silvia Pfeiffer, is a means of mark-
addressed, namely validation, manage- dard digital audio interface) structures ing up time-continuous media such as
ment of very large MPEG-7 documents, can be mapped into the MXF (Material audio and video for integration into the
and efficient searches. Exchange Format), a new SMPTE stan- searching, linking, and browsing func-
With regard to efficient searches, the dard for interchange in the broadcasting tionality of the worldwide web. This is
Worldwide Web Consortium (W3C) has world. As explained in the previous arti- the basis of the so-called Continuous
developed a language called XPath cle on metadata, the MXF format is Media Web (CMWeb). As Schremmer
(XML Path Language), which is a principally a streaming format for media pointed out, it is relatively easy to
simple way of addressing nodes in an data but can also transmit edited search and find static data that has
XML tree. Jacobs believes that XPath projects. descriptive metadata of one sort or
2.0 is theoretically capable of undertak- An MXF file essentially has a number another, but streamed media that
ing more or less all the searches one of separate tracks each containing a evolves over time is a form of “dark
might want to perform on an MPEG-7 different stream, which can be audio, matter” on the Internet, because it
database, but states that the practical video, metadata, data, or timecode. The cannot easily be searched or browsed.
implementation of this is more difficult. tracks are grouped to form packages. A A new format called Annodex is used
Fast searches in practice require what document has been created to deal with to stream such material, which allows
are called indices focusing on a specific AES/BWF mapping, known as SMPTE annotation and indexing so that it can
task, but there have been no tools for 382M, currently in committee draft be integrated into the URL-based
indexing MPEG-7 data in a way that can form. This essentially allows multichan- hyperlinking approach found on the
be understood by XPath. This problem nel audio to be stored and individual or Internet. Essentially what happens is
was addressed in the CUIDADO project stereo audio tracks extracted from it. An that a markup file written in CMML is
within which a number of such limita- important factor to consider was the interleaved with the media stream to
tions were addressed. need to ensure that one or more audio create an Annodex representation that
They decided to adopt an open-source tracks could be synchronized to any of can be searched and browsed.
database system called PostgreSQL and the other media formats (primarily Editor’s note: Look for a third metadata
set up a method of associating event video) contained within an MXF file article in another year or so as new tools
handlers with database operations (such as uncompressed, DV, or MPEG are developed and systems are refined.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1179
Radio ArchivingOct22Bill 10/22/04 10:39 AM Page 1
A WORKSHOP ON RADIO
terms in this field are audio archives,
content management, and media-asset
management; but the distinction
between these terms is not always
completely clear. The term essence is
often applied to the content that is
stored, whereas metadata is used to
to ensure that everything works
together.
The wide use of digital playout sys-
tems has been the driver for incorpo-
rating IT tools into the archiving pro-
cess. Heidrich asked whether we are
seeking a solution that improves qual-
ARCHIVING TOOLS: structure and describe the content. ity and saves money, or perhaps some-
STRATEGIES AND SOLUTIONS Information about the right to use the thing to preserve existing analog
Klaus Heidrich, chair, explained that material completes the picture. assets, or maybe a business require-
this workshop, designed to deal specif- The task of migrating to a digital ment to serve the next generation of
ically with the needs of radio broad- archive involves dealing with legacy digital platforms for on-demand
casters and originally entitled “Com- storage media such as tapes and the program delivery. Or is it a market-
parison of Existing Archiving Tools,” handling of legacy database structures driven change led by the advance of
had been given the subtitle of Solu- that require the addition of new meta- technology?
tions and Strategies. He was familiar data. Last but not least there is the Digital archive projects tend to be
with a number of recent digital archiv- question of “how to get along with fairly expensive, but there are at least
ing projects. The successful ones rights management,” which is usually three elements involved. There is the
always had an overall strategy the responsibility of administrative equipment, of course, but there is also
designed to provide a solution to the departments and which, on the basis of the time to transfer the legacy archive.
archiving problem. This ultimately current experience, may not always be Then there is the cost of ownership
gave rise to specific tools. The field as in the form of concise software and over the lifetime of the system. The
a whole, however, did not conform to a systems specifications that can be inte- question remains, can we justify such
fixed scenario but was something of a grated with a digital archive. an investment? Can business pro-
moving target. There is a wide variety of different cesses and workflow be improved,
Heidrich began the workshop by legacy solutions already in use for and can new business be generated as
introducing the panel: Ernst Dohlus, database management, including more a result?
Bavarian Radio, head of production recent relational databases. The ques- The topic of convergence has been
and playout; Wingolf Grieger, Nord tion arises as to whether it is possible around for years (that is, convergence
Deutscher Rundfunk (NDR), system simply to add mass storage for digital between the different broadcasting and
coordination for digital archives; Niko content to existing database structures. content delivery media made possible by
Waesche, IBM Business Consulting In general, it turns out that this is not information technology). Is the introduc-
Services; Rainer Kellerhals, Tecmath adequate. In fact, it is necessary to tion of digital archives closely related to
AG, executive VP of product and solu- incorporate features that are specific to convergence? From an operational point
tions; and Karl Pieper, general man- digital archives, such as browsing and of view, it is interesting to consider
ager of VCS Media Broadcasting coping with the range of different whether the operational process is driv-
Solutions. audio codecs involved. Interfaces to ing the solution or whether the solution
the digital archive solution are crucial is driving the process. Related to this is
INTRODUCING THE QUESTIONS to the success of a system. Business the question of whether an archiving
Heidrich evaluated reasons for intro- management systems for rights man- solution can really be an off-the-shelf
ducing digital archives and asked some agement also need to be integrated. A product. In particular, does an archive
important questions that he hoped the digital archive solution therefore system have to be broadcast-specific or
panel would address. Commonly used requires a qualified integration concept can it be a generic solution?
1180 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
Radio ArchivingOct22Bill 10/22/04 10:39 AM Page 2
Standards, common practice, and content. For example, in future, the be a functional archive, and a radio
common middleware solutions are all music industry will deliver audio files house without an archive “is like a
issues to be considered, as are stan- instead of CDs, and this content has to kitchen without a pantry.”
dard interfaces and metadata struc- be kept in an archive unless it is only The question of workflow was
tures. Convergence is a thorny subject to be used temporarily. addressed by Grieger, in the form of a
and liable to fill many workshops in NDR has calculated that its mass- question as to whether the workflow of
its own right. However, it was consid- storage project—involving 90,000 a traditional carrier-based archive is
ered very important. “Everything hours of old tape material and 2,000 compatible with the workflow of a dig-
should be made as simple as possible hours of new production per ital system. Traditional material has to
but not simpler!” year—requires an investment of around be stored, documented, and ultimately
7M Euro as well as 10M Euro for the brought out again for broadcasting. In
WHY SHOULD A PUBLIC RADIO outsourcing of digitization. Each hour the digital world there is no huge vault
STATION ADOPT A DIGITAL of archived material, therefore, costs containing tapes and disks, but a mass-
ARCHIVE? about 180 Euro. Why should we make storage archive of some terabytes
Ernst Dohlus from Bavarian Radio this kind of investment? Are we occupying about 6 square meters. In
asked why a public radio station in attempting to perform a cultural role, fact almost none of the documentation
Germany should adopt a digital mass digitizing the cultural radio heritage of at NDR had to be changed with the
storage system and transfer its program the nation? Probably not, considering switch to a digital archive. They are
archive to this format. Such a project is the competitive business of radio today, still using the same database as before.
inevitably complicated and expensive. but certainly we should preserve impor- The new system actually made the pro-
All of the companies offering content- tant examples from radio history. Are cess of documentation easier.
management systems inevitably have to we digitizing in order to facilitate new An important test of a system is in
develop software before they can business solutions, such as educational its approach to the handling of errors;
deliver products. They often run short programs on demand? This may be the this involves primarily the errors of
of time and postpone delivery dates. case for commercial broadcasters, but operators as opposed to software
There is no model solution for radio public radio in Germany is limited in errors. The archive number of an item,
archives, so each project involves new the extent to which it can undertake for example, acts as the unique key to
software development. Radio broad- such commercial operations. The copy- the content and its location in the tradi-
casters are often offered solutions that right laws also make such enterprises tional archive. It acts in a similar way
were originally developed for TV oper- complicated. For example, repeat fees in relation to audio files in the new
ations, and these are rarely suitable to artists are common, especially for system. However, if an operator errs in
without considerable modification. older material when contracts did not the specification of this number, then a
Digital archiving solutions can there- explicitly state that material could be tape might be lost in the archive, and
fore almost never be used right out of reused many times. No, the primary human ingenuity is needed to find it,
the box. Complex interfacing problems reason for moving to a digital archive is either by accident or design. Such
arise, making the introduction of these the way in which production techniques errors will always occur, and in the
systems complicated and expensive. So and delivery methods are changing, digital domain a tool is necessary to
why are we so attracted to such solu- driven by new technology, making con- enable humans to intervene in similar
tions, and why would we want to par- ventional recording equipment and ways so as to be able to correct errors
ticipate in such a risky venture? It is techniques largely obsolete. and search for things that may have
because the wave of technical develop- been erroneously labeled. Grieger
ments drive us forward. Vintage equip- A SOLUTION OR THE START OF therefore stressed the need not to lose
ment, for example tape machines, is NEW PROBLEMS? the possibility for human control.
becoming increasingly rare and diffi- Wingolf Grieger asked the provocative
cult to maintain and operate. question: “Is a solution actually a solu- THE MANUFACTURERS SPEAK
Although some old tapes are gradu- tion or is it the start of new problems?” Niko Waesche from IBM wanted to
ally deteriorating, in fact most of the The project at NDR, active since 2002, reinforce the point made by Dohlus
content in existing archives does not is called Digital Long Term Archive that digital archives are of vital strate-
suffer from this problem, so media (or DELA, to abbreviate the German gic importance to the future of broad-
degradation is not the key driving fac- title). Historically, and still to some casters. Process change in broadcasting
tor. Actually, it is changes in broadcast extent today, people have been is the key issue. Archives have to be
production pulling us to a new solu- employed simply to find material in shifted from being a “luxury item” to a
tion. Journalists, editors, and producers the vaults and bring it out. Finding it core technology. However, if operating
are used to browsing for material on usually involves a database of some costs after the implementation of a dig-
computers, and it is becoming common sort. As more and more material with ital archive become greater than they
practice to find all this material online more channels is generated, the con- were before, many of the advantages
in one way or another. It is no longer ventional archive finds it increasingly will be negated. Owing to the way in
considered possible to do effective difficult to cope and will eventually be which technology has changed the pro-
production work without the introduc- paralyzed by complexity. Unless the cess of archiving, it is no longer sepa-
tion of a mass-storage archive for radio archive is digitized there will no longer rate from the remainder of the ➥
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1181
Radio ArchivingOct22Bill 10/22/04 10:39 AM Page 3
1182 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
Pure audio performance
www.upv.rohde-schwarz.com
Radio ArchivingOct22Bill 10/22/04 10:39 AM Page 4
1184 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
1185to1187_officersNov_oct14 10/13/04 4:33 PM Page 1
ciation, a past member of the AES 1990/91. In 1995 the AES awarded
PRESIDENT Board of Governors, and founder and him a fellowship for his contribution to
chair of the AES’ Alberta Section. She digital audio technology and AES
THERESA LEONARD is the direc- served on the executive committee of standards activities. He also served as
tor of audio for the AES convention in Los Angeles in governor from 1999 to 2001. With
music and sound 2002 and as chair of the AES confer- Christer Grewin of the Swedish Broad-
at The Banff ence on multichannel audio at The casting Corporation he assembled and
Centre. She is Banff Centre in June 2003. edited the AES special publication
responsible for Collected Papers on Digital Audio Bit-
overseeing the rate Reduction. Neil left the BBC in
audio work/stu- PRESIDENT-ELECT 2002 to work as a consultant in
dy program and audio and broadcasting. An offshoot of
directing activi- NEIL GILCHRIST joined the BBC his consultancy work is a recording
ties at the cen- after graduating service for musicians and societies in
tre’s extensive audio facilities. Her from Manches- his area. He has just completed two
work spans many aspects of audio pro- ter University in CD masters for a mechanical musical
duction, administration, and engineer- 1965 with a instrument museum.
ing, including both studio and live B.Sc. honours
recording and postproduction in a vari- degree in physics
ety of musical genres, as well as audio and electronic SECRETARY
for video postproduction. As director engineering. As
of The Banff Centre’s audio education a BBC engineer, HAN TENDELOO was born in
program, she works closely with top he worked on Amsterdam, the
industry personnel, who serve as facul- broadcast audio, PCM for national Netherlands, in
ty members and guest lecturers. radio distribution, and NICAM for 1936. He receiv-
Leonard holds bachelor degrees in television sound. He participated in the ed his master’s
music and education, and a master’s EUREKA 147 (Digital Audio Broad- degree in electri-
degree in music from McGill Univer- casting) project, and toward the end of cal engineering
sity, where she was enrolled in the his BBC career led the European from the Techni-
sound recording program. Her ACTS ATLANTIC project to a suc- cal University
thesis, “Time Delay Compensation of cessful conclusion in its final year. of Delft in the
Distributed Multiple Microphones in From 1981 to 1996 he represented the Netherlands,
Recording: An Experimental Evalua- UK in the former CCIR, including with a specialization in semiconductors.
tion” was later transcribed into an AES chairmanship of CCIR Interim Work- He has been employed by Philips-relat-
paper and presented in New York City ing Party 10/6 (international exchange ed companies such as PolyGram and
at the 95th Convention in 1993. of sound programs). He represented PDO in the fields of recording, duplica-
Trained as a classical pianist, she the BBC on Sub-group V3 (Sound) of tion, replication, and product develop-
previously taught music in French and the EBU, and served on both the AES ment and marketing: LP, MC, VLP,
English schools in eastern Canada, and the EBU groups, which prepared CD, CD-i, CD-Video, DCC, packaging.
worked as audio postproduction engi- the specification for the AES/EBU He is coinventor of the CD jewel box.
neer for a Canadian TV series, and as digital audio interface. His AES He was a long-time chair of NEC TC60
an audio engineer and instructor at the activities have included frequent con- (IEC Audio and Video-Recording Stan-
University of Iowa School of Music. tributions to papers and workshop dardization) and a member of the Soci-
She is the regional representative for sessions at AES conventions, and ety of Motion Picture and Television
the Alberta Recording Industry Asso- chairmanship of the British Section in Engineers (SMPTE). ➥
NEW OFFICERS
After his retirement he freelanced working in the area of acoustics for He was the papers chair for the
for Philips and the International Feder- small to medium-sized rooms. An 104th, 110th, and 114th conventions
ation of the Phonographic Industry AES fellow, he served on its Board in Amsterdam '98, ’01, and ’03; and
(IFPI) in London. An AES member of Governors from 1990 to 1992, and AES governor from 1999-2000. Aarts
since the mid 60s, he has held the fol- was AES president from 1994 to was made a fellow in 1998 of the
lowing AES offices: vice president, 1995. As president and president- AES for major contributions to sound
Northern Region, Europe; governor; elect, he worked very closely reproduction and assessment.
vice chair of the Standards Committee, on financial concerns with the trea-
Europe Region; chair and member of surer. He has also been involved with ULRIKE KRISTINA SCHWARZ
the Publications Policy Committee; AES digital audio measurement stan- started pursuing
convention chair; convention vice dards from 1984 to 1998. Since 1994 a career in the
chair; and convention program coordi- he has been on the Publications music industry
nator. He was awarded a fellowship in Policy Committee to promote the with classical
1977 and has received three Board of use of electronic media for AES piano training
Governors Awards. His focus in recent publications. at the Richard
times is on improvement of informa- Strauß Conser-
tion to the membership about upcom- vatory, Munich,
ing AES conventions by introducing GOVERNORS Germany. In
bar-graph convention calendars, com- addition to the
prehensive semi-interactive conven- RONALD AARTS was born in 1956, Tonmeister program at the University
tion Web sites, and detailed on-site in Amsterdam. of the Arts Berlin (UdK) and the
convention planners. He received a Technical University Berlin, which
B.Sc. degree in she entered in 1994, she expanded her
electrical engi- knowledge by taking part in the Sum-
TREASURER-ELECT neering in 1977, mer Performance Program at the
and a Ph.D. from Berklee School of Music, Boston,
LOUIS FIELDER received a B.S. Delft University MA, USA. A scholarship for a six
degree in elec- of Technology months’ workstudy with acclaimed
trical engineer- in 1994. In 1977 jazz recording engineers brought her
ing from the he joined the to New York City. There she estab-
California Insti- optics group of Philips Research Labo- lished contacts with the major record-
tute of Technol- ratories, Eindhoven. Until 1984, his pri- ing facilities of New York and was
ogy in 1974 and mary contributions were in the fields of involved in productions of all major
an M.S. degree servos, signal processing for video long- jazz labels including artists like Joe
in acoustics play players, and Compact Disc players. Henderson and Horace Silver, and on
from the Uni- In 1984 he joined the acoustics group of the classical side, Lorin Maazel and
versity of Cali- the Philips Research Laboratories and Itzak Perlman. Several of these pro-
fornia in Los Angeles in 1976. was engaged in the development of ductions have received Grammy nomi-
Between 1976 and 1978 he worked CAD tools and signal processing for nations or awards. In 2000 Schwarz
on electronic component design for loudspeaker systems. graduated from the UdK Berlin with
custom sound-reinforcement systems In 1994 he became a member of the the Tonmeister-Diplom, the equivalent
at Paul Veneklasen and Associates. DSP group, where he studied the to a double M.A. and M.Sc. in classical
From 1978 to 1984 he was involved improvements of sound reproduction music production and recording sci-
in digital-audio and magnetic record- by exploiting DSP and psychoacousti- ence in the U.S. In 2001 she joined the
ing research at Ampex Corporation. cal phenomena. He currently holds the TV department of Bayerische Rund-
At that time he became interested in position of research fellow. He has funk, Munich, Germany, as video and
applying psychoacoustics to the published more than 120 technical sound engineer. During this engage-
design and analysis of digital-audio papers and reports, and holds over a ment she recorded Yale University’s
conversion systems. Since 1984 he dozen U.S. patents, while another 40 Chamber Music Festival 2002, featur-
has worked at Dolby Laboratories on are pending. He has been a member of ing the Tokyo String Quartet. Since
the application of psychoacoustics to the organizing committee and chair of 2003 she has been a sound engineer for
the development of audio systems and various conventions. Currently he is studio and remote productions in BR’s
on the development of a number of the chair of AES’ Technical Commit- radio department.
bit-rate reduction audio coders for tee on Signal Processing and a review- She was chair of the Student Delegate
music distribution, transmission, and er for the AES Journal. He is a senior Assembly, Europe/International Regions
storage applications. He has also member of the IEEE, NAG (Dutch in 1999, facilities assistant at the conven-
investigated perceptually derived lim- Acoustical Society), and ASA (Acous- tions in New York in 2001, Munich
its for the performance of digital- tical Society of America). He has been 2002, New York 2003, and Berlin 2004,
audio conversion and low-frequency a member of the Dutch AES committee with increasing involvement in educa-
loudspeaker systems. Currently, he is for various positions, recently as chair. tion events and section activities.
1186 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
1185to1187_officersNov_oct14 10/13/04 4:33 PM Page 3
JOHN VANDERKOOY was born FACULTY VACANCY ANNOUNCEMENT: Music Business Production Emphasis.
in 1943 in
Maasland, the RANK AND SALARY: Assistant Professor/Associate Professor (dependent upon qualifi-
Netherlands. He cations and experience).
emigrated to
Canada with his RESPONSIBILITIES: Teach courses in music recording production and audio engineer-
family at an ear- ing technology. Specifically, beginning, intermediate, and advanced courses in studio
ly age. All of recording theory, history, and practice. Typical load is 24 hours per year (four class sections
his education per semester) plus student advising. Classes may include topics in studio, mastering, post-
was completed production audio for video, or concert remote recording. This position involves a full-time
commitment to teaching. Base contract salary is ten-month cycle with additional summer
in Canada, with
teaching option available.
a B.Eng. degree in engineering physics
in 1963 and a Ph.D. in physics in 1967, QUALIFICATIONS: Teaching experience and a Master's degree in a related discipline
both from McMaster University in with current or future pursuit of Doctorate preferred. Experience and progress toward a ter-
Hamilton, Ontario. After a two-year minal degree may be considered. Experience with studio record production and session pro-
postdoctoral appointment at the Uni- cedures, professional experience with commercially released credits, and demonstrated abil-
versity of Cambridge in the UK, he ity to communicate and work as part of an accomplished team are required. Must possess
went to the University of Waterloo. comprehensive knowledge of microphone design, studio and concert recording techniques,
For some years, he followed his doc- must have historical as well as functional and theoretical knowledge of both analogue
toral interests in high magnetic-field, (Neve and SSL console operations; Studer and Otari 2-inch machine alignment and opera-
low-temperature physics of metals. tions--including synchronization procedures) and digital recording technology (specifically
ProTools, Nuendo, Sony DASH, and Otari RADARHD systems).
His research interests since the late
1970s, however, have been mainly in BELMONT UNIVERSITY: A coeducational university located in Nashville, TN,
audio and electroacoustics. He is cur- Belmont is a student-centered, teaching university focusing on academic excellence. The
rently a full professor of physics at the university is dedicated to providing students from diverse backgrounds an academically
University of Waterloo. Over the challenging education in a Christian community, and is affiliated with the Tennessee
years he has spent sabbatical research Baptist Convention.
leaves at the University of Maryland,
Chalmers University in Gothenburg, THE MIKE CURB COLLEGE OF ENTERTAINMENT AND MUSIC BUSINESS:
the Danish Technical University Located near Nashville's dynamic Music Row, the Mike Curb College of Entertainment and
in Lygnby, the University of Essex in Music Business enrolls 900+ majors and combines classroom experience with real-world
the UK, the Bang & Olufsen Research applications. The curriculum comprises a BBA with emphasis areas in Music Business and
Music Production. Facilities feature eight state-of-the-art recording studios, including the
Centre in Struer, Denmark, and
award-winning Ocean Way Nashville studios, historic RCA Studio B, and the state-of-the-
Philips National Labs in Eindhoven, art Robert E. Mulloy Student Studios in the Center for Music Business.
the Netherlands. Vanderkooy is a
fellow of the AES, a recipient of its APPLICATION PROCESS: Candidates are asked to respond to Belmont’s mission,
Silver Medal and several Publication vision, and values statement in a written statement articulating how the applicant’s knowl-
Awards. Over the years he has con- edge, experience and beliefs have prepared them to function in support of that statement.
tributed a wide variety of technical Send a letter of application including a statement of personal educational philosophy, a
papers in such areas as loudspeaker complete resumé/curriculum vitae, and contact information for at least three references to:
crossover design, electroacoustic mea-
surement techniques, dithered quantiz- Dr. Wesley A. Bulla
ers, and acoustics. Together with his Associate Dean
Mike College of Entertainment and Music Business
colleague Stanley Lipshitz and a num-
1900 Belmont Blvd.
ber of graduate students they form the Nashville, TN 37212
Audio Research Group at the Univer-
sity of Waterloo. His important contri- APPLICATION DEADLINE: Review of applications will begin immediately
butions were papers on dither in digi-
tal audio and MLS measurement BELMONT IS AN EOE/AA employer under all applicable civil rights laws. Women and
systems. He brings an academic point minorities are encouraged to apply.
of view to the AES.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1187
p1188to1191_NewssectionsNov 10/8/04 11:34 AM Page 1
NEWS
OF THE
SECTIONS
We appreciate the assistance of the
section secretaries in providing the
information for the following reports.
1188 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1188to1191_NewssectionsNov 10/8/04 11:34 AM Page 2
NEWS
OF THE
SECTIONS
ogy, whereby artifacts of the recording original analog recording and subse-
McGill Students Meet process are extracted and conditioned. quent playback.
One hundred and twenty people gath- These artifacts are then converted into Howarth provided a complete
ered on January 9, for the McGill a pseudoclock source. Using this clock description of the steps necessary to
University Student Section meeting information, the audio is reconfigured obtain results with Time Traveler. He
in Clara Lichtenstein Hall at the in DSP with a unique application of also had samples from familiar master
Strathcona Music Building in Montre- Irregularly Spaced Sampling Theory. tapes, charts, and other analytical data,
al. Bob Ludwig, renowned mastering In so doing, the recorded material pro- which he used to demonstrate the
engineer, was the featured speaker. vides the information necessary to process. The conclusion was that the
The Centre for Interdisciplinary counteract the mechanical defects unit helps reduce FM distortion by as
Research in Music Media and Tech- inherent within the analog recorder/ much as 30 dB.
nology sponsored the event. reproducer as manifested during the April Cech ➥
Ludwig spoke to the standing room
only crowd about his mastering facility,
Gateway Mastering in Portland, Maine.
He addressed the issues involved in DSCOPE SERIES III
building such a facility. He showed pic-
tures of the studios and discussed the
THE FASTEST WAY TO TEST
rooms, wiring, acoustics and equipment
among other important elements essen- We listened, we developed and then we provided the
tial to a mastering facility. Audience "complete solution" for audio test - dScope Series III
members asked Ludwig’s opinion on
issues such as DVD versus SACD, the
use of compression in pop music, and Ideal for:
the general state of the industry. • Research & Development
In addition to this appearance, Lud- • Automated Production
wig also gave a private lecture to the
McGill AES students. During this ses- Test
sion, Ludwig talked about a DVD art • Quality Assurance
and music project mixed in 5.1 that • Servicing
Gateway had worked on with the com-
poser, Steve Reich. Students were also • Installation
given the unique opportunity to have
Ludwig critique their material.
The section was grateful to Ludwig
for taking time out of his busy sched-
ule to come to Montreal. This event
also helped generate a lot of publicity
for the Audio Engineering Society
among Montreal’s professional audio
community.
Time Traveler
On February 9, members of the
McGill University Student Section dScope Series III is fast becoming the worlds leading "complete solution" for
met at the Strathcona Music Building audio test and measurement. See for yourself why industry leaders like SSL,
in Montreal to hear Jamie Howarth,
Allen & Heath, Neve and Klark Teknik are making the change.
president of Plangent Processes, talk
about a hardware/software solution
for audio restoration and analog Call or e-mail NOW to find out how dScope Series III can help you.
recording.
Plangent Processes has developed a Prism Media Products Limited Prism Media Products Inc.
William James House, 21 Pine Street,
processor called “Time Traveler” that Cowley Road, Cambridge. Rockaway, NJ.
CB4 0WX. UK. 07866. USA.
removes speed variations from analog
recording as they are digitized. The Tel: +44 (0)1223 424988 Tel: 1-973 983 9577
Fax: +44 (0)1223 425023 Fax: 1-973 983 9588
processor corrects speed and pitch
sales@prismsound.com www.prismsound.com
variations known as wow and flutter
using a wideband reproducer technol-
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1189
p1188to1191_NewssectionsNov 10/8/04 11:34 AM Page 3
NEWS
OF THE
SECTIONS
1190 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1188to1191_NewssectionsNov 10/8/04 11:34 AM Page 4
NEWS
OF THE
SECTIONS
Thiele in India
The India Section met on Monday,
July 19 at the Ramee Guestline hotel,
Juhu, Mumbai, to listen to a presenta-
tion of two technical papers by AES
international vice-president Neville
Thiele. Before the meeting, he had a Neville Thiele presents two technical papers at India Section meeting.
one-on-one conversation with section
committee members. The venue was were first published, the T-S parame- Linear Single Channel 50 W amp. At
packed to capacity with almost every ters continue to prove extremely useful. the end of his talk, he provided some
member present to listen to a legend in Realizing he had lost track of the time, useful resources for those interested in
our field of work. Thiele had to abruptly conclude his learning more about amps. On the
The international vice-president presentation in order to catch a flight Web visit http://www.diyaudio.com
was very pleased with the good work back to Australia. and http://www. passdiy.com,
being carried out by the India section Thiele promised section members to and for a place to order boards
and recommended that all sections follow up on his all too brief visit with http://www.4pcb.com.
follow their example. Besides another visit in the near future. The
encouraging scientific research in the meeting ended with a sumptuous Indi-
field of audio engineering, the section an feast in honor of Thiele’s visit. Effects of Digital TV
has been building awareness among Russell A. Corte-Real Jim Hilson of Dolby Labs visited the
the general public about the miscon- Nashville Section on June 29, to talk
ceptions associated with audio and about the effects of digital television
loud music, along with informing Solid State Power Amps transmission on the delivery of multi-
them on the finer technical aspects of To kick off the fall season, Penn State channel sound.
the modern audio systems and the Section met on September 9 to hear Hilson discussed various logistical
fascinating world of recording sound Eli Hughes speak on Modern Solid and physical problems involved in
in a studio. State Power Amps: Concerns and delivering multichannel sound with
The two papers presented were Implementation. After some opening the new Digital TV standard. He cited
“The Dynamics of Reproduced remarks by Dan Valente, Hughes many specific examples from the sum-
Sound” and “The Thiele-Small Para- began his talk. mer Olympics in Athens, Greece, in
meters for Measuring, Specifying and Hughes covered some general topics order to demonstrate the complexity of
Designing Loudspeakers.” such as, why build your own amp, keeping audio and video in sync with
Thiele explained how, in the more what kinds of things should be under- various processing taking place up and
than 100 years since sound was first stood before building one of your down the transmission path.
recorded in a form that could be own, and what is the basic idea of an Jim Ferguson, chief engineer, at
replayed, the quality of reproduction amp. Overall it was very informative WNPT, the local PBS affiliate, said
has improved dramatically. Neverthe- for those just getting started in the that the station is not yet using the
less sound reproduction, although area. Hughes also described his additional bandwidth of the digital
better in many respects, still too often method of design: Load to Line. He TV broadcast frequency for HD pro-
retains the problem of restriction of advocates building the amp for what gramming. However, they are experi-
dynamic range. The ways in which you want to use it on. menting with the multicast delivery
this restriction sometimes enhances of specialized programming to pri-
but very often impairs the quality of Schematics vate sectors of the community while
sound reproduction were explored and When the basics of amps were dis- delivering standard definition pro-
explained in lucid detail. cussed, he continued with a brief gramming that mirrors the analog
Now more than 40 years since they discussion of the schematics of his transmission over the digital channel.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1191
st & cal nov page oct 12 10/12/04 11:46 AM Page 1
SOUND
Upcoming Meetings
TRACK
2005 April 18-21: International Con-
ference on Emerging Technolo-
gies of Noise and Vibration ABOUT COMPANIES… theater or automotive system heard
Analysis and Control, Saint lifelike surround sound, while those
Raphael, France. E-mail:
goran.pavic@insa-lyon.fr. AES sustaining member Solid State listening over regular two-loudspeaker
Logic of Begbroke, UK, has named stereo systems heard the broadcast in
•
AudioPro International, Inc. of Toron- enhanced stereo.
2005 May 16-20: 149th Meeting of
the Acoustical Society of America, to, Canada, the exclusive distributor Circle Surround is a patented multi-
Vancouver, British Columbia, of the new SSL AWS 900 Analogue channel audio encoding and decoding
Canada. ASA, Suite 1NO1, 2 Workstation System for the Canadian technology capable of supporting a
Huntington Quadrangle, Melville, market. wide range of surround sound cre-
NY 11747, USA. Fax: +1 516 576 The advance of Pro Tools and other ation and playback applications. CS
2377; Web: http:// www.
asa.aip.org. such recording and editing systems hardware and software encoders can
used in many facilities today makes encode up to 6.1 channels of discrete
•
the AWS 900 suitable for bridging the audio for distribution over existing
2005 May 28-31: 118th AES Con-
vention, Barcelona, Spain. See gap between a simple digital control two-channel carriers such as broad-
page 1208 for details. unit and the high-end sound of an SSL cast television, cable, and satellite
• SuperAnalogue console. The AWS transmission, streaming media over
2005 July 7-9: 26th AES Con- 900 provides the dual benefits of a ful- the Internet, CDs and VHS tapes.
ference, “Audio Forensics in the ly featured SuperAnalogue signal path
Digital Age,” Denver, CO, USA. coupled with a DAW controller at a AES sustaining member Klipsch
See page 1208. lower price point. According to the Audio Technologies of Indianapolis,
• principal partner of AudioPro Interna- Indiana, has reached an agreement
2005 October 7-10: 119th AES Con- tional, this kind of solution suits the with Oxmoor Corporation LLC of
vention, Jacob K. Javits Conven- Canadian market, which has many Birmingham, Alabama, making Klip-
tion Center, New York, NY, USA. smaller facilities that can not afford a sch the exclusive global distributor of
full-blown 9000 J or K Series. ZON Whole House Digital Audio
The representatives of AudioPro products. The announcement came
MAGNETIC RECORDING: International, Inc., specialize in sur- just before Klipsch debuted 50 of its
The Ups and Downs of a Pioneer round sound applications and bring a own new high-performance loud-
The Memoirs of combined 60 years of experience to speaker products geared toward the
Semi Joseph Begun the audio sales field. AudioPro will custom installation market, 28 of
be setting up a demo facility for the which were on display recently at the
SSL AWS 900 workstation in the Custom Electronic Design and Instal-
near future. lation Association (CEDIA) Expo in
Indianapolis.
SRS Labs, Inc., of Santa Ana, Cali- According to Paul Jacobs, Klipsch
fornia, a sustaining member of the president, Klipsch has experienced
AES, has announced that 38 member significant growth in retail, commer-
stations of JFN Network, Japan’s lead- cial, multimedia, and professional
ing broadcast network association of cinema segments of the audio business
radio stations, aired Japan’s first sur- over the past five years. The ZON
round sound soccer broadcast on alliance will now allow Klipsch to
August 1. make an even greater impact on the
The participating stations broadcast- residential contracting market.
ed the Kashima Antlers vs. FC Oxmoor’s award-winning ZON
Barcelona game in full 5.1 surround Digital Whole House Audio System
using SRS Labs’ Circle Surround (CS) features a stylish easy-to-use control
technology. Tokyo FM broadcasted with 60-Watt integrated amp, analog
the soccer game and encoded the sur- and digital audio inputs, IR routing,
Soft cover round mix live into the standard two- RS-232 control, source selection, EQ,
Prices: $15.00 members channel broadcast format using SRS balance, paging, and other advanced
$20.00 nonmembers
Circle Surround. The signal was then features. ZON received honors for this
AUDIO ENGINEERING SOCIETY
Tel: +1 212 661 8528 ext 39 broadcast to over 100 million soccer system last year from the Consumer
e-mail Andy Veloz at fans in Japan via the FJN network. Electronics Association (CEA) as the
AAV@aes.org
http://www.aes.org Listeners with a multichannel home “Mark of Excellence Award Winner.”
1192 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1193NewProductsNov 10/12/04 11:36 AM Page 1
NEW PRODUCTS
AND
DEVELOPMENTS
Product information is provided as a
service to our readers. Contact manu-
facturers directly for additional infor-
mation and please refer to the Journal
of the Audio Engineering Society.
AES SUSTAINING MEMBER figuration, the ADA-8 can provide eight until full control is regained. The unit
DIGITAL RADIO TUNER is channels of AES/EBU digital and ana- weighs 2.5 pounds and measures
designed for use with Kenwood log input and output at sample rates up 2-inches x 3-inches x 5-inches for
Excelon™ and Kenwood in-dash DVD to 96 kHz. The converter is compatible easy placement. A Designs Audio,
and CD receivers. Model KTC-HR100 with a wide range of PC- and MAC- P. O. Box 4255, West Hills, CA
works with more than two-dozen 2003 based recording, editing, and sequencing 91304, USA; tel. +1 818 716 4153;
and 2004 Kenwood models. WHUR- systems including Digidesign Pro-Tools fax +1 818 716 4153; e-mail sales@
FM, at Howard University, is now Mix and Pro-Tools HD, Logic Audio, adesignsaudio.com; Web site
broadcasting HD Radio™ technology, Final Cut Pro, and many others. In http://www.adesignsaudio.com.
making it the first commercial station to addition to Firewire, ADA-8 interfaces
bring digital radio to the Washington, include AES, Pro Tools MIX/HD, and
D.C. metropolitan area. The station is DSD. Connection to Pro-Tools is
broadcasting signals with a Harris direct to the Mix or Core card, eliminat-
ZDD64HDC 28 000 W, solid-state FM ing the Pro-Tools I/O hardware. Prism
digital broadcast transmitter using the Media Products Inc., 21 Pine Street,
Harris DEXSTAR™ HD Radio exciter. Rockaway, NJ 07866, USA; tel. +1 973
The Harris equipment transmits the HD 983 9577 (US); fax +1 973 983 9588;
radio audio and data created by soft- in UK: tel. +44 1223 424 988; fax AES SUSTAINING MEMBER
ware developer, iBiquity Digital. +44 1223 424 023; e-mail sales @ RECHARGEABLE BATTERY
These combined technologies provide prismsound.com; Web site http://www. PACK powers the Portadisc for
a platform for integrated wireless data prismsound.com. approximately three and one half hours.
services that deliver a variety of addi- As a single, sealed unit, the MDPBP bat-
tional information via scrolling text. tery pack does not suffer the problems
Kenwood USA Corporation, P. O. that can occur with caddies containing
Box 22745 MDS, Long Beach, CA individual AA sized cells, in which bat-
90801, USA; tel. +1 310 639 4200; teries can be inserted incorrectly, or
fax +1 310 604 4487; Web site rechargeable and alkaline batteries may
www.kenwoodusa.com. be inadvertently combined. Designed
and engineered for the most demanding
conditions, the MDPBP is both short
circuit and temperature protected, with
all internal metal contacts securely
welded for long-term reliability.
PASSIVE IN-LINE AUDIO LEVEL Also new from the company is the
CONTROLLER is designed to con- ACS11O, a microprocessor-controlled
trol mono/stereo audio signals from charger specifically developed for use
powered monitors and amplifier units with the MDPBP battery pack. An
that have no output control, such as a ACS110 charger and two MDPBPs will
AES SUSTAINING MEMBER series of microphone preamplifiers, supply a Portadisc with continuous bat-
MULTICHANNEL AD/DA CON- power amplifiers or loudspeakers. The tery power. Additional ACS110 features
VERTER adds a Firewire (IEEE1394) new model ATTY controller has two include a discharge function for effec-
interface module that is compatible with Neutrik combo 1/4-inch XLR input tive management of the MDPBP pack,
the latest Apple OS/X operating system. jacks and two balanced output XLRs. and comprehensive LED metering of the
Support is also planned for Windows Features include a level control knob charging functions. Sennheiser Electron-
XP. The new module allows the ADA-8 and mute switch for those moments ic Corporation, 1 Enterprise Drive, Old
to operate with software such as Emagic when an immediate response is Lyme, CT 06371, USA; tel. +1 860 434
Logic Audio V6, Apple’s Final Cut Pro, required. The mute switch operates as 9190; fax +1 860 434 1759; Web site
and many other applications. In this con- a “panic button,” shorting the signal www.sennheiserusa.com.
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1193
p1194_lit_nov 10/7/04 4:31 PM Page 1
1194 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1195to1198MembershipNov 10/7/04 4:25 PM Page 1
MEMBERSHIP
INFORMATION
Section symbols are: Aachen Student Section (AA), Adelaide (ADE), Alberta (AB), All-Russian State Institute of Cinematography
(ARSIC), American River College (ARC), American University (AMU), Appalachian State University (ASU), Argentina (RA),
Atlanta (AT), Austrian (AU), Ball State University (BSU), Belarus (BLS), Belgian (BEL), Belmont University (BU), Berklee
College of Music (BCM), Berlin Student (BNS), Bosnia-Herzegovina (BA), Boston (BOS), Brazil (BZ), Brigham Young University
(BYU), Brisbane (BRI), British (BR), Bulgarian (BG), Cal Poly San Luis Obispo State University (CPSLO), California State
University–Chico (CSU), Carnegie Mellon University (CMU), Central German (CG), Central Indiana (CI), Chicago (CH), Chile
(RCH), Cincinnati (CIN), Citrus College (CTC), Cogswell Polytechnical College (CPC), Colombia (COL), Colorado (CO),
Columbia College (CC), Conservatoire de Paris Student (CPS), Conservatory of Recording Arts and Sciences (CRAS), Croatian
(HR), Croatian Student (HRS), Czech (CR), Czech Republic Student (CRS), Danish (DA), Danish Student (DAS), Darmstadt
(DMS), Del Bosque University (DBU), Detmold Student (DS), Detroit (DET), District of Columbia (DC), Duquesne University
(DU), Düsseldorf (DF), Ecuador (ECU), Expression Center for New Media (ECNM), Finnish (FIN), Fredonia (FRE), French
(FR), Full Sail Real World Education (FS), Graz (GZ), Greek (GR), Hampton University (HPTU), Heartland (HRT), Hong Kong
(HK), Hungarian (HU), I.A.V.Q. (IAVQ), Ilmenau (IM), India (IND), Institute of Audio Research (IAR), Israel (IS), Italian (IT),
Italian Student (ITS), Japan (JA), Javeriana University (JU), Kansas City (KC), Korea (RK), Lithuanian (LT), Long
Beach/Student (LB/S), Los Andes University (LAU), Los Angeles (LA), Louis Lumière (LL), Malaysia (MY), McGill University
(MGU), Melbourne (MEL), Mexican (MEX), Michigan Technological University (MTU), Middle Tennessee State University
(MTSU), Moscow (MOS), Music Tech (MT), Nashville (NA), Netherlands (NE), Netherlands Student (NES), New Orleans (NO),
New York (NY), New York University (NYU), North German (NG), Norwegian (NOR), Ohio University (OU), Orson Welles
Institute (OWI), Pacific Northwest (PNW), Peabody Institute of Johns Hopkins University (PI), Pennsylvania State University
(PSU), Peru (PER), Philadelphia (PHIL), Philippines (RP), Polish (POL), Portland (POR), Portugal (PT), Ridgewater College,
Hutchinson Campus (RC), Romanian (ROM), Russian Academy of Music, Moscow (RAM/S), SAE Nashville (SAENA), St. Louis
(STL), St. Petersburg (STP), St. Petersburg Student (STPS), San Buenaventura University (SBU), San Diego (SD), San Diego
State University (SDSU), San Francisco (SF), San Francisco State University (SFU), Serbia and Montenegro (SAM), Singapore
(SGP), Slovakian Republic (SR), Slovenian (SL), South German (SG), Spanish (SPA), Stanford University (SU), Swedish (SWE),
Swiss (SWI), Sydney (SYD), Taller de Arte Sonoro, Caracas (TAS), Technical University of Gdansk (TUG), Texas State
University—San Marcos (TSU), The Art Institute of Seattle (TAIS), Toronto (TOR), Turkey (TR), Ukrainian (UKR), University of
Arkansas at Pine Bluff (UAPB), University of Cincinnati (UC), University of Colorado at Denver (UCDEN), University of
Hartford (UH), University of Illinois at Urbana-Champaign (UIUC), University of Luleå-Piteå (ULP), University of
Massachusetts–Lowell (UL), University of Miami (UOM), University of Michigan (UMICH), University of North Carolina at
Asheville (UNCA), University of Southern California (USC), Upper Midwest (UMW), Uruguay (ROU), Utah (UT), Vancouver
(BC), Vancouver Student (BCS), Venezuela (VEN), Vienna (VI), Webster University (WEB), West Michigan (WM), William
Paterson University (WPU), Worcester Polytechnic Institute (WPI), Wroclaw University of Technology (WUT).
MEMBERSHIP
INFORMATION
Greg Scott Sean Tan Doug Wong
14004 Mercado Dr., Del Mar, CA 92014 1556 Ambergrove Dr., San Jose, CA 95131 215-36 Ave. NE Unit 7, Calgary, T2E 2L4,
(SD) (SF) Alberta, Canada (AB)
Peter Seckel Joachim Thiemann Lonce Wyse
123 Tuscan Rd., Mapelwood, NJ 07040 4657 du Parc, Montreal, H2V 4E4, Quebec, 1F Pine Grove, 13-31, 595001 Singapore
(NY) Canada (MGU) (SGP)
Samantha Selig Jan Thore Hol Vladimir S. Zverev
154 Brown St., Tewksbury, MA 01876 Rabbenveien 6C, NO 3039, Drammen, Chernichnaya str 22, Vsevolozhskiy raion,
(BOS) Norway (NOR) Toksovo, RU 188666, Leningradskaya
Vinay Shrivastava oblast, Russia (STP)
Albert Trezza
Broadcast & Elec. Comm. Arts Dept., San 1211 Court N. Dr., Melville, NY 11747 Kevin Zwack
Francisco State University, 1600 Holloway (NY) 9653 Lamar Pl., Westminster, CO 80021
Ave., San Francisco, CA 94132 (SF) (CO)
Cartsen Tringgaard
Samir Sinha Tjearebyvej 111, DK 4000, Roskilde, STUDENTS
495 14th Ave. #1, San Francisco, CA 94118 Denmark (DA)
(SF) Daniel Epstein
Tony Tseng
Ali H. Sleiman 8F-6 no. 351 Chung Shan Rd., Sec.2 Chung 4423 N. Paulina St. # 1, Chicago, IL 60640
ACC wll, Shuwaikh, 176 Safat, 13002 Ho City, Taipei 235, Taiwan (CC)
Kuwait Michael Epstein
Christian Ulbrich
Kevin Smith 400 E. 66th St. # 4F, New York, NY 10017
Strelitzer str 18, DE 10115, Berlin,
204-73 Coburg St., New Westminster, V3L (IAR)
Germany (NG)
2E7, British Columbia, Canada (BC) Chris Fletcher
James W. Urick 6400 Christie Ave. #4118, Emeryville, CA
Jerome Smith 1510 Hillside Oak Dr., Grayson, GA 30017
Klepto Records/Diffrent Fur, 3470 19th St., 94608 (ECNM)
(AT)
San Francisco, CA 94110 (SF) Megan Foley
Jaime Valenszuela P. O. Box 4567, Davis, CA 95617 (SFU)
Adam Sohmer 338 Pasqual Ave., San Gabriel, CA 91775
Sohmer Associates LLC, 507 17th St., Daniel Forsberg
(LA)
Brooklyn, NY 11215 (NY) Ankarskatav 85b, SE 94134, Pitea, Sweden
Peter Van Dam (ULP)
Eric Southam ATS bvba, Wingepark 17, BE 3110,
Easyplug Inc., 2300 S. Decker Lake Blvd., Benjamin M. Foxx
Rotselaar, Belgium (BEL) 167 W. Hudson St., Long Beach, NY 11561
Salt Lake City, UT 84119 (UT)
William Vaughan (IAR)
Bryan Steele
1029 Park Rd. NW, Washington, DC 20010 Daniel Fritz
42855 W. 19th St., Lancaster, CA 93534
(DC) Millergasse 50/14, AT 1060, Vienna,
(LA)
Marcus Venturi Austria (VI)
Stefan Stenzel
Visu-IT! GmbH, Donaustaufer Str. 93, DE Stefan Fuhmann
Stadler Electro GbR, Neustrasse 12, DE
93059, Regensburg, Germany (SG) Oehrenstoecker str 3, DE 98693, Ilmenau,
53498, Waldorf, Germany (CG)
Fabio Vignoli Germany (IM)
Jim Stephens
Philips Research Laboratories, Philips Nozumu Furuya
1755 John Richardson Ln., Vale, NC 28168
Research WY-21, Prof. Holstlaan 4, NL 3 Admiral Dr. #F269, Emeryville, CA
(AT)
5656AA, Eindhoven, Netherlands (NE) 94608 (ECNM)
Daniel Stevens Matthew Gagnon
155 Burton Ave., Hasbrouck Heights, NJ Yon Visell
Kozada 1 Stinjan, Franinovic, HR 52000, 6230 N. Kenmore Ave. # 908, Chicago, IL
07604 (NY) 60660 (CC)
Pula, Croatia (HR)
David Stinson Diego F. Galceran Fernandez
54 Coady Ave., Toronto, M4M 2Y8, Travis Walat
1115 Providence Ct., Frederick, MD 21703 Juan Paulier 1018, Montevideo 11300,
Ontario, Canada (TOR) Uruguay
(DC)
Chris Sturwold Louis Galliot
This is Oddyo Inc., 4139-98 St. NW, Joseph Warda
4 allée du Clos de la Croix, FR 78290,
Edmonton, T6E 5N5, Alberta, Canada (AB) 43-60 Douglaston Pkwy. # 420,
Croissy sur Seine, France (CPS)
Douglaston, NY 11363 (NY)
Olav G. Sunde Carol A. Galvis Jimenez
Car Konows Gate 14, NO 5161, Laksevag, Mark Wherry Cra 23 No. 39 A 40 Apto. 302, Bogota,
Norway (NOR) 1547 14th St., Santa Monica, CA 90404 Colombia (JU)
(LA)
MyungHoon Sunwo Laporschea Gamble
Ajou Univ. San 5 Wonchon-Dong, Paldal- Silvia Weise 3802 Sutton Place Blvd. #1324, Winter
Gu, Suwon, Kyunggi-Do 442-749, Korea Falkentalersteig 58, DE 13467, Berlin, Park, FL 32792 (FS)
(RK) Germany (NG)
Aaron Gandia
Dennis Tabuena Joey White Villa de Torrimar 428 Valle Rey Luis,
1075 Trinity Dr., Menlo Park, CA 94025 1220 Wright St., Reno, NV 89509 Guaynabo, PR 00969 (FS)
(LA) Monte Wise Cole Gaugler
Joel Tan 13216 Marion Dr., Burnsville, MN 55337 27520 N. Sierra Hwy. # H205, Canyon
2/F 6D Babington Path, Hong Kong (HK) (UMW) Country, CA 91351 (CRAS)
1196 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1195to1198MembershipNov 10/7/04 4:25 PM Page 3
MEMBERSHIP
INFORMATION
Oren Gertlitz Yves L. Henry
Kopernikusstrasse 5, DE 10243, Berlin, 53 Hempstead Rd., Spring Valley, NY
Germany (BNS) 10977 (NYU)
Jason Goldkamp Ralf Herrmann
2409 Lancaster Dr. # 11, Richmond, CA Juelicher str 80, DE 40477, Duesseldorf,
94806 (ECNM) Germany (DF)
Brandon Gonzalez Arthur T. Hill
11867 SW 9th Mamor, Davie, FL 33325 2005 N. Ball Ave., Muncie, IN 47304
(UOM) (BSU)
Rodrigo Gonzalez-Hverta Simeon Hinton
Calle 13 No. 230, Cordoba 24500, Mexico 96 Autumn Breeze Way, Winter Park, FL
Christos Goussios 32792 (FS)
12 Askitou Str., GR 54624, Thessaloniki, William Ho
Greece Apt. 404 1745 Wilcox Ave., Los Angeles,
Zachary Gowen CA 90028 (USC)
210 Crystal Lake Dr., Clermont, FL 34711 Ricardo Hohmann
(FS) Rindermarkt 16, DE 80331, Munich,
Celine Grangey Germany
139 rue Manin, Apt. 64, FR 75019, Paris,
Jason Holderness
France (CPS)
1118 Nord Ave. #32, Chico, CA 95926
Ulf Grunbaum (CSU)
Ankarskatav 84e, SE 94134, Pitea, Sweden
(ULP) Sean Hopper
2 Millridge Estates, Elora, N0B 1S0,
Manuel D. Guevara Alvarez Ontario, Canada
Calle 8f No. 79-37, Bogota, Colombia
(SBU) Tomislav Horvat
A. Mihanovica 22, HR 44322, Kutina,
Benjamin Gugler II Croatia (HRS)
2113 Irise Ct. Apt. 306, Orlando, FL 32807
(FS) Lisa M. Host
2728 N. 83 St., Omaha, NE 68134
Steven Guilliams
1803 Golden Gate Ave., San Francisco, CA Mattew Houston
94115 (SU) 906 N. Dodge #10, Iowa City, IA 52245
Hannelore Guittet Daniel Howd
13 rue Jules Auffret, FR 93500, Pantin, 801 E. Benjamin Ave., P. O. Box 150,
France (CPS)
Ajay Gupta
Norfolk, NE 68702
Michael T. Hudson
Advertiser
P. O. Box 115, Notre Dame, IN 46556 11801 High Tech Ave. #324, Orlando, FL
(BSU)
Mike Gurnari
32817 (FS)
Mats Ingemansson
Internet
8 Fuente Ave., San Francisco, CA 94132 Ankarskatav 71b, SE 94134, Pitea, Sweden
(SFU)
Erik Gustafsson
(ULP)
Anamaria D. Irisarri
Directory
Ankarskatav 79c, SE 94134, Pitea Sweden Carrera 2A #72-67 Apt. 201, Bogota,
(ULP) Colombia (JU)
Stanley Haggard Matthew C. Irvin Belmont University.........................1187
1202 Hillside Ave., Richmond, VA 23229 121 Hazelwood Dr. # E23, Hendersonville, www.belmont.edu
(HPTU) TN 37075 (SAE NA)
Christopher Harrelson Nicholas Jacalone BSWA Technology Co., Ltd...........1197
1911 28th St., Sacramemto, CA 95816 5410 Loma Ave., Temple City, CA 91780 www.bswa-tech.com
(SFU) (CSU)
Benjamin Harris *Prism Media Products Limited ......1189
Jeremy Jacobs www.prismsound.com
7985 W. 51st Ave. #8, Arvada, CO 80002 31103 Pierce Ct., Crown Point, IN 46307
(UCDEN) (FS) Rohde & Schwarz GmbH & Co......1183
Julia Havenstein Eric Jacobsen www.rohde-schwarz.com
Kaiser-Friedrich-str 64, DE 10627, Berlin, 936 Bishop Park Ct. # 1321, Winter Park,
Germany (BNS) FL 32792 (FS)
Joshua Hearst Chris Jara
872 Queen Anne Pl., St. Louis, MO 63122 4733 N. Goldenrod Rd. Apt. D, Winter
(WEB) Park, FL 32792 (FS)
Wiebke Heldmann Shawn Jennings
*AES Sustaining Member.
Chorinerstr. 61, DE 10435, Berlin, 202 E. Peabody Dr., URH 361 Scott Hall,
Germany (BNS) Champaign, IL 61820 (UIUC) ➥
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1197
p1195to1198MembershipNov 10/7/04 4:25 PM Page 4
MEMBERSHIP
INFORMATION
John Jensen Bryan Laseter John Madden
517 V St., Sacramento, CA 95818 (SFU) 7934-B Shoals Dr., Orlando, FL 32817 (FS) 1812 Page St. #5, San Francisco, CA 94117
Sverre K. Johansen Victor Laugier (SFU)
Madlamarkveien 6 l.118, NO 4041, 212 avenue Jean Jaures, FR 75019, Paris, Zoran Maksimovic
Stavanger, Norway France (CPS) Faculty of Dramatic Arts, Bulevar
Daniel Johnson David Layne Umetnosti 20, HR 11000, Zagreb, Croatia
8196 SW 53rd Ct., Ocala, FL 34476 (UOM) 100 Anavista Ave., San Francisco, CA (HRS)
Paul Johnson 94115 (ECNM) Valdemar J. Maldonado
4420 N. Varsity Ave. #1074, San Seng Siong Lee 12508 200th Ave. E., Sumner, WA 98390
Bernadino, CA 92407 (SDSU) 29 Jalan 3/149G, Taman Sri Endah, Kuala (TAIS)
Dave Jones Lumpur 57000, Malaysia Carlos A. Manrique Alonso
2404B Crestmoor Rd., Nashville, TN 37215 Cra. 25 #142-60 Apto. 502, Bogota,
Mario A. Lemus Mendez
(SAE NA) Colombia (JU)
Calle 48 No. 15-92 Apt. 302, Bogota,
William Jones Colombia (SBU) Brian Markman
1759 N. Semoran Circle # 203, Winter 3733 Goldenrod Rd. # 1109, Winter Park,
Park, FL 32730 (FS) Tobias Lentz
Viktoriastrasse 87, DE 52066, Aachen, FL 32792 (FS)
Johannes Kammann Germany (AA) Michael A. Mavriokos
Oldenburger str 31, DE 10551, Berlin, 418 Autumn Breeze Way, Winter Park, FL
Germany (BNS) Josue C. Lescano
unidad Vecinal de matute 39-H, La 32792 (FS)
Gregory Kares Victoria, Lima 13, Peru (OWI) Oscar E. Mazuera Escobar
501 Clinton St. #3, Brooklyn, NY 11231 Trans. 9c No. 130 b-81, Bogota, Colombia
(NYU) Oscar Andre Lie Foss
Tiurkroken 26, NO 2050, Jessheim, Norway (SBU)
Jill Kares Sebastian Mazur
501 Clinton St. #3, Brooklyn, NY 11231 Lena Lienig
Ugleveien 6a, NO 4042, Hafrsfjord, Norway ul. Wyspianskiego 7, PL 80-434, Gdansk,
(NYU) Poland (TUG)
Robert Kawiak Maria Linares
Calle 74 #6-11 Apto.302, Bogota, Colombia Kevin McCormick
ul. Fr. Sokoka 150, PL 80-603, Gdynia,
(JU) 2039 New Stonecastle Terrace #111, Winter
Poland (TUG)
Park, FL 32792 (FS)
Travis Kessler Olov Lindberg
Brandellsvag 8, SE 93133, Skelleftea, Kelly McCoy
3309 Horst Ln., Chambersburg, PA 17201
Sweden (ULP) 225 Brown Rd., Lot 46, Franklin, KY
(PSU)
Johannes Lindemann 42134
Nick Kettman
Eidelstedter Weg 9, DE 20255, Hamburg, Mike McKenzie
1200 Barton Hills Dr. # 209, Austin, TX
Germany 1701 Lee Rd. M431, Winter Park, FL
78704 (TSU)
Carlos Llorens 32789 (FS)
George Kim
Universidad Politenica Valencia, Ctra Anthony McMahon
P. O. Box 745020, Los Angeles, CA 90004
Nazaret-Olivia SN, ES 46730, Grao Gandia, Auchlinsky House, Burnfoot, Glendevon,
(LB/S)
Spain Clackmannanshire, FK14 7JY, Scotland
Craig King Erin Lockhart
1323 Whitewood Dr., Deltona, FL 32725 Drew McNally
3886 Calibre Bend Ln. # 809, Winter Park, 338 Newtown Rd., Richboro, PA 18954
(FS) FL 32792 (FS) (PSU)
Rishi Kirby Gabe D. Long
411 Lincoln Ave. Unit 36, Glendale, CA Matthew B. Meares
3416 Murphy Rd. # C-11, Nashville, TN 211 Lenora St. # 203, Seattle, WA 98121
91205 (CTC) 37203 (SAENA) (TAIS)
Mariusz Klawikowski Emil Lorelius
Nowy Barkoczyn 112, PL 83-422, Nowy Volker Meitz
Ankarskatav 84e, SE 94134, Pitea, Sweden
Barkoczyn, Poland (TUG) Bornholmer Str. 95, DE 10439, Berlin,
(ULP)
Germany (BNS)
Matthew Kline Bob Lorentz
703 Jackpine, P. O. Box 864, Stanton, NE 2 Spinozalaan, NL 2273 XA, Voorburg, David Menke
68779 Netherlands (NES) Riglergasse 14/12a, AT 1180, Wien,
Austria (VI)
Michal Klos Christoph Lowis
ul. Szarych Szeregow 4/2, PL 88-100, Erdbergstr 101/22, AT 1030, Vienna, Jessica Mercel
Inowroclaw, Poland (TUG) Austria (VI) 2096 St. Clair Ave. West, Toronto, M6J
3W6, Ontario, Canada
Ivan Kovacevic Elizabeth Luchenbill
Grcica Milenka 3, YU 11000, Belgrade, 611 St. Johns Ct., Winter Park, FL 32792 Daniel Miller
Yugoslavia (FS) 630 Hickory Club Dr., Antioch, TN 37013
(SAENA)
Lucille Kyle Oliver Ludecke
1411 Marchbanks Dr. #3, Walnut Creek, Turmstr 76, DE 10551, Berlin, Germany Jeremy Miller
CA 94598 (ECNM) (DF) 5029 SW Grayson, Seattle, WA 98116
(TAIS)
Robert Lapp Andrew Lux
3212 Arden Villas Blvd. #27, Orlando, FL 2841 Harrsion, San Francisco, CA 94110 Jonathan Patton
32817 (FS) (SFU) 530 Byron Rd., Winter Park, FL 32792 (FS)
1198 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
10,000
Journal technical ar ticles, convention pr eprints,
and confer ence papers at your finge r tips
The Audio Engineering Society has published a 20-disk electronic library containing most
of the Journal technical articles, convention preprints, and conference papers published
by the AES since its inception through the year 2003. The approximately 10,000 papers
and articles are stored in PDF format, preserving the original documents to the highest
fidelity possible while permitting full-text and field searching. The library can be viewed on
Windows, Mac, and UNIX platforms.
You can purchase the entire 20-disk library or disk 1 alone. Disk 1 contains the program
and installation files that are linked to the PDF collections on the other 19 disks. For
reference and citation convenience, disk 1 also contains a full index of all documents
within the library, permitting you to retrieve titles, author names, original publication
name, publication date, page numbers, and abstract text without
ever having to swap disks.
2
2000
033
OOU
UGGH
H
D
D T
THHR
R
P D
DAAT
TEE
U
U P
The AES 26th International Conference is designed to explore the history, hardware, and techniques of forensic investigation of
audio materials. The field has gone through significant advances with the advent of digital audio recording, signal processing, and
computer-assisted evaluation. The appropriate use of analog and digital processes provides the contemporary audio
engineer with powerful tools for quality audio investigation in support of the law enforcement, legal, archival, and restoration
communities.
The AES 26th Conference Committee invites submission of research and technical papers. By January 31, 2005, a proposed title,
60-120 word abstract, and a 500-1000 word précis of the paper should be submitted via the Internet to the AES 26th Committee at
the following site http://www.aes.org/26th_authors. A preference will be given to papers that combine a lecture with a listening
experience.
A full day of tutorial studies will be held at the University of Colorado at Denver on July 7 to provide historical and practical per-
spective to the technical papers and workshops of the 26th Conference to be held on July 8 and 9. High quality audio and video
support will be provided for presentations and laboratory experiences. Authors may submit proposals for papers, workshops, and
tutorials.
The author’s information, title, abstract, and précis should all be submitted online. The précis should describe the work performed,
methods employed, conclusion(s), and significance of the paper. Titles and abstracts should follow the guidelines in Information for
Authors at http://www.aes.org/journal/con_infoauth.html. You can visit this site for more information and complete instructions
for using the site anytime after November 9, 2004. Acceptance of papers will be determined by the 26th Conference review com-
mittee based on an assessment of the abstract and précis. A conference paper, submitted via the Internet by 2005 April 19, will be a
condition for presentation at the conference.
1200 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 1
SECTIONS CONTACTS
DIRECTORY
The following is the latest information we have available for our sections contacts. If you
wish to change the listing for your section, please mail, fax or e-mail the new information
to: Mary Ellen Ilich, AES Publications Office, Audio Engineering Society, Inc., 60 East
42nd Street, Suite 2520, New York, NY 10165-2520, USA. Telephone +1 212 661 8528,
ext. 23. Fax +1 212 661 7829. E-mail MEI@aes.org.
Updated information that is received by the first of the month will be published in the
next month’s Journal. Please help us to keep this information accurate and timely.
EASTERN REGION, 2712 Leslie Dr. Worcester Polytechnic New York University Section
USA/CANADA Atlanta, GA 30345 Institute Section (Student) (Student)
Tel./Fax +1 770 908 1833 William Michalson Robert Rowe, Faculty Advisor
Vice President: E-mail Faculty Advisor Steinhardt School of Education
Jim Anderson atlanta_section@aes.org AES Student Section 35 West 4th St., 777G
12 Garfield Place Worcester Polytechnic Institute New York, NY 10012
Brooklyn, NY 11215 MARYLAND 100 Institute Rd. Tel. +1 212 998 5435
Tel. +1 718 369 7633 Peabody Institute of Johns Worcester, MA 01609 E-mail nyu@aes.org
Fax +1 718 669 7631 Hopkins University Section Tel. +1 508 831 5766
E-mail vp_eastern_usa@aes.org E-mail wpi@aes.org NORTH CAROLINA
(Student)
Neil Shade, Faculty Advisor Appalachian State University
UNITED STATES OF NEW JERSEY Section (Student)
AMERICA AES Student Section
Peabody Institute of Johns William Paterson University Michael S. Fleming
CONNECTICUT Hopkins University Section (Student) Faculty Advisor
University of Hartford Recording Arts & Science Dept. David Kerzner, Faculty Advisor Appalachian State University
Section (Student) 2nd Floor Conservatory Bldg. AES Student Section Hayes School of Music
Timothy Britt 1 E. Mount Vernon Place William Paterson University 813 Rivers St.
Faculty Advisor Baltimore, MD 21202 300 Pompton Rd. Boone, NC 28608
AES Student Section Tel. +1 410 659 8100 ext. 1226 Wayne, NJ 07470-2103 Home Tel. +1 828 263 0454
University of Hartford E-mail peabody@aes.org Tel. +1 973 720 3198 Bus. Tel. +1 828 262 7503
Ward College of Technology Fax +1 973 720 2217 E-mail appalachian@aes.org
200 Bloomfield Ave. MASSACHUSETTS E-mail wpu_section@aes.org University of North Carolina
West Hartford, CT 06117 Berklee College of Music at Asheville Section (Student)
Tel. +1 860 768 5358 NEW YORK
Section (Student) Wayne J. Kirby
Fax +1 860 768 5074 Eric Reuter, Faculty Advisor Fredonia Section (Student) Faculty Advisor
E-mail Berklee College of Music Bernd Gottinger, Faculty Advisor AES Student Section
u_hartford_section@aes.org Audio Engineering Society AES Student Section University of North Carolina at
FLORIDA c/o Student Activities SUNY–Fredonia Asheville
Full Sail Real World 1140 Boylston St., Box 82 1146 Mason Hall Dept. of Music
Education Section (Student) Boston, MA 02215 Fredonia, NY 14063 One University Heights
Bill Smith, Faculty Advisor Tel. +1 617 747 8251 Tel. +1 716 673 4634 Asheville, NC 28804
AES Student Section Fax +1 617 747 2179 Fax +1 716 673 3154 Tel. +1 828 251 6432
Full Sail Real World Education E-mail E-mail Fax +1 828 253 4573
3300 University Blvd., Suite 160 berklee_section@aes.org fredonia_section@aes.org E-mail north_carolina@aes.org
Winter Park, FL 327922 Boston Section Institute of Audio Research PENNSYLVANIA
Tel. +1 800 679 0100 Matthew Girard Section (Student)
E-mail full_sail@aes.org Carnegie Mellon University
Tel. +1 781 883 1248 Noel Smith, Faculty Advisor Section (Student)
University of Miami Section E-mail AES Student Section Thomas Sullivan
(Student) boston_section@aes.org Institute of Audio Research Faculty Advisor
Ken Pohlmann, Faculty Advisor 64 University Pl. AES Student Section
AES Student Section University of Massachusetts New York, NY 10003 Carnegie Mellon University
University of Miami –Lowell Section (Student) Tel. +1 212 677 7580 University Center Box 122
School of Music John Shirley, Faculty Advisor Fax +1 212 677 6549 Pittsburg, PA 15213
P. O. Box 248165 AES Student Chapter E-mail iar_section@aes.org Tel. +1 412 268 3351
Coral Gables, FL 33124-7610 University of Massachusetts–Lowell E-mail carnegie_mellon@aes.org
Tel. +1 305 284 6252 Dept. of Music New York Section
Fax +1 305 284 4448 35 Wilder St., Ste. 3 Bill Siegmund Duquesne University Section
E-mail miami_section@aes.org Lowell, MA 01854-3083 Digital Island Studios (Student)
Tel. +1 978 934 3886 71 West 23rd Street Suite 504 Francisco Rodriguez
GEORGIA Fax +1 978 934 3034 New York, NY 10010 Faculty Advisor
Atlanta Section E-mail Tel. +1 212 243 9753 AES Student Section
Robert Mason umass_lowell_section@aes.org E-mail new_york@aes.org Duquesne University
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1201
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 2
SECTIONS CONTACTS
DIRECTORY
School of Music McGill University Champaign University of Michigan
600 Forbes Ave. Sound Recording Studios Urbana, IL 61801 Section (Student)
Pittsburgh, PA 15282 Strathcona Music Bldg. Tel. +1 217 384 5242 Jason Corey Faculty Advisor
Tel. +1 412 434 1630 555 Sherbrooke St. W. E-mail urbana_section@aes.org University of Michigan School
Fax +1 412 396 5479 Montreal, Quebec H3A 1E3 of Music
E-mail duquesne@aes.org Canada INDIANA 1100 Baits Drive
Tel. +1 514 398 4535 ext. 0454 Ball State University Section Ann Arbor, MI 48109
Pennsylvania State University
E-mail mcgill_u_section@aes.org (Student) E-mail
Section (Student)
Michael Pounds, Faculty Advisor univ_michigan_section@aes.org
Dan Valente Toronto Section
AES Penn State Student Chapter Earl McCluskie AES Student Section
Ball State University West Michigan Section
Graduate Program in Acoustics E32-223 Pioneer Dr. Carl Hordyk
Pennsylvania State University Kitchner, Ontario MET Studios
2520 W. Bethel Ave. Calvin College
P. O. Box 30 N2P 1L9, Canada 3201 Burton S.E.
State College, PA 16803 Tel. +1 519 894 5308 Muncie, IN 47306
Tel. +1 765 285 5537 Grand Rapids, MI 49546
Tel. +1 814 865 2859 Fax +1 416 364 1310 Tel. +1 616 957 6279
Cell +1 814 360 83399 E-mail toronto@aes.org Fax +1 765 285 8768
E-mail Fax +1 616 957 6469
E-mail E-mail
penn_state_section@aes.org ball_state_section@aes.org
west_mich_section@aes.org
F CENTRAL REGION,
Philadelphia Section USA/CANADA Central Indiana Section
Rebecca Mercuri MINNESOTA
James Latta
P. O. Box 1166 Vice President: Sound Around Music Tech College Section
Philadelphia, PA 19105 Frank Wells 6349 Warren Ln. (Student)
Tel. +1 215 327 7105 2130 Creekwalk Drive Brownsburg, IN 46112 Michael McKern
E-mail philly@aes.org Murfreesboro, TN Tel. +1 317 852 8379 Faculty Advisor
Tel. +1 615 848 1769 Fax +1 317 858 8105 AES Student Section
VIRGINIA Fax +1 615 848 1108 E-mail Music Tech College
Hampton University Section E-mail vp_central_usa@aes.org central_indiana_section@aes.org 19 Exchange Street East
(Student) Saint Paul, MN 55101
Bob Ransom, Faculty Advisor UNITED STATES OF KANSAS Tel. +1 651 291 0177
AES Student Section AMERICA Kansas City Section Fax +1 651 291 0366
Hampton University Jim Mitchell E-mail
Dept. of Music ARKANSAS musictech_student@aes.org
Custom Distribution Limited
63 Litchfield Close University of Arkansas at 12301 Riggs Rd.
Hampton, VA 23668 Pine Bluff Section (Student) Ridgewater College,
Overland Park, KS 66209 Hutchinson Campus Section
Office Tel. +1 757 727 5658, Robert Elliott, Faculty Advisor Tel. +1 913 661 0131
+1 757 727 5404 AES Student Section (Student)
Fax +1 913 663 5662 Dave Igl, Faculty Advisor
Home Tel. +1 757 826 0092 Music Dept. Univ. of Arkansas E-mail
Fax +1 757 727 5084 at Pine Bluff AES Student Section
kansas_city_section@aes.org Ridgewater College, Hutchinson
E-mail hampton_u@aes.org 1200 N. University Drive
Pine Bluff, AR 71601 Campus
LOUISIANA
WASHINGTON, DC Tel. +1 870 575 8916 2 Century Ave. S.E.
New Orleans Section Hutchinson, MN 55350
American University Section Fax +1 870 543 8108 Joseph Doherty
(Student) E-mail pinebluff@aes.org E-mail ridgewater@aes.org
Factory Masters
Rebecca Stone-gordon 4611 Magazine St. Upper Midwest Section
Faculty Advisor ILLINOIS
New Orleans, LA 70115 Greg Reierson
AES Student Section Chicago Section Tel. +1 504 891 4424 Rare Form Mastering
American University Jeff Segota Cell +1 504 669 4571 4624 34th Avenue South
4400 Massachusetts Ave., N.W. 2955 No. Halsted #3 Fax +1 504 891 9262 Minneapolis, MN 55406
Washington, DC 20016 Chicago, IL 60657 E-mail new_orleans@aes.org Tel. +1 612 327 8750
Tel. +1 202 885 3242 E-mail chicago_section@aes.org E-mail
E-mail MICHIGAN upper_midwest_section@aes.org
american_u_section@aes.org Columbia College Section
(Student) Detroit Section
District of Columbia Section David Carlstrom MISSOURI
Dominique J. Chéenne
Fred G. Geil Faculty Advisor DaimlerChrysler St. Louis Section
Sound Engineering Company AES Student Section Tel. +1 313 493 4035 John Nolan, Jr.
1408 Harmony Lane 676 N. LaSalle, Ste. 300 E-mail detroit@aes.org 693 Green Forest Dr.
Annapolis, MD 21401 Chicago, IL 60610 Fenton, MO 63026
Tel. +1 410 260 5924 Michigan Technological Tel./Fax +1 636 343 4765
Tel. +1 312 344 7802
Fax +1 410 260 5430 University Section (Student) E-mail st_louis_section@aes.org
Fax +1 312 482 9083
E-mail dc_section@aes.org Greg Piper
E-mail
AES Student Section Webster University Section
CANADA columbia_section@aes.org
Michigan Technological (Student)
McGill University Section University of Illinois at University Gary Gottleib, Faculty Advisor
(Student) Urbana-Champaign Section 121 EERC Building Webster University
William L. Martens and (Student) 1400 Townsend Dr. 470 E. Lockwood Ave.
Martha De Francisco Michael Peterson Houghton, MI 49931 Webster Groves, MO 63119
Faculty Advisors AES Student Section Tel. +1 906 482 3581 Tel. +1 961 2660 x7962
AES Student Section University of Illinois, Urbana- E-mail michigan_tech@aes.org E-mail webster_st_louis@aes.org
1202 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 3
SECTIONS CONTACTS
DIRECTORY
NEBRASKA Tel. +1 615 335 8520 Cal Poly San Luis Obispo 5534 Encino Ave. # 214
Heartland Section Fax +1 615 335 8625 State University Section Encino, CA 91316
Anthony D. Beardslee E-mail nashville@aes.org (Student) Tel. +1 818 830 8775
Northeast Community College Bryan J. Mealy, Faculty Advisor E-mail la_section@aes.org
P.O. Box 469 SAE Nashville Section (Student) AES Student Section
Larry Sterling, Faculty Advisor California Polytechnic State San Diego Section
Norfolk, NE 68702 J. Russell Lemon
Tel. +1 402 844 7365 AES Student Section University
7 Music Circle N. Dept. of Electrical Engineering 2031 Ladera Ct.
Fax +1 209 254 8282 Carlsbad, CA 92009-8521
E-mail Nashville, TN 37203 San Luis Obispo, CA 93407
Tel. +1 615 244 5848 Tel. +1 805 756 2300 Home Tel. +1 760 753 2949
heartland_section@aes.org E-mail
Fax +1 615 244 3192 Fax +1 805 756 1458
E-mail saenash_student@aes.org E-mail san_diego_section@aes.org
OHIO
Cincinnati Section san_luis_obispo_section@aes.org San Diego State University
TEXAS
Dan Scherbarth California State University Section (Student)
Digital Groove Productions Texas State University—San –Chico Section (Student) John Kennedy, Faculty Advisor
5392 Conifer Dr. Marcos Section (Student) Keith Seppanen, Faculty Advisor AES Student Section
Mason, OH 45040 Mark C. Erickson AES Student Section San Diego State University
Tel. +1 513 325 5329 Faculty Advisor California State University–Chico Electrical & Computer
E-mail cincinnati@aes.org AES Student Section 400 W. 1st St. Engineering Dept.
Southwest Texas State University Chico, CA 95929-0805 5500 Campanile Dr.
Ohio University Section 224 N. Guadalupe St. Tel. +1 530 898 5500 San Diego, CA 92182-1309
(Student) San Marcos, TX 78666 E-mail chico@aes.org Tel. +1 619 594 1053
Erin M. Dawes Tel. +1 512 245 8451 Fax +1 619 594 2654
AES Student Section Fax +1 512 396 1169 Citrus College Section (Student) E-mail sdsu@aes.org
Ohio University, RTVC Bldg. E-mail tsu_sm@aes.org Stephen O’Hara, Faculty Advisor
AES Student Section San Francisco Section
9 S. College St.
Citrus College Conrad Cooke
Athens, OH 45701-2979
WESTERN REGION, Recording Arts 1046 Nilda Ave.
Home Tel. +1 740 597 6608
USA/CANADA 1000 W. Foothill Blvd. Mountain View, CA 94040
E-mail ohio@aes.org
Glendora, CA 91741-1899 Office Tel. +1 650 846 1132
University of Cincinnati Vice President: Fax +1 626 852 8063 Home Tel. +1 650 321 0713
Section (Student) Bob Moses E-mail san_francisco@aes.org
Island Digital Media Group, Cogswell Polytechnical
Thomas A. Haines San Francisco State
LLC College Section (Student)
Faculty Advisor University Section (Student)
26510 Vashon Highway S.W. Tim Duncan, Faculty Advisor
AES Student Section John Barsotti, Faculty Advisor
Vashon, WA 98070 AES Student Section
University of Cincinnati AES Student Section
Tel. +1 206 463 6667 Cogswell Polytechnical College
College-Conservatory of Music San Francisco State University
Fax +1 810 454 5349 Music Engineering Technology
M.L. 0003 Broadcast and Electronic
E-mail vp_western_usa@aes.org 1175 Bordeaux Dr.
Cincinnati, OH 45221 Communication Arts Dept.
Sunnyvale, CA 94089
Tel. +1 513 556 9497 1600 Halloway Ave.
Tel. +1 408 541 0100, ext. 130
Fax +1 513 556 0202 UNITED STATES OF San Francisco, CA 94132
Fax +1 408 747 0764
E-mail AMERICA Tel. +1 415 338 1507
E-mail cogswell_section@aes.org
univ_cincinnati@aes.org E-mail sfsu_section@aes.org
ARIZONA
Expression Center for New
Conservatory of The Media Section (Student) Stanford University Section
TENNESSEE
Recording Arts and Sciences John Scanlon, Faculty Advisor (Student)
Belmont University Section Jay Kadis, Faculty Advisor
(Student) Section (Student) Director of Sound Arts
Glenn O’Hara, Faculty Advisor AES Student Section Stanford AES Student Section
Wesley Bulla, Faculty Advisor Stanford University
AES Student Section AES Student Section Ex’pression Center for New Media
Conservatory of The Recording 6601 Shellmount St. CCRMA/Dept. of Music
Belmont University Stanford, CA 94305-8180
Nashville, TN 37212 Arts and Sciences Emeryville, CA 94608
2300 E. Broadway Rd. Tel. +1 510 654 2934 Tel. +1 650 723 4971
E-mail Fax +1 650 723 8468
belmont_section@aes.org Tempe, AZ 85282 Fax +1 510 658 3414
Tel. +1 480 858 9400, 800 562 E-mail expression_center_ E-mail stanford@aes.org
Middle Tennessee State 6383 (toll-free) section@aes.org University of Southern
University Section (Student) Fax +1 480 829 1332 California Section (Student)
E-mail Long Beach City College
Phil Shullo, Faculty Advisor Kenneth Lopez
conservatory_RAS@aes.org Section (Student)
AES Student Section Faculty Advisor
Nancy Allen, Faculty Advisor
Middle Tennessee State University AES Student Section
CALIFORNIA AES Student Section
301 E. Main St., Box 21 University of Southern California
Long Beach City College
Murfreesboro, TN 37132 American River College 840 W. 34th St.
4901 E. Carson St.
Tel. +1 615 898 2553 Section (Student) Los Angeles, CA 90089-0851
Long Beach, CA 90808
E-mail mtsu_section@aes.org Eric Chun, Faculty Advisor Tel. +1 213 740 3224
Tel. +1 562 938 4312
AES Student Section Fax +1 213 740 3217
Fax +1 562 938 4409
Nashville Section American River College Chapter E-mail usc@aes.org
E-mail long_beach@aes.org
Tom Edwards 4700 College Oak Dr.
MTV Networks Sacramento, CA 95841 Los Angeles Section COLORADO
330 Commerce St. Tel. +1 916 484 8420 Geoff Christopherson Colorado Section
Nashville, TN 37201 E-mail american_river@aes.org JBL Professional Roy Pritts
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1203
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 4
SECTIONS CONTACTS
DIRECTORY
2873 So. Vaughn Way AES Student Section c/o Stenbaek Fax +7 095 187 7174
Aurora, CO 80014 The Art Institute of Seattle Mozartsvej 11, 1 TV E-mail all_russian_state@aes.org
Tel. +1 303 369 9514 2323 Elliott Ave. DK-2450, Copenhagen SV,
E-mail Seattle, WA 98121 Denmark Moscow Section
colorado_section@aes.org Tel. +1 206 448 0900 Tel. +45 6133 4588 Michael Lannie
E-mail art_institute_seattle_ E-mail denmark_section@aes.org Research Institute for
University of Colorado at section@aes.org Television and Radio
Denver Section (Student) Danish Student Section Acoustic Laboratory
Roy Pritts, Faculty Advisor Preben Kvist 12-79 Chernomorsky bulvar
AES Student Section CANADA c/o Stenbaek RU-113452 Moscow, Russia
University of Colorado at Denver Alberta Section Mozartsvej 11, 1 TV Tel. +7 095 2502161, +7 095
Dept. of Professional Studies Joshua Tidsbury DK-2450, Copenhagen SV, 1929011
Campus Box 162 AES Alberta Section Denmark Fax +7 095 9430006
P.O. Box 173364 716 Lake Ontario Dr. S.E. Tel. +45 6133 4588 E-mail
Denver, CO 80217-3364 Calgary, Alberta T2J 3J8 E-mail moscow_section@aes.org
Tel. +1 303 556 2795 Canada copenhagen_section@aes.org
Fax +1 303 556 2335 Tel. +1 403 803 4522 Russian Academy of Music
E-mail E-mail alberta@aes.org FINLAND Student Section
cu_denver_section@aes.org Finnish Section Igor Petrovich Veprintsev
Vancouver Section Kalle Koivuniemi Faculty Advisor
OREGON David Linder Nokia Research Center Sound Engineering Division
Portland Section 93.7 JRfm/600am Radio, A P.O. Box 100 30/36 Povarskaya Street
Tony Dal Molin Division of the Jim Pattison FI-33721 Tampere, Finland RU 121069, Moscow, Russia
Audio Precision, Inc. Broadcast Group Tel. +358 7180 35452 Tel. +7 095 291 1532
5750 S.W. Arctic Dr. 300-1401 West 8th Ave. Fax +358 7180 35897 E-mail russian_academy_
Portland, OR 97005 Vancouver, BC V6H 1C9 E-mail finnish_section@aes.org section@aes.org
Tel. +1 503 627 0832 Canada
Fax +1 503 641 8906 E-mail NETHERLANDS St. Petersburg Section
E-mail portland_section@aes.org vancouver_section@aes.org Irina A. Aldoshina
Netherlands Section
Vancouver Student Section Rinus Boone St. Petersburg University of
UTAH Gregg Gorrie, Faculty Advisor Voorweg 105A Telecommunications
AES Greater Vancouver NL-2715 NG Zoetermeer Gangutskaya St. 16, #31
Brigham Young University
Student Section Netherlands RU-191187 St. Petersburg
Section (Student)
Centre for Digital Imaging and Tel. +31 15 278 14 71, +31 62 Russia
Timothy Leishman,
Sound 127 36 51 Tel. +7 812 272 4405
Faculty Advisor
3264 Beta Ave. Fax +31 79 352 10 08 Fax +7 812 316 1559
BYU-AES Student Section
Burnaby, B.C. V5G 4K4, Canada E-mail E-mail
Department of Physics and
Tel. +1 604 298 5400 netherlands_section@aes.org st_petersburg_section@aes.org
Astronomy
Brigham Young University E-mail Netherlands Student Section St. Petersburg Student Section
Provo, UT 84602 vancouver_student @ aes.org Maurik van den Steen Natalia V. Tyurina
Tel. +1 801 422 4612 AES Student Section Faculty Advisor
E-mail Prins Willemstraat 26 Prosvescheniya pr., 41, 185
brigham_young_section@aes.org NORTHERN REGION, 2584 HV Den Haag, Netherlands RU-194291 St. Petersburg, Russia
Utah Section EUROPE Tel. +31 6 45702051 Tel. +7 812 595 1730
Deward Timothy E-mail netherlands_student_ Fax +7 812 316 1559
Vice President: section@aes.org E-mail st_petersburg_student
c/o Poll Sound
Søren Bech _section@aes.org
4026 S. Main Bang & Olufsen a/s
Salt Lake City, UT 84107 CoreTech NORWAY
Tel. +1 801 261 2500 SWEDEN
Peter Bangs Vej 15 Norwegian Section
Fax +1 801 262 7379 DK-7600 Struer, Denmark Jan Erik Jensen Swedish Section
E-mail utah_section@aes.org Tel. +45 96 84 49 62 Nøklesvingen 74 Ingemar Ohlsson
Fax +45 97 85 59 50 NO-0689 Oslo, Norway Audio Data Lab
WASHINGTON E-mail Office Tel. +47 22 24 07 52 Katarinavägen 22
Pacific Northwest Section vp_northern_europe@aes.org Home Tel. +47 22 26 36 13 SE-116 45 Stockholm, Sweden
Gary Louie Fax +47 22 24 28 06 Tel. +46 8 644 58 65
University of Washington BELGIUM E-mail norway_section@aes.org Fax +46 8 641 67 91
School of Music E-mail sweden@aes.org
Belgian Section
P. O. Box 353450 RUSSIA
Hermann A. O. Wilms University of Luleå-Piteå
4522 Meridian Ave. N., #201 AES Europe Region Office All-Russian State Institute of Section (Student)
Seattle, WA 98103 Zevenbunderslaan 142, #9 Cinematography Section Lars Hallberg, Faculty Sponsor
Tel. +1 206 543 1218 BE-1190 Vorst-Brussels, Belgium (Student) AES Student Section
Fax +1 206 685 9499 Tel. +32 2 345 7971 Leonid Sheetov, Faculty Sponsor University of Luleå-Piteå
E-mail Fax +32 2 345 3419 AES Student Section School of Music
pacific_nw_section@aes.org E-mail belgian_section@aes.org All-Russian State Institute of Box 744
The Art Institute of Seattle Cinematography (VGIK) S-94134 Piteå, Sweden
DENMARK W. Pieck St. 3 Tel. +46 911 726 27
Section (Student)
David G. Christensen Danish Section RU-129226 Moscow, Russia Fax +46 911 727 10
Faculty Advisor Preben Kvist Tel. +7 095 181 3868 E-mail lulea_pitea@aes.org
1204 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 5
SECTIONS CONTACTS
DIRECTORY
UNITED KINGDOM Czech Republic Student Düsseldolf Section (Student) Fax +48 71 320 3189
British Section Section Corinna A. Bock E-mail poland_section@aes.org
Heather Lane Libor Husník, Faculty Advisor AES Student Section
Audio Engineering Society AES Student Section Juelicher Str. 80 Technical University of Gdansk
P. O. Box 645 Czech Technical Univ. at Prague DE 40477 Düsseldorf, Germany Section (Student)
Slough SL1 8BJ Technická 2, Tel. +49 211 484 6665 Pawel Zwan
United Kingdom CZ-116 27 Prague 6 E-mail duesseldorf_student_ AES Student Section
Tel. +44 1628 663725 Czech Republic section@aes.org Technical University of Gdansk
Fax +44 1628 667002 Tel. +420 2 2435 2115 Sound Engineering Dept.
Ilmenau Section
E-mail uk@aes.org E-mail ul. Narutowicza 11/12
(Student)
czech_student_section@aes.org PL-80 952 Gdansk, Poland
Karlheinz Brandenburg
Home Tel. +48 58 347 23 98
Faculty Advisor
CENTRAL REGION, Office Tel. +4858 3471301
GERMANY AES Student Section
EUROPE Fax +48 58 3471114
Aachen Section (Student) Fraunhofer Institute for Digital
E-mail gdansk_u @aes.org
Michael Vorländer Media Technology IDMT
Vice President:
Faculty Advisor Langewiesener Str. 22 Wroclaw University of
Bozena Kostek DE-98693 Ilmenau, Germany
Multimedia Systems Institut für Technische Akustik Technology Section (Student)
RWTH Aachen Tel. +49 3677 69 4340
Department Andrzej B. Dobrucki
Templergraben 55 E-mail
Gdansk University of Technology Faculty Sponsor
D-52065 Aachen, Germany ilmenau_student_section@aes.org
Ul. Narutowicza 11/12 AES Student Section
80-952 Gdansk, Poland Tel. +49 241 807985 North German Section Institute of Telecommunications
Tel. +48 58 347 2717 Fax +49 241 8888214 Reinhard O. Sahr and Acoustics
Fax +48 58 347 1114 E-mail Eickhopskamp 3 Wroclaw Univ.Technology
E-mail aachen_section@aes.org DE-30938 Burgwedel, Germany Wybrzeze Wyspianskiego 27
vp_central_europe@aes.org Tel. +49 5139 4978 PL-503 70 Wroclaw, Poland
Berlin Section (Student) Fax +49 5139 5977 Tel. +48 71 320 30 68
AUSTRIA Bernhard Güttler E-mail Fax +48 71 320 31 89
Austrian Section Zionskirchstrasse 14 n_german_section@aes.org E-mail
Franz Lechleitner DE-10119 Berlin, Germany wroclaw @ aes.org
Tel. +49 30 4404 72 19 South German Section
Lainergasse 7-19/2/1 Gerhard E. Picklapp
AT-1230 Vienna, Austria Fax +49 30 4405 39 03 REPUBLIC OF BELARUS
E-mail berlin@aes.org Landshuter Allee 162
Office Tel. +43 1 4277 29602 DE-80637 Munich, Germany Belarus Section
Fax +43 1 4277 9296 Tel. +49 89 15 16 17
Central German Section Valery Shalatonin
E-mail austrian_section@aes.org Fax +49 89 157 10 31
Ernst-Joachim Völker Belarusian State University of
Graz Section (Student) E-mail Informatics and
Institut für Akustik und s_german_section@aes.org
Robert Höldrich Bauphysik Radioelectronics
Faculty Sponsor Kiesweg 22-24 vul. Petrusya Brouki 6
HUNGARY BY-220027 Minsk
Institut für Elektronische Musik DE-61440 Oberursel, Germany
und Akustik Hungarian Section Republic of Belarus
Tel. +49 6171 75031
Inffeldgasse 10 István Matók Tel. +375 17 239 80 95
Fax +49 6171 85483
AT-8010 Graz, Austria Rona u. 102. II. 10 Fax +375 17 231 09 14
E-mail
Tel. +43 316 389 3172 HU-1149 Budapest, Hungary E-mail
c_german_section@aes.org
Fax +43 316 389 3171 Home Tel. +36 30 900 1802 belarusian_section@ aes.org
E-mail Fax +36 1 383 24 81
Darmstadt Section (Student) E-mail
graz_student_section@aes.org SLOVAK REPUBLIC
G. M. Sessler, Faculty Sponsor hungarian_section@aes.org
Vienna Section (Student) AES Student Section Slovakian Republic Section
Jürg Jecklin, Faculty Sponsor Technical University of LITHUANIA Richard Varkonda
Vienna Student Section Darmstadt Lithuanian Section Centron Slovakia Ltd.
Universität für Musik und Institut für Übertragungstechnik Vytautas J. Stauskis Podhaj 107
Darstellende Kunst Wien Merkstr. 25 Vilnius Gediminas Technical SK-48103 Bratislava
Institut für Elektroakustik und DE-64283 Darmstadt, Germany University Slovak Republic
Experimentelle Musik Tel. +49 6151 162869 Traku 1/26, Room 112 Tel. +421 7 781 128, 7 788 437
Rienösslgasse 12 E-mail LT-2001 Vilnius, Lithuania Fax. +421 7 762 955
AT-1040 Vienna, Austria darmstadt_student@aes.org Tel. +370 5 262 91 78 E-mail
Tel. +43 1 587 3478 Fax +370 5 261 91 44 slovakian_rep @aes.org
Fax +43 1 587 3478 20 Detmold Section (Student) E-mail lithuania@aes.org
E-mail Andreas Meyer, Faculty Sponsor SWITZERLAND
vienna_student_section@aes.org AES Student Section POLAND
c/o Erich Thienhaus Institut Polish Section Swiss Section
CZECH REPUBLIC Tonmeisterausbildung Andrzej Dobrucki Joël Godel
Czech Section Hochschule für Musik Wroclaw University of AES Swiss Section
Jiri Ocenasek Detmold Technology Sonnmattweg 6
Dejvicka 36 Neustadt 22, DE-32756 Institute of Telecommunication CH-5000 Aarau
CZ-160 00 Prague 6, Czech Detmold, Germany and Acoustics Tel./Fax +41 26 670 2033
Republic Tel/Fax +49 5231 975639 Wybrzeze Wyspiannkiego 27 Switzerland
Home Tel. +420 2 24324556 E-mail PL-50-370 Wroclaw, Poland E-mail
E-mail czech_section@aes.org detmold @ aes.org Tel. +48 48 71 320 3068 swiss_section@aes.org
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1205
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 6
SECTIONS CONTACTS
DIRECTORY
UKRAINE 209 ave Jean Jaures R. Paulo Renato 1, 2A Talcahuano 141
Ukrainian Section FR-75019 Paris, France PT-2745-147 Linda-a-Velha Buenos Aires, Argentina
Dimitri Danyuk Tel. +33 40 40 4614 Portugal Tel./Fax +5411 4 375 0116
32-38 Artyoma St., Apt. 38 Fax +33 40 40 4768 Tel. +351 214145827 E-mail
UA 04053 Kiev, Ukraine E-mail E-mail portugal @ aes.org vp_latin_american@aes.org
E-mail ukrainian@aes.org Paris_student_section@aes.org
ROMANIA ARGENTINA
French Section Romanian Section Argentina Section
SOUTHERN REGION, Michael Williams Marcia Taiachin German Olguin
EUROPE Ile du Moulin Radio Romania Talcahuano 141
62 bis Quai de l’Artois 60-62 Grl. Berthelot St. Buenos Aires, Argentina 1013
Vice President: FR-94170 Le Perreux sur RO-79756 Bucharest, Romania Tel./Fax +5411 4 375 0116
Ivan Stamac Marne, France Tel. +40 1 303 12 07 E-mail
Ivlje 4 Tel. +33 1 48 81 46 32 Fax +40 1 222 69 19 argentina_section@aes.org
HR-10040 Zagreb, Croatia Fax +33 1 47 06 06 48 E-mail
Tel. +385 1 482 23 61 E-mail french_section@aes.org romanian_section@aes.org BRAZIL
Tel./Fax +385 1 457 44 03
E-mail SERBIA AND MONTENEGRO Brazil Section
Louis Lumière Section
vp_southern_europe@aes.org José Carlos Giner
(Student) Serbia and Montenegro Rua Marechal Cantuária # 18
Julien Basseres Section Urca-Rio de Janeiro
BOSNIA-HERZEGOVINA 4 rue d’Issy Tomislav Stanojevic RJ-2291-060, Brazil
Bosnia-Herzegovina Section FR 92170, Vanves, France Sava centre Tel. +55 21 2244 6530
Jozo Talajic Tel. +33 06 60 12 44 92 M. Popovica 9 Fax +55 21 2244 7113
Bulevar Mese Selimovica 12 E-mail YU-11070 Belgrade, Yugoslavia E-mail aesbrasil@aes.org
BA-71000 Sarajevo louis_lumiere_section@aes.org Tel. +381 11 311 1368Fax +381
Bosnia–Herzegovina 11 605 578 CHILE
Tel. +387 33 455 160 GREECE E-mail
serbia_montenegro_section Chile Section
Fax +387 33 455 163 Greek Section Andres Schmidt
E-mail bosnia_herzegovina_ @aes.org
Vassilis Tsakiris Hernan Cortes 2768
section@aes.org Crystal Audio Ñuñoa, Santiago de Chile
SLOVENIA
Aiantos 3a Vrillissia Tel. +56 2 4249583
BULGARIA GR 15235 Athens, Greece Slovenian Section
E-mail chile@aes.org
Tel. + 30 2 10 6134767 Tone Seliskar
Bulgarian Section RTV Slovenija
Fax + 30 2 10 6137010 COLOMBIA
Konstantin D. Kounov Kolodvorska 2
Bulgarian National Radio E-mail Colombia Section
greek_section@aes.org SI-1550 Ljubljana, Slovenia
Technical Dept. Tel. +386 61 175 2708 Mercedes Onorato
4 Dragan Tzankov Blvd. Fax +386 61 175 2710 Talcahuano 141
ISRAEL Buenos Aires, Argentina
BG-1040 Sofia, Bulgaria E-mail
Tel. +359 2 65 93 37, +359 2 Israel Section slovenian @ aes.org Tel./Fax +5411 4 375 0116
9336 6 01 Ben Bernfeld Jr. E-mail
Fax +359 2 963 1003 H. M. Acustica Ltd. SPAIN colombia_section@aes.org
E-mail 20G/5 Mashabim St. Spanish Section
IL-45201 Hod Hasharon, Israel Javeriana University Section
bulgarian_section@ aes.org Juan Recio Morillas
Tel./Fax +972 9 7444099 (Student)
Spanish Section Silvana Medrano
CROATIA E-mail israel_section@aes.org C/Florencia 14 3oD Carrera 7 #40-62
Croatian Section ES-28850 Torrejon de Ardoz Bogota, Colombia
ITALY
Zoran Vertlberg (Madrid), Spain Tel./Fax +57 1 320 8320
Italian Section Tel. +34 91 675 49 98 E-mail
Hrvatski Radio
Prisavlje 3 Carlo Perretta E-mail spanish @ aes.org javeriana_section@aes.org
HR-10000 Zagreb, Croatia AES Italian Section
Piazza Cantore 10 TURKEY Los Andes University Section
Tel. +385 1 634 27 23
IT-20134 Milan, Italy Turkish Section (Student)
Fax +385 1 634 30 65,
Tel. +39 338 9108768 Sorgun Akkor Jorge Oviedo Martinez
or 1 611 58 29
Fax +39 02 58440640 STD Gazeteciler Sitesi, Transversal 44 # 96-17
E-mail croatian_section@aes.org
E-mail italian@aes.org Yazarlar Sok. 19/6 Bogota, Colombia
Croatian Student Section Esentepe 80300 Tel./Fax +57 1 339 4949 ext.
Marija Salovarda Italian Student Section Istanbul, Turkey 2683
Tatjane Marinic 2 Franco Grossi, Faculty Advisor Tel. +90 212 2889825 E-mail losandes @aes.org
HR 10430 Samobor, Croatia AES Student Section Fax +90 212 2889831 San Buenaventura University
Tel. +385 1 3363 103E-mail Viale San Daniele 29 E-mail Section (Student)
croatian_student_section@aes.org IT-33100 Udine, Italy aesturkey@aes.org Nicolas Villamizar
Tel. +39 0432227527 Transversal 23 # 82-41 Apt. 703
FRANCE E-mail Int.1
Conservatoire de Paris italian_student@aes.org Bogota, Colombia
Section (Student) LATIN AMERICAN REGION
Tel. +57 1 616 6593
Daniel Zalay, Faculty Advisor PORTUGAL Fax +57 1 622 3123
Conservatoire de Paris Portugal Section Vice President: E-mail
Department Son Rui Miguel Avelans Coelho Mercedes Onorato sanbuenaventura@aes.org
1206 J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November
p1201to1207_Sec. Contact 10/12/04 11:57 AM Page 7
SECTIONS CONTACTS
DIRECTORY
ECUADOR Tel. +58 14 9292552 Fax +61 2 9417 3714 125 Regalia Park Tower
Ecuador Section Tel./Fax +58 2 9937296 E-mail P. Tuazon Blvd., Cubao
Juan Manuel Aguillo E-mail sydney@aes.org Quezon City, Philippines
Av. La Prensa 4316 y Vaca de caracas_section@aes.org Tel./Fax +63 2 4211790,
Castro HONG KONG +63 2 4211784
Quito, Ecuador Venezuela Section E-mail
Hong Kong Section
Tel./Fax +59 32 2598 889 Elmar Leal philippines_section@aes.org
Ave. Rio de Janeiro Goeffrey Stitt
E-mail
Qta. Tres Pinos HKAPA, School of Film and SINGAPORE
ecuador_section@aes.org
Chuao, VE-1061 Caracas Television Singapore Section
I.A.V.Q. Section (Student) Venezuela 1 Gloucester Rd. Kenneth J. Delbridge
Felipe Mardones Tel. +58 14 9292552 Wanchai, Hong Kong 480B Upper East Coast Rd.
315 Carrion y Plaza Tel./Fax +58 2 9937296 Tel. +852 2584 8664 Singapore 466518
Quito, Ecuador E-mail Fax +852 2588 1303 Tel. +65 9875 0877
Tel./Fax +59 3 225 61221 venezuela_section@aes.org E-mail Fax +65 6220 0328
E-mail iavq@aes.org hong_kong@aes.org E-mail
singapore@aes.org
MEXICO INDIA
Mexican Section India Section
Jorge Urbano INTERNATIONAL REGION
Cofre de Perote 132 Avinash Oak STUDENT DELEGATE
Fracc. Los Pirules Tlalnepantla Avisound ASSEMBLY
Vice President: A-20, Deepanjali
Edo. de Mexico, C.P. 54040 Neville Thiele
Mexico Shahaji Raje Marg
10 Wycombe St. Vile Parle East
Tel./Fax +52 55 5240 1203 Epping, NSW AU-2121,
E-mail Mumbai IN-400 057, India NORTH/LATIN
Australia Tel. +91 22 26827535
mexican_section@aes.org Tel. +61 2 9876 2407 AMERICA REGIONS
E-mail
Fax +61 2 9876 2749 indian_section@aes.org
PERU E-mail Chair:
vp_international@aes.org Marie Desmarteau
Orson Welles Institute Section JAPAN
(Student) McGill University Section
Javier Antón AUSTRALIA Japan Section (AES)
Av. Salaberry 3641, San Isidro Adelaide Section Katsuya (Vic) Goh 72 Delaware Avenue
Lima, Peru David Murphy 2-15-4 Tenjin-cho, Fujisawa-shi Ottawa K2P 0Z3
Tel. +51 1 264 1773 Krix Loudspeakers Kanagawa-ken 252-0814, Japan Ontario, Canada
Fax +51 1 264 1878 14 Chapman Rd. Tel./Fax +81 466 81 0681 Home Tel. +1 613 236 5411
E-mail Hackham AU-5163 E-mail Office Tel. +1 514 398 4535
orsonwelles@aes.org South Australia aes_japan_section@aes.org E-mail
Tel. +618 8 8384 3433 tonmaestra@hotmail.com
Peru Section
Fax +618 8 8384 3419 KOREA
ArmandÏo Puente De La Vega Vice Chair:
Av. Salaberry 3641 San Isidro E-mail Felice Santos-Martin
Korea Section
Lima, Peru adelaidean_section@aes.org American River College (AES)
Seong-Hoon Kang
Tel. +51 1 264 1773 Taejeon Health Science College Tel. +1 916 802 2084
Brisbane Section
Fax +51 1 264 1878 Dept. of Broadcasting E-mail
E-mail David Spearritt felicelazae@hotmail.com
AES Brisbane Section Technology
peru_section @aes.org 77-3 Gayang-dong Dong-gu
P.O. Box 642
Roma St. Post Office Taejeon, Korea
URUGUAY Tel. +82 42 630 5990
Brisbane, Qld. AU-4003, Australia
Uruguay Section Fax +82 42 628 1423 EUROPE/INTERNATIONAL
Office Tel. +61 7 3364 6510
César Lamschtein E-mail REGIONS
E-mail
Universidad ORT brisbane_section@aes.org korea_section@aes.org
Cuareim 1451
Montevideo, Uruguay Chair:
Melbourne Section MALAYSIA
Tel. +59 1 902 1505 Martin Berggren
Graham J. Haynes Malaysia Section European Student Section
Fax +59 1 900 2952 P. O. Box 5266
E-mail C. K. Ng Varvsgatan 35
Wantirna South, Victoria King Musical Industries Arvika, SE 67133, Sweden
uruguay@aes.org AU-3152, Australia Sdn Bhd Home Tel. +46 0570 12018
Tel. +61 3 9887 3765 Lot 5, Jalan 13/2 Office Tel. +46 0570 38500
VENEZUELA
Fax +61 3 9887 1688 MY-46200 Kuala Lumpur E-mail
Taller de Arte Sonoro, E-mail martin.bergren@imh.se
Caracas Section (Student) Malaysia
melbourne @ aes.org Tel. +603 7956 1668
Carmen Bell-Smythe de Leal Vice Chair:
Faculty Advisor Fax +603 7955 4926
Sydney Section Daniel Hojka
AES Student Section E-mail
Howard Jones TU Graz
Taller de Arte Sonoro malaysia @ aes.org
AES Sydney Section Moserhofgasse 34/28
Ave. Rio de Janeiro P.O. Box 766 AT 8010, Graz, Austria
PHILIPPINES
Qta. Tres Pinos Crows Nest, NSW AU-2065 Tel. +43 650 6471049
Chuao, VE-1061 Caracas Australia Philippines Section E-mail
Venezuela Tel. +61 2 9417 3200 Dario (Dar) J. Quintos daniel.hojka@toningenieur.info
J. Audio Eng. Soc., Vol. 52, No. 11, 2004 November 1207
AES CONVENTIONS AND CON The latest details on the following events are posted on the AES Website: http://www.aes.org
San Francisco
119th Convention
New York New York, NY, USA
2005 Date: 2005 October 7–10
Location: Jacob K. Javits
Convention Center
New York, NY, USA
1208 J. Audio Eng. Soc., Vol. 52, No. 11, 2005 November
FERENCES INFORMATION FOR AUTHORS
Presentation concise. All figures should be labeled with
Authors should submit a PDF for review author’s name and figure number.
by e-mail to: Gerri Calamusa, Senior All illustrations are printed in black and
Exhibit information: Editor, gmc @ aes . org. (Remove spaces white. For more information about
Chris Plunkett/Donna Vivero from e-mail address first.) If manuscript is preparing digital art go to
Telephone: +1 212 661 8528, ext. 30 accepted for publication, the author will http://dx.sheridan.com.
Fax: +1 212 682 0477 be asked to submit the original word-
Email: 117th_exhibits@aes.org processing (double-spaced for The size of illustrations when printed in the
Call for papers: Vol. 52, No. 3 copyediting) and illustrations files. Journal is usually 82 mm (3.25 inches)
p. 319 (2004 March) wide, although 170 mm (6.75 inches) wide
Review can be used if required. Letters on original
Call for workshop participants: Manuscripts are reviewed anonymously illustrations (before reduction) must be large
Vol. 52, No. 5, p. 569 (2004 May) by members of the review board. After the enough so that the smallest letters are at
Convention preview: Vol. 52 No. 7/8, reviewers’ analysis and recommendation least 1.5 mm (1/16 inch) high when the
pp. 828–859 (2004 July/August) to the editors, the author is advised of illustrations are reduced to one of the above
either acceptance or rejection. On the widths. If possible, letters on all original
basis of the reviewers’ comments, the illustrations should be the same size.
editor may request that the author make
Exhibit information: certain revisions which will allow the Units and Symbols
Thierry Bergmans paper to be accepted for publication. Metric units according to the System of
Email: 118th_exhibits@aes.org International Units (SI) should be used.
Content For more details, see G. F. Montgomery,
Call for papers: Vol. 52 No. 10,
p. 1111 (2004 October) Technical articles should be informative “Metric Review,” JAES, Vol. 32, No. 11,
and well organized. They should cite pp. 890–893 (1984 Nov.) and J. G.
original work or review previous work, McKnight, “Quantities, Units, Letter
giving proper credit. Results of actual Symbols, and Abbreviations,” JAES, Vol.
experiments or research should be 24, No. 1, pp. 40, 42, 44 (1976 Jan./Feb.).
included. The Journal cannot accept Following are some frequently used SI
unsubstantiated or commercial statements. units and their symbols, some non-SI units
that may be used with SI units (▲), and
Organization
some non-SI units that are deprecated (■).
An informative and self-contained
abstract of about 60 words must be Unit Name Unit Symbol
Call for papers: This issue,
provided. The manuscript should develop
p. 1200 (2004 October) ampere A
the main point, beginning with an
bit or bits spell out
introduction and ending with a summary
bytes spell out
or conclusion. Pages should be numbered
decibel dB
consecutively. Illustrations must have
degree (plane angle) (▲) °
informative captions and must be referred
farad F
to in the text.
gauss (■) Gs
References should be cited numerically in gram g
brackets in order of appearance in the henry H
text. Footnotes should be avoided, when hertz Hz
possible, by making parenthetical hour (▲) h
remarks in the text. inch (■) in
joule J
Mathematical symbols, abbreviations, kelvin K
Exhibit information: acronyms, etc., which may not be familiar kilohertz kHz
Chris Plunkett/Donna Vivero to readers must be spelled out or defined kilohm kΩ
Telephone: +1 212 661 8528, ext. 30 the first time they are cited in the text. liter (▲) l, L
Fax: +1 212 682 0477 megahertz MHz
Email: 119th_exhibits@aes.org Subheads are appropriate and should be
meter m
inserted where necessary. Paragraph
microfarad µF
division numbers should be of the form 0
micrometer µm
(only for introduction), 1, 1.1, 1.1.1, 2, 2.1,
microsecond µs
2.1.1, etc.
milliampere mA
References should appear at the end of the millihenry mH
manuscript after the text in order of millimeter mm
appearance. References to periodicals millivolt mV
should include the authors’ names, title of minute (time) (▲) min
article, periodical title, volume, page minute (plane angle) (▲) ’
numbers, year, and month of publication. nanosecond ns
Book references should contain the names oersted (■) Oe
of the authors, title of book, edition (if other ohm Ω
than first), name and location of publisher, pascal Pa
publication year, and page numbers. picofarad pF
References to AES convention papers second (time) s
should be replaced with Journal publication second (plane angle) (▲) ”
citations if the convention paper has been siemens S
published in the Journal. tesla T
Reports of recent AES volt V
Illustrations watt W
conventions and conferences are
now available online, go to Figure captions should be duplicated in weber Wb
www.aes.org/events/reports. the word-processing document following
the references. Captions should be
AES
sustaining
member
organizations