Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Applied Acoustics 67 (2006) 835–848

www.elsevier.com/locate/apacoust

A simple method to detect audible echoes in


room acoustical design
a,*
Yoshinari Yamada , Takayuki Hidaka a, Yôiti Suzuki b

a
Takenaka R&D Institute, 1-5-1, Otsuka, Inzai, Chiba 270-1395, Japan
b
Tohoku University, Research Institute of Electrical Communication, 2-1-1, Katahira,
Aoba-ku, Sendai, Miyagi 980-8577, Japan

Received 8 September 2005; received in revised form 17 November 2005; accepted 1 December 2005
Available online 7 March 2006

Abstract

A simple method to detect audible echoes is proposed as an objective criterion for room acoustics.
This method evaluates the perceptibility of sound reflections that are generated by an impulsive
sound source and identifies from reflectograms harmful reflections perceived as echoes. Particularly
with this method, the masking effect of reverberation is taken into consideration, which cannot be
treated sufficiently by the existing objective criteria. The applicability to room acoustical design is
verified by evaluating the impulse responses measured in real halls where audible echoes occurred.
It is shown that the proposed method detects audible echoes at an accuracy of more than 90%
and would be suitable for practical use.
 2006 Elsevier Ltd. All rights reserved.

Keywords: Audible echo; Short pulse; Reverberant field; Masking effect

1. Introduction

In rooms for listening to music or speech, strong reflections comparatively delayed with
respect to the direct sound might become audible echoes and disturb or annoy the players
and the listeners [1,2]. Those harmful reflections should be carefully examined in acoustical
design and removed by suitable constructive measures [3].

*
Corresponding author. Tel.: +81 476 47 1700; fax: +81 476 47 3122.
E-mail address: yamada.yoshinari@takenaka.co.jp (Y. Yamada).

0003-682X/$ - see front matter  2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.apacoust.2005.12.009
836 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

Many researchers have investigated the perceptibility of sound reflections in rooms and
clarified their relationships with the physical characteristics of sound fields [4–7]. Espe-
cially for audible echoes, objective criteria such as percentage disturbance [8], echo coeffi-
cient [9], and echo criterion [10] have been proposed and enabled us to detect from
reflectograms sound reflections perceived as echoes. There are, however, very few reports
of systematical investigation concerning the influence of reverberation [11]. As is well
known, in general, the longer the reverberation time, the less the subjective disturbance
due to echoes [12]. Although this phenomenon suggests that reverberation provides some
masking effect to discrete reflections [13], its mechanism was not fully discussed in the
existing studies.
In this paper, a simple method to detect audible echoes, by which the masking effect of
reverberation is taken into consideration, is proposed. This method evaluates the percep-
tibility of sound reflections generated by an impulsive sound source that most readily
reveals the presence of echoes [14], whereas the majority of the existing objective criteria
evaluate the subjective disturbance when listening to music or speech. Particularly, aiming
at monaural impulse responses that are usually derived from the scale model tests and
numerical analyses of sound fields, the hearing impression for them is evaluated. Yamam-
oto [15] proposed an objective criterion to detect audible echoes, which uses noise pulses as
the source signals. His method can be applied to evaluate multiplex reflections by consid-
ering the temporal masking effects of individual reflections, but some difficulties obviously
arise when treating reverberation. Both of the two masking effects that relate with discrete
reflections and reverberation are taken into account by the new method described in this
paper.
In the following two sections, the threshold of perceptibility of sound reflections is mea-
sured while listening to hypothetic impulse responses that simulate those in rooms with
high diffusion. Based on those data, an objective criterion to detect audible echoes is
derived, and a new physical measure is introduced. In the last section, the applicability
of the proposed method to room acoustical design is verified by evaluating the impulse
responses measured in real halls where audible echoes occurred.

2. Perceptibility of impulsive sound reflections in reverberant fields

2.1. Single reflections

2.1.1. Experimental method


Test signals, each composed of the direct sound, a single reflection, and an exponen-
tially decaying reverberation, were digitally synthesized as in Fig. 1. It simulates an
impulse response in reverberant sound fields with high diffusion. The threshold of percep-
tibility of the single reflection was measured under the monaural hearing condition
through a headphone (STAX SRD-X), while changing the test frequency, the delay time
of the reflection, and the temporal structure of the reverberation.
Two bands with center frequencies of 1 and 2 kHz and width of 1 kHz were chosen to
test, which belong to the mid to high ranges contributive to the subjective disturbance or
the absolute threshold of sound reflections [16,17] and where many realistic sound sources,
such as clapping of hands or a castanet, have large energies. The direct sound component
was obtained by convolving the unit impulse with the FIR band-pass filter that has the
desired cutoff frequencies. The white noise was used to make the reverberation component,
Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848 837

Unit impulse

Direct sound
Analytic
Delay signal
10 ΔL 20 +
Band-pass Δt ms
filter bank Reflected sound
Shift of
White noise 10 A /2 0 exp(−6.91t T60 ) frequency

Reverberation

Headphone
D/A
amplifier

Fig. 1. Block diagram of experimental setup and synthesis of test signal.

which was weighted by the exponential time window w(t) = 10A/20exp(6.91t/T60) after
limiting that frequency range by the same FIR filter as that used to obtain the direct sound
component. Here, the reverberation time T60 was 0.5, 1.0, and 1.5 s, and the amplitude
level A was 5 and 11 dB relative to that of the direct sound component, which corre-
sponds to values estimated roughly by observing impulse responses measured in a few
halls. The reverberation component was not delayed with respect to the direct sound com-
ponent. The reflected sound component was obtained by multiplying the direct sound
component by the amplitude ratio 10DL/20 and delaying it. The delay time Dt was 30,
60, 90, 120, 150, 200, and 250 ms.
To investigate contributions of the center frequency and the temporal structure to the
threshold independently, the following signal processing was applied to prepare the test
signals. At first, the sound components of the center frequency of 1 kHz were made by
the method described above and were transformed to the analytic signals by means of
the Hilbert transform. Next, the frequency of the analytic signal was shifted by 1 kHz
to obtain the sound components of the center frequency of 2 kHz, which have the same
envelope as those of the center frequency of 1 kHz.
The threshold was measured by the method of limits. The reference signal that does not
contain the reflected sound component was always presented before the test signal. Listen-
ers put the headphone and compared the two signals presented at the left channel, and
judged whether the reflection was perceived in the test signal. When it was impossible
to answer confidently by listening one time, another was allowed. The hearing level was
adjusted so that the peak level of the direct sound is about 100 dB at the entrance of
the ear canal. The relative level of the reflection DL was changed by 1 dB step. After
repeating the ascending and the descending series three times alternatively, six values of
DL at which the response changed were obtained. The threshold level DLT was defined
as their average. In this experiment, four male listeners between 19 and 39 years old par-
ticipated, who had normal hearing and underwent sufficient training.

2.1.2. Result and discussion


Fig. 2 shows the threshold level DLT obtained by the experiment, as a function of the
delay time Dt. Comparing the results between the center frequencies of 1 and 2 kHz (circle
838 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

5
(a) T60 = 1.5 s
0

-5

LT (dB) -10

-15

-20

-25 A dB 5 11
1 kHz
-30 2 kHz
2 bands
-35

5
(b) T60 = 1.0 s
0

-5

-10
LT (dB)

-15

-20

A dB 5 11
-25
1 kHz
-30 2 kHz
2 bands
-35
0
(c) T60 = 0.5 s
-5

-10

-15
LT (dB)

-20

-25

-30 A dB 5 11
1 kHz
-35 2 kHz
2 bands
-40
0 50 100 150 200 250

Fig. 2. Threshold level DLT of single reflection as a function of delay time Dt: (a) reverberation time T60 is 1.5 s;
(b) T60 is 1.0 s; (c) T60 is 0.5 s. Error bar is the standard deviation.

and square), it is seen that the difference between them is 1.8 dB at maximum and seems to
be independent of temporal structures. A four-way analysis of variance was carried out
(using ANOVA of StatSoft); center frequency, reverberation time T60, amplitude level
Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848 839

of reverberation A, and delay time Dt were used as factors (2 · 3 · 2 · 7 levels). As a result,


the interaction effect between T60 and Dt is significant (F(12,252) = 75.07, p < 0.01), the
main effect of A is significant (F(1,252) = 1567.37, p < 0.01), and the center frequency does
not contribute significantly (F(1,252) = 0.74, p = 0.39). Thus, we think that the threshold
levels under each condition may be redefined by the average values for the two bands
(cross and minus).
In Fig. 3, the obtained threshold levels are plotted again together with their regression
lines DLT(Dt) = aDt + b and are compared with the threshold levels measured from the
direct and reflected sound components reproduced by frontal loudspeakers in a normal
room (T60 = 0.45 s) [18] and the IEC recommended listening room (T60 = 0.4 s) [14],
and the range of those measured without reverberation [6,14,19]. Moreover, the slope of
the regression line a and the threshold level averaged over the delay times DLAV are sum-
marized in Table 1. It is seen that the threshold levels obtained in this paper have the fol-
lowing characteristics:

(1) They are larger by at least 20–30 dB than the threshold levels measured without
reverberation (Fig. 3). This phenomenon must be caused by the masking effect of
reverberation.
LT (dB)

Fig. 3. Threshold level DLT of single reflection as functions of delay time Dt along with its regression line and
data from previous studies.

Table 1
Slope of regression line a and threshold level averaged over delay times DLAV for experimental cases with different
reverberation time T60 and amplitude level of reverberation A
T60 (s) A (dB) a (dB/ms) DLAV (dB)
1.5 5 0.044 6.2
1.5 11 0.043 11.4
1.0 5 0.059 7.9
1.0 11 0.061 13.4
0.5 5 0.108 13.4
0.5 11 0.111 19.0
840 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

(2) When the reverberation time is 5, 1.0, and 0.5 s, the reverberant sound energy decays
at the rate of 0.04, 0.06, and 0.12 dB/ms, respectively. The slopes of the regression
lines are equal to these values (Table 1). That is, the slope of the threshold level
as a function of the delay time is inversely proportional to the reverberation time.
(3) When the amplitude level of reverberation increases by 6 dB, the average threshold
level DLAV increases by 5–6 dB (Table 1). That is, the threshold level is nearly pro-
portional to the amplitude level of reverberation.

The threshold levels by Tabara et al. and Olive et al. (asterisk and cross in Fig. 3) do not
decline linearly and have bending or bumps. However, regression analyses yielded 0.153
and 0.175 dB/ms for their slopes, respectively, which support the second characteristic
(2). Burgtorf et al. [11] have suggested the third characteristic (3). They measured the
threshold level, while fixing the delay time of a reflection to 80 ms and changing the spatial
correlation of reverberation, and concluded that the threshold level is simply proportional
to the amplitude of reverberation when the spatial correlation is small. The threshold level
by Olive et al. (cross) is smaller by at least 10 dB than other data. It should be noted, how-
ever, that the amplitude of reverberation in their listening room was small (Refer Fig. 10 in
[14]) and that the broadband pulse of the duration 10–40 ls was used as the source signal.

2.2. Multiplex reflections

2.2.1. Experimental method


Test signals were prepared by adding another reflection to the hypothetic impulse
responses used in the previous section and the thresholds of the multiplex reflections were
measured.
Table 2 summarizes experimental parameters, i.e., reverberation time T60, amplitude
level of reverberation A, delay time DtR and amplitude level DLR of the additional reflec-
tion, and delay time Dt of the variable reflection whose threshold was measured. The test
frequencies were the two bands with the center frequencies of 1 and 2 kHz. The amplitude
levels of the additional reflections in the cases a, b, and c are larger than the threshold lev-
els by 7, 15, and 7 dB, respectively; on the other hand, that in the case d is nearly equal to
the threshold level.

Table 2
Parameters of test signals in the threshold measurement with an additional reflection: reverberation time T60,
amplitude level of reverberation A, delay time DtR and amplitude level DLR of the additional reflection, and delay
time of the variable reflection Dt
Case T60 (s) A (dB) DtR (ms) DLR (dB) Dt (ms)
a-30 30
a-90 1.5 5 60 3 90
a-120 120
b-30 30
b-90 0.5 11 60 3 90
b-120 120
c 1.0 5 75 2 90
d 1.0 5 75 4 90
Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848 841

The threshold was measured by the same procedure as in the previous section, excepting
that the reference signal also has the additional reflection. In this experiment, three listen-
ers participated, who experienced the previous test.

2.2.2. Result and discussion


To investigate influences of the center frequency statistically, the analysis of variance
was carried out by using the center-frequency as a single factor (2 levels). The result
showed that the center frequency does not contribute to the threshold level significantly
(F(1,46) = 0.001, p = 0.98). This means that the threshold levels under each condition
may be redefined by the average values for the two bands.
In Fig. 4, the obtained threshold level is plotted (solid circle) along with the envelope of
the reference signal, and is compared with the threshold levels measured without addi-
tional reflections in the previous section (open circle). The threshold levels in the cases
b-90 and c are larger by 4–5 dB than those measured without additional reflections. Since
a two-way analysis of variance, in which experimental cases and the existence of additional
reflection were used as factors (8 · 2 levels), showed that the interaction effect between
them was significant (F(7,96) = 8.91, p < 0.01), the simple main effect of the existence of
additional reflection was then examined for all the eight cases (2 levels). As a result, the
effect of the additional reflection was also significant in the case d (F(1,12) = 9.93,
p < 0.01) in addition to b-90 (F(1,12) = 53.27, p < 0.01) and c (F(1,12) = 71.40, p < 0.01).
The increase of the threshold level confirmed in this section must be caused by the tem-
poral masking effect of the additional reflection. This effect would act when the loudness
for the additional reflection is larger than that for reverberation because the loudness for a
short pulse decays rapidly with time as suggested by the range of threshold levels measured
without reverberation (Refer Fig. 3).

3. An objective criterion to detect audible echoes

3.1. Temporal integration for reflection masking

The masking of a discrete reflection by reverberation or other discrete reflections could


be regarded as results led by some functions of the temporal integration in the auditory
system. In this paper, those functions are represented by a RC leaky-integrating circuit
that has been very often used for models of the summation and the decay of loudness
[20]. For any signals x(t), the output of the energy-based integration from the RC circuit
is represented by the running average operation as follows.
Z t
2 1 2
X ðt; s; aÞ ¼ as xðuÞ exp½ðt  uÞ=s du; ð1Þ
1

where s is the time constant; a is the output gain; X(t;s,a) corresponds to the root mean
squares (RMS) of the input signal x(t).
Applying Eq. (1) to the envelope of the reference signal used in the previous section, the
threshold level DLT(Dt) was fitted by the RMS signal X(Dt;s,a). Based on the method of least
mean squares, the parameters s and a that minimize the following equation were solved.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
re ¼ ðN T  1Þ S ; ð2Þ
842 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

10
Case a

-10
(dB)
-20

-30

10
Case b

-10
(dB)

-20

-30

10
Case c

-10
(dB)

-20

-30

10
Case d

-10
(dB)

-20

-30

0 50 100 150 200


Time (ms)

Fig. 4. Threshold level measured with additional reflection (arrow) along with that of single reflection (open
circle) and the envelope of test signal. Error bar is the standard deviation.
Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848 843

where NT is the number of the threshold level data; S is the sum of square errors given by
the following equation.
XNT
2
S¼ fX ðDt; s; aÞ  DLT ðDtÞg . ð3Þ
i¼1

To identify the RC circuit representing the masking effects of reverberation and discrete
reflections, all of the threshold levels measured in the previous section were fitted together
(NT = 50). Fig. 5 shows the result derived from this curve fitting. In the panel (a), the con-
tour lines indicate the re values in Eq. (2) when the parameters of the RC circuit change.
The gain G in the ordinate is the relative level to the output gain a in Eq. (1), by which the
amplitude of the RMS signal coincides with that of the input signal at the location of
the direct sound. When one parameter is fixed, the optimal value of another changes along
the saddle line (thick line). At this time, the fitting error X(Dt;s,a)  DLT(D t) has such sta-
tistical properties that the mean is usually 0 and the standard deviation is identical to the
re value. The panel (b) shows the standard deviation (thick line) and the maximum abso-
lute value (thin line) of the fitting error on the saddle line, as a function of the time con-
stant s. It is found that the optimal parameters are s = 10 ms and G = 2 dB (circle); the
standard deviation and the maximum absolute value of the fitting error are 0.9 and 1.9 dB,
respectively, with these optimal parameters.
10

15
5

10
Gain G (dB)

0 4
6 8
4
8

-5 2
2 4
6
-10
4
-15
a 10 20 30 40 50

4
Maximum absolute value
3
Error (dB)

2
Standard deviation
1

0
10 20 30 40 50
b Time constant τ (ms)

Fig. 5. Result of curve fitting for measured threshold levels: (a) re as contour lines (thin line) and its saddle line
(thick line), G is the relative level to the gain a by which the amplitude of RMS output signal coincides with that
of input signal at the location of the direct sound; (b) the standard deviation (thick line) and maximum absolute
value (thin line) of fitting error on the saddle line, optimal parameters are s = 10 ms and G = 2 dB (circle).
844 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

Although the time constant of 10 ms is considerably small as compared with that for
the loudness summation, which is about 30–300 ms [21–23], there are many reports
[15,24–27] supporting the obtained result. Poulsen [24] and Kumagai et al. [25] showed
that the RC circuit with the time constant of 5–10 ms is effective to evaluate the loudness
of short impact sounds correctly. Zwicker [26] and Penner [27] measured the hearing
threshold of a short pulse under the simultaneous masking condition by the burst noise
and concluded that the auditory system integrates the burst noise acting as a masker over
the time interval shorter by about one tenth as compared with that for the loudness sum-
mation. The temporal masking pattern by Yamamoto [15] has the slope 3001L0 dB/ms
(L0 is the hearing level) and is equivalent with the impulse response of the RC circuit with
the time constant 1300L1 0 . Supposed that the hearing level L0 is about 100 dB as in the
previous section, the corresponding time constant is about 13 ms.

3.2. Echo index

Fig. 6 shows an example where the envelope of an impulse response measured in a hall,
i.e., the reflectogram is the input signal to the RC circuit with the time constant of
s = 10 ms and the normalized gain of G = 2 dB. The RMS output signal jumps at the
same time upon arrival of reflections. Now, observing the amplitude level of a reflection
LENV and that of the RMS signal just before jump LRMS, we can expect that the difference
between them LENV  LRMS becomes positive or negative in response to the situation
whether the amplitude level of the reflection exceeds the threshold level.
In this paper, this difference in the amplitude levels is referred to as the echo index (EI)
and proposed as an objective criterion to detect audible echoes from reflectograms. In
addition, the maximum value of EI in a reflectogram is referred to as the characteristic
echo index (CEI) and proposed as a physical measure for room acoustical design.

4. Applicability to room acoustical design

4.1. Method of examination

The applicability of EI to room acoustical design was investigated by evaluating four


halls: the drama theaters D1 (the average of the reverberation times for octave bands with

Fig. 6. An example of RMS output signals from the identified RC circuit, and definitions of EI and CEI. Input
signal is the envelope of a reflectogram in a real hall.
Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848 845

the center frequencies of 1 and 2 kHz, RTAV is 0.9 s) and D2 (RTAV = 0.9 s), the multi
purpose halls M1 (RTAV = 1.1 s) and M2 (RTAV = 2.0 s). In these halls, some echoes were
perceived in the rear side of the audience area when an omni-directional loudspeaker was
set at the centerline of the stage.
Monaural impulse responses measured at seats in each hall by an omni-directional
microphone were convolved with the FIR filters used in the previous section, and 34
(17 seats · 2 bands), 18 (9 seats · 2 bands), 20 (10 seats · 2 bands), and 34 (17 seats · 2
bands) test signals were obtained for the halls D1, D2, M1, and, M2, respectively. More-
over, to identify harmful reflections perceived as echoes and investigate the frequency
dependence of the perception of echoes, the test signals for the hall D1 were supplied
by the following method. At first, after compressing remarkable reflections in the original
signals (17 seats · 2 bands), 38 and 32 signals were obtained for the center frequencies of 1
and 2 kHz, respectively. Then, by shifting the frequencies mutually by 1 kHz (Refer
Fig. 1), 70 pairs of test signals, which have the same envelopes but each have different cen-
ter frequencies, were prepared ultimately.
Those 212 test signals were presented to the left ears of listeners through the headphone
(STAX SRD-X). In this test, four male listeners between 23 and 40 years old participated.
Three listeners were not participants of the experiments in the previous section, but they
had normal hearing and much experience in hearing tests for room acoustics. The hearing
level was adjusted so that the peak level of the direct sound is about 100 dB at the entrance
of the ear canal. Each listener was requested to judge whether any echoes were perceived
and was instructed to listen to each test signal repeatedly until they could answer with con-
fidence. At the same time, EI and CEI values were derived to investigate the relationships
with the hearing impression.

4.2. Result and discussion

Fig. 7 shows the relationship between the total number of subjects who perceived some
echoes (referred to as ‘‘score’’) and the CEI value with respect to the hall D1. Comparing
the results between the center frequencies of 1 and 2 kHz, the mean values of CEI are dif-
ferent by only 0.4 dB at maximum (circle and square); the standard deviations are almost

10
: 1 kHz ( : mean)
8 : 2 kHz ( : mean)

6
CEI (dB)

2
EI C 1.5 dB
0

-2

-4
0 1 2 3 4
Score

Fig. 7. Relationship between score and CEI for the hall D1.
846 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

identical (error bar). Besides, after investigating the differences of the scores between each
pair with the same envelopes, it was found that all of them were equal to or less than 1.
This result supports that the perception of echoes does not depend on the center frequency
in the range under consideration.
Now, supposed that the critical EI value EIC is set to 1.5 dB (dashed line), the test sig-
nals in which all listeners perceive no echo (score 0) and a majority of listeners perceives
some echoes (score 2–4) can be discriminated at an accuracy of 98%. The reflectogram in
Fig. 6, which represents the impulse response in the rear side of this hall, has remarkable
reflections at the delay times 41, 100, and 215 ms and all listeners perceived echoes
(CEI = 6.8 dB). The scores of the test signals, in which each reflection is separately con-
tained, were 0, 4, and 3, respectively. On the other hand, the corresponding EI values
are 1.0, 6.8 and 4.0 dB, respectively, and exactly indicate that the reflections at the delays
100 and 215 ms are perceived as echoes.
Next, the relationship between the score and the CEI value with respect to the four halls
is shown in Fig. 8. For the hall D1, the results for the test signals without compressing
reflections and shifting the frequencies (17 seats · 2 bands) were plotted in this figure (cir-
cle) to aim at actual reflectograms in every hall. When the critical EI value EIC is set to
1.5 dB (dashed line), the test signals with score 0 and those with score 2–4 are discrimi-
nated at an accuracy of 91%. It is evident that the greater part of incorrectly discriminated
cases belongs to the halls M1 and M2. It was found from precise observations of the cor-
responding reflectograms that repetitive reflections appear in one case of M1 and that the
reverberations have stepwise discontinuities or bending in four cases of the hall M2
(enclosed data). The accuracy of discrimination is improved to 97% if those cases are
excluded in a pre-processing.
Finally, the relationship between CEI and the perception rate is derived from all data
excepting the five singular cases in the halls M1 and M2. For this purpose, the perception
rate was calculated for each interval of CEI that was divided by 1 dB step. Fig. 9 shows
the plot of the perception rate as a function of CEI. It is obvious that the plotted points
(circle) can be well represented by a cumulative normal distribution. The method of least

10
: D1, : D2 ( : mean)
8 : M1, : M2 ( : mean)

6
CEI (dB)

-2
EI C 1.5 dB
-4
0 1 2 3 4
Score

Fig. 8. Same as Fig. 7 but for four halls. Enclosed data are singular cases in which repetitive reflection or
discontinuity of reverberation was observed.
Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848 847

0.8

Rate of perception
0.6

0.4

0.2

0
-5 0 5 10
CEI (dB)

Fig. 9. Relationship between CEI and the perception rate (circle), and the probability of echo perception (thick
line).

mean squares yielded the distribution function defined by the average 2.5 and the standard
deviation 2.5 (thick line). The probability of echo perception can be estimated by this func-
tion for any given CEI. The probability of echo perception is estimated as 50% for
CEI = 2.5 dB and 35% for CEI = EIC = 1.5 dB.

5. Conclusion

The echo index (EI) was proposed as an objective criterion to detect audible echoes.
This method considers two masking effects that relate to reverberations and discrete reflec-
tions, and identifies harmful sound reflections from reflectograms by the following simple
procedures.

(1) The envelope of the band-limited impulse response is prepared by means of the Hil-
bert transform (Refer Fig. 1).
(2) The envelope is used as the input signal to the RC circuit with the time constant of
10 ms (Eq. (1)).
(3) For individual discrete reflections, the difference between the amplitude level of the
input signal and that of the RMS output signal defines EI. This value is used to iden-
tify harmful reflections in a reflectogram. The characteristic EI (CEI) is the maxi-
mum value of EI in the reflectogram and is used as a physical measure for room
acoustics (Fig. 6).

To investigate the applicability of this method to room acoustical design, the monaural
impulse responses measured in real halls were evaluated. As a result, it was shown that EI
is able to detect audible echoes at an accuracy of more than 90% by introducing the critical
EI value as EIC = 1.5 dB (Figs. 7 and 8). To obtain better performance, pre-processing to
detect repetitive reflections or discontinuities of reverberation would be required, and the
critical EI value should be revised for such cases. Beside, from precise investigations on the
relationship between CEI and the perception rate of echoes, the probability of echo per-
ception was estimated as 50% for CEI = 2.5 dB and 35% for CEI = EIC = 1.5 dB (Fig. 9).
848 Y. Yamada et al. / Applied Acoustics 67 (2006) 835–848

Acknowledgements

The authors express their gratitude to Professor M. Morimoto for advice to derive the
probability of echo perception in Fig. 9 and Dr. T. Okano for comments on an earlier ver-
sion of this paper.

References

[1] Everest FA. The master handbook of acoustics. USA: TAB Books; 1994.
[2] Beranek LL. Concert halls and opera house. New York: Springer; 2004.
[3] Cremer L, Muller H. Principal and applications of room acoustics. London/New York: Applied Science;
1982.
[4] Seraphim H-P. Über die Wahrnehmbarkeit mehrerer Rückwürfe von Sprachschall. Acustica 1961;11:80–91.
[5] Schubert P. Die Wahrnehmbarkeit von Rückwürfen bei Musik. Zeitschr. Hochfrequentechn. Electro-akust.
1969;78:230–45.
[6] Burgtorf W. Untersuchungen zur Wahrnehmbarkeit verzögerter Schallsignale. Acustica 1961;11:97–111.
[7] Morimoto M, Aokata H. Localization cues of sound sources in the upper hemisphere. J Acoust Soc Jpn (E)
1984;5:165–73.
[8] Bolt RH, Doak PE. A tentative criterion for the short-term transient response of auditoriums. J Acoust Soc
Am 1950;22:507–9.
[9] Niece H. Die Messung der Nutzschall- und Echogradverteilung zur Beurteilung der Hörsamkeit in Räumen.
Acustica 1961;11:201–13.
[10] Dietsh L, Kraak W. Ein objektives Kriterium zur Erfassung von Echostörungen bei Musik- und
Sprachdarbietungen. Acustica 1986;60:205–16.
[11] Burgtorf W, Wargener B. Verdeckung durch subjectiv diffuse Schallfelder. Acustica 1967/1968;19:72–9.
[12] Nickson AFB, Muncey RW, Dubout P. The acceptability of artificial echoes with reverberant speech and
music. Acustica 1954;4:447–50.
[13] Kuttruff H. Room acoustics. London: Spon Press; 2000.
[14] Olive SE, Tool FE. The detection of reflection in typical rooms. J Audio Eng Soc 1989;37:539–53.
[15] Yamamoto T. The perceptible limit of the echo due to multiplex reflections. J Acoust Soc Jpn 1971;26:153–62
[in Japanese].
[16] Haas H. The influence of a single echo on the audibility of speech. J Audio Eng Soc 1972;20:146–59.
[17] Olive SE, Tool FE. The modification of timbre by resonances: perception and measurement. J Audio Eng
Soc 1988;36:122–41.
[18] Tabara Y, Yamagata J, Sone T, Nimura T. The effect of the reverberation on detectable threshold for the
echo. Reports of the spring meeting of the Acoustical Society of Japan; 1969. p. 105–6 [in Japanese].
[19] Elliot LL. Backward and forward masking of probe tones of different frequencies. J Acoust Soc Am
1962;34:1116–7.
[20] Zwislocki J. Theory of temporal auditory summation. J Acoust Soc Am 1960;32:1046–60.
[21] Plomp R. Hearing threshold for periodic tone pulses. J Acoust Soc Am 1961;33:1561–9.
[22] Port E. Über die Lautstärke einzelner kurzer Schallimpulse. Acustica 1963;13:212–23.
[23] Reichardt W, Niese H. Choice of sound duration and silent intervals for test and comparison signals in the
subjective measurement of loudness level. J Acoust Soc Am 1969;47:1083–90.
[24] Poulsen T. Loudness of tone pulses in a free field. J Acoust Soc Am 1981;69:1786–90.
[25] Kumagai M, Ebata M, Sone T. Comparison of loudness of impact sounds with and without steady duration
(a study on the loudness of impact sound. II). J Acoust Soc Jpn (E) 1982;3:33–40.
[26] Zwicker E. Temporal effects in simultaneous masking by white-noise bursts. J Acoust Soc Am
1965;37:653–63.
[27] Penner MJ. A power law transformation resulting in a class of short term integrators that produce time-
intensity trades for noise bursts. J Acoust Soc Am 1977;63:195–200.

You might also like