Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.

4798648] Published Online 8 April 2013

The effect of diffuse reflections on spatial


discrimination in a simulated concert hall
Philip Robinson,a) Jukka P€atynen, and Tapio Lokki
Department of Media Technology, School of Science, Aalto University,
P.O. Box 15500, FI-00076 Aalto, Finland
philip.robinson@aalto.fi, jukka.patynen@aalto.fi, tapio.lokki@aalto.fi

Abstract: This letter presents results from a study on diffusive architec-


tural surfaces and auditory perception. Spatial discrimination of multi-
ple sources is investigated in a simulated performance venue with
various diffusive surface treatments. Simulations were generated with
closely spaced sound sources on the stage of a concert hall and a listener
in the audience area. Subjects were asked to distinguish signals in which
pairs of simultaneous talkers were presented at various lateral separa-
tions, in halls with flat or diffusive surfaces. The experiments reveal that
discriminating differences in the lateral arrangement of sources is possi-
ble at narrower separation angles when reflections come from flat rather
than diffusive surfaces.
C 2013 Acoustical Society of America
V
PACS numbers: 43.66.Lj, 43.66.Qp, 43.55.Br, 43.55.Fw [QJF]
Date Received: January 31, 2013 Date Accepted: March 18, 2013

1. Introduction
Diffusive architectural surfaces are widely used to evenly distribute sound throughout
rooms and eliminate echoes. To distribute reflected energy throughout a performance
venue, the acoustic designer is primarily concerned with the spatial diffusion a reflect-
ing surface produces. However, for a given listener position, other perceptually rele-
vant effects of a diffusive surface are the temporal diffusion and frequency spectrum of
its reflection. Additionally, individual diffusers have their own sound character
(Kleiner et al., 1992), and diffusers may create effects associated with reduced prefer-
ence, such as decreased reverberance and loudness (Ryu and Jeon, 2008).
This paper further investigates the perceptual effect of diffusers by attempting
to objectively quantify the effect of temporal diffusion on spatial impression by testing
listeners’ abilities to distinguish the positions of laterally separated acoustic sources.
The results are relevant to room acoustic designers who would like to maintain precise
localizability of individual sources in a group, or, on the other hand, achieve a specific
level of blend between sources through the application of diffuse architectural surfaces.
Localizing sound in a room is a complex auditory task. Subsequent to the
direct sound, a multitude of reflections arrive at the ears from different directions and
with various delays, levels, and frequency contents. Despite the additional sound waves
arriving at the ears, and in the absence of a very delayed strong reflection, listeners
perceive only one source, and are able to estimate the arrival direction of the direct
sound relative to their own position, even when the reverberant energy is greater than
that of the direct sound (Hartmann, 1983).
While the direct sound is important for localization, the reflected sound waves
contribute to spatial impression, which is comprised of at least two distinct effects. The first
is a broadening of the apparent width of a sound source, which is primarily determined by
early lateral reflections. The second is the listener’s sense of being enveloped by sound,
which is determined primarily by late-arriving reflections (Barron and Marshall, 1981;

a)
Author to whom correspondence should be addressed.

EL370 J. Acoust. Soc. Am. 133 (5), May 2013 C 2013 Acoustical Society of America
V

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4798648] Published Online 8 April 2013

Bradley and Soulodre, 1995). In addition to the aforementioned contribution to the per-
ceived source width, early reflections naturally amplify a source and may add a certain
sound color because of interfering direct and reflected sound (Lokki et al., 2011).
Due to difficulties in objectively measuring subjective percepts like apparent
source width and envelopment, an objective test was developed as a proxy. In this test,
spatial discrimination was measured by iteratively asking subjects to distinguish which
of three sound signals differed from the other two, while the spatial separation of two
sources was adaptively modified, to determine the threshold of discrimination with var-
ious reflection types.

2. Experimental setup and methods


The listening experiment utilizes auralizations produced from a hybrid simulation of
an abstracted concert hall. In each signal, two simultaneous talkers are present on the
stage. In one signal, they are presented at a close lateral separation and in the other
they are further apart. The difference in lateral separation between the signals is adap-
tively changed with each listening trial to determine the threshold of discrimination for
diffuse and non-diffuse reflections.

2.1 Hall simulation


A simulated concert hall was generated to provide the architectural context for the lis-
tening tests. The geometry provided 11 early reflections and measured late reverbera-
tion was added. Figure 1(A) conceptually illustrates the listening configuration.
Binaural impulse responses were created to be convolved with source material for the
listening tests. The impulse response synthesis consisted of four steps.
First, an image source model was utilized to attain an echogram containing
the arrival time, amplitude, and direction of the early reflections. The sources were
simulated as omni-directional and air absorption was applied based on reflection path
length.
In step two for the diffusive cases, the early reflections in the image source
echogram were convolved with either a measured reflection response from a diffuser,
or a simulated reflection response with characteristics similar to the measured response,
but with a white frequency spectrum. In the Measured Diffusive case, to obtain a real-
istic reflection response for the early reflections, impulse responses from a scale model
cast plaster tessellated surface designed to be an effective diffuser between 500 Hz and

Fig. 1. (A) A schematic representation of the surfaces used in the image source model. The complete binaural
simulation was generated by adding the early reflections from these surfaces to measured late reverberation. (B)
The listening test setup. In each trial, the left talker in Condition B was adaptively positioned to change the sep-
aration distance angle. The receiver position is indicated by “R” and sources are indicated as “M” and “F” for
male and female, respectively. The male and female voice positions were randomly switched at each trial.

J. Acoust. Soc. Am. 133 (5), May 2013 Robinson et al.: Diffusers and spatial discrimination EL371

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4798648] Published Online 8 April 2013

2 kHz was measured using techniques described by Robinson et al. (2009). These
impulse responses were processed to include only the reflection (Robinson and Xiang,
2010), scaled to normal audio frequencies (Robinson and Xiang, 2013), and normal-
ized to contain the same total energy as a specular reflection. The center panel of
Fig. 2 illustrates that these reflections are slightly more spread in time than specular
responses and have a less even frequency response. In the Simulated Diffusive case, a
reflection was generated to emulate the envelope and temporal spreading present in the
measured diffuse response, while maintaining a flat frequency response. The response
for the simulated diffuse reflection was generated by multiplying white Gaussian noise
with an envelope with the shape of the probability density function of a gamma distri-
bution of the desired length and shape. A similar method has been applied by Siltanen
et al. (2012). In this case, the reflection response was temporally diffused over approxi-
mately 16 ms. The resulting reflection impulse response was whitened by iteratively
multiplying the magnitude response by its inverse in the frequency domain until the
deviation from a flat spectrum was smaller than 60.1 dB. For Specular reflections,
step two was omitted, resulting in ideal reflections that were identical to the direct
sound with the exception of amplitude and direction of arrival. The Measured and
Simulated diffuse responses were normalized to contain energy equal to the specular
reflection. This resulted in three early reflection conditions: Specular, from the image
source model; Measured Diffuse, which included spectral coloration and temporal
spreading; and Simulated Diffuse, which included temporal spreading but not spectral
coloration.
The third step transformed the treated echogram into a binaural impulse
response. The reflections in the echogram were individually convolved with head
related transfer functions (HRTFs) for the appropriate arrival directions, and added to
create the early part of spatialized binaural impulse responses. The HRTF measure-
ment procedure can be found in Pulkki et al. (2010).
Finally, in the fourth step, measured late reverberation was added. Late rever-
beration was measured with a Genelec 1029A loudspeaker at the center of the stage at
12 m distance from the receiving position in the Helsingin Konservatorio concert hall.
An impulse response was recorded with a G.R.A.S. 3-D intensity vector probe with
capsule pairs arranged in X, Y, and Z directions. These were processed with Spatial

Fig. 2. (Color online) The frequency response of the first 200 ms of the left ear from the simulated binaural
impulse response. The upper panel shows specular reflections, the middle panel shows an impulse response uti-
lizing the reflection response measured from a diffuser, and the lower panel shows a simulated diffusive surface
response.

EL372 J. Acoust. Soc. Am. 133 (5), May 2013 Robinson et al.: Diffusers and spatial discrimination

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4798648] Published Online 8 April 2013

Impulse Response Rendering (Pulkki and Merimaa, 2006) into a total of 24 virtual
loudspeakers surrounding the listener, using the same HRTF set as used for the early
reflections. The binaural reverberation was then joined to the simulated early part of
the binaural impulse response with a sinusoidal fade-in duration of 80 ms beginning at
60 ms after the direct sound. The procedure was repeated for all simulated source posi-
tions. The reverberant level was adjusted such that it was clearly audible without intro-
ducing artifacts during fade-in, and produced a smooth decay from the early reflections.
For further details on the complete simulation process, see Lokki et al. (2011).
Figure 2 illustrates example spectrograms including direct sound, reflections, and late
reverberation from the first 200 ms of the impulse responses for all three surface condi-
tions. The resulting impulse responses had similar acoustical parameters (see Table 1)
to measured responses reported by Lokki et al. (2012), and informal listening revealed
them to be similar enough to be considered natural and realistic.

2.2 Listening test


Listening tests were conducted at the Aalto University campus in Espoo, Finland, in
the Department of Media Technology. Subjects were presented binaural auralizations
over Sennheiser HD238 supra-aural headphones, fed by a Presonus FP10 digital audio
interface receiving signals from a MATLAB graphical user interface running on a per-
sonal computer in a quiet room or listening booth. The binaural impulse responses
described above were convolved with samples from the Coordinate Response Measure
speech corpus provided by Bolia et al. (2000). These samples are approximately 3 s
long and follow the format: “Ready call sign go to color, number now.” Listeners took
from 20 min to 1 h 15 min to complete the test. In total, 13 listeners who reported hav-
ing normal hearing participated. The listening level was adjusted to a comfortable set-
ting at the beginning of the experiment and kept constant for all subjects. All were
members of the Department of Media Technology or Acoustics and Signal Processing
groups at Aalto University and had extensive critical listening experience. Twelve sub-
jects were male and one was female, all were between 22 and 40 years old.
The test was set up to have listeners compare signals with pairs of talkers on
stage at various lateral separations. Figure 1(B) illustrates a typical trial setup. The re-
ceiver was positioned 12 m from the line of sources on stage and the sources were posi-
tioned in 18 equal steps from the center of the stage to stage right, resulting in 2.4
separation between sources at the center, decreasing to 1.6 steps at the side of the
stage. In each signal, one talker was always in the center and the other was placed to
the left. In Signal A, the second talker was always 2.4 to the left of center and in
Signal B, the location of the second talker varied from trial to trial. In the first trial,
the second talker in Signal B was at the far left edge of the stage and moved towards

Table 1. Calculated acoustical parameters for the first 6 source positions in the simulated concert hall. Listed
parameters are Early Decay Time (EDT), Reverberation Time (T30), Clarity index (C50), and Inter-aural Cross
Correlation coefficient (IACC). Parameters are averaged from 500 to 2000 Hz.

EDT [s] T30 [s] C50 [dB] IACC

Position Spec. S.Dif M.Dif Spec. S.Dif M.Dif Spec. S.Dif M.Dif Spec. S.Dif M.Dif

1 1.84 1.78 1.75 2.08 2.08 2.05 0.54 0.32 0.22 0.42 0.42 0.43
2 1.66 1.59 1.58 2.06 2.05 2.02 0.09 0.06 0.59 0.42 0.42 0.40
3 1.71 1.64 1.65 2.06 2.06 2.02 0.44 0.28 0.36 0.28 0.29 0.29
4 1.68 1.62 1.6 2.06 2.05 2.01 0.46 0.35 0.35 0.27 0.28 0.32
5 1.69 1.63 1.64 2.06 2.06 2.01 0.12 0.01 0.05 0.28 0.29 0.29
6 1.72 1.64 1.64 2.06 2.05 2.01 0.20 0 0.08 0.26 0.28 0.30

Mean 1.72 1.65 1.64 2.06 2.06 2.02 0.31 0.15 0.47 0.32 0.33 0.34

J. Acoust. Soc. Am. 133 (5), May 2013 Robinson et al.: Diffusers and spatial discrimination EL373

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4798648] Published Online 8 April 2013

the center as the test progressed. The position of the second talker was determined by
an adaptive one-up, one-down procedure per Levitt (1971), with progressively smaller
step sizes to determine the 50% detection threshold for differences in the separation of
two simultaneous sources. One of the talkers was male and the other female; whether
the male was on the right or left was determined randomly each trial. The male and
female talkers simultaneously recited a different randomly selected sentence from the
speech corpus, which changed with each trial. Listeners were presented with a graphi-
cal user interface with three play buttons. Two of these buttons were assigned to play
condition B and the third was assigned to play condition A. In 5% of the randomly
chosen trials, all of the buttons were assigned to play the same signal. Subjects were
allowed to listen to each signal as many times as they liked before deciding which sig-
nal was different from the others, or if they were all the same. The test continued until
eight reversals were recorded or the listener reached a minimum separation difference
angle of 2.4 on 4 consecutive trials.

3. Results and discussion


The mean discrimination angle across all subjects and all surface conditions is 10.4 .
This value is at the upper range of findings by Perrott (1984) on concurrent minimum
audible angle, but confounded by two factors: the signals used in this study were far
more complex and distinguishable, which should have made the task easier, but the pres-
ence of reverberation should have made it more difficult. Additionally, the results show
noteworthy differences in spatial difference discrimination angles between reflection con-
ditions. Figure 3 illustrates the differences between the three reflection situations. These
results were analyzed using a one-way analysis of variance procedure, which revealed sig-
nificant differences between the means [F(2, 36) ¼ 10.42, p < 0.01]. Post hoc analysis
using Tukey’s least significant difference criterion (a ¼ 0.05) of the three conditions
showed this difference to be between the measured and simulated diffuse cases and the
specular case. The mean angles are 11.2 for both diffuse conditions, and 8.7 for the
case with specular reflections. This means that a listener cannot distinguish, with better
than 50% success, if a source moves from a position two degrees away from the central
source to a further position, until that further position is at least this angle.
The results show that in the presence of reflections from diffusive surfaces, the
mean angle at which listeners could discriminate the change in position of a second
talker on stage was 2.5 larger than in the presence of reflections from specularly

Fig. 3. The mean separation difference discrimination angle and 95% confidence intervals for a simulated con-
cert hall. The results for the Specular case are significantly different from the results for both the Diffusive cases
[F(2, 36) ¼ 10.42, p < 0.01]. Number of subjects ¼ 13.

EL374 J. Acoust. Soc. Am. 133 (5), May 2013 Robinson et al.: Diffusers and spatial discrimination

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4798648] Published Online 8 April 2013

reflecting surfaces. For an instrumental ensemble or orchestra, this difference could


allow discriminating two instruments or having them blend together. When this swath
is projected to the back of the stage, more instruments may be included, since the rows
of the orchestra overlap. Likewise, for seats further back in the audience, the same
angle equates to an even larger swath of the stage. It should be also noted that the dif-
ferent reflection types have varying subjective effects, including tone coloration, per-
ceived reverberance, and clarity. Examination of these features is reserved for future
studies.
Examination of the objective acoustic parameters associated with each source
position yields little insight into the reason for the difference in thresholds. Table 1 dis-
plays the common acoustical parameters, EDT, T30, C50, and IACC, for each source
position. Early decay time (EDT) for the specular case is slightly higher than the other
two, indicating more running reverberance that should interfere with localization.
However the difference is generally within one JND, assumed to be 5%. Reverberation
time T30 and Clarity are also within 5% and 1 dB JNDs, respectively, between all
cases. There are differences in IACC between comparison conditions, such as between
Position 6 and Position 2. For the Specular case this difference is 0.16 and for the
Diffuse cases it is 0.14 and 0.10. Listeners are very sensitive to changes in IACC when
it is close to 1, with JNDs as low as 0.03 for narrowband noise, however, these values
increase with lower values of IACC to as large as 0.4 at an IACC of 0 (Gabriel and
Colburn, 1981). For a musical signal, Morimoto and Iida (1995) measured a JND of
0.1–0.12 at IACC of 0.5. The IACC range under investigation here is 0.42–0.26;
assuming a JND of greater than 0.12, all three cases may be within one JND, but the
Specular case is the most likely to be discernible. Further research is required to fully
develop this finding.
4. Concluding remarks
Diffusive architectural surfaces are widely applied in spaces for critical listening, from
recording studios, to lecture rooms, to concert halls. Yet, despite their widespread use,
the perceptual consequences of using diffusive architectural surfaces are largely
unstudied. The results presented here begin to reveal some of their effects on spatial
perception. An objective spatial discrimination test revealed that listeners could not
distinguish between a pair of talkers 2.4 and one 11.2 apart in the presence of diffuse
early reflections, but they could distinguish between the former and a pair 8.7 apart in
a condition with specular early reflections. It can be inferred that a higher discrimina-
tion threshold is associated with more blended sources or a wider apparent source
width. This is a significant finding in the context of acoustic design for performing arts
venues, and one that needs further attention, particularly to determine why the audi-
tory system is more accurate when processing specular early reflections. Diffusive
surfaces may also affect subjective qualities of the sound field; this is an area for future
research.

Acknowledgments
P.R.’s research was supported by a U.S. Fulbright grant, funded by Finland’s Center for
International Mobility. Additional funding was provided by The Academy of Finland,
project no. 257099 and the European Research Council grant agreement no. 203636.
Thanks are also due to the listening test participants for their volunteered time and effort.

References and links


Barron, M., and Marshall, A. (1981). “Spatial impression due to early lateral reflections in concert halls:
The derivation of a physical measure,” J. Sound Vib. 77, 211–232.
Bolia, R., Nelson, W., Ericson, M., and Simpson, B. (2000). “A speech corpus for multi-talker communica-
tions research,” J. Acoust. Soc. Am. 107, 1065–1066.
Bradley, J., and Soulodre, G. (1995). “Objective measures of listener envelopment,” J. Acoust. Soc. Am.
98, 2590–2597.

J. Acoust. Soc. Am. 133 (5), May 2013 Robinson et al.: Diffusers and spatial discrimination EL375

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
Robinson et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4798648] Published Online 8 April 2013

Gabriel, K., and Colburn, H. S. (1981). “Interaural correlation discrimination: I. bandwidth and level
dependence,” J. Acoust. Soc. Am. 69, 1394–1401.
Hartmann, W. (1983). “Localization of sound in rooms,” J. Acoust. Soc. Am. 74, 1380–1391.
Kleiner, M., Svensson, U., and Dalenb€ ack, B. (1992). “Auralization of QRD and other diffusing surfaces
using scale modeling,” in 93rd AES Convention (Audio Engineering Society, New York), p. 1–13.
Levitt, H. (1971). “Transformed up-down methods in psychoacoustics,” J. Acoust. Soc. Am. 49, 467–477.
Lokki, T., P€ atynen, J., Kuusinen, A., and Tervo, S. (2012). “Disentangling preference ratings of concert
hall acoustics using subjective sensory profiles,” J. Acoust. Soc. Am. 132, 3148–3161.
Lokki, T., P€ atynen, J., Tervo, S., Siltanen, S., and Savioja, L. (2011). “Engaging concert hall acoustics is
made up of temporal envelope preserving reflections,” J. Acoust. Soc. Am. 129, EL223–EL228.
Morimoto, M., and Iida, K. (1995). “A practical evaluation method of auditory source width in concert
halls,” J. Acoust. Soc. Jpn. 16, 59–69.
Perrott, D. (1984). “Concurrent minimum audible angle: A re-examination of the concept of auditory spa-
tial acuity,” J. Acoust. Soc. Am. 75, 1201–1206.
Pulkki, V., Laitinen, M., and Sivonen, V. (2010). “HRTF measurements with a continuously moving
loudspeaker and swept sines,” in Audio Engineering Society Convention (Audio Engineering Society,
New York), Vol. 128, p. 8090.
Pulkki, V., and Merimaa, J. (2006). “Spatial impulse response rendering II: Reproduction of diffuse sound
and listening tests,” J. Audio Eng. Soc. 54, 3–20.
Robinson, P., and Xiang, N. (2010). “On the subtraction method for in-situ reflection and diffusion coeffi-
cient measurements,” J. Acoust. Soc. Am. 127, EL99–EL104.
Robinson, P., and Xiang, N. (2013). “Construction and evaluation of a 1:8 scale model binaural manikin,”
J. Acoust. Soc. Am. 133, EL162–EL168.
Robinson, P., Xiang, N., and D. Antonio, P. (2009). “Measuring the uniform diffusion coefficient:
Synthesized aperture goniometer measurements,” Proc. Meet. Acoust. 6, 015003.
Ryu, J., and Jeon, J. (2008). “Subjective and objective evaluations of a scattered sound field in a scale
model opera house,” J. Acoust. Soc. Am. 124, 1538–1549.
Siltanen, S., Lokki, T., Tervo, S., and Savioja, L. (2012). “Modeling incoherent reflections from rough
room surfaces with image sources,” J. Acoust. Soc. Am. 131, 4606–4614.

EL376 J. Acoust. Soc. Am. 133 (5), May 2013 Robinson et al.: Diffusers and spatial discrimination

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 84.253.223.176 On: Fri, 01 Nov 2013 06:02:34
View publication stats

You might also like