Neural Correlates of Timbre Change in Harmonic Sounds

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

NeuroImage 17, 1742–1754 (2002)

doi:10.1006/nimg.2002.1295

Neural Correlates of Timbre Change in Harmonic Sounds


V. Menon,* ,† ,‡ D. J. Levitin,§ B. K. Smith,㛳 A. Lembke,* B. D. Krasnow,* D. Glazer,¶
G. H. Glover,‡ ,¶ and S. McAdams㛳
*Department of Psychiatry and Behavioral Sciences, †Program in Neuroscience, ‡Stanford Brain Research Center, Stanford University
School of Medicine, Stanford, California 94305; §Department of Psychology, McGill University, Montréal, Québec, Canada;
㛳Institut de Recherche et Coordination Acoustique/Musique (IRCAM-CNRS), Paris, France; and ¶Department of Radiology,
Stanford University School of Medicine, Stanford, California 94305

Received October 22, 2001

known about the psychological representation of timbre (Mc-


Timbre is a major structuring force in music and one Adams and Cunibile, 1992; McAdams and Winsberg, 2000;
of the most important and ecologically relevant fea- McAdams et al., 1995; Pitt, 1995) but relatively little is
tures of auditory events. We used sound stimuli se- known about its neural underpinnings (Shamma, 1999). This
lected on the basis of previous psychophysiological report is the first in a series of experiments designed to
studies to investigate the neural correlates of timbre uncover the functional neuroanatomy of timbre perception in
perception. Our results indicate that both the left and humans and its relation to human psychophysics.
right hemispheres are involved in timbre processing, “Timbre” refers to those aspects of sound quality other
challenging the conventional notion that the elemen- than pitch, loudness, perceived duration, spatial location,
tary attributes of musical perception are predomi- and reverberant environment (McAdams, 1993; Plomp, 1970;
nantly lateralized to the right hemisphere. Significant Risset and Wessel, 1999). More generally, timbre can be
timbre-related brain activation was found in well-de- defined as that feature of an auditory stimulus that allows us
fined regions of posterior Heschl’s gyrus and superior to distinguish one source from another when all of the other
temporal sulcus, extending into the circular insular five perceptual features are held constant. That is, timbre is
sulcus. Although the extent of activation was not sig- the “tonal color and texture” that allows us to distinguish a
nificantly different between left and right hemi- trumpet from a clarinet playing the same note or to distin-
spheres, temporal lobe activations were significantly guish one speaker from another when they are reciting the
posterior in the left, compared to the right, hemi- same words. In spite of its prime importance in identification
sphere, suggesting a functional asymmetry in their and memory, timbre is one of the least understood compo-
respective contributions to timbre processing. The im- nents of auditory perception. Helmholtz was the first to rec-
plications of our findings for music processing in par- ognize that it is to some extent related to a tone’s spectral
ticular and auditory processing in general are dis- composition (Helmholtz, 1863/1954). More recent investiga-
cussed. © 2002 Elsevier Science (USA)
tors have found that it is crucially related not just to spectral
Key Words: timbre; temporal lobe; superior temporal
parameters but also to the temporal aspects of tone, such as
gyrus; superior temporal sulcus; middle temporal gy-
the rate of increase in level or the presence of inharmonic
rus; fMRI.
components in the initial part of the tone (Grey, 1977), acous-
tic features that affect the “attack quality” (Hajda et al.
INTRODUCTION 1997). For example, a piano is easily recognized as such in a
taped recording, but if that same recording is played back-
An auditory event comprises six general classes of percep- wards, the distinct timbre that identifies “piano” is lost, even
tual attributes: pitch, loudness, timbre, perceived duration, though the long-term frequency spectrum remains un-
spatial location, and reverberant environment (Levitin, changed for any given tone (Berger, 1964).
1999). Of these six, timbre is arguably the most important Accurate memory for the timbral qualities of everyday
and ecologically relevant feature of auditory events. The tim- music was shown recently in a compelling study by Schellen-
bre of a sound is the principal feature that distinguishes the berg and his colleagues (1999). Participants in this study
growl of a lion from the purr of a cat, the crack of thunder were able to identify popular songs from brief excerpts of only
from the crash of ocean waves, the voice of a friend from that 100 and 200 ms. The authors argued that this was too brief
of a bill collector one is trying to dodge. Timbral discrimina- an excerpt for cues such as rhythm, pitch, or tonality to
tion is so acute in humans that most of us can recognize trigger memory and that it must have been timbral memory
hundreds of different voices and, within a single well-known that was being accessed. Timbral memory may also be the
voice, distinguish whether its possessor is in good health or mechanism by which some absolute pitch possessors identify
suffering from a cold— even over the telephone. Much is now tones (Takeuchi and Hulse, 1993). The distinctive sound of a

1053-8119/02 $35.00 1742


© 2002 Elsevier Science (USA)
All rights reserved.
NEURAL CORRELATES OF TIMBRE 1743

spectral centroid (a measure of the relative presence of high-


frequency versus low-frequency energy in the frequency spec-
trum), and spectral flux (a measure of how much the spectral
envelope changes over the duration of a tone).
Although the psychoacoustics of timbre continues to be
extensively investigated (McAdams et al., 1995), information
about its neural substrates remains sparse. In general, lesion
studies have shown that the right temporal lobe is involved
in spectral processing, i.e., pitch matching, frequency dis-
crimination (Milner, 1962; Robin et al. 1990; Sidtis and
Volpe, 1988; Zatorre, 1988), and the left temporal lobe is
involved in temporal processing (Efron, 1963; Robin et al.,
1990). Samson and Zatorre (1994) examined timbre discrim-
ination in patients with unilateral temporal lobe excisions,
right and left, while manipulating both the spectral and
temporal components of timbre. To their surprise, only those
subjects with right temporal lobe excisions were impaired in
timbre discrimination, independently of the temporal and
spectral cues. These findings support a functional role for the
right temporal lobe in timbre discrimination, but leave am-
biguous the role of the left temporal lobe, if any.
A few neuroimaging studies have examined the neural
basis of music perception in general, with attempts to frac-
tionate musical processing into component operations. The
emerging picture is that music perception per se does not
FIG. 1. The timbre space for 18 synthetic musical instrument have a specific neural locus, but that the components of music
sounds found by McAdams et al. (1995). A three-dimensional space perception—specifically pitch and rhythm—may. One early
with acoustic correlates of attack time, spectral centroid, and spec- PET study showed a specialized role for the right temporal
tral flux was found. For comparison, Timbres A and B from the lobe in pitch perception (Zatorre et al., 1993); a more recent
present study are shown approximately as they would appear in the PET study by Griffiths et al. (1999) showed remarkably sim-
space on the basis of these same acoustic parameters. Note that the
change from Timbre A to Timbre B occurs as a diagonal across the
ilar bilateral activation for both pitch and duration, in the
space since all three dimensions change. cerebellum, posterior superior temporal cortices, and inferior
frontal cortices. A PET study by Blood et al. (1999) examined
affective processing of musical pieces and found activation of
piano or saxophone—presumably based on its timbre—is be- neocortical and paralimbic structures in response to pleas-
lieved to serve as a cue for pitch identification among abso- ant/unpleasant musical stimuli. Belin et al. (2000) found
lute pitch possessors who are unable to identify the pitch of regions in the superior temporal sulcus (bilaterally) that
sine tones or of instruments with which they are less famil- were involved in the processing of human voices. Although
iar. the study did not explicitly address timbre perception, tim-
The psychological space for timbre representation has been bre-related attributes would have been among the primary
investigated in an extensive series of studies by McAdams cues for subjects in this experiment to distinguish human
and his colleagues, among others. Since the work of Plomp from environmental sounds and to distinguish human voices
(1970), the principal paradigm for uncovering the perceptual from one another.
structure of timbre has employed multidimensional scaling The functional neuroanatomy of timbre perception in hu-
of dissimilarity judgments among sets of musical sounds mans remains almost entirely unknown, and yet this impor-
equated for pitch, loudness, and duration and generally pre- tant topic may hold the key to understanding fundamental
sented over headphones or loudspeakers (see McAdams, questions about voice identification, musical processing, and
1993; Hajda et al., 1997, for reviews). Over many different memory for the numerous sound sources encountered in ev-
studies, two or three perceptual dimensions are generally eryday life. The only known study to date identified right
extracted. The acoustic correlates of these “timbre spaces” superior and middle frontal gyrus as candidates for selective
depend on the set of sounds tested and to some extent the attending to timbre in the presence of simultaneous variation
listener population. For example, using a set of synthesized in pitch and rhythm (Platel et al., 1997), yet this finding is
sounds designed to imitate orchestral instruments or hybrids difficult to interpret for several reasons: first, the timbral
of these instruments, McAdams et al. (1995) found a three- change was subtle and primarily spectral in nature (a
dimensional space with specific features on several of the “bright” vs “dull” oboe) and second, the data analysis em-
timbres (Fig. 1). The acoustic correlates of the commonly ployed a subtraction in which selective attending to timbral
shared dimensions were the logarithm of the attack time (the change was compared to selective attending to pitch and
time it takes the energy envelope to go from perceptual rhythm changes. Johnsrude et al. (1997) examined process-
threshold to its maximum), the spectral center of gravity, or ing of acoustic transients in nonspeech sounds using PET.
1744 MENON ET AL.

This study is of particular interest because psychophysical bits. Each sound had an amplitude envelope defined in terms
and cognitive studies of timbre have shown transients to be of linear attack time (AT), sustain time (ST), and exponential
an important component in timbral identification and recog- decay time (DT). These three values were, respectively, 15,
nition (Berger, 1964). Participants in this study discrimi- 60, and 198 ms for Timbre A and 80, 0, and 117 ms for Timbre
nated pure tone stimuli incorporating frequency glides of B (Fig. 2). The main interest was in attack time and the other
either long (100 ms) or short (30 ms) duration. Comparison of two values were adjusted to achieve similar perceived dura-
these two conditions revealed significant activation foci in tions across the two timbres. The sounds were composed of
the left orbitofrontal cortex, left fusiform gyrus, and right harmonic spectra. The levels of the harmonies decreased as a
cerebellum. The left hemisphere activations are consistent power function of harmonic rank (A n ⫽ n p). The exponent p of
with previously cited work implicating left hemisphere struc- this function was used to establish the base spectral centroid
tures for temporal processing (Efron, 1963; Robin et al., (SCbase), which is defined as the amplitude-weighted mean
1990). harmonic rank. For a pure tone SCbase ⫽ 1, whereas for a
Based therefore on psychophysical models of timbre per- tone composed of the first three harmonics with amplitudes
ception, which point to an important role for both spectral equal to the inverse of the harmonic rank, SCbase ⫽ (1*1 ⫹
and temporal parameters of sound, we hypothesized a critical 2*1/2 ⫹ 3*1/3)/(1 ⫹ 1/2 ⫹ 1/3) ⫽ 1.63. Expressed in this way,
role for the the left and right temporal lobes in tasks involv- the relative amplitudes among harmonics remain constant
ing timbral processing, as well as those left hemisphere re- with variations in fundamental frequency. Spectral flux
gions previously implicated by Johnsrude et al.’s (1997) study (⌬SC, expressed as a change in amplitude-weighted mean
of transient perception. In order to maximize the difference harmonic rank) was defined as a variation over time of SC
along acoustic dimensions known to affect timbre perception, with a rise time SCRT and a fall time SCFT. Between time 0
we selected two types of sounds that differed maximally in and SCRT, SC rose from SCbase to SCbase ⫹ ⌬SC, following
terms of attack time, spectral centroid, and spectral flux, as a raised cosine trajectory. From time SCRT to SCRT ⫹ SCFT,
found by McAdams et al. (1995). The two distinct timbre SC decreased back to SCbase, following a slower raised co-
stimuli (A and B) created for the present experiment differed sine trajectory. After time SCRT ⫹ SCFT, SC remained con-
significantly along these three dimensions. Their approxi- stant at SCbase. Timbre A was characterized by a fast attack
mate positions with respect to the original space are shown in (AT ⫽ 15 ms), a low base spectral centroid (SCbase ⫽ 1.05),
Fig. 1. Our experimental design was as follows. We compared and no spectral flux (⌬SC ⫽ 0). Timbre B was characterized
brain activation in response to melodies composed of tones by a slower attack (AT ⫽ 80 ms), a higher base spectral
with a fast attack, low spectral centroid, and no spectral flux centroid (SCbase ⫽ 1.59), and a higher degree of spectral flux
(Timbre A) to that resulting from melodies formed of tones (⌬SC ⫽ 4.32, SCRT ⫽ 49 ms, SCFT ⫽ 102). For Timbre B, the
with a slow attack, higher spectral centroid, and greater instantaneous SC therefore varied from 1.59 to 5.91 between
spectral flux (Timbre B). The theoretical interest in these 0 and 49 ms and from 5.91 back to 1.59 between 49 and 151
sounds is that they include spectral, temporal, and spectro- ms, remaining at 1.59 thereafter (Fig. 2). Prior to the exper-
temporal parameters that have been shown to be correlated imental session, the two timbre stimuli were matched psy-
with perceptual dimensions of timbre. A case in which tim- chophysically for loudness by four raters, and this necessi-
bres changed across the entire perceptual space needed to be tated attenuating the sounds with Timbre B by 8.5 dB
tested first in order to demonstrate that perceptually salient
compared to those with Timbre A.
stimuli reveal differences in activation as measured with
Stimuli consisted of 60 six-note melodies played with these
fMRI. The critical question is thus: what are the character-
timbres. The range of fundamental frequencies was 288.2 Hz
istics of brain responses associated with differences in me-
(C No. 4) to 493.9 Hz (B4). One of these melodies, realized
lodic stimuli created with precisely controlled synthetic
with each of the timbres, is shown in a spectrographic rep-
sounds having different timbres?
resentation in Fig. 2. Ten melodies with a given timbre con-
stituted a block. All melodies were tonal and ended on the
MATERIALS AND METHODS tonic note. The pitch range was confined to an octave but
varied from five to nine semitones across melodies: 21 melo-
Subjects dies had a range with five semitones (ST), 21 with seven ST,
Ten healthy adult subjects (4 males and 6 females; ages and 18 with nine ST. Each block had 3 or 4 five-ST melodies,
21– 45, mean 33) participated in the study after giving writ- 3 or 4 seven-ST melodies, and 3 nine-ST melodies. Thirty
ten informed consent, and all protocols were approved by the melodies had one contour change, and 30 had two changes
human subjects committee at Stanford University School of (five of each per block). There were 36 major-mode and 24
Medicine. All participants were volunteers and were treated minor-mode melodies (6 major/4 minor per group). Key
in accordance with the APA “Ethical Principles of Psycholo- changed from melody to melody in order to keep the mean
gists and Code of Conduct.” pitch constant across melodies. An attempt was also made to
keep the pitch distribution relatively constant across blocks.
The time interval between onsets of successive tones within
Stimuli
a melody was 300 ms, making the total duration of each
All sounds were created digitally using a 22,050-Hz sam- melody approximately 1.8 s for Timbre A melodies and 1.7 s
pling rate and 16-bit resolution, although stimulus presen- for Timbre B melodies. A new melody occurred every 2.4 s
tation in the MRI scanner used only the 8 most significant and a block of 10 melodies thus lasted 24 s.
NEURAL CORRELATES OF TIMBRE 1745

FIG. 2. Acoustic features of Timbre A (top) and Timbre B (bottom). (Left panel) Spectrographic representation of one of the stimulus
melodies. The level is indicated by the darkness of the gray scale. The range represented by the gray scale is approximately 90 dB. (Top right
panel) The amplitude envelope of a single tone, representing the rms amplitude as a function of time. (Bottom right panel) The spectral flux
function for a single tone, representing the variation of the spectral centroid (SC) over time.

fMRI Tasks the scan and practiced the experiment prior to entering the
scanner.
Each subject performed two runs of the same experiment
in the scanner. Each experiment consisted of 12 alternating Behavioral Data Analysis
24-s epochs (epochs of 10 melodies with a given timbre), for a
total of 120 melodies per scan. There were 6 Timbre A epochs The aim of the behavioral data collection was to verify that
interleaved with 6 Timbre B epochs. Each of the six blocks of listeners were paying attention to the stimuli to the same
ten melodies was presented for each timbre, but the order of extent for A and B epochs. Reaction time (RT) and number of
blocks and the order of melodies within blocks were random- correct and incorrect responses for A and B epochs were
ized for each subject. Subjects were asked to engage their collapsed across the two runs. A response was considered
attention on the melodies. For each six-note melody they correct if the subject pressed the appropriate response key
were asked to press a response key as soon as the melody was after the end of the musical segment, but before the onset of
over. In one run, the first epoch was played with Timbre A, the next stimulus. The percentages of correct responses and
and in the second run, the first epoch was played with RTs to correct trials were compared using two-tailed paired t
Timbre B. tests. The significance threshold was set at 0.05.

Procedure fMRI Acquisition


The order of the two runs was randomized across subjects. Images were acquired on a 3T GE Signa scanner using a
They were briefed on the task they were to perform during standard GE whole head coil (software Lx 8.3). A custom-
1746 MENON ET AL.

built head holder was used to prevent head movement; 25 using least squares minimization without higher order cor-
axial slices (5 mm thick, 0 mm skip) parallel to the AC/PC rections for spin history (Friston et al., 1996) and were nor-
line and covering the whole brain were imaged with a tem- malized to stereotaxic Talairach coordinates using nonlinear
poral resolution of 1.6 s using a T2*-weighted gradient echo transformations (Ashburner and Friston, 1999; Friston et al.,
spiral pulse sequence (TR, 1600 ms; TE, 30 ms; flip angle, 1996). Images were then resampled every 2 mm using sinc
70°; 180 time frames; and one interleave) (Glover and Lai, interpolation and smoothed with a 4-mm Gaussian kernel to
1998). The field of view was 220 ⫻ 220 mm 2, and the matrix reduce spatial noise.
size was 64 ⫻ 64, providing an in-plane spatial resolution of
3.44 mm. To reduce blurring and signal loss arising from field
inhomogeneities, an automated high-order shimming Statistical Analysis
method based on spiral acquisitions was used before acquir-
ing functional MRI scans (Kim et al., 2000). Images were Statistical analysis was performed on individual and group
reconstructed, by gridding interpolation and inverse Fourier data using the general linear model and the theory of Gauss-
transform, for each of the 180 time points into 64 ⫻ 64 ⫻ 25 ian random fields as implemented in SPM99. This method
image matrices (voxel size, 3.44 ⫻ 3.44 ⫻ 5 mm). A linear takes advantage of multivariate regression analysis and cor-
shim correction was applied separately for each slice during rects for temporal and spatial autocorrelations in the fMRI
reconstruction using a magnetic field map acquired automat- data (Friston et al., 1995). Activation foci were superimposed
ically by the pulse sequence at the beginning of the scan on high-resolution T1-weighted images and their locations
(Glover and Lai, 1998). were interpreted using known neuroanatomical landmarks
To aid in localization of functional data, a high-resolution (Duvernoy et al., 1999). MNI coordinates were transformed to
T1-weighted spoiled grass gradient recalled (SPGR) inver- Talairach coordinates using a nonlinear transformation
sion recovery 3D MRI sequence was used with the following (Brett, 2000).
parameters: TI, 300 ms; TR, 8 ms; TE, 3.6 ms; flip angle, 15°; A within-subjects procedure was used to model all the
22-cm field of view; 124 slices in the sagittal plane; 256 ⫻ 192 effects of interest for each subject. Individual subject models
matrix; two averages; acquired resolution, 1.5 ⫻ 0.9 ⫻ 1.1 were identical across subjects (i.e., a balanced design was
mm. The images were reconstructed as a 124 ⫻ 256 ⫻ 256 used). Confounding effects of fluctuations in the global mean
matrix with a 1.5 ⫻ 0.9 ⫻ 0.9 mm spatial resolution. Struc- were removed by proportional scaling where, for each time
tural and functional images were acquired in the same scan point, each voxel was scaled by the global mean at that time
session. point. Low-frequency noise was removed with a high-pass
filter (0.5 cycles/min) applied to the fMRI time series at each
Stimulus Presentation voxel. A temporal smoothing function (Gaussian kernel cor-
responding to a half-width of 4 s) was applied to the fMRI
The task was programmed using Psyscope (Cohen et al., time series to enhance the temporal signal to noise ratio.
1993) on a Macintosh (Cupertino, CA) computer. Initiation of Regressors were modeled with a boxcar function correspond-
scan and task was synchronized using a TTL pulse delivered ing to the epochs during which each timbre was presented
to the scanner timing microprocessor board from a “CMU and convolved with a hemodynamic response function. We
Button Box” microprocessor (http://poppy.psy.cmu.edu/ then defined the effects of interest for each subject with the
psyscope) connected to the Macintosh. Auditory stimuli were relevant contrasts of the parameter estimates. Group analy-
presented binaurally using a custom-built magnet compati-
sis was performed using a random-effects model that incor-
ble system. This pneumatic delivery system was constructed
porated a two-stage hierarchical procedure. This model esti-
using a piezoelectric loudspeaker attached to a cone-shaped
mates the error variance for each condition of interest across
funnel, which in turn was connected to flexible plastic tubing
subjects, rather than across scans, and therefore provides a
leading to the participant’s ears. This delivery system filters
stronger generalization to the population from which data
the sound with an overall low-pass characteristic and a spec-
are acquired (Holmes and Friston, 1998). In the first stage,
tral dip in the region of 1.5 kHz (Fig. 3). The tubing passed
through foam earplug inserts that attenuated external sound contrast images for each subject and each effect of interest
by approximately 28 dB. The sound pressure level at the were generated as described above. In the second stage, these
head of the participant due to the fMRI equipment during contrast images were analyzed using a general linear model
scanning was approximately 98 dB (A), and so after the to determine voxelwise t statistics. One contrast image was
attenuation provided by the ear inserts, background noise generated per subject, for each effect of interest. A one-way,
was approximately 70 dB (A) at the ears of the listener. The two-tailed t test was then used to determine group activation
experimenters set the stimuli at a comfortable listening level for each effect. Finally, the t statistics were normalized to Z
determined individually by each participant during a test scores, and significant clusters of activation were determined
scan. using the joint expected probability distribution of height and
extent of Z scores (Poline et al., 1997), with height (Z ⬎ 1.67;
P ⬍ 0.05) and extent (P ⬍ 0.05) thresholds.
Image Preprocessing
Contrast images were calculated using a within-subjects
fMRI data were preprocesssed using SPM99 (http://www. design for the following comparisons: (1) Timbre B–Timbre
fil.ion.ucl.ac.uk/spm). Images were corrected for movement A; (2) Timbre A–Timbre B.
NEURAL CORRELATES OF TIMBRE 1747

FIG. 3. Fourier spectra (red) for Timbres A and B (left and right panels, respectively). The spectra were computed on 2048-sample (46
ms) segments with a Hanning window in the middle of the second tone from melody 1 (Fig. 2). The top panels show the spectra at the output
of the computer’s converters. The bottom panels show the sound after filtering by the pneumatic delivery system. Note an overall low-pass
characteristic plus a dip in the spectrum in the region of 1500 Hz. The filtering affects Timbre B sounds more than Timbre A sounds, but the
two are still quite distinct spectrally. Also superimposed on the bottom panels is the estimated spectrum of the scanner noise (black). This
noise could not be recorded under the same conditions due to the magnetic field of the scanner. This relative level is roughly estimated to
correspond to that present in the headphone system. It does not take into account filtering by the headphones, which would be quite variable
in the low-frequency region depending on the fit of the headphones to the head of each listener, nor the fact that the stimulus level was
adjusted individually for each listener.
FIG. 4. Surface rendering of brain activation in response to Timbre B stimuli compared to Timbre A stimuli. Significant clusters of
activation were observed in the central sections of the STG and STS. Note that left hemisphere activations are posterior to right hemisphere
activations. No brain regions showed greater activation in response to Timbre B stimuli, compared to those with Timbre A.

RESULTS for Timbre A and 99.8 ⫾ 1.0% for Timbre B. RTs were 1914 ⫾
59.5 and 1931 ⫾ 76.0 ms for Timbres A and B, respectively.
Behavioral
We conclude that there were no behavioral differences (and
End-of-melody detections did not differ in accuracy be- thus no confounding attentional or arousal differences) be-
tween A and B blocks (t(8) ⫽ ⫺0.900; P ⫽ 0.40), nor did RT tween the two sets of stimulus blocks containing the two
(t(8) ⫽ ⫺1.876; P ⫽ 0.10). Accuracy rates were 99.5 ⫾ 2.8% timbres.
1748 MENON ET AL.

TABLE 1
Brain Regions That Showed Significant Activation Differences between the Two Timbre Stimuli

P value No. of Peak


(corrected) voxels Z score Talairach coordinates

Timbre B–Timbre A
Right STG/STS (BA 41/22/21) ⬍0.001 297 3.31 50, ⫺8, ⫺4
Left STG/STS (BA 41/22/21) ⬍0.002 285 2.55 ⫺36, ⫺24, 8
Timbre A–Timbre B No significant clusters — — —

Note. For each significant cluster, region of activation, significance level, number of activated voxels, maximum Z score, and location of
peak in Talairach coordinates are shown. Each cluster was significant after height and extent correction (P ⬍ 0.05). STG, superior temporal
gyrus; MTG, middle temporal gyrus; BA, Brodmann Area.

Brain Activation 0.21; P ⫽ 0.66), nor were mean Anterior/Posterior differences


(F(1,8) ⫽ 0.72; P ⬍ 0.42).
Timbre B–Timbre A. Two significant, tightly focused,
Timbre A–Timbre B. No brain regions showed greater
clusters of activation were observed in both the left and right
activation during Timbre A compared to Timbre B.
temporal lobes (Table 1, Fig. 4). In both hemispheres, the
main activation foci were localized to the dorsomedial surface
of the superior temporal gyrus (STG) (BA 41/42/22), adjoin- DISCUSSION
ing the insular cortex and extending to the upper and lower
banks of the STS (BA 21). Additionally, in the left hemi- Our findings strongly support the notion that timbre-re-
sphere, a more posterior and medial activation focus overly- lated information is processed in both the left and right
ing the circular insular sulcus was also observed. hemispheres. Furthermore, timbre-related differences in ac-
Seven of the nine subjects showed significant activation in tivation were detected only in mid sections of the STG, STS,
the left temporal lobe cluster and eight also showed activa- and adjoining insular cortex. Although the overall extent of
tion in the right temporal lobe cluster. There was no differ- activation was not significantly different in the left and right
ence overall between the percentages of voxels activated in temporal lobes, left temporal lobe activations were signifi-
left and right temporal lobes (Wilcoxon matched pairs test, cantly posterior to right temporal lobe activations, suggest-
Z ⫽ 0.296; P ⫽ 0.76). Similarly, no difference was seen in the ing a functional asymmetry in the left and right temporal
average intensity of activation between the two hemispheres lobe regions involved in timbre processing. The present study
(Z ⫽ 1.006; P ⫽ 0.31). Interestingly, activation in the left provides information about the precise temporal lobe regions
temporal lobe was posterior to the right temporal lobe acti- involved in timbre processing. The implications of these find-
vation (Fig. 5). In the left hemisphere, the most anterior ings for timbre processing are discussed below.
activation point was at ⫺34, ⫺20, 8 and the most posterior Specifically, we found that timbre stimuli with greater
point was at ⫺41, ⫺40, 10. In the right hemisphere, the most spectral flux, higher spectral centroid, and slower attack
anterior point was at 54, ⫺4, ⫺8, and the most posterior (Timbre B) resulted in greater activation than timbre stimuli
point was at 40, ⫺28, ⫺6. with low spectral centroid, no spectral flux, and faster attack
To quantify this asymmetry in activation, we examined the (Timbre A). No brain regions showed significantly greater
number of voxels activated anterior and posterior to y ⫽ ⫺22 activation during Timbre A compared to Timbre B. Another
(midway between the anteriormost and posteriormost acti- functional imaging study of timbre processing identified the
vations). Only 14 voxels were activated in the left anterior right superior and middle frontal gyri as brain regions in-
region compared to 271 in the left posterior region, whereas volved in timbral processing (Platel et al., 1997), a finding
286 voxels were activated in the right anterior region com- that was based on a comparison of selective attending to
pared to only 11 in the right posterior region. Three of the timbral change with selective attending to pitch or rhythm
nine subjects had no activated voxels in the left anterior change rather than a direct comparison of two distinct timbre
region and six had no activation in the right posterior region. stimuli. The present study did not find significant activation
To further analyze the asymmetry in activation, we exam- in these regions. The lack of prefrontal cortical activation in
ined the Hemisphere by Anterior/Posterior interaction using the present study may be related to the fact that subjects did
Analysis of Variance (ANOVA). For this analysis, we con- not have to make any complex comparisons or decisions
structed identical and (left/right) mirror symmetric regions of about the timbre stimuli, nor did the task require the in-
interest (ROI) in the left and right hemispheres based on the volvement of selective attentional processes. Unlike the Pla-
observed group activation. Within these ROIs, the number of tel et al. study, which used a subtle, primarily spectral tim-
voxels activated in each Hemisphere in the Anterior and bral change (“bright” vs “dull” oboe), the present study used
Posterior subdivisions was computed for each subject. As timbre stimuli that have been shown to be perceptually very
shown in Fig. 6, ANOVA revealed a significant Hemisphere x dissimilar (McAdams et al., 1995). The subtle timbral
Anterior/Posterior interaction (F(1,8) ⫽ 20.11; P ⬍ 0.002). changes may also have been the reason for the lack of tem-
Mean hemispheric differences were not significant (F(1,8) ⫽ poral lobe activation in the Platel et al. study. Our data are in
NEURAL CORRELATES OF TIMBRE 1749

agreement with those of lesion studies which have suggested Our results also bear upon the cortical representation of
involvement of the temporal lobe in timbre processing (Mil- loudness and its examination using fMRI since the fMRI
ner, 1962; Samson and Zatorre, 1994). Lesion studies from activation patterns of two loudness-matched stimuli are so
the Montreal Neurological Institute have reported that right unbalanced. Studies of the effect of sound level on the extent
temporal lobe lesions result in greater impairment in timbre and level of temporal lobe activation have shown complex
discrimination than left temporal lobe lesions (Milner, 1962; relationships based on the nature of pure/harmonic-complex
Samson and Zatorre, 1994). Nevertheless, a closer look at the tones used (Hall et al., 2001) as well as the precise subregion
published data also suggests that left temporal lesions do and laterality of the auditory cortical activation examined
impair timbre discrimination. These studies have not, how- (Brechmann et al., 2002). For example, sound level had a
ever, provided any details of the precise temporal lobe re- small, but significant, effect on the extent of auditory cortical
gions involved in timbre perception and have instead focused activation for the pure tone, but not for the harmonic-com-
on examining the relative contributions of the entire left and plex tone, while it had a significant effect on the response
right temporal lobes to timbre processing. The present study magnitude for both types of stimuli (Hall et al., 2001). Fur-
clearly indicates that focal regions of the left and right tem- thermore, the relationships among sound level, loudness per-
poral lobes are involved in timbre processing. ception, and intensity coding in the auditory system are
Equal levels of activation were detected in the left and complex (Moore, 1995). In any case, our results discount
right temporal lobes. To our knowledge, lateralization of simple models for loudness that are based, implicitly or ex-
timbre-related activation has never been directly tested be- plicitly, on the degree of activation of neural populations.
fore. Within the temporal lobes, activation was localized to Although there were no overall hemispheric differences in
the primary auditory cortex (PAC) as well as the auditory extent of activation, our results clearly point to differences in
association areas in the belt areas of the STG and in the STS. the specific temporal lobe regions activated in the left and
Anteriorly, the activation foci did not extend beyond Heschl’s right hemispheres. ANOVA revealed that distinct subregions
gyrus (HG) (Howard et al., 2000; Kaas and Hackett, 2000; of the STG/STS were activated in each hemisphere: activa-
Tian et al., 2001). The most anterior activation point was at tion in the left hemisphere was significantly posterior to right
hemisphere activation, with very few voxels showing activa-
y ⫽ ⫺4 mm (Talairach coordinates) in the right temporal
tion in the anterior regions ( y ⬎ ⫺22) in the left hemisphere.
lobe. The bilateral activation observed in the present study is
Conversely, very few voxels showed activation in the poste-
in agreement with human electrophysiological studies that
rior regions ( y ⬍ ⫺22) in the right hemisphere. These results
have found bilateral N100 sources during timbre-specific pro-
point to a hemispheric asymmetry in neural response to
cessing (Crummer et al., 1994; Jones et al., 1998; Pantev et
timbre; indeed, the location of peak activation in left auditory
al., 2001).
cortex was 16 mm posterior to that of peak activation in right
Interestingly, activation for Timbre B was greater than
auditory cortex. This shift is 8 –11 mm more posterior than
that for Timbre A, despite the matching of the two stimuli for
can be accounted for by morphological differences in the
loudness. One plausible explanation is that Timbre B has a
locations of the left and right auditory areas (Rademacher et
greater spectral spread (or perceivable spectral bandwidth)
al., 2001). Based on a detailed cytoarchitectonic study of
than Timbre A and would thus have a more extensive tono-
postmortem brains, Rademacher et al. have reported a sta-
topic representation, at least in the primary and secondary tistically significant asymmetry in the location of the PAC in
auditory cortices (Howard et al., 2000; Wessinger et al., the left and right hemispheres. The right PAC is shifted
2001). While the two timbres have the same number of har- anteriorly by 7 or 8 mm and laterally by 4 or 5 mm, with
monics and thus ideally would have the same bandwidths, it respect to the left PAC. An almost identical pattern of func-
is likely that the higher harmonics of Timbre A are below the tional asymmetry was observed in the present study—the
auditory threshold and/or masked by the scanner noise, thus right hemisphere activation peak was shifted more anteriorly
creating a perceptible bandwidth difference. Bandwidth dif- and laterally than the activation peak in the left hemisphere.
ferences were also present in the Wessinger et al. (2000) and While the centroid of the right PAC is roughly shifted by
Hall et al. (2001) studies. However, in all three cases, other ⌬X ⫽ 4, ⌬Y ⫽ 8, and ⌬Z ⫽ 1 mm, the peak fMRI activations
factors were also varied (the relative periodicity of the signals in the present study are shifted by ⌬X ⫽ 14, ⌬Y ⫽ 16, and
in Wessenger et al., the pitch in Hall et al., and attack time ⌬Z ⫽ ⫺12 mm in the X, Y, and Z directions, respectively.
and spectral flux in the present study), so the true contribu- Thus, even after accounting for the anatomical asymmetry in
tion of bandwidth per se cannot be unambiguously deter- location of the PAC, left hemisphere activation is still con-
mined. Another possibility is that the greater spectral flux in siderably posterior to right hemisphere activation, thus sug-
Timbre B may result in greater cortical activation, particu- gesting a functional asymmetry in the left and right temporal
larly in the STS, as this region appears to be sensitive to lobe contributions to timbre processing. Furthermore, these
temporal changes in spectral distribution. Finally, the rela- hemispheric differences do not appear to arise from volumet-
tive salience of the stimuli may also be important since ric differences since careful measurements of the highly con-
Timbre B is clearly more salient perceptually than A, but it is voluted auditory cortex surface have revealed no size differ-
not easy to measure “salience” per se. At any rate, it may not ence between the left and right PAC or HG (Rademacher et
be possible to independently control salience and timbre, al., 2001). Moreover, the functional asymmetry observed dur-
since some timbral attributes, such as brightness and sharp ing timbre processing extends beyond the dorsal STG to the
attack, generally seem to make sounds “stand out.” STS and the adjoining circular insular sulcus as well.
1750 MENON ET AL.

FIG. 5. (a) Coronal and (b) axial sections showing brain activation in response to Timbre B stimuli, compared to those with Timbre A.
Activation is shown superimposed on group-averaged, spatially normalized, T1-weighted images. In each hemisphere, two distinct subclus-
ters of activation could be clearly identified, one in the dorsomedial STG and the other localized to the STS. The activations also extend into
the circular insular sulcus and insula. See text for details.

In each hemisphere, two distinct subclusters of activation borders the circular sulcus medially, the first transverse
could be clearly identified, one in the STG and the other sulcus anteriorly, and the roof of HG caudolaterally. The
localized to the STS. The anteriormost aspects of the dorso- caudally adjacent T2 was centered on Heschl’s sulcus, thus
medial STG activations in both hemispheres were localized including the posterior rim of HG and a similar area behind
to a stripe-like pattern of activation in area T1b which fol- the sulcus on the anterior planum temporale/second Heschl’s
lows the rostromedial slope of HG (Brechmann et al., 2002). gyrus. There is very little activation in area T1a, the division
In the left hemisphere, the activation extends posteriorly to of the HG rostral to the first transverse sulcus. Areas T1a
area T2, which is centered on Heschl’s sulcus. In both hemi- and T1b overlap to a great extent the PAC probability maps
spheres, the activations appear to be anterior to area T3, and Talairach coordinate ranges reported by Rademacher et
which principally overlaps the main body of the planum al. (2001). Based on the landmarks identified by Rademacher
temporale. As defined by Brechmann et al. (2002), area T1b et al. and Brechmann et al., it therefore appears that in the
NEURAL CORRELATES OF TIMBRE 1751

FIG. 5—Continued

left hemisphere, the dorsal STG activation extends posteri- primarily from primary and secondary auditory cortices in
orly beyond the PAC to the posterior HG/anterior tip of the the STG, and the upper bank receives input exclusively from
planum temporale (BA 42/22) in the belt areas of the auditory the STG (Seltzer and Pandya, 1978). Note that these STS
association cortex. In the right STG, activation was primarily regions are anterior to, and functionally distinct from, the
observed in anterior and posterior HG, including the PAC visual association areas of the STS that are involved in visual
(BA 41). No activation was observed in the auditory associa- motion and gaze perception. This central STS region is not
tion cortex A2 covering the planum polare (BA 52) anterior to generally known to be activated by pure tones or even by
the primary auditory cortex in either the left or right STG. frequency modulation (FM) of tones, in contrast to the T1b
Although processing of nonlinguistic, nonvocal, auditory and T2 regions of the auditory cortex (Brechmann et al.,
stimuli is generally thought to take place in the STG, we also 2002). Relatively little is known about the auditory functions
found significant activation in the STS and adjoining circular of the STS, but Romanski et al. (1999) have proposed the
insular sulcus and insula in both hemispheres. The STS existence of a ventral temporal lobe stream that plays a role
regions activated in the present study have direct projections in identifying coherent auditory objects, analogous to the
to the STG (Pandya, 1995) and are considered part of the ventral visual pathway involved in object recognition. Inter-
hierarchical organization of the auditory association cortex estingly, Belin et al. (2000) have shown that similar central
(Kaas and Hackett, 2000). These STS zones receive input STS auditory areas are activated when subjects listen pas-
1752 MENON ET AL.

FIG. 6. To characterize the observed hemispheric asymmetry in temporal lobe activation, we used analysis of variance (ANOVA) on the
dependent variable “percentage of voxels activated,” with factors Hemisphere (Left/Right) and Anterior/Posterior divisions. ANOVA revealed
a significant Hemisphere ⫻ Anterior/Posterior interaction (F(1,8) ⫽ 20.11; P ⬍ 0.002), with the left hemisphere activations significantly
posterior to the right hemisphere activations. The coronal slice at y ⫽ ⫺22 mm (Talairach coordinates), midway between the anterior- and
posteriormost activation in the temporal lobes, was used to demarcate the anterior and posterior subdivisions. Note that this analysis was
based on activation maps obtained in individual subjects.

sively to vocal sounds, whether speech or nonspeech. The temporal lobe to timbre processing is of considerable interest
present study suggests that these STS regions can be acti- is the fact that the same acoustical cues involved in the
vated even when stimuli do not have any vocal characteris- perception of musical timbre may also be involved in process-
tics. A common feature that the vocal sounds used in the ing linguistic cues. For example, spectral shape features may
Belin et al. study and the Timbre B stimuli used in the provide a more complete set of acoustic correlates for vowel
present study have is that both involve rapid temporal identity than do formants (Zahorian and Jagharghi, 1993),
changes in spectral energy distribution (spectral flux). We analogous to the manner in which timbre is most closely
suggest that timbre, which plays an important role in both associated with the distribution of acoustic energy across
music and language, may invoke common neural processing frequency (i.e., the shape of the power spectrum). Evidence
mechanisms in the STS. Unlike the STG, the precise func- for this comes from Quesnel’s study in which subjects were
tional divisions of the left and right STS are poorly under- easily taught to identify subtle changes in musical timbre
stood. One clue that may, in the future, aid in better under- based on similarities between these timbres and certain En-
standing the functional architecture of STS is that we glish vowels (Quesnel, 2001). Although it has been commonly
observed a functional hemispheric asymmetry in the STS assumed that the right and left hemispheres play dominant
similar to that in the STG, in that left STS activations were roles in musical and language processing, respectively (Kolb
also posterior to those observed in the right STS. The poste- and Willshaw, 2000), our findings suggest that music and
rior insular regions activated in this study are known to be language processing may in fact share common neural sub-
directly connected to the STG (Galaburda and Pandya, 1983), strates. In this regard, our data are consistent with human
suggesting a role in auditory function, but the role of this electrophysiological recordings which have also found equiv-
region in auditory processing is less well understood than alent bilateral STG activation during music and language
that of the STS. processing (Creutzfeldt and Ojemann, 1989). It has also been
The differential contributions of the left and right STG/ suggested that the left STG may be critically involved in
STS to timbre processing are not clear. Lesion studies have making distinctions between voiced and unvoiced consonants
provided very little information on this issue, partly because (Liegeois-Chauvel et al., 1999). Conversely, the right STG is
of the large extent of the excisions (Liegeois-Chauvel et al., involved in processing vocal sounds, whether speech or non-
1998). It is possible that the right hemisphere bias for timbre speech, compared to nonvocal environmental sounds (Belin et
processing observed in lesion studies may arise partly be- al., 2000).
cause more posterior left STG regions that contribute to The present study marks an important step in charting out
timbre processing are generally spared in temporal lobecto- the neural bases of timbre perception, a critical structuring
mies. One important reason that the contribution of the left force in music (McAdams, 1999), as well as in the identifica-
NEURAL CORRELATES OF TIMBRE 1753

tion of sound sources in general (McAdams, 1993). This study Duvernoy, H. M., Bourgouin, P., Cabanis, E. A., and Cattin, F. 1999.
motivates thinking about acoustic stimuli in terms of their The Human Brain: Functional Anatomy, Vascularization and Se-
timbral properties. This, in turn, will help to establish a more rial Sections with MRI. Springer-Verlag, New York.
ecological framework for interpreting brain activations aris- Efron, R. 1963. Temporal perception, aphasia and deja vu. Brain 86:
ing from acoustic stimuli with different timbral properties. 403– 424.
Second, it provides a review of neuropsychological investiga- Friston, K. J., Holmes, A. P., Worsley, K. J., Poline, J. P., Frith, C. D.,
tions of timbre perception that is integrated with a detailed and Frackowiak, R. S. J. 1995. Statistical parametric maps in
functional imaging: A general linear approach. Hum. Brain Mapp.
description of the hemispheric asymmetries in the differen-
2: 189 –210.
tial response to sounds that fall within different regions of
Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S., and
timbral space. Turner, R. 1996. Movement-related effects in fMRI time-series.
Future studies will need to examine the relative contribu- Magn. Reson. Med. 35(3): 346 –355.
tion of the different components of timbre with independent, Galaburda, A. M., and Pandya, D. N. 1983. The intrinsic architec-
parametric control. Further electrophysiological studies are tonic and connectional organization of the superior temporal re-
also needed that examine how the spectral profile shape and gion of the rhesus monkey. J. Comp. Neurol. 221(2): 169 –184.
its dynamics are encoded by neurons in the auditory cortex Glover, G. H., and Lai, S. 1998. Self-navigated spiral fMRI: Inter-
and how the tonotopic and temporal patterns of neuronal leaved versus single-shot. Magn. Reson. Med. 39(3): 361–368.
activity relate to timbre perception (Shamma, 1999). We Grey, J. M. 1977. Multidimensional perceptual scaling of musical
expect that such future work will allow us to examine the timbres. J. Acoust. Soc. Am. 61: 1270 –1277.
neural basis of some of the known perceptual structure of Griffiths, R., Johnsrude, I., Dean, J., and Green, G. 1999. A common
timbre, and we expect to find further evidence of bilateral neural substrate for the analysis of pitch and duration pattern in
hemispheric contributions to this important parameter of segmented sound? NeuroReport 10(18): 3825–3830.
auditory perception, discrimination, and identification. Hajda, J. M., Kendall, R. A., Carterette, E. C., & Harshberger, M. L.
(Eds.) 1997. Methodological Issues in Timbre Research. Psychology
Press, London.
ACKNOWLEDGMENTS Hall, D. A., Haggard, M. P., Summerfield, A. Q., Akeroyd, M. A.,
Palmer, A. R., and Bowtell, R. W. 2001. Functional magnetic
This research was supported by NIH Grant HD 40761 and a grant resonance imaging measurements of sound-level encoding in the
from the Norris Foundation to V.M., by NSERC Grant 228175-00, absence of background scanner noise. J. Acoust. Soc. Am. 109(4):
FCAR Grant 2001-SC-70936, and CFI New Opportunities Grant 1559 –1570.
2719 to D.J.L., and NIH Grant RR 09784 to G.H.G. We are grateful Helmholtz, H. L. F. 1863/1954. On the Sensations of Tone (A. J. Ellis,
to Nicole Dolensky for her assistance with the behavioral data anal- Trans.). Dover, New York.
ysis. Holmes, A. P., and Friston, K. J. 1998. Generalisability, random
effects and population inference. NeuroImage 7(4): S754.
REFERENCES Howard, M. A., Volkov, I. O., Mirsky, R., Garell, P. C., Noh, M. D.,
Granner, M., Damasio, H., Steinschneider, M., Reale, R. A., Hind,
J. E., and Brugge, J. F. 2000. Auditory cortex on the human
Ashburner, J., and Friston, K. J. 1999. Nonlinear spatial normaliza- posterior superior temporal gyrus. J. Comp. Neurol. 416(1): 79 –92.
tion using basis functions. Hum. Brain Mapp. 7(4): 254 –266.
Johnsrude, I. S., Zatorre, R. J., Milner, B. A., and Evans, A. C. 1997.
Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P., and Pike, B. 2000. Left-hemisphere specialization for the processing of acoustic tran-
Voice-selective areas in human auditory cortex. Nature 403(6767): sients. NeuroReport 8(7): 1761–1765.
309 –312.
Jones, S. J., Longe, O., and Vaz Pato, M. 1998. Auditory evoked
Berger, K. 1964. Some factors in the recognition of timbre. J. Acoust.
potentials to abrupt pitch and timbre change of complex tones:
Soc. Am. 36: 1888 –1891.
Electrophysiological evidence of ‘streaming’? Electroencephalogr.
Blood, A. J., Zatorre, R. J., Bermudez, P., and Evans, A. C. 1999. Clin. Neurophysiol. 108(2): 131–142.
Emotional responses to pleasant and unpleasant music correlate
Kaas, J. H., and Hackett, T. A. 2000. Subdivisions of auditory cortex
with activity in paralimbic brain regions. Nature Neurosci. 2(4).
and processing streams in primates. Proc. Natl. Acad. Sci. USA
Brechmann, A., Baumgart, F., and Scheich, H. 2002. Sound-level- 97(22): 11793–11799.
dependent representation of frequency modulations in human au-
ditory cortex: A low-noise fMRI study. J. Neurophysiol. 87(1): Kim, D. H., Adalsteinsson, E., Glover, G. H., and Spielman, S. 2000.
423– 433. SVD Regularization Algorithm for Improved High-Order Shim-
ming. Paper presented at the Proceedings of the 8th Annual Meet-
Brett, M. 2000. The MNI brain and the Talairach atlas. http://
ing of ISMRM, Denver.
www.mrc-cbu.cam.ac.uk/Imaging/mnispace.html.
Kolb, B., and Willshaw, I. Q. 2000. Brain and Behavior. Worth, New
Cohen, J. D., MacWhinney, B., Flatt, M., and Provost, J. 1993.
York.
PsyScope: A new graphic interactive environment for designing
psychology experiments. Behav. Res. Methods Instr. Comp. 25(2): Levitin, D. J. 1999. Memory for musical attributes. In Music, Cog-
257–271. nition and Computerized Sound: An Introduction to Psychoacous-
Creutzfeldt, O., and Ojemann, G. 1989. Neuronal activity in the tics (P. R. Cook, Ed.), (pp. 209 –227). MIT Press, Cambridge, MA.
human lateral temporal lobe. III. Activity changes during music. Liegeois-Chauvel, C., de Graaf, J. B., Laguitton, V., and Chauvel, P.
Exp. Brain Res. 77(3): 490 – 498. 1999. Specialization of left auditory cortex for speech perception in
Crummer, G. C., Walton, J. P., Wayman, J. W., Hantz, E. C., and man depends on temporal coding. Cereb. Cortex 9(5): 484 – 496.
Frisina, R. D. 1994. Neural processing of musical timbre by musi- Liegeois-Chauvel, C., Peretz, I., Babai, M., Laguitton, V., and Chau-
cians, nonmusicians, and musicians possessing absolute pitch. J. vel, P. 1998. Contribution of different cortical areas in the tempo-
Acoust. Soc. Am. 95(5, Pt. 1): 2720 –2727. ral lobes to music processing. Brain 121(Pt. 10): 1853–1867.
1754 MENON ET AL.

McAdams, S. 1993. Recognition of sound sources and events. In volume measurement of human primary auditory cortex. Neuro-
Thinking in Sound: The Cognitive Psychology of Human Audition Image 13(4): 669 – 683.
(S. McAdams and E. Bigand, Eds.), pp. 146 –198. Oxford Univ. Risset, J.-C., and Wessel, D. L. 1999. Exploration of timbre by anal-
Press, Oxford. ysis and synthesis. In The Psychology of Music (D. Deutsch, Ed.),
McAdams, S. 1999. Perspectives on the contribution of timbre to 2nd ed., pp. 113–168. Academic Press, San Diego.
musical structure. Comput. Music J. 23(2): 96 –113. Robin, D., Tranel, D., and Damasio, H. 1990. Auditory perception of
McAdams, S., and Cunibile, J. C. 1992. Perception of timbral anal- temporal and spectral events in patients with focal left and right
ogies. Phil. Trans. R. Soc. Lond. B 336: 383–389. cerebral lesions. Brain Language 39: 539 –555.
McAdams, S., and Winsberg, S. 2000. Psychophysical quantification Romanski, L. M., Tian, B., Fritz, J., Mishkin, M., Goldman-Rakic,
of individual differences in timbre perception. In Contributions to P. S., and Rauschecker, J. P. 1999. Dual streams of auditory
Psychological Acoustics: Results of the 8th Oldenburg Symposium afferents target multiple domains in the primate prefrontal cortex.
on Psychological Acoustics (A. Schick, M. Meis, and C. Reckhardt, Nat. Neurosci. 2(12): 1131–1136.
Eds.), pp. 165–182. Bis, Oldenburg. Samson, S., and Zatorre, R. J. 1994. Contribution of the right tem-
McAdams, S., Winsberg, S., Donnadieu, S., De Soete, G., and Krim- poral lobe to musical timbre discrimination. Neuropsychologia
phoff, J. 1995. Perceptual scaling of synthesized musical timbres: 32(2): 231–240.
Common dimensions, specificities, and latent subject classes. Psy- Schellenberg, E. G., Iverson, P., and McKinnon, M. C. 1999. Name
chol. Res. 58: 177–192. that tune: Identifying popular recordings from brief excerpts. Psy-
Milner, B. (Ed.). 1962. Laterality Effects in Audition. Johns Hopkins chon. Bull. Rev. 6(4): 641– 646.
Press, Baltimore. Seltzer, B., and Pandya, D. N. 1978. Afferent cortical connections
and architectonics of the superior temporal sulcus and surround-
Moore, B. C. J. (Ed.). 1995. Hearing. Academic Press, San Diego.
ing cortex in the rhesus monkey. Brain Res. 149(1): 1–24.
Pandya, D. N. 1995. Anatomy of the auditory cortex. Rev. Neurol.
Shamma, S. A. 1999. Physiological basis of timbre perception. In
151(8,9): 486 – 494.
Cognitive Neuroscience (M. S. Gazzaniga, Ed.), pp. 411– 423. MIT
Pantev, C., Roberts, L. E., Schulz, M., Engelien, A., and Ross, B. Press, Cambridge.
2001. Timbre-specific enhancement of auditory cortical represen-
Sidtis, J., and Volpe, B. 1988. Selective loss of complex-pitch or
tations in musicians. NeuroReport 12(1): 169 –174.
speech discrimination after unilateral lesions. Brain Language 34:
Pitt, M. A. 1995. Evidence for a central representation of instrument 235–245.
timbre. Percept. Psychophys. 57(1): 43–55. Takeuchi, A. H., and Hulse, S. H. 1993. Absolute pitch. Psychol. Bull.
Platel, H., Price, C., Baron, J., Wise, R., Lambert, J., Frackowiak, R., 113(2): 345–361.
Lechevalier, B., and Eustache, F. 1997. The structural components Tian, B., Reser, D., Durham, A., Kustov, A., and Rauschecker, J. P.
of music perception. A functional anatomical study. Brain 2: 229 – 2001. Functional specialization in rhesus monkey auditory cortex.
243. Science 292(5515): 290 –293.
Plomp, R. 1970. Timbre as a multidimensional attribute of complex Wessinger, C. M., VanMeter, J., Tian, B., Van Lare, J., Pekar, J., and
tones. In Frequency Analysis and Periodicity Detection in Hearing Rauschecker, J. P. 2001. Hierarchical organization of the human
(R. Plomp and G. F. Smoorenburg, Eds.), pp. 397– 414. Sijthoff, auditory cortex revealed by functional magnetic resonance imag-
Leiden. ing. J. Cogn. Neurosci. 13(1): 1–7.
Poline, J. B., Worsley, K. J., Evans, A. C., and Friston, K. J. 1997. Zahorian, S. A., and Jagharghi, A. J. 1993. Spectral-shape features
Combining spatial extent and peak intensity to test for activations versus formants as acoustic correlates for vowels. J. Acoust. Soc.
in functional imaging. NeuroImage 5(2): 83–96. Am. 94(4): 1966 –1982.
Quesnel, R. 2001. A Computer-Assisted Method for Training and Zatorre, R. 1988. Pitch perception of complex tones and human
Researching Timbre Memory and Evaluation Skills. Unpublished temporal-lobe function. J. Acoust. Soc. Am. 84: 566 –572.
doctoral dissertation, McGill University, Montreal. Zatorre, R., Evans, A., and Meyer, E. 1993. Functional activation of
Rademacher, J., Morosan, P., Schormann, T., Schleicher, A., Werner, right temporal and occipital cortex in processing tonal melodies. J.
C., Freund, H. J., and Zilles, K. 2001. Probabilistic mapping and Acoust. Soc. Am. 93: 2364.

You might also like