Professional Documents
Culture Documents
0956 CH15
0956 CH15
15 of Complex Sounds
and Species-Specific
Vocalizations in the
Marmoset Monkey
(Callithrix jacchus)
Xiaoqin Wang, Siddhartha C. Kadia, Thomas Lu,
Li Liang, and James A. Agamaite
CONTENTS
I. Introduction
II. Temporal Processing
A. Rate vs. Temporal Coding of Time-Varying Signals
B. Processing of Acoustic Transients within the Temporal
Integration Window
C. Extraction of Temporal Profiles Embedded in Complex Sounds
III. Spectral Processing
A. Multipeaked Tuning and Long-Range Horizontal Connections
B. Spectral Inputs from Outside the Classical Receptive Field
IV. Representation of Species-Specific Vocalizations
A. Historical Studies
B. The Nature of Cortical Representations
C. Dependence of Cortical Responses on Behavioral Relevance
of Vocalizations
D. Mechanisms Related to Coding of Species-Specific Vocalizations
V. Summary
Acknowledgments
References
45 r 45 r
40 r 40 r
35 r 35 r
30 r 30 r
25 r 25 r
20 r 20 r
15 r 15 r
12.5 r 12.5 r
10 r 10 r
7.5 r 7.5 r
5r 5 r
3r 3 r
0 500 1000 1500 0 500 1000 1500
Time (ms) Time (ms)
b b 60
Rate (spikes/sec)
Rate (spikes/sec)
100
damped 40
50
20
ramped ramped
damped
0 0
0 50 100 0 50 100
ISI (ms) ISI (ms)
40
30
Synchronized
population
20
10
0
0 20 40 60 80 100
Inter-click Interval (ms)
30 msec, as illustrated by the example in Figure 15.1B. We have shown that the
limit on stimulus-synchronized responses is on the order of 20 to 25 msec (median
value of sampled population) in AI in the unanesthetized condition.35 The observation
that neurons are sensitive to changes of short ISIs indicates that a discharge-rate-
based mechanism may be in operation when ISIs are shorter than 20 to 30 msec.
We have identified two populations of AI neurons displaying these two types
of responses: synchronized and nonsynchronized.35 The two populations appeared
to encode sequential stimuli in complementary manners. Neurons in the synchro-
nized population showed stimulus-synchronized discharges at long ISIs but few
responses at short ISIs (Figure 15.1A). This population of neurons can thus represent
slowly occurring temporal events explicitly using a temporal code. The nonsynchro-
nized population of neurons did not exhibit stimulus-synchronized discharges at
either long or short ISIs (Figure 15.1A). This population of neurons can implicitly
represent rapidly changing temporal intervals by their average discharge rates. On
the basis of these two populations of neurons, the auditory cortex can represent a
wide range of time intervals in sequential, repetitive stimuli (Figure 15.1C). The
large number of neurons with nonsynchronized and sustained discharges at short
ISIs were observed in the AI of awake marmosets35 but were not observed in the
AI of anesthetized cats.34
A C
Discharge Rate (spk/s)
8
0
2
B D
30%
Cumulative % of Units
20% 80%
60%
10% 40%
20%
0% 0%
4 8 16 32 64 128 256 2 4 8 16 32 64 128 512
rBMF (Hz) fmax (Hz)
FIGURE 15.2 Cortical responses to temporally modulated stimuli recorded from AI of awake
marmoset (data based on Liang et al.38). (A) Discharge-rate-based modulation transfer func-
tions (rMTFs) measured in a single neuron using sAM (solid line), sFM (dashed line with
open circles), and nAM (dotted line with filled squares). (B) Population properties for dis-
charge-rate-based modulation frequency selectivity. Distribution of rBMF resulted from sAM
(open) and sFM (gray) stimuli. The bin widths of the histograms are on a base-2 logarithmic
scale. The distributions of rBMFsAM and rBMFsFM are not statistically different from each
other (Wilcoxon rank sum test, p = 0.17). The medians of the rBMF distributions are 22.59
Hz (sAM) and 20.01 Hz (sFM). (C) Comparison between rate-based and synchrony-based
modulation selectivity evaluated on the basis of individual neurons. tBMF is plotted vs. rBMF
for sAM (circles) and sFM (crosses) stimuli. The distributions of tBMF and rBMF are
statistically different from each other (Wilcoxon rank sum test, p < 0.005). The diagonal
dashed line has a slope of 1. The medians of the tBMF distributions are 10.12 Hz (sAM) and
10.8 Hz (sFM). (D) Cumulative distributions of fmax for sAM (solid line) and sFM (dashed
line) stimuli.
S2 Frequency (kHz)
3CF
CF2 29.5
14 21.3
13.2
CF
CF1
5
7
100 200 300 400 0 100 200 300 400 500
b b
% Change in Discharge Rate
CF2
105 0 S1 alone
CF
87 CF1
-20
68 -40
49 -60
3CF
30 -80
CF1 alone
11 -100
B D
0.5CF 2CF
30 10
CF
Number of Neurons
N = 38 (56 ratios) N = 51
Number of Ratios
20
6
4
10
0 0
1.5 2 3 4 0 0.5 1 1.5 2 3 4
Peak Frequency Ratio Major Inhibition Peaks (f/CF)
(CF2/CF1; CF3/CF2; CF3/CF1)
FIGURE 15.3
FIGURE 15.3 Characteristics of multipeaked (A) and (B) and single-peaked (C) and (D)
neurons recorded from AI of awake marmosets (data based on Kadia and Wang45). Each two-
tone pair is defined by S1 (fixed tone) and S2 (variable tone). (A, part a) Dot raster showing
single tone responses for a multipeaked neuron. Stimulus onset was at 200 msec, duration
100 msec (indicated by the bar under the horizontal axis). This neuron was responsive to two
frequency regions, with peaks at 9.5 kHz (CF1) and 19 kHz (CF2). The most responsive
frequency (the frequency evoking the strongest discharge rate) in each region was designated
as a frequency peak. Note that CF1 and CF2 are harmonically related with a peak frequency
ratio of 2. (A, part b) Responses to two simultaneously presented tones. Discharge rate
(calculated for the duration of the stimuli) is plotted vs. S2 frequency; S1 frequency was 10
kHz. The error bars represent standard errors of the means (SEMs). The two-tone responses
indicate maximum facilitation at the frequencies near ~20 kHz (CF2 of this neuron). The
dotted line indicates the strength of response to CF1 tone alone. (B) Distribution of peak
frequency ratios (CFn/CF1) in 38 multipeaked neurons. CFn is determined as the frequency
with the strongest discharge rate at a sound level near threshold. Because a multipeaked
neuron can have more than two peaks, a neuron may contribute to more than one ratio in this
plot. (C, part a) Dot raster showing two-tone responses for a single-peaked neuron. (C, part
b) Percent change in discharge rate, defined as 100 × (RS1 + S2 – RS1)/RS1, where RS1 + S2 is the
average discharge rate to two-tone pair, and RS1 is the average discharge rate to the S1 tone
alone, is plotted vs. S2 frequency for the data shown in part a. The dotted line indicates the
strength of response to the S1 tone alone. Notice the complete inhibition of discharges at a
frequency equal to 3 × CF, which can also be seen in part a. The drop of response magnitude
at CF was due to the nonmonotonic rate-level function of this neuron. (D) Distribution of
major inhibitory peaks in 51 single-peaked neurons. f/CF is the ratio of the inhibitory fre-
quency to the CF of a unit. Single-peaked neurons in AI have inhibitory influences from a
wide range of frequencies outside the receptive fields, predominantly from harmonically
related frequencies (0.5 × CF, 2 × CF).
RRev (Spikes/Sec)
30
20
10
0
0 10 20 30 40
RNat (Spikes/Sec)
B 20
Marmoset A1 (N = 89)
Marmoset
Cat
Number of Units
15
10
0
-1 -0.5 0 0.5 1
Reversed Natural
Selectivity Index (d)
C 20 Cat A1 (N = 69)
Number of Units
15
10
0
-1 -0.5 0 0.5 1
Reversed Natural
Selectivity Index (d)
FIGURE 15.4 Comparison between cortical responses to natural and time-reversed marmoset
twitter calls in the AI of two species, marmoset and cat (data based on Wang and Kadia60).
(A) Mean firing rate of responses to a time-reversed twitter call (Rrev) is plotted vs. that of
the natural twitter call (Rnat) on a unit-by-unit basis for marmosets. (B) The distribution of
the selectivity index, d = (Rnat – Rrev)/(Rnat + Rrev), is shown for marmosets. (C) The distribution
of the selectivity index is shown for cats. The means of these two distributions, indicated by
vertical dashed lines in part b, are as follows: cat, 0.047 ± 0.265; marmoset, 0.479 ± 0.361.
The two distributions differ significantly (t-test, p < 0.0001).
social contexts by marmosets66 and therefore must be discriminated by the animals. Our
quantitative analysis showed that trill and trillphee calls were separable into clusters in
a multidimensional space of particular acoustic parameters.65 Vocalizations of each type
Freq (kHz)
(N=1105) 20
Frequency (kHz)
10
Number of Calls
200 0
0 200 400 15
Time (ms)
10
CF2
0
400 5
Trillphee CF1
20
Freq (kHz)
(N=1010)
10
b
200 0 3
2
0
20 24 28 32 36 40
Modulation Frequency (Hz) 1
b
30% Trill/Trillphee
Percent of Samples
rBMFsAM 0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 0.4 0.6
rBMFsFM
Time (sec)
20%
10%
0%
4 8 16 32 64 128 256
Frequency (Hz)
FIGURE 15.5 (A) Correlation between modulation frequency in marmoset vocalizations and
temporal modulation selectivity of AI neurons. (a) Distribution of the modulation frequency
measured from trill (top) and trillphee (bottom) calls.65 The insets show spectra of the calls.
Trill: µ ± STD = 27.7 ± 1.9 Hz, N = 1105; trillphee: µ ± STD = 29.8 ± 2.4 Hz, N = 1010.
The two distributions are statistically different (Wilcoxon rank sum test, p << 0.001). The
samples of each call type were recorded from four marmosets. (b) Comparison between the
distribution of modulation frequency of trill and trillphee calls combined (black, µ = 28.7 ±
2.4 Hz, N = 2115) and those of rBMFsAM (open) and rBMFsFM (gray) obtained in awake
marmoset.38 (B) Nonlinear facilitation in the response of multipeaked neurons to marmoset
vocalizations.45 (a) Spectrogram of a natural trill call (left), its first harmonic component
(middle), and its second harmonic component (right). (b) PSTHs of responses to the stimuli
shown in part a. This neuron had two spectral peaks at approximately 4.8 kHz (CF1) and 9.6
kHz (CF2) which match two harmonics of the trill call. The average discharge rate to the
natural trill call (30 spikes/sec) was greater than the sum of discharge rates to the first and
second harmonics presented alone (12 and 9 spikes/sec, respectively).
V. SUMMARY
While much progress has been made in recent years in the study of the auditory
cortex, our understanding of cortical mechanisms for encoding non-human primate
vocalizations is still in its infancy. Much of our knowledge of the auditory cortex
has been based on studies in anesthetized animals. An important step in moving this
field forward is the use of awake and behaving preparations. Only then can issues
such as the behavioral state of an animal and attentional modulation of cortical
responses be adequately addressed. A limitation of existing neurophysiological tech-
niques is that the vocal behavior of animals is substantially restricted or eliminated
once an animal is restrained. Application of implanted and moveable electrodes
attached to a telemetry device will significantly improve our ability to correlate
natural vocal behavioral and brain activities. Finally, while the visual system has
served as a good model system for understanding cortical coding mechanisms, we
must bear in mind the fundamental difference between auditory and visual systems.
Unlike the visual system, a significant portion of behaviorally important inputs to
the auditory systems of vocal species such as humans and primates are actively
produced by the species themselves (i.e., speech and species-specific vocalizations).
The auditory cortical system in humans and non-human primates may therefore
ACKNOWLEDGMENTS
We thank Dr. Ross Snider and Steven Eliades for their work on the data acquisition
system used in our studies and Ashley Pistorio for colony management, animal
training, and proofreading the manuscript. Our research has been supported by NIH-
NIDCD Grant DC03180 and by a Presidential Early Career Award for Scientists
and Engineers (X.W.). Publications from our laboratory can be obtained at
www.bme.jhu.edu/~xwang/papers.html.
REFERENCES
1. Heffner, H.E. and Heffner, R.S., Effect of unilateral and bilateral auditory cortex
lesions on the discrimination of vocalizations by Japanese macaques, J. Neurophysiol.,
56, 683–701, 1986.
2. Penfield, W. and Roberts, L., Speech and Brain Mechanisms, Princeton University
Press, Princeton, NJ, 1959.
3. Pandya, D.N. and Yeterian, E.H., Architecture and connections of cortical association
areas, in Cerebral Cortex, Vol. 4, A. Peters and E.G. Jones, Eds., Plenum, New York,
1985, pp. 3–61.
4. Kaas, J.H., Hackett, T.A., and Tramo, M.J., Auditory processing in primate cerebral
cortex, Curr. Opin. Neurobiol., 9, 164–170, 1999.
5. Jones, E.G. and Powell, T.P.S., An anatomical study of converging sensory pathways
within the cerebral cortex of monkey, Brain, 93, 793–820, 1970.
6. Morel, A. and Kaas, J.H., Subdivisions and connections of auditory cortex in owl
monkeys, J. Comp. Neurol., 318, 27–63, 1992.
7. Kaas, J.H. and Hackett, T.A., Subdivisions of auditory cortex and processing streams
in primates, Proc. Natl. Acad. Sci. USA, 97, 11793–11799, 2000.
8. Hackett, T.A., Preuss, T.M., and Kaas, J.H., Architectonic identification of the core
region in auditory cortex of macaques, chimpanzees, and humans, J.þComp Neurol.,
441, 197–222, 2001.
9. Brodmann, K., Vergleichende Lokalisationslehre der Grobhirnrinde, J.A. Barth,
Leipzig, 1909.
10. Goldstein, M.H.J., Kiang, N.Y.-S., and Brown, R.M., Responses of the auditory cortex
to repetitive acoustic stimuli, J.þAcoust. Soc. Am., 31, 356–364, 1959.