HickokPoeppel2016HBNBL Speechperception

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

NEUROBIOLOGY

OF LANGUAGE
Edited by

GREGORY HICKOK
Department of Cognitive Sciences, University of California, Irvine, CA, USA

STEVEN L. SMALL
Department of Neurology, University of California, Irvine, CA, USA

AMSTERDAM • BOSTON • HEIDELBERG • LONDON


NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Academic Press is an imprint of Elsevier
S E C T I O N D

LARGE-SCALE MODELS
C H A P T E R

25
Neural Basis of Speech Perception
Gregory Hickok1 and David Poeppel2,3
1
Department of Cognitive Sciences, Center for Language Science, Center for Cognitive Neuroscience, University of
California, Irvine, CA, USA; 2Department of Psychology, New York University, New York, NY, USA;
3
Max-Planck-Institute for Empirical Aesthetics, Frankfurt, Germany

25.1 INTRODUCTION the posterior frontal lobe, is involved in translating


acoustic-based representations of speech signals into artic-
The mind/brain must figure out at least two things ulatory representations, essential for speech production.
when faced with the task of learning a natural In contrast to the canonical view that speech processing is
language. One is how to transform the sound patterns of mainly left hemispheredependent, a wide range of evi-
speech into a representation of the meaning of an utter- dence suggests that the ventral stream is bilaterally orga-
ance. The other is how to reproduce those sound patterns nized (although with important computational differences
with the vocal tract (or in the case of signed languages between the two hemispheres). The compelling extent to
with manual and facial gestures). Put differently, speech which neuroimaging data implicate both hemispheres has
information must be processed along two different recently been reviewed (Price, 2012; Schirmer, Fox, &
routes, an auditory-conceptual route and an auditory- Grandjean, 2012; Turkeltaub & Coslett, 2010). The dorsal
motor route. These two processing streams involve par- stream, however, is traditionally—and in the model
tially segregated circuits in the brain and form the basis outlined here—held to be strongly left-dominant.
of the dual route model of speech processing (Hickok &
Poeppel, 2000, 2004, 2007), which traces its routes to the
classical model of Wernicke (Wernicke, 1874/1977) and 25.2.1 Ventral Stream: Mapping from Sound
parallels analogous proposals in the visual (Milner & to Meaning
Goodale, 1995) and somatosensory (Dijkerman & de 25.2.1.1 Bilateral Organization and Parallel
Haan, 2007) systems. Thus, the division of labor proposed Computation
in dual route models, wherein one route is sensory-
The ventral stream is bilaterally organized, although
conceptual and the other is sensory-motor, appears to be
not computationally redundant, in the two hemispheres.
a general organizational property of the cerebral cortex.
This may not be obvious based on a cursory evaluation
This chapter outlines the dual route model as a
of the clinical data. After all, left hemisphere damage
foundation for understanding the functional anatomy
yields language deficits of a variety of sorts, including
of speech and language processing.
comprehension impairment, whereas, in most cases,
right hemisphere damage has little effect on phonologi-
cal, lexical, or sentence-level language abilities. A closer
25.2 THE DUAL ROUTE MODEL look tells a different story. In particular, research in the
OF SPEECH PROCESSING 1980s showed that auditory comprehension deficits in
aphasia (caused by unilateral left hemisphere lesions)
The dual route model (Figure 25.1) holds that a ventral were not caused primarily by impairment in the ability
stream, which involves structures in the superior and to perceive speech sounds, as Wernicke and later Luria
middle portions of the temporal lobe, is involved in pro- proposed (Luria, 1970; Wernicke, 1874/1969). For exam-
cessing speech signals for comprehension. A dorsal ple, when Wernicke’s aphasics are asked to match pic-
stream, which involves structures in the posterior planum tures to auditorily presented words, their overall
temporale region (at the parietal-temporal junction) and performance is well above chance, and when they do

Neurobiology of Language. DOI: http://dx.doi.org/10.1016/B978-0-12-407794-2.00025-0 299 © 2016 Elsevier Inc. All rights reserved.
300 25. NEURAL BASIS OF SPEECH PERCEPTION

Via higher-order frontal networks

Articulatory network Sensorimotor interface Input from


pIFG, PM, anterior insula Parietal–temporal Spt other sensory
(left dominant) Dorsal stream (left dominant) modalities

Spectrotemporal analysis Phonological network


Conceptual network
Dorsal STG Mid-post STS
Widely distributed
(bilateral) (bilateral)

Combinatorial network Ventral stream Lexical interface


aMTG, aITS pMTG, pITS
(left dominant?) (weak left-hemisphere bias)

FIGURE 25.1 Dual stream model of speech processing. The dual stream model (Hickok & Poeppel, 2000; Hickok & Poeppel, 2004; Hickok
& Poeppel, 2007) holds that early stages of speech processing occurs bilaterally in auditory regions on the dorsal STG (spectrotemporal analy-
sis; green) and STS (phonological access/representation; yellow), and then diverges into two broad streams: a temporal lobe ventral stream
supports speech comprehension (lexical access and combinatorial processes; pink), whereas a strongly left-dominant dorsal stream supports
sensory-motor integration and involves structures at the parietal-temporal junction (Spt) and frontal lobe. The conceptual network (gray box)
is assumed to be widely distributed throughout cortex. IFG, inferior frontal gyrus; ITS, inferior temporal sulcus; MTG, middle temporal gyrus;
PM, premotor; Spt, Sylvian parietal-temporal; STG, superior temporal gyrus; STS, superior temporal sulcus. From Hickok and Poeppel (2007)
with permission from the publisher.

make errors they tend to confuse the correct answer with devastating effect, as cases of word deafness attest (see
semantically similar alternatives more often than with Chapter 37) (Buchman, Garron, Trost-Cardamone,
phonemically similar foils (Baker, Blumsteim, & Wichter, & Schwartz, 1986; Poeppel, 2001).
Goodglass, 1981; Miceli, Gainotti, Caltagirone, & Data from neuroimaging have been more controver-
Masullo, 1980; Rogalsky, Pitz, Hillis, & Hickok, 2008; sial. One consistent and uncontroversial finding, how-
Rogalsky, Love, Driscoll, Anderson, & Hickok, 2011). A ever, is that, when contrasted with a resting baseline,
similar pattern of performance has been observed after listening to speech activates the STG bilaterally, including
acute deactivation of the entire left hemisphere in Wada the dorsal STG and superior temporal sulcus (STS). But,
procedures (Figure 25.2) (Hickok et al., 2008). “Speech when listening to connected, intelligible speech is con-
perception” deficits can be identified in left-injured trasted against various acoustic baselines, some studies
patients, but only on metalinguistic tasks such as syllable have reported left-dominant activation patterns (Narain
discrimination that involve some level of conscious et al., 2003; Scott, Blank, Rosen, & Wise, 2000), leading
attention to phonemic structure and working memory; some authors to argue for a fully left-lateralized
the involvement of the left hemisphere in these tasks network for speech perception (Rauschecker & Scott,
likely follows from the relation between working mem- 2009; Scott et al., 2000). Other studies report bilateral
ory and speech articulation (Hickok & Poeppel, 2000, activation even when acoustic controls are subtracted
2004, 2007). In contrast to the (minimal) effects of unilat- out of the activation pattern (Okada et al., 2010) (for a
eral lesions on the processing of phoneme level informa- review see Hickok and Poeppel, 2007). The issue is still
tion during auditory comprehension, bilateral lesions being actively debated within the functional imaging
involving the superior temporal lobe can have a literature, although recent reviews and meta-analyses

D. LARGE-SCALE MODELS
25.2 THE DUAL ROUTE MODEL OF SPEECH PROCESSING 301
1

0.9

0.8 Phonemic

Semantic
0.7
Proportion of errors

Unrelated
0.6

0.5

0.4

0.3

0.2

0.1

0
Left Anesthesia Right Anesthesia Baseline

FIGURE 25.2 Auditory comprehension performance during Wada procedure. Data show mean error rate on a four-alternative forced
choice auditory comprehension task with phonemic, semantic, and unrelated foils (inset shows sample stimulus card for the spoken target
word, bear) in 20 patients undergoing clinically indicated Wada procedures. Error rate is shown as a function error type and amytal condition:
left hemisphere injection, right hemisphere injection, and baseline. Note overall low error rate, even with left hemisphere injection, and the
dominance of semantic mis-selections when errors occur. Modified from Hickok et al. (2008).

support the conjecture of bilateral STG/STS involve- sampling (48 Hz) (Poeppel, 2003). These two propo-
ment (Price, 2012; Schirmer et al., 2012; Turkeltaub & sals are not incompatible because there is a relation
Coslett, 2010). between sampling rate and spectral versus temporal
resolution; rapid sampling allows the system to detect
25.2.1.2 Computational Asymmetries changes that occur over short timescales but sacrifices
The hypothesis that sublexical-level processes in spectral resolution, and vice versa (Zatorre et al., 2002).
speech recognition are bilaterally organized does not Further research is needed to address these hypothe-
imply that the two hemispheres are computationally ses. For the present purposes, the central point is that
identical. In fact, there is strong evidence for hemi- this asymmetry of function indicates that spoken word
spheric differences in the processing of acoustic/ recognition involves parallel pathways—at least one in
speech information (Abrams, Nicol, Zecker, & Kraus, each hemisphere—in the mapping from sound to
2008; Boemio, Fromm, Braun, & Poeppel, 2005; Giraud lexical meaning (Hickok & Poeppel, 2007), similar
et al., 2007; Hickok & Poeppel, 2007; Zatorre, Belin, & to well-accepted dual-route models of reading (pho-
Penhune, 2002). The basis of these differences is neme-to-grapheme conversion and whole-word routes)
currently being debated. One view, arguing for a (Coltheart, Curtis, Atkins, & Haller, 1993). Although the
domain-general perspective for all sounds, is that the parallel pathway view differs from standard models of
difference turns on the selectivity for temporal (left speech recognition (Luce & Pisoni, 1998; Marslen-
hemisphere) versus spectral (right hemisphere) resolu- Wilson, 1987; McClelland & Elman, 1986), wherein the
tion (Obleser, Eisner, & Kotz, 2008; Zatorre et al., processor proceeds from small to larger units in serial
2002). That is, the left hemisphere may be particularly stages, it is consistent with the fact that speech contains
well-suited for resolving rapid acoustic change (such redundant cues to phonemic information (e.g., in the
as a formant transition), whereas the right hemisphere speech envelope and fine spectral structure cues) and
may have an advantage in resolving spectral frequency with behavioral evidence suggesting that the speech
information. A closely related proposal is that the system can take advantage of these different cues
two hemispheres differ in terms of their preferred (Remez, Rubin, Pisoni, & Carrell, 1981; Shannon, Zeng,
“sampling rate,” with left auditory cortical regions Kamath, Wygonski, & Ekelid, 1995). It is worth bearing
incorporating a bias for faster rate sampling in mind that such computational asymmetries apply
(2550 Hz) and the right hemisphere for slower rate to all sounds that the auditory system analyzes.

D. LARGE-SCALE MODELS
302 25. NEURAL BASIS OF SPEECH PERCEPTION

They reflect properties of neuronal ensembles that are line with the view that intelligibility tasks tap into addi-
like filters acting on any incoming signal. Specialization tional operations beyond speech recognition. It will, in
is likely to occur at the next stage at which signals are any case, be important in future work to understand
translated into a format suitable for lexical access (given the role of various portions of the STS in auditory
that words are stored in some format). speech perception and language processing.

25.2.1.3 Phonological Processing and the STS 25.2.1.4 Lexical-Semantic Access


Beyond the earliest stages of speech recognition During auditory comprehension, the goal of speech
there is accumulating evidence that portions of the STS processing is to use phonological information to access
are important for representing and/or processing pho- words: conceptual-semantic representations that are crit-
nological information (Binder et al., 2000; Hickok & ical to comprehension. The dual stream model holds that
Poeppel, 2004, 2007; Indefrey & Levelt, 2004; conceptual-semantic representations are widely distrib-
Liebenthal, Binder, Spitzer, Possing, & Medler, 2005; uted throughout the cortex. However, a more focal
Price et al., 1996). The STS is activated by language system serves as a computational interface that maps
tasks that require access to phonological information, between phonological-level representations of words or
including both the perception and production of morphological roots and distributed conceptual repre-
speech (Indefrey & Levelt, 2004), and during active sentations (Hickok & Poeppel, 2000, 2004, 2007; Lau,
maintenance of phonemic information (Buchsbaum, Phillips, & Poeppel, 2008). This interface is not the site
Hickok, & Humphries, 2001; Hickok, Buchsbaum, for storage of conceptual information. Instead, it is
Humphries, & Muftuler, 2003). Portions of the STS hypothesized to store information regarding the relation
seem to be relatively selective for acoustic signals that (or correspondences) between phonological information
contain phonemic information when compared with and conceptual information. Most authors agree that the
complex nonspeech signals (yellow shaded portion of temporal lobe(s) play a critical role in this process, but
Figure 25.1) (Hickok & Poeppel, 2007; Liebenthal et al., again there is disagreement regarding the role of anterior
2005; Narain et al., 2003; Okada et al., 2010). STS acti- versus posterior regions. The evidence for both of these
vation can be modulated by the manipulation of psy- viewpoints is briefly presented here.
cholinguistic variables that tap phonological networks Damage to posterior temporal lobe regions, particu-
(Okada & Hickok, 2006), such as phonological neigh- larly along the MTG, has long been associated with
borhood density (the number of words that sound sim- auditory comprehension deficits (Bates et al., 2003;
ilar to a target word), and this region shows neural Damasio, 1991; Dronkers, Redfern, & Knight, 2000), an
adaptation effects to phonological level information effect confirmed in a large-scale study involving 101
(Vaden, Muftuler, & Hickok, 2009). patients (Bates et al., 2003). We infer that these deficits
One currently unresolved question concerns the rela- are primarily postphonemic in nature because phone-
tive contribution of anterior versus posterior STS regions mic deficits after unilateral lesions to these areas are
in phonological processing. Lesion evidence indicates mild (Hickok & Poeppel, 2004). Data from direct
that damage to posterior temporal lobe areas is most pre- cortical stimulation studies corroborate the involvement
dictive of auditory comprehension deficits (Bates et al., of the MTG in auditory comprehension and also indi-
2003) and a majority of functional imaging studies cate the involvement of a much broader network
targeting phonological processing in perception have involving most of the superior temporal lobe (including
identified regions in the posterior half of the STS anterior portions) and the inferior frontal lobe
(Hickok & Poeppel, 2007). Other studies, however, have (Miglioretti & Boatman, 2003). Functional imaging stud-
reported anterior STS activation in perceptual speech ies have also implicated posterior middle temporal
tasks (Mazoyer et al., 1993; Narain et al., 2003; Scott regions in lexical-semantic processing (Binder et al.,
et al., 2000; Spitsyna, Warren, Scott, Turkheimer, & Wise, 1997; Rissman, Eliassen, & Blumstein, 2003; Rodd,
2006). These studies typically involved sentence-level sti- Davis, & Johnsrude, 2005). These findings do not pre-
muli, raising the possibility that anterior STS regions may clude the involvement of more anterior regions in
be responding to some other aspect of the stimuli such as lexical-semantic access, but they do make a strong case
its syntactic or prosodic organization (Friederici, Meyer, for significant involvement of posterior regions.
& von Cramon, 2000; Humphries, Binder, Medler, & Electrophysiological studies have successfully used
Liebenthal, 2006; Humphries, Love, Swinney, & Hickok, paradigms building on the N400 response to study
2005; Humphries, Willard, Buchsbaum, & Hickok, lexical-semantic processing. This response is very
2001; Vandenberghe, Nobre, & Price, 2002). Recent sensitive to a range of variables known to implicate
electrophysiological work supports the hypothesis that lexical-level properties. A review of that literature
left anterior temporal lobe (ATL) is critical to elemen- (including source localization studies of the N400) also
tary structure building (Bemis & Pylkkanen, 2011), in suggests that posterior MTG plays a key role, although

D. LARGE-SCALE MODELS
25.2 THE DUAL ROUTE MODEL OF SPEECH PROCESSING 303
embedded in a network of anterior temporal, parietal, hearing, a “where” function (Rauschecker, 1998),
and inferior frontal regions (Lau et al., 2008). similar to the dorsal “where” stream proposal in the
ATL regions have been implicated both in lexical- cortical visual system (Ungerleider & Mishkin, 1982).
semantic and sentence-level processing (syntactic More recently, there has been some convergence on
and/or semantic integration processes). Patients with the idea that the dorsal stream supports auditory-
semantic dementia, who have been used to argue for a motor integration (Hickok & Poeppel, 2000, 2004, 2007;
lexical-semantic function (Scott et al., 2000; Spitsyna Rauschecker, 2011; Rauschecker & Scott, 2009; Scott &
et al., 2006), have atrophy involving the ATL bilaterally Wise, 2004; Wise et al., 2001). Specifically, the idea is
along with deficits in lexical tasks such as naming, that the auditory dorsal stream supports an interface
semantic association, and single-word comprehension between auditory and motor representations of speech,
(Gorno-Tempini et al., 2004). However, these deficits a proposal similar to the claim that the dorsal visual
are not specific to the mapping between phonological stream has a sensory-motor integration function
and conceptual representations and appear to involve (Andersen, 1997; Milner & Goodale, 1995).
more general semantic integration (Patterson, Nestor, &
Rogers, 2007). Further, because atrophy in semantic 25.2.2.1 The Need for Auditory-Motor Integration
dementia involves a number of regions in addition to the The idea of auditory-motor interaction in speech is
lateral ATL, including bilateral inferior and medial tem- not new. Wernicke’s classic model of the neural
poral lobe, bilateral caudate nucleus, and right posterior circuitry of language incorporated a direct link
thalamus, among others (Gorno-Tempini et al., 2004), between sensory and motor representations of words
linking the deficits specifically to the ATL is difficult. and argued explicitly that sensory systems participated
Higher-level syntactic and compositional semantic in speech production (Wernicke, 1874/1969). More
processing might involve the ATL. Functional imaging recently, research on motor control has revealed why
studies have found portions of the ATL to be more this sensory-motor link is critical. Motor acts aim to hit
active while subjects listen to or read sentences rather sensory targets. In the visual-manual domain, we iden-
than unstructured lists of words or sounds (Friederici tify the location and shape of, say, a cup visually (the
et al., 2000; Humphries et al., 2001; Humphries et al., sensory target) and generate a motor command that
2005; Mazoyer et al., 1993; Vandenberghe et al., 2002). allows us to move our limb toward that location and
This structured versus unstructured effect is indepen- shape the hand to match the shape of the object. In the
dent of the semantic content of the stimuli, although speech domain, the targets are not external objects but
semantic manipulations can modulate the ATL rather internal representations of the sound pattern
response somewhat (Vandenberghe et al., 2002). Recent (phonological form) of a word. We know that the
electrophysiological data (Brennan & Pylkkanen, 2012; targets are auditory in nature because manipulating a
Bemis & Pylkkanen, 2013) also implicate left ATL in speaker’s auditory feedback during speech production
elementary structure building. Damage to the ATL has results in compensatory changes in motor speech acts
also been linked to deficits in comprehending complex (Houde & Jordan, 1998; Larson, Burnett, Bauer, Kiran,
syntactic structures (Dronkers, Wilkins, Van Valin, & Hain, 2001; Purcell & Munhall, 2006). For example,
Redfern, & Jaeger, 2004). However, data from semantic if a subject is asked to produce one vowel and the
dementia are contradictory, because these patients are feedback that she hears is manipulated so that it
reported to have good sentence-level comprehension sounds like another vowel, then the subject will
(Gorno-Tempini et al., 2004). change the vocal tract configuration so that the feed-
In summary, there is strong evidence that lexical- back sounds like the original vowel. In other words,
semantic access from auditory input involves the pos- talkers will readily modify their motor articulations to
terior lateral temporal lobe. In terms of syntactic and hit an auditory target, indicating that the goal of
compositional semantic operations, neuroimaging evi- speech production is not a particular motor configura-
dence is converging on the ATL as an important com- tion but rather a speech sound (Guenther, Hampson, &
ponent of the computational network (Humphries Johnson, 1998). The role of auditory input is nowhere
et al., 2005; Humphries et al., 2006; Vandenberghe more apparent than in development, where the child
et al., 2002); however, the neuropsychological evidence must use acoustic information in the linguistic envi-
remains equivocal. ronment to shape vocal tract movements that must
reproduce those sounds.
A great deal of progress has been made in mapping
25.2.2 Dorsal Stream: Mapping from Sound
the neural organization of sensorimotor integration for
to Action
speech. Early functional imaging studies identified an
The earliest proposals regarding the dorsal auditory auditory-related area in the left planum temporale region
stream argued that this system was involved in spatial as involved in speech production (Hickok et al., 2000;

D. LARGE-SCALE MODELS
304 25. NEURAL BASIS OF SPEECH PERCEPTION

(A) (B)
Speech
Music

fMRI signal amplitude


SPt

Rehearse Rest
Time

(C) (D)
Continous listen 1.2 Hum
Listen+rest
1.2 Play
Listen+rehearse 1
1
0.8
0.8
0.6

Z-score
Z-score

0.4 0.6
0.2
0 0.4

–0.2
0.2
–0.4
0 3 6 9 12 15 Covert Hum/Play Rest
0
Seconds 0 5 10 15 20 25
Time (s)

FIGURE 25.3 Location and functional properties of area Spt. (A) Activation map for covert speech articulation (rehearsal of a set of non-
words). (B) Activation timecourse (fMRI signal amplitude) in Spt during a sensorimotor task for speech and music. A trial is composed of 3 s
of auditory stimulation, followed by 15 s of covert rehearsal/humming of the heard stimulus, followed by 3 s of auditory stimulation,
followed by 15 s of rest. The two humps represent the sensory responses; the valley between the humps is the motor (covert rehearsal)
response and the baseline values at the onset and offset of the trial represent resting activity levels. Note similar response to both speech and
music. (C) Activation timecourse in Spt in three conditions: continuous speech (15 s, blue curve), listen 1 rest (3 s speech, 12 s rest, red curve),
and listen 1 covert rehearse (3 s speech, 12 s rehearse, green curve). The pattern of activity within Spt (inset) was found to be different for lis-
tening to speech compared with rehearsing speech assessed at the end of the continuous listen versus listen 1 rehearse conditions despite the
lack of a significant signal amplitude difference at that time point. (D) Activation timecourse in Spt in skilled pianists performing a sensorimo-
tor task involving listening to novel melodies and then covertly humming them (blue curve) versus listening to novel melodies and imagining
playing them on a keyboard (red curve). This indicates that Spt is relatively selective for vocal tract actions. (A) From Hickok and Buchsbaum
(2003) with permission from the publisher. (B) Adapted from Hickok et al. (2003). (C) Adapted from Hickok, Okada, and Serences (2009b). (D) From
Hickok (2009) with permission from the publisher.

Wise et al., 2001). Subsequent studies showed that of activity are apparent during the sensory and motor
this left-dominant region, dubbed Spt for its location in phases of the task (Hickok, Okada, & Serences, 2009a),
the Sylvian fissure at the parietal-temporal boundary likely reflecting the activation of different neuronal
(Figure 25.3A) (Hickok et al., 2003), exhibited a subpopulations (Dahl, Logothetis, & Kayser, 2009)
number of properties characteristic of sensorimotor (some sensory-weighted and others motor-weighted).
integration areas such as those found in macaque Figure 25.3BD shows examples of the sensory-motor
parietal cortex (Andersen, 1997; Colby & Goldberg, response properties of Spt and the patchy organization
1999). Most fundamentally, Spt exhibits sensorimotor of this region for sensory-weighted versus motor-
response properties, activating both during the weighted voxels (25.3C, inset).
passive perception of speech and during covert (sub- Spt is not speech-specific; its sensorimotor responses
vocal) speech articulation (Buchsbaum et al., 2001; are equally robust when the sensory stimulus consists of
Buchsbaum, Olsen, Koch, & Berman, 2005; Hickok tonal melodies and (covert) humming is the motor task
et al., 2003); further, the different subregional patterns (see the two curves in Figure 25.3B) (Hickok et al., 2003).

D. LARGE-SCALE MODELS
25.2 THE DUAL ROUTE MODEL OF SPEECH PROCESSING 305
Conduction aphasia fMRI activation map Maximal overlap
lesion overlap lesion and fMRI

8% 85% 0% 50%

FIGURE 25.4 Relation between lesions associated with conduction aphasia and the cortical auditory-motor network. A comparison of con-
duction aphasia, an auditory-motor task (listening to and then repeating back speech) in fMRI, and their overlap. The uninflated surface in
the left panel shows the regional distribution lesion overlap in patients with conduction aphasia (maximum is 12/14 or 85% overlap). Middle
panel shows the auditory-motor network in the fMRI analysis. The right panel shows the area of maximal overlap between the lesion and
fMRI surfaces (lesion .85% overlap and significant fMRI activity). Modified from Buchsbaum et al. (2011).

Activity in Spt is highly correlated with activity in the fasciculus. However, there is now good evidence that
pars opercularis (Buchsbaum et al., 2001; Buchsbaum this syndrome results from cortical dysfunction
et al., 2005), which is the posterior sector of Broca’s (Anderson et al., 1999; Hickok et al., 2000). The pro-
region. White matter tracts identified via diffusion ten- duction deficit is load-sensitive: errors are more likely
sor imaging suggest that Spt and the pars opercularis on longer, lower-frequency words and verbatim repe-
are densely connected anatomically (for review see tition of strings of speech with little semantic con-
Friederici, 2009, Rogalsky and Hickok, 2011). Finally, straint (Goodglass, 1992, 1993). In the context of this
consistent with sensorimotor integration areas in the discussion, the effects of such lesions can be under-
monkey parietal lobe (Andersen, 1997; Colby & stood as an interruption of the system that serves at
Goldberg, 1999), Spt appears to be motor-effector the interface between auditory target and the motor
selective, responding more robustly when the motor speech actions that can achieve them (Hickok &
task involves the vocal tract than when it involves the Poeppel, 2000, 2004, 2007).
manual effectors (Figure 25.2D) (Pa & Hickok, 2008). Recent theoretical work has clarified the computa-
More broadly, Spt is situated in the middle of a net- tional details underlying auditory-motor integration in
work of auditory (STS) and motor (pars opercularis, the dorsal stream. Drawing on advances in under-
premotor cortex) regions (Buchsbaum et al., 2001; standing motor control generally, speech researchers
Buchsbaum et al., 2005; Hickok et al., 2003), perfectly have emphasized the role of internal forward models
positioned both functionally and anatomically to sup- in speech motor control (Golfinopoulos, Tourville, &
port sensorimotor integration for speech and related Guenther, 2010; Hickok, Houde, & Rong, 2011; Houde
vocal tract functions. & Nagarajan, in press; Houde & Nagarajan, 2011). The
Lesion evidence is consistent with the functional basic idea is that to control action the nervous system
imaging data implicating Spt as part of a sensorimotor makes forward predictions about the future state of
integration circuit. Damage to auditory-related regions the motor articulators and the sensory consequences of
in the left hemisphere often results in speech produc- the predicted actions. The predictions are assumed to
tion deficits (Damasio, 1991, 1992), demonstrating be generated by an internal model that receives copies
that sensory systems participate in motor speech. of motor commands and integrates them with informa-
More specifically, damage to the left temporal-parietal tion about the current state of the system and past
junction is associated with conduction aphasia, a syn- experience (learning) of the relation between particular
drome that is characterized by good comprehension motor commands and their sensory consequences.
but frequent phonemic errors in speech production This internal model affords a mechanism for detecting
(Baldo, Klostermann, & Dronkers, 2008; Damasio & and correcting motor errors (i.e., motor actions that fail
Damasio, 1980; Goodglass, 1992), and the lesion distri- to hit their sensory targets).
bution overlaps with the location of functional area Spt Several models have been proposed with similar
(Figure 25.4) (Buchsbaum et al., 2011). Conduction basic assumptions but slightly different architectures
aphasia has classically been considered to be a discon- (Golfinopoulos et al., 2010; Guenther et al., 1998;
nection syndrome involving damage to the arcuate Hickok et al., 2011; Houde & Nagarajan, in press; also

D. LARGE-SCALE MODELS
(A)

Articulatory Speech
Vocal tract
control

Motor Auditory-
phonological Auditory- phonological
system motor system
vocal tract state translation sensory targets/
estimation prediction

Internal model

Predict
Lexical-conceptual
Correct
system

(B)

Controller
Vocal tract Speech
system

Motor Auditory
phonological phonological
system system
vocal tract state sensory targets/
estimation prediction

Predict
Lexical-conceptual
Correct
system

FIGURE 25.5 An integrated state feedback control (SFC) model of speech production. (A) Speech models derived from the feedback con-
trol, psycholinguistic, and neurolinguistic literatures are integrated into one framework, presented here. The architecture is fundamentally
that of an SFC system with a controller, or set of controllers (Haruno, Wolpert, & Kawato, 2001), localized to primary motor cortex, which gen-
erates motor commands to the vocal tract and sends a corollary discharge to an internal model, which makes forward predictions about the
dynamic state of the vocal tract and about the sensory consequences of those states. Deviations between predicted auditory states and the
intended targets or actual sensory feedback generate an error signal that is used to correct and update the internal model of the vocal tract.
The internal model of the vocal tract is instantiated as a “motor phonological system,” which corresponds to the neurolinguistically elucidated
phonological output lexicon, and is localized to premotor cortex. Auditory targets and forward predictions of sensory consequences are encoded
in the same network, namely the “auditory phonological system,” which corresponds to the neurolinguistically elucidated phonological input
lexicon and is localized to the STG/STS. Motor and auditory phonological systems are linked via an auditory-motor translation system local-
ized to area Spt. The system is activated via parallel inputs from the lexical-conceptual system to the motor and auditory phonological
systems. (B) Proposed source of the deficit in conduction aphasia: damage to the auditory-motor translation system. Input from the lexical-
conceptual system to motor and auditory phonological systems are unaffected, allowing for fluent output and accurate activation of sensory
targets. However, internal forward sensory predictions are not possible, leading to an increase in error rate. Further, errors detected as a con-
sequence of mismatches between sensory targets and actual sensory feedback cannot be used to correct motor commands. From Hickok et al.
(2011) and Hickok and Poeppel (2004) with permission from the publisher.
25.4 SUMMARY 307
see Chapter 58, this volume). One such model is shown be viewed as a disruption to representations that code
in Figure 25.5 (Hickok et al., 2011). Input to the system for speech-related actions at multiple levels from coding
comes from a lexical-conceptual network, as assumed low-level phonetic features, to sequences of syllables, to
by psycholinguistic models of speech production (Dell, sequences of words in structured sentences. Although
Schwartz, Martin, Saffran, & Gagnon, 1997; Levelt, Broca’s area and Broca’s aphasia are widely considered
Roelofs, & Meyer, 1999). In between the input/output to be associated with deficits in receptive syntactic
system is a phonological system that is split into two processing (Caramazza & Zurif, 1976; Grodzinsky,
components, corresponding to sensory input and motor 2000), this issue is now being seriously questioned and
output subsystems and mediated by a sensorimotor remains debatable (Rogalsky et al., 2011).
translation system, which corresponds to area Spt Word deafness is the “lowest-level” ventral stream
(Buchsbaum et al., 2001; Hickok et al., 2003; Hickok syndrome according to the dual stream model, affecting
et al., 2009a). Parallel inputs to sensory and motor sys- the processing of phonemic information during
tems are needed to explain neuropsychological observa- speech recognition. This differs from classic interpreta-
tions (Jacquemot, Dupoux, & Bachoud-Levi, 2007), such tions of word deafness as a disconnection syndrome
as conduction aphasia, as shown here. Inputs to the (Geschwind, 1965). Due to the key role that auditory sys-
auditory phonological network define the auditory tar- tems play in speech production, as discussed, we should
gets of speech acts. As a motor speech unit (ensemble) expect that disruption to auditory speech systems, as in
begins to be activated, its predicted auditory conse- word deafness, will impact production as well.
quences can be checked against the auditory target. If Although the canonical description of word deafness is a
they match, then that unit will continue to be activated, syndrome in which speech production is preserved, the
resulting in an articulation that will hit the target. If there majority of case descriptions that provide information on
is a mismatch, then a correction signal can be generated the speech output of word deaf patients report the pres-
to activate the correct motor unit. ence of paraphasic errors (Buchman et al., 1986).
This model provides a natural explanation of Wernicke’s aphasia is explained in terms of damage
conduction aphasia. A lesion to Spt would disrupt the to multiple ventral stream processing levels in the dual
ability to generate forward predictions in auditory stream model. Given the rather extensive posterior
cortex and thereby the ability to perform internal feed- lesions that are typically required to yield a chronic
back monitoring, making errors more frequent than in Wernicke’s aphasia (Dronkers & Baldo, 2009), it is likely
an unimpaired system (Figure 25.5B). However, this that this syndrome results from damage to auditory-
would not disrupt the activation of auditory targets motor area Spt, left hemisphere auditory areas, and
via the lexical-semantic system, thus leaving the posterior middle temporal lexical-semantic interface sys-
patient capable of detecting errors in their own speech, tems. Such damage can explain the symptom complex:
a characteristic of conduction aphasia. Once an error is relatively good phonological level speech recognition
detected, however, the correction signal will not be (due to the bilateral organization as described above),
accurately translated to the internal model of the vocal poor comprehension at the higher semantic level (due to
tract due to disruption of Spt. The ability to detect but damage to lexical-semantic interface systems), fluent
not accurately correct speech errors should result in speech (due to preserved motor speech systems), poor
repeated unsuccessful self-correction attempts, again a repetition (due to disruption of auditory-motor interface
characteristic of conduction aphasia. network), and paraphasic errors (due to disruption of
auditory-motor interface network).
Transcortical sensory aphasia, which is similar to
25.3 CLINICAL CORRELATES OF Wernicke’s aphasia but with preserved repetition, is
THE DUAL STREAM MODEL conceptualized as a functionally more focal deficit
involving the lexical-semantic interface network but
The dual stream model, like the classical Wernicke- sparing the auditory-motor network. Damage to the
Lichtheim model, provides an account of the major lexical-semantic interface explains the poor compre-
clinical aphasia syndromes (Hickok & Poeppel, 2004). hension, whereas sparing of the auditory-motor inter-
Within the dual stream model, Broca’s aphasia and face explains the preserved repetition.
conduction aphasia are considered to be dorsal
streamrelated syndromes, whereas Wernicke’s apha-
sia, word deafness, and transcortical sensory aphasia 25.4 SUMMARY
are considered ventral stream syndromes. We have
already noted that conduction aphasia can be conceptu- Dual stream models of cortical organization have
alized as a disruption of auditory-motor integration proven useful in understanding both language and
resulting from damage to area Spt. Broca’s aphasia can visual-related systems, and have been a recurrent

D. LARGE-SCALE MODELS
308 25. NEURAL BASIS OF SPEECH PERCEPTION

theme in neural models stretching back more than Buchman, A. S., Garron, D. C., Trost-Cardamone, J. E., Wichter, M. D.,
a century (Wernicke, 1874/1977). Thus, the general & Schwartz, M. (1986). Word deafness: One hundred years later.
Journal of Neurology, Neurosurgury, and Psychiatry, 49, 489499.
concept underlying the model—that the brain must inter- Buchsbaum, B., Hickok, G., & Humphries, C. (2001). Role of left pos-
face sensory information with two different systems, terior superior temporal gyrus in phonological processing for
conceptual and motor—not only is intuitively appealing speech perception and production. Cognitive Science, 25, 663678.
but also has a proven track record across domains. In the Buchsbaum, B. R., Baldo, J., Okada, K., Berman, K. F., Dronkers, N.,
language domain, the dual stream model provides an D’Esposito, M., et al. (2011). Conduction aphasia, sensory-motor
integration, and phonological short-term memory - an aggregate
explanation of classical language disorders (Hickok analysis of lesion and fMRI data. Brain and Language, 119(3),
et al., 2011; Hickok & Poeppel, 2004) and provides a 119128.
framework for integrating and unifying research Buchsbaum, B. R., Olsen, R. K., Koch, P., & Berman, K. F. (2005).
across psycholinguistic, neurolinguistic, and neuro- Human dorsal and ventral auditory streams subserve rehearsal-
physiological traditions. Recent work has shown that based and echoic processes during verbal working memory.
Neuron, 48(4), 687697.
still further integration with motor control models is Caramazza, A., & Zurif, E. B. (1976). Dissociation of algorithmic and
possible (Hickok et al., 2011). All of this suggests heuristic processes in sentence comprehension: Evidence from
that the dual stream framework is on the right track aphasia. Brain and Language, 3, 572582.
as a model of language organization and provides Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parie-
a rich context for guiding future research. tal cortex. Annual Review of Neuroscience, 22, 319349.
Coltheart, M., Curtis, B., Atkins, P., & Haller, M. (1993). Models of
reading aloud: Dual-route and parallel-distributed-processing
approaches. Psychological Review, 100, 589608.
Dahl, C. D., Logothetis, N. K., & Kayser, C. (2009). Spatial organiza-
References tion of multisensory responses in temporal association cortex. The
Abrams, D. A., Nicol, T., Zecker, S., & Kraus, N. (2008). Right- European Journal of Neuroscience, 29(38), 1192411932.
hemisphere auditory cortex is dominant for coding syllable pat- Damasio, A. R. (1992). Aphasia. New England Journal of Medicine, 326,
terns in speech. Journal of Neuroscience, 28(15), 39583965. 531539.
Andersen, R. (1997). Multimodal integration for the representation of Damasio, H. (1991). Neuroanatomical correlates of the aphasias.
space in the posterior parietal cortex. Philosophical Transactions of In M. Sarno (Ed.), Acquired aphasia (pp. 4571). San Diego, CA:
the Royal Society of London B Biological Sciences, 352, 14211428. Academic Press.
Anderson, J. M., Gilmore, R., Roper, S., Crosson, B., Bauer, R. M., Damasio, H., & Damasio, A. R. (1980). The anatomical basis of
Nadeau, S., et al. (1999). Conduction aphasia and the arcuate fas- conduction aphasia. Brain, 103, 337350.
ciculus: A reexamination of the Wernicke-Geschwind model. Dell, G. S., Schwartz, M. F., Martin, N., Saffran, E. M., & Gagnon,
Brain and Language, 70, 112. D. A. (1997). Lexical access in aphasic and nonaphasic speakers.
Baker, E., Blumsteim, S. E., & Goodglass, H. (1981). Interaction Psychological Review, 104, 801838.
between phonological and semantic factors in auditory compre- Dijkerman, H. C., & de Haan, E. H. (2007). Somatosensory processes
hension. Neuropsychologia, 19, 115. subserving perception and action. The Behavioral and Brain
Baldo, J. V., Klostermann, E. C., & Dronkers, N. F. (2008). It’s either a Sciences, 30(2), 189201 (discussion 201239).
cook or a baker: Patients with conduction aphasia get the gist but Dronkers, N., & Baldo, J. (2009). Language: Aphasia. In L. R. Squire
lose the trace. Brain and Language, 105(2), 134140. (Ed.), Encyclopedia of neuroscience (Vol. 5, pp. 343348). Oxford:
Bates, E., Wilson, S. M., Saygin, A. P., Dick, F., Sereno, M. I., Knight, Academic Press.
R. T., et al. (2003). Voxel-based lesion-symptom mapping. Nature Dronkers, N. F., Redfern, B. B., & Knight, R. T. (2000). The neural
Neuroscience, 6(5), 448450. architecture of language disorders. In M. S. Gazzaniga (Ed.), The
Bemis, D. K., & Pylkkanen, L. (2011). Simple composition: A magne- new cognitive neurosciences (pp. 949958). Cambridge, MA: MIT
toencephalography investigation into the comprehension of mini- Press.
mal linguistic phrases. The Journal of Neuroscience, 31(8), Dronkers, N. F., Wilkins, D. P., Van Valin, R. D., Jr., Redfern, B. B., &
28012814. Jaeger, J. J. (2004). Lesion analysis of the brain areas involved in
Bemis, D. K., & Pylkkanen, L. (2013). Basic linguistic composition language comprehension. Cognition, 92(12), 145177.
recruits the left anterior temporal lobe and left angular gyrus dur- Friederici, A. D. (2009). Pathways to language: Fiber tracts in the
ing both listening and reading. Cerebral Cortex, 23(8), 18591873. human brain. Trends in Cognitive Sciences, 13(4), 175181.
Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S., Springer, Friederici, A. D., Meyer, M., & von Cramon, D. Y. (2000). Auditory
J. A., Kaufman, J. N., et al. (2000). Human temporal lobe activa- languge comprehension: An event-related fMRI study on the pro-
tion by speech and nonspeech sounds. Cerebral Cortex, 10, cessing of syntactic and lexical information. Brain and Language,
512528. 74, 289300.
Binder, J. R., Frost, J. A., Hammeke, T. A., Cox, R. W., Rao, S. M., & Geschwind, N. (1965). Disconnexion syndromes in animals and man.
Prieto, T. (1997). Human brain language areas identified by func- Brain, 88, 237294 585-644.
tional magnetic resonance imaging. Journal of Neuroscience, 17, Giraud, A. L., Kleinschmidt, A., Poeppel, D., Lund, T. E.,
353362. Frackowiak, R. S., & Laufs, H. (2007). Endogenous cortical
Boemio, A., Fromm, S., Braun, A., & Poeppel, D. (2005). Hierarchical rhythms determine cerebral specialization for speech perception
and asymmetric temporal sensitivity in human auditory cortices. and production. Neuron, 56(6), 11271134.
Nature Neuroscience, 8(3), 389395. Golfinopoulos, E., Tourville, J. A., & Guenther, F. H. (2010). The
Brennan, J., & Pylkkanen, L. (2012). The time-course and spatial dis- integration of large-scale neural network modeling and functional
tribution of brain activity associated with sentence processing. brain imaging in speech motor control. Neuroimage, 52(3),
Neuroimage, 60(2), 11391148. 862874.

D. LARGE-SCALE MODELS
REFERENCES 309
Goodglass, H. (1992). Diagnosis of conduction aphasia. In S. E. Kohn manipulations during sentence processing. Human Brain Mapping,
(Ed.), Conduction aphasia (pp. 3949). Hillsdale, N.J: Lawrence 26, 128138.
Erlbaum Associates. Humphries, C., Willard, K., Buchsbaum, B., & Hickok, G. (2001).
Goodglass, H. (1993). Understanding aphasia. San Diego, CA: Role of anterior temporal cortex in auditory sentence comprehen-
Academic Press. sion: An fMRI study. Neuroreport, 12, 17491752.
Gorno-Tempini, M. L., Dronkers, N. F., Rankin, K. P., Ogar, J. M., Indefrey, P., & Levelt, W. J. (2004). The spatial and temporal signa-
Phengrasamy, L., Rosen, H. J., et al. (2004). Cognition and anat- tures of word production components. Cognition, 92(12),
omy in three variants of primary progressive aphasia. Annals of 101144.
Neurology, 55(3), 335346. Jacquemot, C., Dupoux, E., & Bachoud-Levi, A. C. (2007). Breaking
Grodzinsky, Y. (2000). The neurology of syntax: Language use with- the mirror: Asymmetrical disconnection between the phonologi-
out Broca’s area. Behavioral and Brain Sciences, 23, 121. cal input and output codes. Cognitive Neuropsychology, 24(1),
Guenther, F. H., Hampson, M., & Johnson, D. (1998). A theoretical 322.
investigation of reference frames for the planning of speech Larson, C. R., Burnett, T. A., Bauer, J. J., Kiran, S., & Hain, T. C.
movements. Psychological Review, 105, 611633. (2001). Comparison of voice F0 responses to pitch-shift onset and
Haruno, M., Wolpert, D. M., & Kawato, M. (2001). Mosaic model for offset conditions. Journal of the Acoustical Society of America, 110(6),
sensorimotor learning and control. Neural Computation, 13(10), 28452848.
22012220. Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network for
Hickok, G. (2009). The functional neuroanatomy of language. Physics semantics: (de)constructing the N400. Nature Reviews Neuroscience,
of Life Reviews, 6, 121143. 9(12), 920933.
Hickok, G., & Buchsbaum, B. (2003). Temporal lobe speech percep- Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexi-
tion systems are part of the verbal working memory circuit: cal access in speech production. Behavioral & Brain Sciences, 22(1),
Evidence from two recent fMRI studies. The Behavioral and Brain 175.
Sciences, 26, 740741. Liebenthal, E., Binder, J. R., Spitzer, S. M., Possing, E. T., & Medler,
Hickok, G., Buchsbaum, B., Humphries, C., & Muftuler, T. (2003). D. A. (2005). Neural substrates of phonemic perception. Cerebral
Auditory-motor interaction revealed by fMRI: Speech, music, and Cortex, 15(10), 16211631.
working memory in area Spt. Journal of Cognitive Neuroscience, 15, Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The
673682. neighborhood activation model. Ear and Hearing, 19, 136.
Hickok, G., Erhard, P., Kassubek, J., Helms-Tillery, A. K., Naeve- Luria, A. R. (1970). Traumatic aphasia. The Hague. Mouton.
Velguth, S., Strupp, J. P., et al. (2000). A functional magnetic reso- Marslen-Wilson, W. D. (1987). Functional parallelism in spoken
nance imaging study of the role of left posterior superior tempo- word-recognition. Cognition, 25, 71102.
ral gyrus in speech production: Implications for the explanation Mazoyer, B. M., Tzourio, N., Frak, V., Syrota, A., Murayama, N.,
of conduction aphasia. Neuroscience Letters, 287, 156160. Levrier, O., et al. (1993). The cortical representation of speech.
Hickok, G., Houde, J., & Rong, F. (2011). Sensorimotor integration in Journal of Cognitive Neuroscience, 5, 467479.
speech processing: Computational basis and neural organization. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech
Neuron, 69(3), 407422. perception. Cognitive Psychology, 18, 186.
Hickok, G., Okada, K., Barr, W., Pa, J., Rogalsky, C., Donnelly, K., et Miceli, G., Gainotti, G., Caltagirone, C., & Masullo, C. (1980). Some
al. (2008). Bilateral capacity for speech sound processing in audi- aspects of phonological impairment in aphasia. Brain and
tory comprehension: Evidence from Wada procedures. Brain and Language, 11, 159169.
Language, 107(3), 179184. Miglioretti, D. L., & Boatman, D. (2003). Modeling variability in
Hickok, G., Okada, K., & Serences, J. T. (2009a). Area Spt in the cortical representations of human complex sound perception.
human planum temporale supports sensory-motor integration for Experimental Brain Research, 153(3), 382387.
speech processing. Journal of Neurophysiology, 101(5), 27252732. Milner, A. D., & Goodale, M. A. (1995). The visual brain in action.
Hickok, G., Okada, K., & Serences, J. T. (2009b). Area Spt in the Oxford: Oxford University Press.
human planum temporale supports sensory-motor integration for Narain, C., Scott, S. K., Wise, R. J., Rosen, S., Leff, A., Iversen,
speech processing. Journal of Neurophysiology, 101(5), 27252732. S. D., et al. (2003). Defining a left-lateralized response specific
Hickok, G., & Poeppel, D. (2000). Towards a functional neuroanat- to intelligible speech using fMRI. Cerebral Cortex, 13(12),
omy of speech perception. Trends in Cognitive Sciences, 4, 131138. 13621368.
Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A Obleser, J., Eisner, F., & Kotz, S. A. (2008). Bilateral speech compre-
framework for understanding aspects of the functional anatomy hension reflects differential sensitivity to spectral and temporal
of language. Cognition, 92, 6799. features. The Journal of Neuroscience, 28(32), 81168123.
Hickok, G., & Poeppel, D. (2007). The cortical organization of speech Okada, K., & Hickok, G. (2006). Identification of lexical-phonological
processing. Nature Reviews Neuroscience, 8(5), 393402. networks in the superior temporal sulcus using fMRI.
Houde, J., & Nagarajan, S.S. (in press). Speech production as state Neuroreport, 17, 12931296.
feedback control. Frontiers in Human Neuroscience, 5. http://dx. Okada, K., Rong, F., Venezia, J., Matchin, W., Hsieh, I. H., Saberi, K.,
doi.org/10.3389/fnhum.2011.00082. et al. (2010). Hierarchical organization of human auditory cortex:
Houde, J. F., & Jordan, M. I. (1998). Sensorimotor adaptation in Evidence from acoustic invariance in the response to intelligible
speech production. Science, 279, 12131216. speech. Cerebral Cortex, 20(10), 24862495.
Houde, J. F., & Nagarajan, S. S. (2011). Speech production as state Pa, J., & Hickok, G. (2008). A parietal-temporal sensory-motor inte-
feedback control. Frontiers in Human Neuroscience, 5. gration area for the human vocal tract: Evidence from an fMRI
Humphries, C., Binder, J. R., Medler, D. A., & Liebenthal, E. (2006). study of skilled musicians. Neuropsychologia, 46, 362368.
Syntactic and semantic modulation of neural activity during audi- Patterson, K., Nestor, P. J., & Rogers, T. T. (2007). Where do you
tory sentence comprehension. Journal of Cognitive Neuroscience, 18 know what you know? The representation of semantic knowledge
(4), 665679. in the human brain. Nature Reviews Neuroscience, 8(12), 976987.
Humphries, C., Love, T., Swinney, D., & Hickok, G. (2005). Response Poeppel, D. (2001). Pure word deafness and the bilateral processing
of anterior temporal cortex to syntactic and prosodic of the speech code. Cognitive Science, 25, 679693.

D. LARGE-SCALE MODELS
310 25. NEURAL BASIS OF SPEECH PERCEPTION

Poeppel, D. (2003). The analysis of speech in different temporal inte- Schirmer, A., Fox, P. M., & Grandjean, D. (2012). On the spatial orga-
gration windows: Cerebral lateralization as “asymmetric sam- nization of sound processing in the human temporal lobe: A
pling in time”. Speech Communication, 41, 245255. meta-analysis. Neuroimage, 63(1), 137147.
Price, C. J. (2012). A review and synthesis of the first 20 years of PET Scott, S. K., Blank, C. C., Rosen, S., & Wise, R. J. S. (2000).
and fMRI studies of heard speech, spoken language and reading. Identification of a pathway for intelligible speech in the left tem-
Neuroimage, 62(2), 816847. poral lobe. Brain, 123, 24002406.
Price, C. J., Wise, R. J. S., Warburton, E. A., Moore, C. J., Howard, D., Scott, S. K., & Wise, R. J. (2004). The functional neuroanatomy of pre-
Patterson, K., et al. (1996). Hearing and saying: The functional lexical processing in speech perception. Cognition, 92(12), 1345.
neuro-anatomy of auditory word processing. Brain, 119, 919931. Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., & Ekelid, M.
Purcell, D. W., & Munhall, K. G. (2006). Compensation following (1995). Speech recognition with primarily temporal cues. Science,
real-time manipulation of formants in isolated vowels. Journal of 270, 303304.
the Acoustical Society of America, 119(4), 22882297. Spitsyna, G., Warren, J. E., Scott, S. K., Turkheimer, F. E., & Wise,
Rauschecker, J. P. (1998). Cortical processing of complex sounds. R. J. (2006). Converging language streams in the human temporal
Current Opinion in Neurobiology, 8(4), 516521. lobe. Journal of Neuroscience, 26(28), 73287336.
Rauschecker, J. P. (2011). An expanded role for the dorsal auditory Turkeltaub, P. E., & Coslett, H. B. (2010). Localization of sublexical
pathway in sensorimotor control and integration. Hearing speech perception components. Brain and Language, 114(1), 115.
Research, 271(12), 1625. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual sys-
Rauschecker, J. P., & Scott, S. K. (2009). Maps and streams in the tems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.),
auditory cortex: Nonhuman primates illuminate human speech Analysis of visual behavior (pp. 549586). Cambridge, MA: MIT
processing. Nature Neuroscience, 12(6), 718724. Press.
Remez, R. E., Rubin, P. E., Pisoni, D. B., & Carrell, T. D. (1981). Vaden, K. I., Jr., Muftuler, L. T., & Hickok, G. (2009). Phonological
Speech perception without traditional speech cues. Science, 212, repetition-suppression in bilateral superior temporal sulci.
947950. Neuroimage, 23(10), 26652674.
Rissman, J., Eliassen, J. C., & Blumstein, S. E. (2003). An event- Vandenberghe, R., Nobre, A. C., & Price, C. J. (2002). The response of
related FMRI investigation of implicit semantic priming. Journal of left temporal cortex to sentences. Journal of Cognitive Neuroscience,
Cognitive Neuroscience, 15(8), 11601175. 14(4), 550560.
Rodd, J. M., Davis, M. H., & Johnsrude, I. S. (2005). The neural Wernicke, C. (1874). The symptom complex of aphasia: A psycholog-
mechanisms of speech comprehension: fMRI studies of semantic ical study on an anatomical basis. In R. S. Cohen, & M. W.
ambiguity. Cerebral Cortex, 15, 12611269. Wartofsky (Eds.), Boston studies in the philosophy of science
Rogalsky, C., & Hickok, G. (2011). The role of Broca’s area in sen- (pp. 3497). Dordrecht: D. Reidel Publishing Company.
tence comprehension. Journal of Cognitive Neuroscience, 23(7), Wernicke, C. (1874). Der aphasische symptomencomplex: Eine psy-
16641680. chologische studie auf anatomischer basis. In G. H. Eggert (Ed.),
Rogalsky, C., Love, T., Driscoll, D., Anderson, S. W., & Hickok, G. Wernicke’s works on aphasia: A sourcebook and review (pp. 91145).
(2011). Are mirror neurons the basis of speech perception? The Hague: Mouton.
Evidence from five cases with damage to the purported human Wise, R. J. S., Scott, S. K., Blank, S. C., Mummery, C. J., Murphy, K.,
mirror system. Neurocase, 17(2), 178187. & Warburton, E. A. (2001). Separate neural sub-systems within
Rogalsky, C., Pitz, E., Hillis, A. E., & Hickok, G. (2008). Auditory Wernicke’s area. Brain, 124, 8395.
word comprehension impairment in acute stroke: Relative contri- Zatorre, R. J., Belin, P., & Penhune, V. B. (2002). Structure and func-
bution of phonemic versus semantic factors. Brain and Language, tion of auditory cortex: Music and speech. Trends in Cognitive
107(2), 167169. Sciences, 6, 3746.

D. LARGE-SCALE MODELS

You might also like