Professional Documents
Culture Documents
Psychophysiology - 2005 - Heinks Maldonado - Fine Tuning of Auditory Cortex During Speech Production
Psychophysiology - 2005 - Heinks Maldonado - Fine Tuning of Auditory Cortex During Speech Production
Psychophysiology - 2005 - Heinks Maldonado - Fine Tuning of Auditory Cortex During Speech Production
Abstract
The cortex suppresses sensory information when it is the result of a self-produced motor act, including the motor act of
speaking. The specificity of the auditory cortical suppression to self-produced speech, a prediction derived from the
posited operation of a precise forward model system, has not been established. We examined the auditory N100
component of the event-related brain potential elicited during speech production. While subjects uttered a vowel, they
heard real-time feedback of their unaltered voice, their pitch-shifted voice, or an alien voice substituted for their own.
The subjects’ own unaltered voice feedback elicited a dampened auditory N100 response relative to the N100 elicited by
altered or alien auditory feedback. This is consistent with the operation of a precise forward model modulating the
auditory cortical response to self-generated speech and allowing immediate distinction of self and externally generated
auditory stimuli.
Descriptors: Efference copy, Event-related potential (ERP), N100, Auditory feedback, Speech production
Sensory stimulation resulting from self-initiated actions is provides a mechanism for filtering sensory information. When
experienced differently than stimulation produced by an external there is a match between the predicted and actual sensory
source. When we move our eyes, we do not perceive a moving feedback, a net cancellation of sensory input results, leading to a
room, and even ticklish people cannot seem to tickle themselves dampened sensory experience. When these signals do not match,
(Blakemore, Wolpert, & Frith, 1998; Weiskrantz, Elliot, & or when there is no corollary discharge to cancel the sensory
Darlington, 1971), perhaps because the brain processes the feedback (as occurs when sensory stimulation results from
sensory consequences of self-initiated actions differently from external events), sensory experience is intensified, alerting us to
externally generated sensory input. It is as if the brain expects the potentially important environmental events.
sensory consequence of self-initiated action, enabling us to Although this forward model has been applied most often in
distinguish potentially important external events from stimula- the visual or tactile modality, it can also be applied to other
tion that results from our own motor acts. It has been proposed sensory responses that are affected by self-generated actions,
that information about motor commands is transmitted in a such as the auditory response to self-produced speech. Early
forward system to make this distinction (Jeannerod, 1988; evidence for the effect of vocal production on the auditory system
Wolpert, 1997; Wolpert, Ghahramani, & Jordan 1995). Those came from animal studies in bats, birds, and monkeys. For
forward or ‘‘re-afference hypothesis’’ models propose that there example, in bats a 15-dB attenuation of the responses in the
is an efference copy of the motor commands used to predict the lateral lemniscus of the midbrain has been found during
sensory consequences (corollary discharge) of the action (Hein & vocalization (Suga & Schlegel, 1972; Suga & Shimozawa,
Held, 1962; Sperry, 1950; von Holst & Mittelstädt, 1950). A 1974). In monkeys, activity in the auditory cortex is inhibited
subtractive comparison of this corollary discharge with the actual during vocalizations (Eliades & Wang, 2003; Müller-Preuss &
sensory feedback associated with the action (‘‘re-afference’’) Ploog, 1981).
In humans, the results are less consistent; however, there have
been reports of dampened temporal lobe responsiveness during
This research was supported by NIH grant MH 58262 and speech production (Creutzfeld, Ojemann, & Lettich, 1989a,
MH067967, NARSAD, the Department of Veterans Affairs and the 1989b). Magnetoencephalography (MEG) recordings have
German National Merit Foundation. We thank J. Houde, S. Nagarajan, shown that auditory cortical responses to self-produced speech
W. Roth, A. Maldonado, and U. Halsband for their advice and are attenuated when compared to responses to tape-recorded
assistance.
speech (Curio, Neuloh, Numminen, Jousmake, & Hari, 2000;
Address reprint requests to: Judith M. Ford, Ph.D., Psychiatry
Service 116A, VA Healthcare System, 950 Campbell Avenue, West Houde, Nagarajan, Sekihara, & Merzenich, 2002; Numminen &
Haven, CT 06516, USA. E-mail: judith.ford@yale.edu. Curio, 1999; Numminen, Salmelin, & Hari, 1999; Pantev, Eulitz,
180
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Altered auditory feedback 181
Elbert, & Hoke, 1994). For example, studies by Curio et al. and Support for a similarly precise forward model effect in the
Houde et al. showed a reduction of the auditory M100 auditory modality during speech production comes from several
component to speech sounds when subjects spoke them relative MEG (Curio et al., 2000; Houde et al., 2002) and ERP (Ford,
to when they heard them played back. Using electroencephalo- Mathalon, Heinks, et al., 2001; Ford, Mathalon, Kalba, et al.,
graphy (EEG) we showed a similar effect on the auditory N100 2001) studies, as mentioned above. However, conclusions from
(N1) component of the event-related brain potential (ERP; Ford, these studies are limited by their reliance on an experimental
Mathalon, Heinks, Kalba, & Roth, 2001), with the N100 approach involving the comparison of auditory responses to
response being smaller for speech during its production than speech during its production relative to its playback. This
during its playback; both the magnetic M100 and the EEG N100 approach is associated with a potential confound: Although the
have a dominant source in auditory cortex and its immediate loudness of the played back speech was matched to the loudness
environs (Hari et al., 1987; Krumbholz, Patterson, Seither- of the speech as it was being spoken, the speech sounds may
Preisler, Lammertmann, & Lutkenhoner, 2003; Ozaki et al., have differed in quality due to properties of bone conduction,
2003; Pantev, Eulitz, Hampson, Ross, & Roberts, 1996; Reite middle ear muscle contraction, and the response characteristics
et al., 1994; Sams et al., 1985). of the ear. Thus, dampened cortical responsiveness during
These results could be accounted for by the operation of a speaking relative to playback could, in part, be due to the
forward model in which an efference copy of the speech different physical qualities of the sounds.
commands and a corollary discharge representing their predicted Another approach to testing the hypothesis that a precise
auditory consequences modulate the responsiveness of the forward model operates in the auditory system is to manipulate
auditory cortex. However, these results could also be explained the re-afferent auditory feedback that subjects hear as they
by a general gating or dampening of all incoming auditory produce speech. Alteration of the auditory feedback experienced
stimulation during self-generated speech. Support for this during speech allows for a direct test of the prediction, derived
general dampening hypothesis comes from ERP (Ford, Matha- from the precise forward model hypothesis, that auditory cortical
lon, Kalba, et al., 2001) and MEG (Houde et al., 2002) studies dampening during speech is greater to the exact speech sound and
showing the auditory cortical response to sound probes (e.g., less to sounds that do not match it. There is evidence from PET-
phonemes, white noise bursts, tone pips) is attenuated when studies (Hirano et al., 1997) that during talking unaltered and
probes are presented while subjects produce speech relative to altered auditory feedback (either by distortion or time-delay)
when subjects passively listen to the probes. However, these activate different brain regions. However, PET studies are not
studies also showed that the cortical response to sound probes able to reveal the temporal dynamics of activity on a millisecond
was similarly attenuated when these probes were presented scale like EEG or MEG. Houde et al. (2002) used MEG to
during the tape-recorded playback of the speech produced by compare the M100 to speech production versus speech playback
subjects. Whereas Ford, Mathalon, Kalba, et al. showed the in two different experiments, one involving accurate acoustic
response to sound probes to be equally attenuated by speaking delivery of the speech sounds and the other involving the addition
and listening to playback of recorded speech, Houde et al. of white noise that coincided with and effectively masked speech
showed a very small decrement in the M100 to sound probes sounds during their production and playback. The authors found
during speech production relative to speech playback. Indeed, that the M100 suppression observed during speech production
the fact that this decrement was quite small in comparison with relative to playback was abolished when subjects heard white
the large suppression of the M100 to speech itself during its noise instead of the expected voice feedback. Although these
production relative to its playback led Houde et al. to conclude results may show some specificity of the cortical responsiveness,
that general dampening of the auditory cortex during speech white noise is far different from speech and produces widespread
production was, at best, a negligible effect. Instead, Houde et al. activation of the auditory cortex. Thus, the results do not
argued that their data provided strong support for the forward necessarily show the precision of the sensory attenuation during
model hypothesis in which sensory stimulation (in this case, the speaking. Moreover, the Houde et al. results were potentially
auditory re-afference) is specifically suppressed to the extent that confounded by their reliance on direct comparisons of spoken
it matches the predicted sensory consequences (i.e., expected versus played-back speech, as discussed above.
sound) associated with the efference copy of the motor act (i.e., Accordingly, our goal was to design an experiment to
speech commands). investigate the precision of the forward model hypothesis by
Sensorimotor studies in the somatosensory system (Blake- assessing modulations of cortical responsiveness to speech
more et al., 1998, Blakemore, Wolpert, & Frith, 2000; sounds during their production without having to compare them
Weiskrantz et al., 1971) provide evidence for a precise forward to their playback. To this end, we altered the re-afferent auditory
model in which sensory stimulation has to correspond accurately feedback associated with self-produced speech, allowing us to
to the movement producing it in order to attenuate its perception, examine the degree to which suppression of the auditory cortical
with the amount of perceptual attenuation being proportional to response depends on the closeness of the match between the
the accuracy of the sensory prediction. For example, in a study auditory feedback and the predicted feedback (Figure 1). EEG
reported by Blakemore, Frith, and Wolpert (1999) subjects were was recorded while the subjects produced speech sounds and
asked to rate the sensation of self-produced tactile stimulation. heard real-time feedback of either their unaltered speech, their
When varied degrees of delay or trajectory rotation between the pitch-shifted speech, or the voice of another person. The N100
subject’s movement and the resultant tactile stimulation were component of the auditory ERP was compared for these
introduced, the tactile sensation was rated as more intense than different speech-feedback conditions. Two tasks were conducted.
when there was no externally manipulated alteration of the First, we tested the specificity of signal attenuation during speech
movement. Furthermore, the subjects reported incremental production by presenting the subjects with the different feedback
increases in perceived intensity as the delay or the trajectory conditions. Second, we tested whether there was also a difference
rotation was parametrically increased. in cortical activity to the different speech conditions when the
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
182 T.H. Heinks-Maldonado et al.
Alien voice
man
Com r
Moto
shifted
Desired speech
sound
Effe
Cop
renc
y
e
Internal model of
no
vocal apparatus
cy
dis
pan
Model of
cre
cre
environmental
pan
dis
influences
cy
Figure 1. A model for determining the auditory consequences of speaking. An internal forward model makes predictions of the
auditory feedback (corollary discharge) based on a copy of the motor command (efference copy). These predictions are then
compared with the actual auditory feedback (re-afference). Self-produced speech sounds can be correctly predicted on the basis of
the efference copy and are associated with little or no sensory discrepancy resulting from the comparison between predicted and
actual feedback. This results in suppression of auditory cortex to the self-produced sound as can be seen by a reduced N100
amplitude. When the actual feedback does not match the predicted feedback (by altering the feedback), the discrepancy increases
and so does the likelihood that the sound is externally produced. As a result the cortical suppression decreases and the N100
amplitude increases. Such a system would allow canceling out the effects of self-produced speech and thereby distinguishing sounds
due to self-produced speech from auditory feedback caused by the environment.
subject was passively listening to playback of the recorded speech. alien voice pitch-shifted downward by two semitones (alien-pitch
Data from these tasks were also compared to each other. shifted). As suggested by Shuster and Durrant (2002), the self-
unaltered voice needed to be pitch-shifted down 0.3 semitones to
best match the subjective experience of self-generated speech.
Methods After each trial, participants were prompted to indicate via
button press whether the feedback heard was their own voice, the
Participants alien voice, or whether they were unsure. Participants were
We recorded ERPs from 17 men (ages 21–48) who were fluent required to respond within 1.5 s after the prompt. Responses
English speakers and naı̈ve to the purpose of this study. After falling outside that window were considered misses. Participants
giving informed consent and passing a hearing test, each were told in the instructions that their own or the alien voice
participant took part in the acclimation phase followed by two would sometimes be pitch-shifted, but that they were still
runs of the speaking task and two runs of the listening task. required to decide whether its source was their own or the alien
voice. These behavioral responses were collected to assess
Tasks whether participants were actually able to distinguish between
The experiment started with an acclimation phase in which the sources of the different types of feedback.
participants produced the vowel [a:] while being made aware of Visual stimuli presented on a computer monitor were used to
the various feedback conditions. In the speaking task, partici- prompt the participant to speak or respond. To avoid an overlap
pants were told to utter ‘‘a short’’ [a:] about every 5 s. The of visual and auditory responses, the participants were instructed
feedback voice participants heard over headphones was varied to speak after the disappearance of the visual cue on the screen.
randomly between their own unaltered voice (self-unaltered), Considering an average vocal response time of about 200–500
their own voice pitch-shifted downward by two semitones (self- ms, the auditory N100 should be relatively free of the influence of
pitch shifted), the alien unaltered voice (alien-unaltered) or the visual ERP components. To further avoid an overlap of the N100
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Altered auditory feedback 183
with a motor potential associated with the subsequent button resetting of the trigger production module, and inserting of a
press, a stimulus was displayed on the screen 1.5 s after the onset trigger code in the EEG data collection system. The rectified and
of each speech sound to prompt the participants to indicate their filtered signal was also used internally to drive an envelope
response with a button press. All trials containing early responses follower that modulated the processed output signal to match the
or early speech sounds were excluded from analysis. incoming audio signal in amplitude and duration. The average
In the listening task, the recorded feedback sounds from the duration of the participant’s vocalizations was approximately
speaking task were played back, and participants were instructed 350 ms.
to merely listen and then decide about the source of the voice The mean SPL of the participants’ utterances was 76 dB
heard. All other features remained the same as in the speaking measured at a 5 cm distance from the participants’ mouths.
task including the same visual cues and volume. The listening During both the speaking and the listening tasks, the mean sound
task was carried out to replicate the approach of other studies pressure level (SPL) of the speech sounds played back over the
comparing cortical responses during speaking and listening, as headphones was increased 15 db over the average SPL of the
well as to determine whether there were differential effects of the participant’s speech. This was necessary to mask the effect of
feedback conditions when merely listening. Each task consisted bone conduction during vocalization. The SPL measurements
of 240 trials with 60 trials per condition. were made directly at the headphone using a special coupler to
connect the SPL meter and the headphones.
Instrumentation Data Acquisition and Processing
To create the different feedback conditions, we used an audio We acquired EEG data continuously from 27 sites (F7, F3, Fz,
presentation system (Figure 2) that allowed us to detect the F4, F8, FT7, FC3, FCz, FC4, FT8, T7, C3, Cz, C4, T8, TP7,
participant’s vocalization and, in real time, modulate the CP3, CPz, CP4, TP8, P7, P3, Pz, P4, P8, Tp9, Tp10) referenced
participant’s voice or substitute it with a prerecorded speech to the nose. Additional electrodes were placed on the outer canthi
sample of a male voice (‘‘alien’’). When the participant vocalized, of the eyes to measure horizontal eye movements, and above and
the speech signal was picked up by a microphone and sent below the right eye to monitor blinks and vertical eye move-
through a preamplifier to a personal computer equipped with ments. Epochs were synchronized to vocalization onset and
sound processing software and hardware. The incoming audio corrected for eye movements and blinks (Gratton, Coles, &
signal was used to generate a trigger pulse that initiated either a Donchin, 1983), and then re-referenced relative to the mastoid
pitch shift or the alien voice sample (as shown in Figure 2) that electrodes to minimize artifacts associated with talking as well as
was amplified and played to the participant via headphones. to be consistent with the reference sites used in our prior studies
The analog audio system consisted of an Audix OM2 (Ford, Mathalon, Heinks, et al., 2001; Ford, Mathalon, Kalba,
microphone, Nady MM4 mini-mixer, RCA SA155 stereo et al., 2001). After rejecting trials containing artifacts (voltages
amplifier, and audio-technica ATH-M40fs studiophones. Digi- exceeding 4 50 mV), averages using only correctly identified
tal processing was accomplished with the Reaktor software trials were created and then band-pass filtered 0.5–12 Hz.
program (Native Instruments) running on a Gateway PC (MS Averages containing less than 15 trials were not included in the
Windows 2000, 800 MHz), with an M-Audio, Audiophile 2496 statistical analyses.
sound card. The digital sampling rate of the soundcard was N100 was defined as the most negative peak between 80 and
44,100 Hz, and the ASIO drivers delivered 88 samples/processing 120 ms following the onset of the speech sound and was
bin. This, combined with a 1.25-ms software control rate, measured relative to a baseline of 150 ms prior to stimulus onset.
allowed us to detect and modulate in real time the participant’s
vocalizations through the digital processing stream with only 6 Statistics
ms of delay as measured with a Tektronix oscilloscope. A delay Repeated-measures analyses of variance (ANOVA) were con-
this small is not perceptible (Lee, 1950; Stone & Moore, 1999) ducted to examine effects of Task (speaking, listening), Source
and it is unlikely to influence the participant’s performance or the (self, alien), and Pitch (unaltered, pitch-shifted) on the accuracy
ERP amplitudes or latencies. of participants’ judgments regarding the source of the speech
A trigger pulse, signaling onset of vocalization, was generated sounds they heard. Analysis of the ERP N100 data was guided
within the software program on the rising edge of the rectified by the forward model theory that posits N100 attenuation to the
and low-pass filtered channel of the split incoming audio signal. auditory re-afference when it matches the expected sound
This internal trigger pulse initiated all other software processing associated with the produced speech (i.e., self-unaltered feed-
including modification of the original incoming audio channel, back) and by MEG studies reporting hemispheric lateralization
of the suppression effect. Thus, N100 amplitudes were analyzed
in a four-way ANOVA including factors of Task (speaking,
listening), Condition (self unaltered, self pitch-shifted, alien
unaltered, alien pitch-shifted), Laterality (left, right), and
No
Electrode Site. We included 20 electrode sites in the analysis,
ah
alteration 10 for each hemisphere.
ah Pitch shift
Amplifier
Alien voice Results
µV
µV
µV
µV
0 0 0 0
µV
µV
µV
0 0 0 0
0 µV
µV
µV
0 0 0
µV
µV
µV
0 0 0 0
0
µV
0 0 0
Figure 4. Event related potential (ERP) waveforms averaged over all subjects at all sites for the four conditions during speaking. A: Subjects heard their
own voice either unaltered or pitch-shifted. B: Subjects heard the alien voice either unaltered or pitch-shifted. In both plots 0 indicates the onset of the
speech sound.
µV
µV
µV
0 0 0 0
µV
µV
µV
0 0 0 0
µV
µV
µV
0 0 0 0
µV
µV
µV
0 0 0 0
µV
µV
0 0 0 0
Figure 4. (Continued)
that subjects heard as they produced speech and assessed both The number of uncertainty responses was similar for speaking
their perception of this feedback and its evoked auditory cortical and listening. There was a tendency, however, for subjects to
N100 response. The performance data showed that subjects had more often respond incorrectly during speaking than during
a 90% accuracy identifying their own unaltered voice as their listening. Even though the ANOVAs did not reveal a significant
own. In the other three conditions incorrect and uncertainty interaction of Task Source Pitch, Figure 3 shows that
responses increased. This can be explained by the nature of the during the speaking task subjects had a bias in identifying the
behavioral task. Subjects were instructed to decide whether they inputs as their own. This can be explained by the fact that the
believed the source of the auditory feedback was rather their own subjects were actually involved in the motor act of speaking as
voice or someone else’s, even if the feedback was altered in pitch. compared to passively listening to the sounds, which may have
We assume that it was similarly difficult to decide whether increased the tendency to assume their own voices to be the
subjects heard their own voice pitch-shifted or someone else’s source of the auditory feedback.
voice either unaltered or pitch-shifted. If the instructions had During speech production the N100 amplitude was maxi-
been to decide whether the sound had been altered or not, the mally reduced to the subject’s own unaltered voice feedback re-
results would perhaps have been different. lative to the pitch-shifted and alien speech feedback (Figures 4, 5).
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Altered auditory feedback 187
7 7 7 7
F7 F3 F4 F8
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
7 7 7 7
FT7 FC3 FC4 FT8
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
7 7 7 7
T3 C3 C4 T4
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
7 7 7 7
TP7 CP3 CP4 TP8
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
7 7 7 7
T5 P3 P4 T6
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
Self unaltered
Self pitch-shifted
Figure 5. Event related potential (ERP) waveforms averaged over all subjects at all sites for the four conditions during listening. A: Subjects heard their
own voice either unaltered or pitch-shifted. B: Subjects heard the alien voice either unaltered or pitch-shifted. In both plots 0 indicates the onset of the
speech sound.
The different feedback types (self-unaltered, self-pitch shifted, production relative to playback was reduced when subjects heard
alien-unaltered, alien-pitch shifted) did not lead to differences in white noise instead of the expected voice feedback. However, our
N100 amplitude during the listening task, even though subjects approach extends the findings of Houde et al. in two ways. First,
correctly identified the source. Houde et al.’s use of broadband white noise to mask and replace
Thus, auditory response attenuation during speech produc- the speech sounds during their production was problematic.
tion is greatest when the re-afferent auditory feedback exactly Because of the substantial acoustic difference between white
matches the predicted auditory consequences of speech (i.e., noise and speech sounds, as well as the fact that white noise
corollary discharge). broadly activates the auditory cortex, Houde et al.’s results could
Our results are consistent with those reported by Houde et al. have been due to the activating effects of white noise rather than
(2002), who found that the M100 suppression during speech to the selective attenuation of the auditory response to the
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
188 T.H. Heinks-Maldonado et al.
8 8 8 8
F7 F3 F4 F8
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
8 8 8 8
FT7 FC3 FC4 FT8
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
8 8 8 8
T3 C3 C4 T4
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
8 8 8 8
TP7 CP3 CP4 TP8
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
8 8 8 8
T5 P3 P4 T6
µV
µV
µV
µV
0 0 0 0
−8 −8 −8 −8
0 100 300 0 100 300 0 100 300 0 100 300
ms ms ms ms
= Alien unaltered
= Alien pitch-shifted
Figure 5. (Continued)
specific speech produced. By using voice feedback that differed show selective effects of a precise forward model within the
from the subject’s own speech output only in pitch and/or source speaking task itself.
(self/alien), masking sounds that were much more similar to the It is difficult to link the results of the performance data and the
subjects’ own speech than the white noise mask used by Houde ERP data. During speaking it was easiest for subjects to identify
et al., we demonstrated unambiguously that the attenuation of their own unaltered voice as their own and in this case we found a
the auditory sensory response during speech production is suppressed N100 amplitude, that is, a suppressed N100 precedes
greatest to the subject’s own speech. Second, Houde et al. relied the correct behavioral response in the self unaltered condition.
on direct comparisons between spoken and played-back speech, Determining whether the suppression of N100 amplitude
which potentially differed in sound quality as discussed in the correlates with the correctness of the response would require
introduction. In contrast, by changing the re-afferent auditory analysis of the N100 amplitudes during the erroneous and
feedback during speech production and keeping all other uncertain responses and comparison with the N100 amplitudes
parameters of the forward model constant, we were able to during correct performance. This is not feasible in our study
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Altered auditory feedback 189
REFERENCES
Blakemore, S., Frith, C., & Wolpert, D. (1999). Spatio-temporal Blakemore, S., Wolpert, D., & Frith, C. (2000). Why can’t you tickle
prediction modulates the perception of self-produced stimuli. Journal yourself? NeuroReport, 11, 11–16.
of Cognitive Neuroscience, 11, 551–559. Creutzfeld, O., Ojemann, G., & Lettich, E. (1989a). Neuronal activity in
Blakemore, S., Wolpert, D., & Frith, C. (1998). Central cancellation of the human lateral temporal lobe: I. Responses to speech. Experi-
self-produced tickle sensation. Nature Neuroscience, 1, 635–640. mental Brain Research, 77, 451–475.
14698986, 2005, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/j.1469-8986.2005.00272.x by University of New South Wales, Wiley Online Library on [09/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
190 T.H. Heinks-Maldonado et al.
Creutzfeldt, O., Ojeman, G., & Lettich, E. (1989b). Neuronal activity in Ozaki, I., Suzuki, Y., Jin, C., Baba, M., Matsunaga, M., & Hashimoto, I.
the human lateral temporal lobe. II. Responses to the subject’s own (2003). Dynamic movement of N100m dipoles in evoked magnetic
voice. Experimental Brain Research, 77, 476–489. field reflects sequential activation of isofrequency bands in human
Curio, G., Neuloh, G., Numminen, J., Jousmaki, V., & Hari, R. (2000). auditory cortex. Clinical Neurophysiology, 114, 1681–1688.
Speaking modifies voice-evoked activity in the human auditory Pantev, C., Eulitz, C., Elbert, T., & Hoke, M. (1994). The auditory
cortex. Human Brain Mapping, 9, 183–191. evoked sustained field: Origin and frequency dependence. Electro-
Eliades, S., & Wang, X. (2003). Sensory-motor interaction in the primate encephalography and Clinical Neurophysiology, 90, 82–90.
auditory cortex during self-initiated vocalizations. Journal of Pantev, C., Eulitz, C., Hampson, S., Ross, B., & Roberts, L. (1996). The
Neurophysiology, 89, 2194–2207. auditory evoked ‘‘off’’ response: Sources and comparison with the
Ford, J., Mathalon, D., Heinks, T., Kalba, S., & Roth, W. (2001). ‘‘on’’ and the ‘‘sustained’’ responses. Ear & Hearing, 17, 255–265.
Neurophysiological evidence of corollary discharge dysfunctions in Reite, M., Adams, M., Simon, J., Teale, P., Sheeder, J., & Richardson,
schizophrenia. American Journal of Psychiatry, 158, 2069–2071. D., et al. (1994). Auditory M100 component 1: Relationship to
Ford, J. M., Mathalon, D. H., Kalba, S., Whitfield, S., Faustman, W. O., Heschl’s gyri. Brain Research Cognitive Brain Research, 2, 13–20.
& Roth, W. T. (2001). Cortical responsiveness during talking and Roth, W., Ford, J., Lewis, S., & Kopell, B. (1976). Effects of stimulus
listening in schizophrenia: An event-related brain potential study. probability and task-relevance on event-related potentials. Psycho-
Biological Psychiatry, 50, 540–549. physiology, 13, 311–317.
Gratton, G., Coles, M., & Donchin, E. (1983). A new method for off-line Sams, M., Hamalainen, M., Antervo, A., Kaukoranta, E., Reinikainen,
removal of ocular artifact. Electroencephalography and Clinical K., & Hari, R. (1985). Cerebral neuromagnetic responses evoked by
Neurophysiology, 55, 468–484. short auditory stimuli. Electroencephalography and Clinical Neuro-
Hari, R., Pelizzone, M., Makela, J., Hallstrom, J., Leinonen, L., & physiology, 61, 254–266.
Lounasmaa, O. V. (1987). Neuromagnetic responses of the human Shuster, L., & Durrant, J. (2003). Toward a better understanding of self-
auditory cortex to on- and offsets of noise bursts. Audiology, 26, produced speech. Journal of Communication Disorders, 36, 1–11.
31–43. Sperry, R. (1950). Neural basis of the spontaneous optokinetic response
Hein, A., & Held, R. (1962). A neural model for labile sensorimotor produced by visual inversion. Journal of Comparative and Physiolo-
coordination. In E. Bernard & M. Hare (Eds.), Biological prototypes gical Psychology, 43, 482–489.
and synthetic systems (pp. 71–74). New York: Plenum Press. Stone, M., & Moore, B. (1999). Tolerable hearing aid delays. I.
Hirano, S., Kojima, H., Naito, Y., Honjo, I., Kamoto, Y., & Okazawa, Estimation of limits imposed by the auditory path alone using
H., et al. (1997). Cortical processing mechanism for vocalization with simulated hearing losses. Ear & Hearing, 20, 182–192.
auditory verbal feedback. NeuroReport, 8, 2379–2382. Suga, N., & Schlegel, P. (1972). Neural attenuation of responses to
Houde, J., Nagarajan, S., Sekihara, K., & Merzenich, M. (2002). emitted sounds in echolocating bats. Science, 177, 82–84.
Modulation of the auditory cortex during speech: An MEG study. Suga, N., & Shimozawa, T. (1974). Site of neural attenuation of
Journal of Cognitive Neuroscience, 14, 1125–1138. responses to self-vocalized sounds in echolocating bats. Science, 183,
Jeannerod, M. (1988). The neural and behavioral organization of goal- 1211–1213.
directed movement. Oxford, UK: Oxford University Press. Tonndorf, J. (1972). Bone conduction. In J. Tobias (Ed.), Foundations of
Jeannerod, M. (2003). The mechanism of self-recognition in humans. modern auditory theory. New York: Academic Press.
Behavioral Brain Research, 142, 1–15. von Holst, E., & Mittelstädt, H. (1950). Das Reafferenzprinzip.
Krumbholz, K., Patterson, R., Seither-Preisler, A., Lammertmann, C., Naturwissenschaften, 37, 464–476.
& Lutkenhoner, B. (2003). Neuromagnetic evidence for a pitch Weiskrantz, L., Elliot, J., & Darlington, C. (1971). Preliminary
processing center in Heschl’s gyrus. Cerebral Cortex, 13, 765–772. observations on tickling oneself. Nature, 230, 598–599.
Lee, B. (1950). Some effects of side-tone delay. Journal of the Acoustic Wolpert, D. (1997). Computational approaches to motor control. Trends
Society of America, 22, 639–640. in Cognitive Science, 1, 209–216.
Müller-Preuss, P., & Ploog, D. (1981). Inhibition of auditory cortical Wolpert, D., & Flanagan, J. (2001). Motor prediction. Current Biology,
neurons during phonation. Brain Research, 215, 61–76. 11, 729–732.
Numminen, J., & Curio, G. (1999). Differential effects of overt, covert Wolpert, D., Ghahramani, Z., & Jordan, M. (1995). An internal model
and replayed speech on vowel-evoked responses of the human for sensorimotor integration. Science, 269, 1880–1882.
auditory cortex. Neurscience Letters, 272, 29–32.
Numminen, J., Salmelin, R., & Hari, R. (1999). Subject’s own speech
reduces reactivity of the human auditory cortex. Neurscience Letters,
265, 119–122. (Received August 18, 2004; Accepted November 29, 2004)