Professional Documents
Culture Documents
2018 - Attent Models - Garrido+
2018 - Attent Models - Garrido+
2018 - Attent Models - Garrido+
doi: 10.1093/cercor/bhx087
Advance Access Publication Date: 10 April 2017
Original Article
ORIGINAL ARTICLE
Abstract
Predictive coding posits that the human brain continually monitors the environment for regularities and detects
inconsistencies. It is unclear, however, what effect attention has on expectation processes, as there have been relatively few
studies and the results of these have yielded contradictory findings. Here, we employed Bayesian model comparison to
adjudicate between 2 alternative computational models. The “Opposition” model states that attention boosts neural
responses equally to predicted and unpredicted stimuli, whereas the “Interaction” model assumes that attentional boosting
of neural signals depends on the level of predictability. We designed a novel, audiospatial attention task that orthogonally
manipulated attention and prediction by playing oddball sequences in either the attended or unattended ear. We observed
sensory prediction error responses, with electroencephalography, across all attentional manipulations. Crucially, posterior
probability maps revealed that, overall, the Opposition model better explained scalp and source data, suggesting that
attention boosts responses to predicted and unpredicted stimuli equally. Furthermore, Dynamic Causal Modeling showed
that these Opposition effects were expressed in plastic changes within the mismatch negativity network. Our findings
provide empirical evidence for a computational model of the opposing interplay of attention and expectation in the brain.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
1772 | Cerebral Cortex, 2018, Vol. 28, No. 5
Task
Participants were seated in front of a computer screen and
wore inner-ear buds for the duration of the experiment. Prior to
recordings, participants listened to an example auditory stream
of 1-min duration, which demonstrated the single and double
gaps in the white noise. Each participant then underwent a
brief practice session with auditory stimuli consisting of 9 sin-
gle and 9 double gaps, and a total of 110 tones. Participants
were given feedback about their accuracy in this practice block
but not in the experimental blocks. At the beginning of each
after source localization (see below). All sensor effects are modeling the data with regressors describing the hypothesized
reported at a threshold of P < 0.05 with family-wise error (FWE) relationships amongst the 4 different conditions.
correction for multiple comparisons over the whole spatiotem- Briefly, covariate regressor weights were applied to every par-
poral volume. For closer inspection of the main effects and ticipant and trial under the Opposition model, which predicts
interactions obtained at channel Fz (at which predictability reductions in ERP amplitudes across conditions in the following
effects are typically strongest, Naatanen and Alho (1997)), we order: (1) attended unpredicted, (2) unattended unpredicted/
implemented a 1D GLM approach using SPM12. We restricted attended predicted, and (3) unattended predicted. Next, we speci-
our time window from 0 to 400 ms after stimulus onset and, in fied a second model derived from Kok et al. (2012), the Interaction
a separate analysis, between the typical MMN time window of model, which predicts reductions in ERP amplitudes across condi-
100–250 ms (FWE corrected over the time bins considered). tions in the following order: (1) attended predicted, (2) attended
candidate nodes for the DCMs. Whether including anatomical each ear—were grouped into unilateral (focused) or bilateral
information would improve the source reconstruction results (divided) attention conditions (30 targets over 8 blocks and 60
at the group level is unclear. This raises an interesting model targets over 4 blocks, respectively). We excluded any partici-
comparison related to that addressed in Mattout et al. (2007); pants who did not achieve mean response accuracy >50%.
Henson et al. (2009), who showed that individual MRI does not There was no significant difference in response accuracy (P =
add to the precision of source estimates compared with an 0.14) between the unilateral (M = 71.80%, standard error of
individual deformed template. This was done for MEG data, mean [SEM] = 5.19%) and the bilateral (M = 68.33%, SEM =
however, and it is unclear what the impact on EEG might be 5.13%) conditions. Participants were significantly faster (P =
when using an MNI template without individual deformations. 0.03) to respond in the bilateral (M = 748.16 ms, SEM = 27.67 ms)
Given that MEG has higher spatial resolution and is more sensi- than the unilateral conditions (M = 779.79 ms, SEM = 34.13 ms),
source-reconstructed images revealed 2 significant clusters for is, the evidence that a given model (Opposition or Interaction)
the main effect of Prediction in the left ([−42 −10 −38], peak- generated the data.
level Tmax = 4.14, cluster-level PFWE = 0.019) and right inferior As shown in Figure 5, BMS revealed that the Opposition
temporal gyri ([44 0 −42], peak-level Tmax = 3.77, cluster-level model (“Attention and Prediction oppose”) was the more likely
PFWE = 0.023) (Fig. 4C). (>75% model probability) explanation for the data across most
frontocentral channel locations at the majority of time points
(70–210 and 290–375 ms). However, the Interaction model
(“Attention and Prediction interact”) had a higher probability
Opposition Wins Over Interaction—Evidence From
(>75%) of explaining the data between 170 and 230 ms (i.e.,
Posterior Probability Maps
within the MMN time window) at central and lateral parietal
Scalp level channel locations. Thus, the relationship between Attention
BMS was used to compare the 2 competing models of the rela- and Prediction differed depending on both the time point and
tionship between Attention and Prediction (the Opposition or scalp location; although more often than not, Attention and
Interaction models; see Fig. 1). Specifically, we were interested Prediction had opposing effects.
in comparing the strength of neural activation under the differ- The fact that the Interaction model won within the MMN
ent manipulations of attention and prediction. We used ran- window and yet we did not find a significant interaction in
dom effects BMS to create group-level PPMs for each model, the classic GLM analysis could perhaps be explained by a
derived from the log-model evidence of each participant, that Prediction by Attention interaction effect that did not survive
Modeling Attention and Prediction Garrido et al. | 1777
correction for multiple corrections. We further examined a that of the Interaction model, with a value of 80%. Thus,
potential interaction effect, hindered perhaps by a rather con- Attention and Prediction appear to have opposing effects later
servative multiple comparison correction procedure. Firstly, we in time.
used more lenient, uncorrected peak-level statistics to select 2
small interaction clusters at 175 ms (peak-level Fmax = 5.79,
peak-level Puncorr = 0.004; at central channels) and 360 ms (peak- Source Level
level Fmax = 5.45, peak-level Puncorr = 0.006; at right parietal Finally, we applied the same BMS technique employed at the
channels—see Fig. 6). We then took the spatiotemporal coordi- sensor level to our source reconstructed results. BMS revealed
nates of these clusters and extracted the posterior probability that the Opposition model had the higher model probability
of each model at that particular location. We constructed a 103 and larger clusters at the source (Fig. 7). The Opposition model
cube around these coordinates and took the average posterior achieved >50% model probability in the left middle temporal
probability of each model over that volume. Our reasoning was gyrus (cluster size; KE = 82) and right inferior temporal gyrus
that if an interaction between Attention and Prediction were (cluster size; KE = 288). Conversely, the Interaction model
present in the data, then the Interaction model would have a achieved > 50% model probability in a smaller cluster in the left
higher posterior probability compared with the Opposition middle temporal gyrus (cluster size; KE = 32). We then com-
model at these coordinates. We found that at 175 ms over fron- pared the model probabilities at the center of these clusters
tocentral channels there was a negligible difference between and showed that the Opposition model was more probable
the Opposition and Interaction models, with 48% and 52%, than the Interaction model in the left middle temporal and
respectively (Fig. 6). However, at 360 ms over the right lateral right inferior temporal gyri (winning with 82% and 78% proba-
parietal area, the Opposition model probability far exceeded bility, respectively). Furthermore, model probabilities extracted
1778 | Cerebral Cortex, 2018, Vol. 28, No. 5
Discussion
from the peak of the Interaction model cluster showed only a In this study, we adjudicated between 2 alternative computa-
slight advantage for the Interaction over the Opposition model tional models of the effect that spatial attention has on expec-
(with 57% probability for the Interaction model) in the left mid- tations. Using Bayesian model comparison of scalp PPMs we
dle temporal gyrus. Such a small difference between the proba- found that, except for an early time window (within the typical
bility of the Interaction model over the Opposition model at MMN), the Opposition model won over the Interaction model.
this cluster suggests we should be cautious in drawing any This suggests that, for the most part, attention provides an
strong conclusions about its functional anatomy. equivalent boost to neuronal responses to predicted and
Modeling Attention and Prediction Garrido et al. | 1779
unpredicted stimuli. Similarly, at the source level we found finding of a prediction error effect regardless of attention is oppo-
stronger evidence for the Opposition model underlying a fron- site to Todorovic et al. (2015), who found that while beta syn-
totemporal network. We investigated this further with DCMs chrony decreased with expectation in the unattended condition,
that employed trial-dependent plastic changes according to no difference was found in the attended condition. The latter is
either the Opposition or the Interaction model. In agreement seemingly at odds with the idea that attention amplifies predic-
with the model-based scalp and source analysis, we found that tion errors as previously shown (Jiang et al. 2013;
the family of Opposition models better explained the data. Auksztulewicz and Friston 2015), and as revealed in the current
Classic SPM analysis of spatiotemporal maps revealed an effect study. A number of factors could explain such conflicting
of prediction across and within all attentional manipulations, results. Perhaps most importantly, very different paradigms
which peaked within the typical MMN time window and at and measures were employed across the relevant experiments.
frontocentral channels. This effect was statistically greater in Both our study and that of Auksztulewicz and Friston (2015)
the attended compared with the unattended conditions at the investigated the effects of attention and prediction on evoked
single channel level, where MMN is typically seen, suggesting responses in an oddball paradigm, whereas Todorovic et al.
that attention amplifies prediction errors. At the whole spatio- (2015) focused on endogenous oscillatory activity. Moreover,
temporal map level, however, this interaction effect did not both Auksztulewicz and Friston (2015) and Todorovic et al.
survive correction for multiple comparisons over the whole (2015) manipulated temporal attention, whereas here we
space-time, despite the appearance of somewhat larger clusters manipulated spatial attention. Finally, in our experiment atten-
for the attended than the unattended condition, tion and prediction were manipulated within the same spatial
Our finding of a prediction error effect in all attention condi- location (left or right ears), but were drawn toward independent
tions (attended, unattended, and divided) is in agreement with a auditory “objects” (noise for the attention task, and tones for
vast body of work suggesting that the MMN is elicited regardless the concurrent oddball stream). By contrast, the aforemen-
of attention, and hence is “pre-attentive” in nature (Naatanen tioned studies (and that of Kok et al. (2012)) manipulated atten-
et al. 2001). This is in contradistinction to Auksztulewicz and tion and prediction within the same (visual or auditory) object. It
Friston (2015), who did not find an effect of prediction in the is possible that our attention manipulation, based on spatial
absence of attention (although this might have been due to a selectivity, had a small effect on the tones (in the attended con-
lack of power, as very few trials were included). Again, our dition), given that these were task-irrelevant and that they
1780 | Cerebral Cortex, 2018, Vol. 28, No. 5
never occurred at the same time as the task-relevant noise considering regions of the visual cortex (V1, V2, and V3). Here,
gaps. However, we believe that this is improbable for 2 reasons. however, we took a different approach by implementing the
First, the onset of the noise gaps was unpredictable and hence models computationally and directly testing them against our
participants had to constantly monitor the stream of sounds on data. By using Bayesian model comparison of statistical maps
the task-relevant side of space. Second, it is unlikely that partici- of EEG activity, and DCMs for ERPs, we were able to quantify
pants learned that the noise gaps never coincided with the tones, how likely each of these 2 models was at every point of space
and could therefore momentarily disengage attention from the and time at the scalp level, at each voxel in source space, and
noise task. Having said that, the possibility remains that by hav- in the trial-dependent plastic changes within a cortical net-
ing the participants focus on the noise streams instead of the work. The Opposition model was unambiguously favored in
tones, our attention manipulation might not have influenced the our data at every level, that is, scalp, source, and network. At
neural representations of the tones as much as it would have, the network level we found that the plastic changes according
had we asked the participants to focus on the tones. Future work to the Opposition model were more pronounced in forward
should test whether manipulating attention and prediction for connections. This is consistent with the idea that attention
common versus independent stimuli alters the extent to which boosts, or heavily weights, prediction errors, which are then
they interact. conveyed upward in the cortical hierarchy. Such prediction
In this work we directly compared 2 competing models of errors signal the need to update an internal perceptual model
the effects of attention on expectations—the Interaction and of the world, in turn prompting learning. At first glance it may
Opposition models—put forward in Kok et al. (2012). The data appear that boosting of prediction errors is more consistent
in that study were consistent with the Interaction model when with the Interaction model. It is important to note, however,
Modeling Attention and Prediction Garrido et al. | 1781
that the corollary of the Interaction model is that attention Fellowship (FL110100103) to J.B.M., the ARC Centre of Excellence
reverses prediction, such that larger responses will be observed for Integrative Brain Function (ARC Centre Grant CE140100007) to
for predicted compared with unpredicted stimuli. In this sense, M.I.G. and J.B.M., and an ARC Special Research Initiative—Science
attention changes the sign of the prediction error instead of of Learning Research Centre (SR120300015) to J.B.M.
boosting it. On the contrary, boosting of prediction errors could
in principle be accommodated by the Opposition model as it
predicts a larger difference between unpredicted and predicted
Notes
responses in the attended versus unattended condition. Having We thank the volunteers for participating in this study and
said this, our instantiation of the Opposition model is agnostic Maria Joao Rosa for discussions. Conflict of Interest: The authors
to such a relationship and was not modeled explicitly here. The declare no competing financial interests.
Rao RP, Ballard DH. 1999. Predictive coding in the visual cortex: Summerfield C, Egner T. 2009. Expectation (and attention) in
a functional interpretation of some extra-classical recep- visual cognition. Trends Cogn Sci. 13:403–409.
tive-field effects. Nat Neurosci. 2:79–87. Summerfield C, Koechlin E. 2008. A neural representation of
Rosa MJ, Bestmann S, Harrison L, Penny W. 2010. Bayesian prior information during perceptual inference. Neuron. 59:
model selection maps for group studies. Neuroimage. 49: 336–347.
217–224. Todorovic A, Schoffelen JM, van Ede F, Maris E, de Lange FP.
Spratling MW. 2008. Reconciling predictive coding and biased 2015. Temporal expectation and attention jointly modulate
competition models of cortical function. Front Comput auditory oscillatory activity in the beta band. PloS One. 10:
Neurosci. 2:4. e0120288.