Findling&Wyart - 2021 - Computation Noise in Human Learning Anddecision-Making Origin, Impact, Function

Available online at www.sciencedirect.
com
ScienceDirect
Computation noise in human learning and

decision-making: origin, impact, function
Charles Findling1,2 and Valentin Wyart1,2
Making sense of uncertain and volatile environments, a In the laboratory, these dynamic inferences are investi-
cognitive process modeled across domains as statistical gated using controlled paradigms generating different
inference, constitutes a difficult yet ubiquitous challenge for forms of uncertainty (Figure 1a). The ‘weather pre-
human intelligence. Beside sensory errors and exploratory diction’ task [1–3] probes the ability to combine multiple
choices, recent research has identified the limited sources of uncertain information about the location of a
computational precision of cognitive inference as a surprisingly hidden reward. After learning probabilistic associations
large contributor to the variability and suboptimality of between a set of symbols — presented one at a time —
perceptual and reward-guided decisions made under and the location of the reward, subjects are asked to
uncertainty. This focused review discusses the theoretical and predict the location of the reward based on sequences
experimental evidence scattered across psychology and of symbols. Each symbol taken in isolation predicts the
neuroscience which, taken together, provides key insights into same reward location as before, but this uncertain infor-
the origin, impact and function of this ‘computation noise’ for mation has to be combined across symbols to predict
learning and decision-making. Moving beyond the classical accurately the location of the reward. In this task, even
description of internal noise as performance-limiting constraint after reaching ceiling performance at predicting the loca-
on neural function and cognition, we outline the possible tion of the reward based on symbols presented one at a
emergent benefits of computation noise for adaptive behavior time, humans (and non-human primates) show a large
in adverse conditions and highlight open questions for future trial-to-trial variability in their predictions based on
research. sequences of symbols [3,4]. In other words, subjects
sometimes choose a reward location which is not associ-
Addresses
1
ated with the highest posterior probability of reward given
Laboratoire de Neurosciences Cognitives et Computationnelles, Insti- the presented sequence of symbols.
tut National de la Santé et de la Recherche Médicale, Paris, France
2
Département d’Études Cognitives, École Normale Supérieure,
Another widely used paradigm, the ‘reversal learning’
Université PSL, Paris, France
task [5–7], probes the ability to monitor changes in the
Corresponding authors: Findling, Charles (charles.findling@gmail.com), reward probabilities associated with choice options (e.g.
Wyart, Valentin (valentin.wyart@ens.fr) the two arms of a bandit, Figure 1b). Reward probabilities
being uncertain, subjects need to distinguish ‘false
Current Opinion in Behavioral Sciences 2021, 38:124–132
alarms’ — missing rewards when choosing the option
associated with the highest reward probability — from
This review comes from a themed issue on Computational cognitive
neuroscience
genuine changes in reward probabilities. Humans
engaged in this task (or one of its variants) make a
Edited by Angela J Langdon and Geoffrey Schoenbaum
substantial fraction of ‘non-greedy’ choices — choices
For a complete overview see the Issue and the Editorial which do not maximize expected reward but reduce the
Available online 12th March 2021 uncertainty about recently unchosen options [5,8].
https://doi.org/10.1016/j.cobeha.2021.02.018 Therefore, and despite the several differences between
these different experimental paradigms, human decisions
2352-1546/ã 2021 The Author(s). Published by Elsevier Ltd. This is an
open access article under the CC BY-NC-ND license (http://creative- made in uncertain and volatile environments exhibit a
commons.org/licenses/by-nc-nd/4.0/). pervasive variability which limits their accuracy.
Large contribution of inference errors to

Prominent variability of human decisions human decision variability
made under uncertainty The origin of decision variability under uncertainty has
Humans routinely navigate uncertain and volatile envir- usually been assigned to a single source, located either at
onments in everyday life, from a changing weather to the input or output of the probabilistic inference process
unexpected incidents on our regular metro line. In such used to update an internal model of the environment –
conditions, adaptive behavior requires making dynamic the location of the hidden reward in the weather predic-
probabilistic inferences about external events (e.g. mak- tion task, or the reward probabilities associated with each
ing accurate predictions about the weather) and action- choice option in the reversal learning task. However,
outcome contingencies (e.g. choosing a detour which these accounts fail to explain empirical observations, such
minimizes additional delays on our way to work). as why decision variability grows linearly with the number
Current Opinion in Behavioral Sciences 2021, 38:124–132 www.sciencedirect.com

Computation noise in human cognition Findling and Wyart 125
Figure 1
(a)
(b)
(c)
Current Opinion in Behavioral Sciences
(a) Description of experimental paradigms studying dynamic inferences under uncertainty. Left: visual variant of the weather prediction task. Each
trial consists of a sequence of oriented patterns, drawn from one of two generative probability distributions (sources A and B). At sequence offset,
subjects are prompted to indicate the source from which they believed the oriented patterns were drawn. Right: restless variant of the reversal
learning task. On each trial, subjects are asked to choose between two colored symbols (options), and then obtain its associated reward. The
mean rewards associated with the two options drift continuously and randomly over time (thick lines represent the drifting mean rewards
associated with the two options, whereas thin lines represent rewards sampled the drifting means). (b) Contributions of distinct sources of errors
to human behavioral variability in the weather prediction task (left) and the reversal learning task (right). Left: inference errors explain about 90% of
the observed behavioral variability in the weather prediction task. Right: inference errors in reinforcement learning explain more than 60% of
seemingly non-greedy decisions when the reward associated with the foregone option is not observed (partial outcome condition, left), and more
than 85% when the foregone reward is observed and there is thus equal uncertainty about chosen and unchosen options (complete outcome
condition, right). (c) Decomposition of human inference errors in terms of a computation bias-variance trade-off in the weather prediction task (left)
and the reversal learning task (right). The bias term (left) corresponds to predictable errors across repetitions of the same trial, whereas the
variance term (right) corresponds to unpredictable errors across trial repetitions – that is, computation noise. In the weather prediction task, the
bias term is split into temporal biases (green), perceptual biases (blue), trial history biases (brown), and other unspecified biases (gray).
Computation noise explains about two thirds of human inference errors in the two tasks.
of presented symbols in the weather prediction task – at a steps and explain why decision variability grows with the
rate which greatly exceeds sensory variability [4]. Or number of stimuli in the sequence. Strikingly, inference
why subjects make non-greedy decisions even in condi- errors account for more than 85% of the observed decision
tions where they observe after each choice the foregone variability in this task. This result means that subjects
reward associated with the unchosen option, and there is almost never choose a reward location which is not
thus equal uncertainty about chosen and unchosen associated with the highest perceived probability of
options [8,9,10]. reward. Rather, they make substantial cognitive errors
inferring the most probable reward location based of the
These different effects can be readily explained by the presented sequence of stimuli. In a ‘random walk’ variant
presence of significant inference errors – that is, errors in of the reversal learning task where non-greedy decisions
the updating of the internal model of the environment reduce the uncertainty about recently unchosen options,
(Figure 1b). In the weather prediction task [4], these inference errors still make up more than 60% of these
cognitive errors accumulate across successive inference decisions [8]. This observation means that a substantial
www.sciencedirect.com Current Opinion in Behavioral Sciences 2021, 38:124–132

126 Computational cognitive neuroscience
fraction of these adaptive decisions is not driven by an precision of underlying computations rather than from
explicit arbitration between exploration and exploitation, biased (systematically wrong) inference [23].
but by cognitive errors when inferring the reward proba-
bilities associated with the different choice options. In In volatile environments where the state of the environ-
other words, many of the non-greedy decisions labeled as ment changes over time, computation noise shows a
‘exploratory’ (uncertainty-minimizing) when assuming scaling variance which matches the Weber’s law of per-
noise-free inference reflect ‘exploitative’ (reward-maxi- ceptual discrimination prevalent in numerous sensory
mizing) decisions based on misestimated reward proba- modalities. Across different variants of the reversal learn-
bilities caused by inference noise. Accounting for infer- ing task [8,24], computation noise grows with the
ence errors during reward-guided learning also explains prediction error between observed and expected reward
the non-greedy decisions observed when foregone – the ‘temporal difference’ (surprise) signal which drives
rewards are presented together with obtained rewards – model updating in reinforcement learning (RL) [25]. This
that is, conditions where these decisions have no adaptive Weber scaling structure of computation noise is not
value in terms of reward maximization. observed in stable environments where the state of the
environment is uncertain but fixed, as in the weather
Computation bias-variance structure of prediction task [4]. These differences may be due to the
human inference errors distinct roles of surprise for learning in these two types of
The surprisingly large contribution of inference errors to uncertain environments. Inferring the state of volatile
human decision variability requires qualifying their environments constitutes an estimation task where sur-
nature. The statistical signatures of inference errors differ prise indicates a possible change in the current state of the
from those of stochastic choice policies (e.g. softmax, environment – and is thus the primary signal for updating
Thompson sampling) in the sense that inference errors the internal model of the environment [26–28]. By con-
committed at one point in time corrupt (in a normative trast, inferring the state of stable environments relies on
sense) the internal model of the environment and thus the integration of uncertain information over much longer
propagate forward – within a sequence of stimuli in the (ideally infinite) time constants [29]. The relevance of
weather prediction task [4], or across successive trials in surprise for inferring the state of a stable environment
sequential reinforcement paradigms such as the reversal decays over time as the amount of integrated information
learning task [8]. In contrast to stochastic choice policies grows, and optimal inference does not require computing
[11], inference errors do not reflect a sampling-based surprise to update the internal model of the environment.
‘read-out’ of the internal model of the environment, Future research should examine further the relation
but fluctuations of the internal model itself. between surprise, model updating and computation
noise.
A key question concerns the statistical structure of
inference errors — in particular, whether they produce Despite these fine-grained differences, computation
random (unpredictable) or systematic (predictable) var- noise accounts for a dominant fraction of inference errors
iability in behavior — a decomposition known as the across perceptual and reward-guided decisions – two
‘bias-variance’ trade-off. Inference errors being defined canonical types of decisions studied and theorized using
as deviations from optimal (or near-optimal) inference different paradigms and cognitive models [4,8]. This
[4,8], they may be caused by different cognitive pattern of findings sets the limited precision of probabi-
biases known to affect probabilistic inference under listic inference as an upper bound on the accuracy and
uncertainty. For example, evidence integration shows trial-to-trial consistency of human decisions made in
idiosyncratic temporal biases (e.g. primacy or recency) uncertain environments.
that can produce large inference errors [12–15]. Simi-
larly, confirmation biases and other forms of ‘trial history’ Constraints of computation noise on learning
effects have been reported across decision domains [16– and decision-making
19]. A fundamental difference between such cognitive Computation noise, defined above as random variability
biases and computation ‘noise’ — that is, stochastic in the updating of the internal model of the environment
variability in the updating of the internal model of the (the location of the hidden reward in the weather predic-
environment — is that biases should produce correlated tion task, or the reward probabilities associated with each
inference errors across repetitions of the same trial choice option in the reversal learning task), can have very
[4,8,20]. Instead of performing an exhaustive search different substrates in the brain. Computation noise can
of all possible cognitive biases [21], leveraging the reflect genuine neural noise such as stochastic synaptic
consistency of human decisions across repeated trials release at the cellular level [30], random fluctuations in
[22] revealed that random variance accounts for 65% of the tight excitation-inhibition balance required by neural
inference errors across perceptual and reward-guided populations to perform precise computations [31,32], or
decisions [4,8] (Figure 1c). In other words, about the variable pooling of task-relevant neural responses by
two thirds of inference errors arise from the limited top-down attention across cortical hierarchies [33]. But

computation noise can also arise from task-independent produce either primacy or recency effects depending
(background) input to neural circuits implementing prob- on the main source of uncertainty [15]. And more com-
abilistic inference [34,35,36] – that is, ‘effective’ noise plex, ‘winner-take-all’ biases in information integration
which may not be random in an absolute sense, but may have emerged to mitigate the impact of computation
generates trial-to-trial variability in the updating of the noise on performance [54].
internal model. And irrespective of its precise substrates,
this type of internal noise is widely seen as a performance- Together, these recent behavioral findings blur the line
limiting constraint which neural systems have evolved to between bias and variance in the statistical sense [23]:
cope with using ‘efficient’ mechanisms. certain cognitive biases may correspond to direct mecha-
nistic consequences of computation noise, while others
At the neural level, correlated noise across neurons may guard against the variability and suboptimality trig-
(which does not average out by pooling neural gered by computation noise. In either case, this deep
responses) has been theorized [37] and recently shown interplay between cognitive bias and variance emphasizes
[38,39] to constrain the precision of neural representa- the importance of characterizing computation noise for
tions. Because this correlated noise aligns strikingly understanding even seemingly unrelated aspects of
well with the neural ‘dimensions’ (in population space) human cognition [20,22,23].
which predict decisions on a trial-by-trial basis [40],
its impact on task performance is expected to be sub-
stantial, and consistent with the large contribution of Regulation of computation noise by
computation noise to decision variability. In agreement noradrenergic neuromodulation
with this view, correlated noise has been shown to be The large contribution of computation noise to human
the prime target of top-down attentional modulation learning and decision-making suggests that neural sys-
[41–43], by decreasing pairwise correlations between tems should be involved in its active regulation [4,8].
sensory neurons tuned to attended locations or features, Previous research which does not consider explicitly
and increasing the gain of neural representations at the computation noise has identified the anterior cingulate
population level [44]. At the theoretical level, attention cortex (ACC) as a selective neural correlate of behavioral
has been described as an evolved mechanism for variability in uncertain and volatile environments which
approximate inference which can increase the precision require arbitrating between exploration and reward max-
of task-relevant computations in the presence of lim- imization [55,56]. Accounting for the presence of compu-
ited processing resources [45]. tation noise in reinforcement learning has revealed that
trial-to-trial fluctuations in ACC activity covary with
Other neural constraints which may give rise to compu- variability in the updating of choice values based on
tation noise include the multiplexing of several cognitive obtained rewards, even when it does not result in a
tasks by context-dependent computations in shared neu- behavioral switch (i.e. overt exploration) on the subse-
ral circuits in parietal and prefrontal cortices [36,46]. quent trial (Figure 2a) [8]. This finding indicates that
Concurrent, irrelevant input that has not been fully the phasic ACC activity observed preceding exploratory
suppressed by context-dependent computations would choices may reflect a genuine resetting of the internal
produce task-independent variability with the same sta- model of the environment rather than a temporary
tistical signatures as the computation noise observed in ‘release’ of the reliance on the internal model for guiding
the weather prediction task [4]. behavior.
In terms of cognition, different lines of research have Beside the ACC, the magnitude of computation noise in
identified ‘efficient coding’ mechanisms for dealing with the updating of choice values correlates also robustly with
external and internal sources of noise [47]. Human per- trial-to-trial fluctuations in pupil dilation [8], a non-
ceptual biases running opposite to prior expectations (in invasive proxy of locus coeruleus-norepinephrine (LC-
apparent contradiction with statistical inference) can be NE) activity [57,58,59]. Noradrenergic neuromodulation
explained by efficient coding principles [48]. Similarly, has been linked to the regulation of arousal, but also
the variability and occasional irrationality of human pref- strategic exploration through the adaptive gating of neural
erence-based decisions may arise from strikingly similar variability in the ACC and other decision circuits in the
mechanisms applied to value signals [49]. The notion of prefrontal cortex [60]. In volatile environments, pupil
limited processing resources, which lies at the heart of dilation reflects not only the updating of the internal
efficient coding theories, has recently been instantiated as model following surprising events [61], but also the
a finite number of ‘particles’ in a sampling-based descrip- variability of the resulting behavior [62]. Accounting
tion of statistical inference [50,51–53]. As shown by for the presence of computation noise in similar condi-
these two examples, certain cognitive biases may be tions showed that pupil-linked variability in behavior
the consequence of computation noise. Sampling-based arises from computation noise rather than fluctuations
computations in a hierarchical inference circuit can in the exploration-exploitation trade-off (Figure 2b) – an

Figure 2
(a) anterior cingulate cortex (ACC) (c) reanalysis of Jepma et al., 2010
contributions to behavioral variability
parameter estimate (a.u.)

(b) phasic pupil dilation (d) computation noise estimates
posterior density
Current Opinion in Behavioral Sciences
(a) Regression of trial-to-trial fluctuations of computation noise with BOLD activity in the anterior cingulate cortex (ACC) locked to the onset of the
choice period. Left: the correlations of ACC activity with computation noise (blue) and prediction error triggered by the previous reward (orange)
emerge before choice onset and follow similar time courses. Right: fluctuations of ACC activity predict trial-to-trial changes in behavioral variability
to the same extent in the partial outcome condition (left bar) and the complete outcome condition (right bar) where exploration has no adaptive
value in terms of reward maximization. (b) Regression of trial-to-trial fluctuations of computation noise with phasic pupil dilation locked to the
onset of the choice period. Left: the correlation of pupil dilation with learning noise (blue) emerges before choice onset and precedes the
correlation of pupil dilation with choice value (purple). Right: like ACC activity, fluctuations of pupil dilation predict trial-to-trial changes in
behavioral variability to the same extent in the partial and complete outcome conditions. (c) Reanalysis of behavioral data from a human
pharmacological study of the effect of reboxetine (noradrenaline reuptake inhibitor) on reinforcement learning in a restless four-armed bandit task.
Contribution of inference errors in reinforcement learning to behavioral variability for the placebo group (left) and the reboxetine group (right).
Computation noise in reinforcement learning explains a larger fraction of behavioral variability in the reboxetine group (80%) than the placebo
group (53%). (d) Posterior distributions of computation noise estimates for the placebo group (gray) and reboxetine group (dark blue). Reboxetine
increases the magnitude of computation noise during reinforcement learning.
effect observed even when there is equal uncertainty reuptake inhibitor, increase the rate of endogenous
about chosen and unchosen options [8]. Recent simul- perceptual alternations of bistable stimuli [64], in a task
taneous recordings from the LC and the ACC have which does not involve any form of exploration. Fur-
further shown that phasic LC activation increases pair- thermore, small doses of reboxetine, another noradrena-
wise correlations between ACC neurons [63], in a way line reuptake inhibitor, do not produce exploratory
that resembles the attentional modulation of correlated behavior [65] but seem to increase the magnitude of
noise in visual cortex [41–44]. computation noise in a preliminary reanalysis of this
dataset using a reinforcement learning model which
The idea that noradrenergic neuromodulation may reg- accounts for the presence of computation noise
ulate computation noise rather than the exploration- (Figure 2c,d) [8,66]. Future research should validate
exploitation trade-off is also supported by causal phar- this preliminary finding and investigate further the pos-
macological manipulations of the LC-NE system in sible involvement of the noradrenergic system in the
humans. Small doses of atomoxetine, a noradrenaline active regulation of computation noise.

Emergent benefits of computation noise for behavioral variability when confronted with a computer-
learning and decision-making simulated competitor that is capable of predicting their
The research discussed above describes computation upcoming choices [73,74].
noise as an important constraint on neural function and
cognition, whose impact on performance may be con- Finally, and more generally, computation noise may
trolled by a specific neuromodulatory pathway. A remain- produce beneficial stochasticity for the attractor-like
ing puzzle concerns the reasons of its very existence. If dynamics of neural circuits implementing probabilistic
computation noise drives such a large fraction of human inference [34,35,36,46]. Their dynamics are typically
behavioral variability and suboptimality in uncertain low-dimensional and converge on strong attractor states,
environments, why hasn’t it been suppressed or reduced such that computation noise could serve to destabilize
to a larger extent during evolution? these attractor states and allow for more flexible transi-
tions between them. Such flexibility is particularly wel-
A first possible answer is that computation noise may come in uncertain and volatile environments where the
optimize a trade-off between the marginal payoff of a state of the environment can change unpredictably, and
computation and the cost associated with performing the the internal model (reflected in the current attractor state
computation at a certain precision [8]. The ACC has of the circuit) needs to be updated following each change.
recently been hypothesized to arbitrate such a cost-ben- This flexibility is typically implemented in cognitive
efit trade-off by computing an ‘expected value of control’ models through explicit sophistication (e.g. hierarchical
– defined as the difference between an expected payoff inference) [6,75,76]. Whether computation noise may
and its associated cost in terms of cognitive conflict [67]. provide such cognitive flexibility at zero cost therefore
The cost associated with a computation is likely to grow constitutes another important open question for future
with its precision, such that the ACC may optimize this research.
computation cost-benefit trade-off by regulating compu-
tation noise [8] (possibly, through bidirectional connec- Conflict of interest statement
tivity with the LC-NE system [58]). This proposal Nothing declared.
provides a natural explanation for the existence of com-
putation noise, but also makes testable predictions. In
particular, computation noise should increase in highly Acknowledgements
volatile conditions where the marginal payoff of main- We thank Marieke Jepma and Sander Nieuwenhuis for sharing the
taining a precise internal model of the environment behavioral data of their pharmacological study, and Vasilisa Skvortsova for
useful discussions. This work was supported by a starting grant from the
decreases – a prediction which has recently been vali- European Research Council (OPTIMIZERR, ERC-StG-759341) awarded
dated experimentally [24]. The Weber scaling structure to V.W., and an institutional grant from the Agence Nationale de la
of computation noise also drives transient increases in Recherche (FrontCog, ANR-17-EURE-0017) awarded to the Département
d’Études Cognitives of the École Normale Supérieure.
behavioral variability following surprising events by reset-
ting the current state of the internal model [8,24].
References and recommended reading
Papers of particular interest, published within the period of review,
Although this cost-benefit description of computation have been highlighted as:
noise in volatile environments has its merits, it fails to
of special interest
explain why computation noise is similarly large in uncer- of outstanding interest
tain but fixed environments where behavioral variability
is neither required nor useful, as in the weather prediction 1. Knowlton BJ, Mangels JA, Squire LR: A neostriatal habit learning
task [4]. This pervasiveness of computation noise sug- system in humans. Science 1996, 273:1399-1402.
gests that it has broader emergent benefits for cognition. 2. Poldrack RA, Clark J, Paré-Blagoev EJ, Shohamy D, Creso
Moyano J, Myers C, Gluck MA: Interactive memory systems in
Recent research has provided insights regarding what the human brain. Nature 2001, 414:546-550.
these emergent benefits may be. First, two neurophysio-
3. Yang T, Shadlen MN: Probabilistic reasoning by neurons.
logical correlates of computation noise (pupil-linked Nature 2007, 447:1075-1080.
arousal and large-scale neural variability) have been pro- 4. Drugowitsch J, Wyart V, Devauchelle A-D, Koechlin E:
posed to counteract idiosyncratic biases during perceptual Computational precision of mental inference as critical source
decisions [68–70,71]. This means that increases in com- of human choice suboptimality. Neuron 2016, 92:1398-1411
This study investigates the distinct sources of human decision variability
putation noise may be accompanied by less biased per- and suboptimality in a visual variant of the weather prediction task. They
ception, a cognitive trade-off which has also been introduce a cognitive modeling framework which decomposes decision
variability into sensory, inference and selection errors based on their
reported in the reversal learning task [72]. Another possi- respective statistical signatures. By measuring the bias-variance trade-
ble benefit of computation noise is that it produces less off of inference errors, they find that computation noise in inference
accounts alone for more than two thirds of human decision variability.
predictable behavior – a key advantage in competitive
social contexts. Behavioral experiments in rodents have 5. Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ: Cortical
substrates for exploratory decisions in humans. Nature 2006,
shown that the animals can purposefully increase their 441:876-879.

6. Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS: population level, and 2. that sensory variability dominates over motor
Learning the value of information in an uncertain world. Nat variability in most perceptual decision-making tasks.
Neurosci 2007, 10:1214-1221.
24. Findling C, Chopin N, Koechlin E: Imprecise neural
7. Izquierdo A, Brigman JL, Radke AK, Rudebeck PH, Holmes A: The computations as source of human adaptive behavior in
neural basis of reversal learning: an updated perspective. volatile environments. Nat Hum Behav 2021, 5:99-112
Neuroscience 2017, 345:12-26. This study investigates the contribution of computation noise to human
adaptive behavior in volatile environments. They show that computation
8. Findling C, Skvortsova V, Dromnelle R, Palminteri S, Wyart V: noise during low-level, single-stage inference leads to the same near-
Computational noise in reward-guided learning drives optimal performance as normative hierarchical inference in volatile envir-
behavioral variability in volatile environments. Nat Neurosci onments. They also find that noisy single-stage inference accounts for
2019, 22:2066-2077 human behavioral variability better than noise-free hierarchical inference.
This study investigates the contribution of computation noise in reinfor-
cement learning to human decision variability in a random-walk variant of 25. Sutton RS, Barto AG: Reinforcement Learning: An Introduction.
the reversal learning task. They find that computation noise drives a MIT Press; 1998.
dominant fraction of non-greedy decisions otherwise assigned to the
exploration-exploitation trade-off, and identify its neurophysiological 26. Rushworth MFS, Noonan MP, Boorman ED, Walton ME,
correlates in the anterior cingulate cortex and phasic pupil dilation during Behrens TE: Frontal cortex and reward-guided learning and
reinforcement learning rather than choice. decision-making. Neuron 2011, 70:1054-1069.
9. Boorman ED, Behrens TEJ, Woolrich MW, Rushworth MFS: How 27. O’Reilly JX, Schüffelgen U, Cuell SF, Behrens TEJ, Mars RB,
green is the grass on the other side? Frontopolar cortex and Rushworth MFS: Dissociable effects of surprise and model
the evidence in favor of alternative courses of action. Neuron update in parietal and anterior cingulate cortex. Proc Natl Acad
2009, 62:733-743. Sci U S A 2013, 110:E3660-3669.
10. Palminteri S, Khamassi M, Joffily M, Coricelli G: Contextual 28. Farashahi S, Donahue CH, Khorsand P, Seo H, Lee D, Soltani A:
modulation of value signals in reward and punishment Metaplasticity as a neural substrate for adaptive learning and
learning. Nat Commun 2015, 6:8096. choice under uncertainty. Neuron 2017, 94:401-414.e6.
11. Gershman SJ: Deconstructing the human algorithms for 29. Waskom ML, Kiani R: Decision making through integration of
exploration. Cognition 2018, 173:34-42. sensory evidence at prolonged timescales. Curr Biol 2018,
28:3850-3856.
12. Kiani R, Hanks TD, Shadlen MN: Bounded integration in parietal
cortex underlies decisions even when viewing duration is 30. Stevens CF: Neurotransmitter release at central synapses.
dictated by the environment. J Neurosci 2008, 28:3017-3029. Neuron 2003, 40:381-388.
13. Nienborg H, Cumming BG: Decision-related activity in sensory 31. Boerlin M, Machens CK, Denève S: Predictive coding of
neurons reflects more than a neuron’s causal effect. Nature dynamical variables in balanced spiking networks. PLoS
2009, 459:89-92. Comput Biol 2013, 9 e1003258.
14. Wyart V, Myers NE, Summerfield C: Neural mechanisms of 32. Denève S, Machens CK: Efficient codes and balanced
human perceptual choice under focused and divided networks. Nat Neurosci 2016, 19:375-382
attention. J Neurosci 2015, 35:3485-3498. This review discusses the advantages of a tight excitation-inhibition
balance for precise computations in neural circuits. They describe how
15. Lange RD, Chattoraj A, Beck JM, Yates JL, Haefner RM: A such this tight balance can be used by associative cortical circuits to
confirmation bias in perceptual decision-making due to construct high-dimensional representations and learn complex functions
hierarchical approximate inference. bioRxiv 2020 http://dx.doi. of their input.
org/10.1101/440321.
33. Pestilli F, Carrasco M, Heeger DJ, Gardner JL: Attentional
16. Talluri BC, Urai AE, Tsetsos K, Usher M, Donner TH: Confirmation enhancement via selection and pooling of early sensory
bias through selective overweighting of choice-consistent responses in human visual cortex. Neuron 2011, 72:832-846.
evidence. Curr Biol 2018, 28:3128-3135.
34. Wong K-F, Wang X-J: A recurrent network mechanism of time
17. Akrami A, Kopec CD, Diamond ME, Brody CD: Posterior parietal integration in perceptual decisions. J Neurosci 2006, 26:1314-
cortex represents sensory history and mediates its effects on 1328.
behaviour. Nature 2018, 554:368-372.
35. Soltani A, Wang X-J: Synaptic computation underlying
18. Lak A, Okun M, Moss MM, Gurnani H, Farrell K, Wells MJ, probabilistic inference. Nat Neurosci 2010, 13:112-119.
Reddy CB, Kepecs A, Harris KD, Carandini M: Dopaminergic and
prefrontal basis of learning from snsory confidence and 36. Mante V, Sussillo D, Shenoy KV, Newsome WT: Context-
reward value. Neuron 2020, 105:700-711.e6. dependent computation by recurrent dynamics in prefrontal
cortex. Nature 2013, 503:78-84
19. Lak A, Hueske E, Hirokawa J, Masset P, Ott T, Urai AE, Donner TH, This study investigates how cortical circuits in the prefrontal cortex can
Carandini M, Tonegawa S, Uchida N et al.: Reinforcement biases perform context-dependent computations based on multidimensional
subsequent perceptual decisions when confidence is low, a stimuli. They find that population-level activity shows low-dimensional
widespread behavioral phenomenon. eLife 2020:9. trajectories characteristic of the selection and integration of context-
dependent decision signals. They show that the attractor-like dynamics
20. Wyart V, Koechlin E: Choice variability and suboptimality in of recurrent neural networks trained on the same task show the same
uncertain environments. Curr Opin Behav Sci 2016, 11:109-115. characteristic features.
21. Rahnev D, Denison RN: Suboptimality in perceptual decision 37. Moreno-Bote R, Beck J, Kanitscheider I, Pitkow X, Latham P,
making. Behav Brain Sci 2018, 41:e223. Pouget A: Information-limiting correlations. Nat Neurosci 2014,
17:1410-1417
22. Wyart V: Leveraging decision consistency to decompose This theoretical study describes the precise statistical structure of cor-
suboptimality in terms of its ultimate predictability. Behav Brain related noise which limit information processing in populations of neurons
Sci 2018, 41:e248. tuned to a stimulus feature. They show that decorrelation alone does not
necessarily increase the precision of sensory representations.
23. Beck JM, Ma WJ, Pitkow X, Latham PE, Pouget A: Not noisy, just
wrong: the role of suboptimal inference in behavioral 38. Rumyantsev OI, Lecoq JA, Hernandez O, Zhang Y, Savall J,
variability. Neuron 2012, 74:30-39 Chrapkiewicz R, Li J, Zeng H, Ganguli S, Schnitzer MJ:
This review discusses the contribution of inference biases to decision Fundamental bounds on the fidelity of sensory cortical coding.
variability based on uncertain sensory evidence. They propose that Nature 2020, 580:100-105.
decision variability arises dominantly from biased computations which
can amplify sensory noise, rather than from neural noise– expressed at 39. Kafashan M, Jaffe A, Chettih SN, Nogueira R, Arandia-Romero I,
the level of individual neurons. They ground their hypothesis on theore- Harvey CD, Moreno-Bote R, Drugowitsch J: Scaling of
tical considerations: 1. that neural noise should average out at the information in large neural populations reveals signatures of

information-limiting correlations. bioRxiv 2020 http://dx.doi. 58. Joshi S, Li Y, Kalwani RM, Gold JI: Relationships between pupil
org/10.1101/2020.01.10.902171. diameter and neuronal activity in the locus coeruleus, colliculi,
and cingulate cortex. Neuron 2016, 89:221-234
40. Ni AM, Ruff DA, Alberts JJ, Symmonds J, Cohen MR: Learning This study investigates the relation between pupil dilation and neural
and attention reveal a general relationship between activity in the locus coeruleus and the cingulate cortex. They find that
population activity and behavior. Science 2018, 359:463-465 changes in pupil dilation can reflect neural activity in the locus coeruleus,
This study investigates how correlated noise in visual cortex covaries with and that pupil-linked arousal may coordinate neural activity in the cingu-
behavioral performance as a function of attention and perceptual learn- late cortex.
ing. They find that correlated noise projects dominantly on the neural
dimensions which predict decisions at the trial-by-trial level, and that 59. Joshi S, Gold JI: Pupil size as a window on neural substrates of
attention and perceptual learning decrease correlated noise in similar cognition. Trends Cogn Sci 2020, 24:466-480.
ways over distinct time scales.
60. Aston-Jones G, Cohen JD: An integrative theory of locus
41. Cohen MR, Maunsell JH: Attention improves performance coeruleus-norepinephrine function: adaptive gain and optimal
primarily by reducing interneuronal correlations. Nat Neurosci performance. Annu Rev Neurosci 2005, 28:403-450.
2009, 12:1594-1600.
61. Nassar MR, Rumsey KM, Wilson RC, Parikh K, Heasly B, Gold JI:
42. Cohen MR, Kohn A: Measuring and interpreting neuronal Rational regulation of learning dynamics by pupil-linked
correlations. Nat Neurosci 2011, 14:811-819. arousal systems. Nat Neurosci 2012, 15:1040-1046.
43. Engel TA, Steinmetz NA, Gieselmann MA, Thiele A, Moore T, 62. Filipowicz AL, Glaze CM, Kable JW, Gold JI: Pupil diameter
Boahen K: Selective modulation of cortical state during spatial encodes the idiosyncratic, cognitive complexity of belief
attention. Science 2016, 354:1140-1144. updating. eLife 2020:9.
44. Rabinowitz NC, Goris RL, Cohen M, Simoncelli EP: Attention 63. Joshi S, Gold JI: Context-dependent relationships between
stabilizes the shared gain of V4 populations. eLife 2015, 4 locus coeruleus firing patterns and coordinated neural activity
e08998. in the anterior cingulate cortex. bioRxiv 2020 http://dx.doi.org/
45. Whiteley L, Sahani M: Attention in a Bayesian framework. Front 10.1101/2020.09.26.314831.
Hum Neurosci 2012, 6:100.
64. Pfeffer T, Avramiea A-E, Nolte G, Engel AK, Linkenkaer-Hansen K,
46. Yang GR, Joglekar MR, Song HF, Newsome WT, Wang X-J: Task Donner TH: Catecholamines alter the intrinsic variability of
representations in neural networks trained to perform many cortical population activity and perception. PLoS Biol 2018, 16
cognitive tasks. Nat Neurosci 2019, 22:297-306. e2003453
This study investigates the effects of catecholamines on the internal
47. Barlow HB: Possible principles underlying the transformations variability of large-scale neural activity and the perception of bistable
of sensory messages. In Sensory Communication. Edited by visual stimuli. They find that low doses of atomoxetine, a noradrenaline
Rosenblith WA. MIT Press; 1961:217-234. reuptake inhibitor, increase the rate of spontaneous perceptual alterna-
tion and the temporal structure of ‘scale-free’ population activity across
48. Wei X-X, Stocker AA: A Bayesian observer model constrained cortex.
by efficient coding can explain “anti-Bayesian” percepts. Nat
Neurosci 2015, 18:1509-1517. 65. Jepma M, Te Beek ET, Wagenmakers E-J, van Gerven JMA,
Nieuwenhuis S: The role of the noradrenergic system in the
49. Polanı́a R, Woodford M, Ruff CC: Efficient coding of subjective exploration-exploitation trade-off: a psychopharmacological
value. Nat Neurosci 2019, 22:134-142. study. Front Hum Neurosci 2010, 4:170.
50. Sanborn AN, Chater N: Bayesian brains without probabilities. 66. Findling C, Skvortsova V, Wyart V: A role for the noradrenergic
Trends Cogn Sci 2016, 20:883-893 system in the precision of reward-guided learning. Symposium
This review discusses how Bayesian accounts of human cognition do not on the Biology of Decision Making. 2019. (Oxford, UK), 9:33.
imply the explicit representation and use of probabilities. Rather, the
authors describe the brain as a Bayesian sampler which performs 67. Shenhav A, Botvinick MM, Cohen JD: The expected value of
approximate (yet asymptotically normative) computations based on lim- control: an integrative theory of anterior cingulate cortex
ited cognitive resources. They show how this hypothesis can account for function. Neuron 2013, 79:217-240.
human biases in probabilistic reasoning.
68. de Gee JW, Colizoli O, Kloosterman NA, Knapen T, Nieuwenhuis S,
51. Bhui R, Gershman SJ: Decision by sampling implements Donner TH: Dynamic modulation of decision biases by
efficient coding of psychoeconomic functions. Psychol Rev brainstem arousal systems. eLife 2017:6.
2018, 125:985-1001.
69. de Gee JW, Tsetsos K, Schwabe L, Urai AE, McCormick D,
52. Heng JA, Woodford M, Polania R: Efficient sampling and noisy McGinley MJ, Donner TH: Pupil-linked phasic arousal predicts a
decisions. eLife 2020, 9:e54962. reduction of choice bias across species and decision
53. Lieder F, Griffiths TL: Resource-rational analysis: domains. eLife 2020:9.
understanding human cognition as the optimal use of limited
70. Kloosterman NA, de Gee JW, Werkle-Bergner M, Lindenberger U,
computational resources. Behav Brain Sci 2019, 43:e1.
Garrett DD, Fahrenfort JJ: Humans strategically shift decision
54. Tsetsos K, Moran R, Moreland J, Chater N, Usher M, bias by flexibly adjusting sensory evidence accumulation.
Summerfield C: Economic irrationality is optimal during noisy eLife 2019:8.
decision making. Proc Natl Acad Sci U S A 2016, 113:3102-3107
This study investigates the origin of economic irrationality during the 71. Kloosterman NA, Kosciessa JQ, Lindenberger U, Fahrenfort JJ,
sequential comparison of choice options. They find that humans integrate Garrett DD: Boosts in brain signal variability track liberal shifts
information in a ‘winner take all’ fashion which can explain the observed in decision bias. eLife 2020:9
irrationality of human behavior. They further show that this seemingly This study investigates the relation between fluctuations in the entropy of
suboptimal integration strategy mitigates optimally the negative effects of frontal brain signals and adjustments in conservative biases during
computation noise on performance. perceptual decision-making. They find that liberal shifts in decision biases
are accompanied by increases in the variability of frontal brain signals.
55. Karlsson MP, Tervo DGR, Karpova AY: Network resets in medial They propose that the regulation of neural variability may support the
prefrontal cortex mark the onset of behavioral uncertainty. moment-to-moment adaptation of decision biases to environmental
Science 2012, 338:135-139. demands.
56. Donoso M, Collins AGE, Koechlin E: Foundations of human 72. Glaze CM, Filipowicz ALS, Kable JW, Balasubramanian V, Gold JI:
reasoning in the prefrontal cortex. Science 2014, 344:1481- A bias–variance trade-off governs individual differences in on-
1486. line learning in an unpredictable environment. Nat Hum Behav
2018, 2:213-224.
57. Usher M, Cohen JD, Servan-Schreiber D, Rajkowski J, Aston-
Jones G: The role of locus coeruleus in the regulation of 73. Tervo DGR, Proskurin M, Manakov M, Kabra M, Vollmer A,
cognitive performance. Science 1999, 283:549-554. Branson K, Karpova AY: Behavioral variability through

stochastic choice and its gating by anterior cingulate cortex. Mourot A et al.: Mice adaptively generate choice variability in a
Cell 2014, 159:21-32 deterministic task. Commun Biol 2020, 3:34.
This study investigates whether animals can purposefully generate beha-
vioral variability in competitive social settings. They find that rats increase 75. Meyniel F, Schlunegger D, Dehaene S: The sense of confidence
their behavioral variability when confronted with a computer-simulated during probabilistic learning: a normative account. PLoS
competitor that is capable of predicting their upcoming choices, a Comput Biol 2015, 11 e1004305.
mechanism reflected in the relation between anterior cingulate activity
and behavior. 76. Meyniel F, Dehaene S: Brain networks for confidence weighting
and hierarchical inference during probabilistic learning. Proc
74. Belkaid M, Bousseyrol E, Durand-de Cuttoli R, Dongelmans M, Natl Acad Sci U S A 2017, 114:E3859-E3868.
Duranté EK, Ahmed Yahia T, Didienne S, Hanesse B, Come M,

Findling&Wyart - 2021 - Computation Noise in Human Learning Anddecision-Making Origin, Impact, Function

Uploaded by

Copyright:

Available Formats

You might also like

Findling&Wyart - 2021 - Computation Noise in Human Learning Anddecision-Making Origin, Impact, Function

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Findling&Wyart - 2021 - Computation Noise in Human Learning Anddecision-Making Origin, Impact, Function

Uploaded by

Copyright:

Available Formats

Available online at www.sciencedirect.

Computation noise in human learning and

Large contribution of inference errors to

Current Opinion in Behavioral Sciences 2021, 38:124–132 www.sciencedirect.com

Current Opinion in Behavioral Sciences

www.sciencedirect.com Current Opinion in Behavioral Sciences 2021, 38:124–132

Current Opinion in Behavioral Sciences 2021, 38:124–132 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Behavioral Sciences 2021, 38:124–132

parameter estimate (a.u.)

parameter estimate (a.u.)

Current Opinion in Behavioral Sciences

Current Opinion in Behavioral Sciences 2021, 38:124–132 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Behavioral Sciences 2021, 38:124–132

Current Opinion in Behavioral Sciences 2021, 38:124–132 www.sciencedirect.com

www.sciencedirect.com Current Opinion in Behavioral Sciences 2021, 38:124–132

Current Opinion in Behavioral Sciences 2021, 38:124–132 www.sciencedirect.com

You might also like