Professional Documents
Culture Documents
Error Based and Reward Based Learning: April 2016
Error Based and Reward Based Learning: April 2016
net/publication/308965897
CITATIONS READS
0 1,042
1 author:
Wondimu W Teka
U.S. Food and Drug Administration
28 PUBLICATIONS 225 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Wondimu W Teka on 10 October 2016.
1
Research goal
• Develop a model that has both BG and cerebellum compartments
• Apply error-based and reward based learning on the model
• Produce experimental results for
• Control group (Normal BG and cerebellum)
(Sensory and reward prediction errors are presented)
• Damaged BG
(Only sensory prediction errors)
• Damaged Cerebellum
(only reward prediction errors)
• Propose model predictions
• Write the paper in parallel with the above research tasks
2
Cerebellum and basal ganglia are separately responsible:
Doya 1999/2000 proposal (Kenji Doya).
The basal ganglia are specialized for
reinforcement learning, which is
guided by the reward signal encoded
in the dopaminergic input from the
substantia nigra. The cerebellum is
specialized for supervised learning,
which is guided by the error signal
encoded in the climbing fiber input
from the inferior olive. The cerebral
cortex is specialized for unsupervised
learning, which is guided by the
statistical properties of the input
signal itself, but may also be regulated
Doya, Kenji. "Complementary roles of basal ganglia and cerebellum
in learning and motor control." Current opinion in neurobiology 10.6 by the ascending neuromodulatory
(2000): 732-739. inputs. 3
Error-based and reward-based learning
Error-based learning Reward-based learning
(Supervised learning) (Reinforcement learning)
Sensory prediction error Reward prediction error
• Minimize the error (error correction) • Maximize reward
• Observed sensory feedback • Minimize punishment
• The magnitude and sign of error presented • Success/ failure
Action specification Action selection
(move closer to the target) • move around – explore
• trial-and-error search
Visual feedback No Visual feedback
Cerebellum is responsible Basal ganglia is responsible
Learning is faster, forgetting is faster Learning is slower, forgetting too
Sensory remapping Reach variability is high
4
Error correction: Feedback of errors can be used to directly improve
performance. In supervised or error-based learning the error vector gives both
the magnitude and the direction of the error, and the learning system then shifts
subsequent performance in the opposite direction, with the intention to reduce
the error on subsequent trials. The subject has clear information about the sign
and magnitude of the error, and the subject uses this to minimize error.
5
Error-based learning to understand motor adaptation and motor disordered
6
Methods of experiment on motor learning and adaptation
The most common experimental methods for Error based and reward based
learnings are
• Visual perturbation (rotation)
• Force field perturbation
• Gradual visual perturbation
• Gradual force field perturbation
The errors are caused by the perturbation (external factor) or motor noise
(internal factor).
These experimental tasks involve the adjustment of an internal model to
compensate for an external perturbation.
7
Sensory prediction error dominates reward prediction error
Although reward prediction error is useful for learning and adaptation, the
change in the motor commands is driven almost entirely by sensory prediction
errors when there is high quality sensory feedback.
Reward prediction error (RPE) is very useful if
there is a lack of (no) sensory prediction error
(SPE).
Learning with RPE is slower than Learning with SPE
Izawa, Jun, and Reza Shadmehr. "Learning from sensory and
reward prediction errors during motor adaptation." PLoS
Comput Biol 7.3 (2011): e1002012.
8
Reward based Learning (RPE) shows higher degree of retention
than Error Based learning (SPE), i.e. Aftereffect decay is slow
Therrien, Amanda S.,
Daniel M. Wolpert,
and Amy J. Bastian.
"Effective
reinforcement
learning following
cerebellar damage
requires a balance
between exploration
and motor noise."
Brain (2015): awv329.
9
Cerebellar patients show complete retention from RBL, but
Complete forgettingfrom
Therrien, Amanda S.,
Daniel M. Wolpert,
and Amy J. Bastian.
"Effective
reinforcement
learning following
cerebellar damage
requires a balance
between exploration
and motor noise."
Brain (2015): awv329.
10
Reward based Learning (RPE) shows higher degree of retention
than Error Based learning (SPE), i.e. Aftereffect decay is slow
BE= Binary error, VE= vector error BE
for RBL, NA for EBL,
BE+VE for both.
Error clamp - false artificial
correction, and Sensory error is 0
and reward is provided .
11
Reward caused greater memory retention, and Punishment
led to faster learning
Reward-based feedback during adaptation
led subsequently to greater retention
when the directional feedback was fully
withdrawn (no vision). Previous work has
shown that positive reinforcement can
influence both online (retention across
trials) and offline (retention across time)
motor retention.
Galea, Joseph M., et al. "The dissociable
effects of punishment and reward on
motor learning." Nature neuroscience 18.4
(2015): 597-602.
12
Reward caused greater memory retention
Galea, Joseph M., et al. "The dissociable effects of punishment and reward on motor
learning." Nature neuroscience 18.4 (2015): 597-602.
13
Combination of reward and sensory (error) feedback accelerates
learning compared with either form of feedback alone.
14
Combination of reward and sensory (error) feedback increases
learning performance and minimize reach variability.
15
Error Based learning (SPE) is very effective only for simple
Learning, for example, distance and angle errors.
Reward based Learning (RPE) is important for complex
learning, for example to correct kinematic errors.
Searching supporting experimental studies
16
Baseline performance with damaged cerebellum
Cerebellar patients do not differ from control group in their baseline performance
- with/without sensory feedback.
Error variability may be high in cerebellar patients, i.e. large Standard deviation of
errors. Note: baseline task is performed after trainings.
Henriques, Denise YP, et al. "The cerebellum is not necessary for visually driven
recalibration of hand proprioception." Neuropsychologia 64 (2014): 195-204.
Synofzik, Matthis, Axel Lindner, and Peter Thier. "The cerebellum updates predictions
about the visual consequences of one's behavior." Current Biology 18.11 (2008): 814-
818. 17
Cerebellar disorders show impaired motor learning
Cerebellar disorders show impaired motor learning in both visual and force field
adaptationthe. Cerebellum plays an important role in adaptation to visuomotor
(VM) and force field (FF) perturbations. Since the model lacks Cerebellum,
errors in the model are larger.
19
Learning despite a damaged cerebellum – gradual adaptation
Aftereffects following reaching with a gradually rotated cursor were
similar across the control and cerebellar patients groups.
gradual Visual rotation of 0.75 degree for each single trail, and
maximum perturbation is 30 degree. For aftereffect only the
target is visible. Henriques, Denise YP, et al. "The cerebellum is
not necessary for visually driven recalibration of hand
proprioception." Neuropsychologia 64 (2014): 195-204.
20
Learning despite a damaged cerebellum – gradual adaptation
cerebellar patients and healthy controls showed learning under error based
(cursor was presented). Note: Patients did not show aftereffect when the visual
feedback and reward are removed, so the learning is from online visual feedback,
It is not the function of cerebellum.
22
Effects of experimental methods on motor learning and adaptation
Different experimental result may contradict each other because of the type
of the experiment. Example:
1. Cerebellar patients showed gradual visuomotor adaption.
Method: a cursor feedback was provided through the reaching movement.
Izawa, Jun, Sarah E. Criscimagna-Hemminger, and Reza Shadmehr.
"Cerebellar contributions to reach adaptation and learning sensory
consequences of action." The Journal of neuroscience 32.12 (2012): 4230-4239.
2. Cerebellar patients have deficits in adapting their reaching movement to
a gradual visuomotor rotation.
method: only endpoint feedback was provided.
Schlerf, John E., et al. "Individuals with cerebellar degeneration show similar
adaptation deficits with large and small visuomotor errors.
" Journal of neurophysiology 109.4 (2013): 1164-1173.
The difference between 1 and 2 may arise from online feedback correction.23
View publication stats