Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220689220

Fundamentals of Computational Neuroscience (2. ed.).

Book · January 2009


Source: DBLP

CITATIONS READS

4 6,503

1 author:

Thomas Trappenberg
Dalhousie University
187 PUBLICATIONS 2,987 CITATIONS

SEE PROFILE

All content following this page was uploaded by Thomas Trappenberg on 01 June 2014.

The user has requested enhancement of the downloaded file.


Fundamentals of Computational
Neuroscience 2e

Thomas Trappenberg

December 11, 2009

Chapter 9: Modular networks, motor control, and reinforcement


learning
Mixture of experts

Expert 1

Integration
Expert 2 Output
network
Input

Expert n

Gating
network

A. Absolute function B. Mixture of expert for absolute function

f (x ) = abs (x )
X ΣΠ abs(x )

x
The ‘what-and-where’ task

B. Without bias towards short connections


1
A. Model retina with sample image

Output node #
5

5 10

4 15

3 1 5 10 15 20 25 30 35
Hidden node #

2
C. With bias towards short connections
1

Output node #
1
5
1 2 3 4 5
10

15

1 5 10 15 20 25 30 35
Hidden node #

Jacobs and Jordan (1992)


Coupled attractor networks

A. Coupled attractor networks B. The left--right universe with letters


Node group 1 Node group 2
Connections 0000011100000 1111111110000
between groups 0000011100000 1111111111100
0000111110000 1110000001110
0000110110000 1110000011100
0001100011000 1110001110000
0011100011100 1111111000000
0011111111100 1111111000000
0111111111110 1110001110000
0111000001110 1110000011100
0111000001110 1110000001110
0111000001110 1111111111100
0111000001110 1111111110000
Limit on modularity

A. Load capacity B. Bounds on intermodular strength


0.12 2
m=1

g : Relative intramodular strength


0.1
α c : Load capacity

1.5
0.08 m=2

0.06 1
m=4
0.04
0.5
0.02

0 0
0 0.2 0.4 0.6 0.8 1 2 4 6 8 10
g : Relative intermodular strength m : Number of modules
Sequence learning

A. Modular attractor model B. Time evolution of overlaps


3000
Input

Overlap in A
Pathway 2000

1000

w AB w BB −1000
0 2 4 6 8 10 12 14 16 18 20

w AA w BA 2000
1500

Overlap in B
1000
500
0
−500
0 2 4 6 8 10 12 14 16 18 20
Module A Module B Time [τ]

Lawrence, Trappenberg and Fine (2006); (Sommer and Wennekers (2005))


Working memory

PFC

PMC

HCMP

O’Reilly, Braver, and Cohen 1999


Limit on working memory

A. One object B. Two objects C. Four objects


120 120 120
100 100 100
Node number

Node number

Node number
80 80 80
60 60 60
40 40 40
20 20 20

0 10 20 0 10 20 0 10 20
Time Time Time
Motor learning and control

Disturbance

Desired Motor command Motor Controlled Actual Sensory


state
- generator command object state system

Afferent

Re-afferent
Forward model controller

Disturbance

Desired Motor command Motor Controlled Actual Sensory


state
- generator command object state system

Forward dynamic Forward output - -


model model
+
Afferent
Re-afferent
Inverse model controller

Inverse
model
Disturbance

+
Desired Motor command - command
Motor Controlled Actual Sensory
state
- generator object state system

Afferent

Re-afferent
Cerebellum

Stellate cell Parallel fibre

Basket cell

Molecular
layer { Purkinje
Purkinje { neuron Golgi cell

{
layer

Granular Granule
layer cell

Climbing fibre

Mossy fibre
Excitatory synapse
Intracerebellar Inhibitory synapse

{
and vestibular Spinal cord
nuclei External cuneate nucleus
Inferior olive Out Reticular nuclei
Pontine nuclei
Reinforcement learning
Basal Ganglia

A. Outline of basic BG anatomy C. Recordings of SNc neurons and simulations


Cerebral cortex

Caudate
nucleus

Putamen Thalamus Stimulus A No reward

Pattern 4
Subthalamic
nucleus

rhat
Pattern 3
Globus Substantia nigra
Pattern 2
pallidus
( pars compacta
pars reticulata )
0 50 100
Superior colliculus Stimulus B Stimulus A Reward Episode
temporal difference learning

A. Linear predictor node B. Temporal delta rule C. Temporal difference rule


in in in in in in in in
r 1 (t) r 2 (t) r 3 (t) r 4 (t) r 1 (t) r 2 (t) r 3 (t) r 4 (t)
in in in in
r 1 (t) r 2 (t) r 3 (t) r 4 (t)
in in
rj (t-1) rj (t-1)

r (t) r (t)
V(t) V(t)
slow slow
V(t −1) V(t −1)
V(t) fast
γ V(t)

r (t) r (t)
Actor-critique and Q-learning

B. Actor-critic model of BG D. Q-learning model of BG


state action
Cerebral cortex
(frontal)
F C C C
Cerebral cortex
state / action coding
TH

Thalamus
Striosomal module
Striatum
Matrix module

SPm SPs

reward prediction
ST ST
SNc Pallidum
PD DA
action selection
Basal ganglia
(actor) (critic)

Primary reinforcement Primary reinforcement


Actor-critique controller

Critic

Reinforcement Disturbance
signal

Desired Motor command Motor Controlled Actual Sensory


state
- generator (actor) command object state system

Afferent

Re-afferent
Further Readings
Robert A. Jacobs, Michael I. Jordan, and Andrew G. Barto (1991), Task decomposition through
competition in a modular connectionist architecture: the what and where tasks, in
Cognitive Science 15: 219–250.
Geoffrey Hinton (1999), Products of experts, in Proceedings of the Ninth International
Conference on Artificial Neural Networks, ICANN ’99, 1:1–6.
Yaneer Bar-Yam (1997), Dynamics of complex systems, Addison-Wesley.
Edmund T. Rolls and Simon M. Stringer (1999), A model of the interaction between mood and
memory, in Networks: Comptutation in neural systems 12: 89–109.
N. J. Nilsson (1965), Learning machines: foundations of trainable pattern-classifying
systems, McGraw-Hill.
O. G. Selfridge (1958), Pandemonium: a paradigm of learning, in the mechanization of
thought processes, in Proceedings of a Symposium Held at the National Physical
Laboratory, November 1958, 511–27, London HMSO.
Marvin Minsky (1986), The society of mind, Simon & Schuster.
Akira Miyake and Priti Shah (eds.) (1999), Models of working memory, Cambridge University
Press.
Daniel M. Wolpert, R. Chris Miall, and Mitsuo Kawato (1998), Internal models in the
cerebellum, in Trends Cognitive Science 2: 338–47.
Edmund T. Rolls and Alessandro Treves (1998), Neural networks and brain function, Oxford
University Press.
James C. Houk, Joel L. Davis, and David G. Beiser (eds.) (1995), Models of information
processing in the basal ganglia, MIT Press.
Richard S. Sutton and Andrew G. Barto (1998), Reinforcement learning: an introduction, MIT
Press.
Peter Dayan and Laurence F. Abbott (2001), Theoretical Neuroscience, MIT Press.
View publication stats

You might also like