Professional Documents
Culture Documents
Levine Deep RL Lecture
Levine Deep RL Lecture
Levine Deep RL Lecture
Imitation Learning
Sergey Levine
perception
Action
(run away)
action
sensorimotor loop
Action
(run away)
End-to-end vision
standard
features mid-level features classifier
computer
(e.g. HOG) (e.g. DPM) (e.g. SVM)
vision
Felzenszwalb ‘08
deep
learning
Krizhevsky ‘12
End-to-end control
standard state low-level
modeling & motion motor
robotic observations estimation
prediction planning
controller
torques
control (e.g. vision) (e.g. PD)
deep
motor
sensorimotor observations
torques
learning
indirect supervision
actions have consequences
Contents
Imitation learning
Research frontiers
Terminology & notation
1. run away
2. ignore
3. pet
Terminology & notation
1. run away
2. ignore
3. pet
Terminology & notation
1. run away
2. ignore
3. pet
a bit of history…
Imitation learning
Research frontiers
Imitation Learning
training supervised
data learning
stability
Learning from a stabilizing
controller
Imitation learning
Research frontiers
Terminology & notation
1. run away
2. ignore
3. pet
Trajectory optimization
Probabilistic version
Probabilistic version (in pictures)
DAgger without Humans
path replanned!
new old
[L. et al. NIPS ‘14]
Learning on PR2
trajectory distribution(s)
end-to-end training
with N. Wagener and P. Abbeel with V. Kumar and E. Todorov with A. Gupta, C. Eppner, P. Abbeel
reinforcement learning
without using the model
(the method)
Imitation learning
Research frontiers
ingredients for success in learning:
supervised learning: learning sensorimotor skills:
computation computation
algorithms
data
~? algorithms
data
• 2-5 Hz update
object
• no prior knowledge bin
training testing
with J. Fu
Learning from Prior Experience
Learning what Success Means
Greg Kahn Tianhao Zhang Chelsea Finn Trevor Darrell Pieter Abbeel