Professional Documents
Culture Documents
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 1
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 1
I Lectures
⇒ Tuesdays and Thursdays 9:00-10:30 at Moore 212
⇒ Miguel Calvo-Fullana and Santiago Paternain
I Office hours:
⇒ Clark Zhang: Monday 5pm-7pm GRASP conference room
clarkz@seas.upenn.edu
⇒ Kate Tolstaya: Wednesday 9am-11am 452C Walnut 3401
eig@seas.upenn.edu
⇒ Arbaaz Khan: On demand
arbaazk@seas.upenn.edu
I Canvas: http://canvas.upenn.edu/courses/1475618
⇒ Piazza
I State space S ⊂ R4
⇒ Pos, vel, angle, ang. velocity
I Action space A ⊂ R
⇒ Horizontal acceleration
I Dynamics of the system
ẍ cos θ + θ̈` = −g sin θ
ẍ(m+mp )+θ̈mp ` cos θ = F +mp `θ̇2 sin θ
I State space S: position and velocity, Action space A is the force applied
I Goal is to reach the top of the mountain
I Previous examples are in this form ⇒ But they have infinite states
S A
I The learning problem is to find such policy based on the rewards collected
S A
S A
1
https://www.youtube.com/watch?v=V1eYniJ0Rnk
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 14
Success of Reinforcement Learning
I Challenges
⇒ Imperfect information: Only part of the map is observed
⇒ Long term planning: Actions won’t pay off until end of game
⇒ Large action-space: 1026 legal actions per step
I One week of training ⇒ Equivalent to 200 years of gameplay
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 17
Success of Reinforcement Learning
2
https://www.youtube.com/watch?v=yEOEqaEgu94
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 18
Success of Reinforcement Learning
3
https://www.youtube.com/watch?v=VCdxqn0fcnE
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 19
Success of Reinforcement Learning
4
https://www.youtube.com/watch?v=ZhsEKTo7V04
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 20
Not everything that shines is... RL
I Some of the most impressive robotic behaviors are achieved without RL5
5
https://www.youtube.com/watch?v=hSjKoEva5bg
Santiago Paternain, Miguel Calvo-Fullana ESE680-005 Reinforcement Learning 21
Characteristics of Reinforcement Learning