Lecture 0

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Reinforcement Learning: From Foundations to Deep

Approaches
Lecture 00: Organization

Georgia Chalvatzaki & Davide Tateo


Department of Computer Science
TU Darmstadt
Summer Term 2024

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 1 / 25
1. Organizational Aspects

Outline

1. Organizational Aspects

2. Questions & Answers

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 2 / 25
1. Organizational Aspects

Instructors

Georgia Chalvatzaki is Full Professor (W3) and is the leader of the Interactive Robot
Perception & Learning (PEARL) group at the Department of Computer Science of TU
Darmstadt. She joined TU Darmstadt in 2019 after obtaining a PhD in Engineering and
Computer Science from the National Technical University of Athens (NTUA), Greece.
Her main research interests are Robot Learning, Embodied AI, Robot Perception,
Human-Robot interaction.

Davide Tateo is a Research Group Leader at the Intelligent Autonomous Systems


Laboratory in the Computer Science Department of the Technical University of
Darmstadt. He received his M.Sc. degree in Computer Engineering at Politecnico di
Milano in 2014 and his Ph.D. in Information Technology from the same university in
2019. His main research interest is Robot Learning, focusing on high-speed motions,
safety, and interpretability.

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 3 / 25
1. Organizational Aspects

Teaching Assistants (1/2)

Aryaman Reddi is a Ph.D. student at the LiteRL lab supervised by Professor Carlo
D’Eramo. His research is focused on designing algorithms for multi-agent
reinforcement learning through a game theoretical lens. During his Master’s at the
University of Cambridge, he studied deep Q-learning agents in control problems
under bounded rationality.

Théo Vincent is a Ph.D. student at DFKI and TU Darmstadt supervised by Ph.D. Boris
Belousov and Professor Jan Peters. He is currently working on off-policy
Reinforcement Learning methods. More information about his research can be found
through this link: https://bit.ly/4aN74I1. Prior to his Ph.D., Théo conduated from
a double master’s degree program between ENS Paris-Saclay and Ponts ParisTech.

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 4 / 25
1. Organizational Aspects

Teaching Assistants (2/2)

Jiayun Li is a PhD student in the Pearl Lab, supervised by Professor Georgia


Chalvatzaki. His research revolves around model learning and learning for control.
Before starting his PhD, he completed his Master’s programme at the TU Berlin,
focusing on statistical machine learning and optimal control.

Elisa Alboni is a Ph.D student at the University of Trento, supervised by Professor


Andrea Del Prete, and a visiting Ph.D student in Pearl Lab, supervised by Professor
Georgia Chalvatzaki. Her research focuses on combining Reinforcement Learning and
Trajectory Optimization. Prior to his Ph.D., she completed her Master’s programme in
Mechatronics Engineering at the University of Trento.

Yufeng Jin is a PhD student in the Pearl Lab, jointly supervised by Prof. Georgia
Chalvatzaki and Dr. Mathias Franzius at the Honda Research Institute Europe. The
focus of his research primarily revolves around object 6D pose estimation and
uncertainty analysis. Prior to his Ph.D., he completed his Master’s program in
Mechatronics and Computer Science at the Karlsruhe Institute of Technology.

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 5 / 25
1. Organizational Aspects

Contacts — Instructors

Georgia Chalvatzaki
Q georgia.chalvatzaki@tu-darmstadt.de
Office: Robert-Piloty building S2 | 02 room D203
Davide Tateo
Q davide.tateo@tu-darmstadt.de
Office: Robert-Piloty building S2 | 02 room E303

Please use Moodle for Q&As


E-mail us only for important issues!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 6 / 25
1. Organizational Aspects

Contacts — Teaching Assistants

Aryaman Reddi
Q aryaman.reddi@tu-darmstadt.de

Théo Vincent
Q theovincentjourdat@gmail.com

Jiayun Li
Q jiayun.li@tu-darmstadt.de

Elisa Alboni
Q elisa.alboni@unitn.it

Yufeng Jin
Q yufeng.jin@tu-darmstadt.de

Please use Moodle for Q&As


E-mail us only for important issues!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 7 / 25
1. Organizational Aspects

Website & Discussion Board

Moodle:
Lecture slides
Pointers to readings
Homework assignments + hand in
Discussion board on Moodle
Please use it to ask questions of public interest.
You are encouraged to discuss with each other.
However: Please do not share solutions or give strong hints about
the solutions to the homework problems.
Asking questions via the discussion boards on Moodle is the
preferred form of communication with the course staff!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 8 / 25
1. Organizational Aspects

Course Language

...will be in English

Why?
Essentially all reinforcement learning literature is in English.
Knowing the proper terminology is essential.
Good to improve your English skills.

Questions and answers in emails/homework/exams will be answered


in English.

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 9 / 25
1. Organizational Aspects

Organization (1/3)

Lectures
This course provides a deep theoretical and practical
understanding to Reinforcement Learning (RL)
It consists of twelve theoretical lectures, every Monday in-person
– see Moodle for schedule.
And practical lectures, every Friday on Zoom – see Moodle for
schedule.
We divided the topics into two parts:
Part A: MDPs and Discrete RL
Part B: Continuous and deep RL

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 10 / 25
1. Organizational Aspects

Organization (2/3)

Exercises, homework, etc.


Each lecture provides self-test topics that you should practice to
keep up with the course
examples on specific problems will be provided during the lecture
practical (algorithmic implementation) examples will be given in
the online sessions
three assignments will be given

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 11 / 25
1. Organizational Aspects

Organization (3/3)

Q&As
during the course
The online session will offer interactive problem-solving sessions
after the practical online lecture, there will be time given to
Q&As for the teaching assistants to answer questions that are not
easy to answer via Moodle.

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 12 / 25
1. Organizational Aspects

Feedback: Essential for both sides...

We appreciate
FEEDBACK!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 13 / 25
1. Organizational Aspects

Exam & Bonus Points from Homework

There will be a written exam!


It will be in English
The format will be open questions – we will provide more info
at the end of the lecture period
Date: planned for September 13, 2024, 14:00-16:30
There will also be a winter exam, and you can keep your
assignment grades
Homework (1/3)
Homework is crucial for succeeding in the course and the final
exam!
There will be three assignments
You will have to submit an individual report for each assignment
For submission dates, check Moodle for the schedule!
G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 14 / 25
1. Organizational Aspects

Homework (2/3)

There will be 3 homework assignments.

Each assignment will contain:


Some open questions
Problems that you will have to model and solve with one or more
RL algorithms
You will have to write a report commenting on your results,
according to given instructions
You will have to submit your code
copied implementations and/or reports will be directly failed
Each report is personal!
You are NOT allowed to use ChatGPT!
Strongly advised to DO the assignments!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 15 / 25
1. Organizational Aspects

Homework (3/3)

Late-submission policy: Every day after the designated deadline


will lead to a reduction to 10% of the total original score.
Submissions three days later than the designated deadline will
not be considered.
Grade decomposition if you do the assignments:
Assignment 1: 10% (+ 3% Bonus)
Assignment 2: 10% (+ 3% Bonus)
Assignment 3: 10% (+ 3% Bonus)
Final exam: up to 70% (BUT you have to pass the exam to get the
Assignment grades!)
Final grade: max(exam+assignments, exam)
Final grade if you do not do any assignment: 100% from the
exam – The exam will NOT have ANY bonus questions

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 16 / 25
1. Organizational Aspects

Background Reading

We will add current papers & tutorials.


Standard background reading:
R. Sutton, A. Barto. Reinforcement Learning - an Introduction, MIT Press
(http://incompleteideas.net/book/RLbook2020.pdf)
(Strongly recommended)
Algorithms for Reinforcement Learning, Csaba Szepesvari. (https:
//sites.ualberta.ca/~szepesva/papers/RLAlgsInMDPs.pdf)
Mathematics background for machine learning:
M.P. Deisenroth, A. Aldo Faisal, and C.S. Ong, Mathematics for Machine
Learning (2020), Cambridge University Press
https://mml-book.github.io/

MushroomRL: Reinforcement Learning library


https://github.com/MushroomRL/mushroom-rl

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 17 / 25
1. Organizational Aspects

Background Reading

Other resources
Reinforcement Learning: State-of-the-Art, Marco Wiering and Martijn
van Otterlo, Eds.
https://link.springer.com/book/10.1007/978-3-642-27645-3
Bertsekas, Dimitri. Reinforcement learning and optimal control. Athena
Scientific, 2019. https://web.mit.edu/dimitrib/www/RL_
Frontmatter-SHORT-INTERNET-POSTED.pdf
Artificial Intelligence: A Modern Approach, Stuart J. Russell and Peter
Norvig. http://aima.cs.berkeley.edu/
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
https://www.deeplearningbook.org/

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 18 / 25
1. Organizational Aspects

How does it fit in your course plan? (1/3)

VL Reinforcement Learnimg is a good advanced lecture that can be


combined with:
VL Lernende Roboter (aka Robot Learning)
IP Robot Learning
IP, VL Intelligent Robotic Manipulation
VL Introduction to Artificial Intelligence

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 19 / 25
1. Organizational Aspects

How does it fit in your course plan? (2/3)

Related Classes:
Improve Foundations: Statistical Machine Learning (SoSe), Robot
Learning (WiSe), Deep Learning: Architectures and Methods(SoSe)

Theses: We regularly offer B.Sc. or M.Sc. Theses on RL-related topics.


Please contact us!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 20 / 25
1. Organizational Aspects

How does it fit in your course plan? (3/3)

B.Sc. / M.Sc. Informatik:


Visual & Interactive Computing (see Modulhandbuch)
If you are strongly interested in machine learning you should
take:
Statistical Machine Learning for V&IC credit
Data Mining and Machine Learning for WW&IV credit
Robot Learning for R&CCE credit
Computer Vision for V&IC credit

M.Sc. in Autonomous Systems & Robotics


M.Sc. in Visual Computing: Area “Computer Vision & ML”

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 21 / 25
1. Organizational Aspects

Lectures Schedule

April 15, Introduction to RL, Lecturer: Georgia


April 22, Markov Decision Processes, Lecturer: Georgia
April 29, Dynamic programming, Lecturer: Georgia
May 6, Model-free prediction, Lecturer: Davide
May 13, Model-free control, Lecturer: Davide
May 27, Function approximation, Lecturer: Georgia
June 3, Deep Q-Learning, Lecturer: Georgia
June 10, Policy gradients, Lecturer: Davide
June 17, Deep Actor-Critic: Introduction, Lecturer: Davide
June 24, Deep Actor-Critic: On-policy, Lecturer: Davide
July 1, Deep Actor-Critic: Off-policy, Lecturer: Davide
July 8, Intrinsic Motivation & Model-based RL, Lecturer: Georgia

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 22 / 25
1. Organizational Aspects

Practical Lectures Schedule


April 19, Introduction to MushroomRL library, TA: Aryaman Reddi
April 26, Coding an MDP, TA: Elisa Alboni
May 3, Dynamic Programming, TA: Yufeng Jin
May 10, Model-free Prediction (TD Evaluation & MC), TA: Elisa Alboni
May 17, Model-free Control (SARSA, Q Learning), TA: Yufeng Jin
First assignment announcement – Due: June 7, 23:59
May 31 Function Approximation, TA: Jiayun Li
June 7, Deep Q Networks, TA: Théo Vincent
June 14, Using the Cluster, TA: Aryaman Reddi
Second assignment announcement – Due: July 5, 23:59
June 21, Policy Gradient Performance Difference Lemma, TA: Jiayun Li
June 28, TRPO, PPO, TA: Théo Vincent
July 5, DDPG, TD3, SAC, TA: Théo Vincent
Third assignment announcement – Due: July 26, 23:59
July 12, Recap Lecture and Exam Demo, TA: Aryaman Reddi, All

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 23 / 25
2. Questions & Answers

Outline

1. Organizational Aspects

2. Questions & Answers

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 24 / 25
2. Questions & Answers

Questions & Answers

Time for your questions!

G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 25 / 25

You might also like