Lecture 0

Reinforcement Learning: From Foundations to Deep
Approaches
Lecture 00: Organization
Georgia Chalvatzaki & Davide Tateo

Department of Computer Science
TU Darmstadt
Summer Term 2024
G. Chalvatzaki & Davide Tateo · RL: Foundations to Deep · Summer Term 2024 1 / 25
1. Organizational Aspects
Outline
2. Questions & Answers
Instructors
Georgia Chalvatzaki is Full Professor (W3) and is the leader of the Interactive Robot
Perception & Learning (PEARL) group at the Department of Computer Science of TU
Darmstadt. She joined TU Darmstadt in 2019 after obtaining a PhD in Engineering and
Computer Science from the National Technical University of Athens (NTUA), Greece.
Her main research interests are Robot Learning, Embodied AI, Robot Perception,
Human-Robot interaction.
Davide Tateo is a Research Group Leader at the Intelligent Autonomous Systems

Laboratory in the Computer Science Department of the Technical University of
Darmstadt. He received his M.Sc. degree in Computer Engineering at Politecnico di
Milano in 2014 and his Ph.D. in Information Technology from the same university in
2019. His main research interest is Robot Learning, focusing on high-speed motions,
safety, and interpretability.
Teaching Assistants (1/2)
Aryaman Reddi is a Ph.D. student at the LiteRL lab supervised by Professor Carlo
D’Eramo. His research is focused on designing algorithms for multi-agent
reinforcement learning through a game theoretical lens. During his Master’s at the
University of Cambridge, he studied deep Q-learning agents in control problems
under bounded rationality.
Théo Vincent is a Ph.D. student at DFKI and TU Darmstadt supervised by Ph.D. Boris
Belousov and Professor Jan Peters. He is currently working on off-policy
Reinforcement Learning methods. More information about his research can be found
through this link: https://bit.ly/4aN74I1. Prior to his Ph.D., Théo conduated from
a double master’s degree program between ENS Paris-Saclay and Ponts ParisTech.
Teaching Assistants (2/2)
Jiayun Li is a PhD student in the Pearl Lab, supervised by Professor Georgia

Chalvatzaki. His research revolves around model learning and learning for control.
Before starting his PhD, he completed his Master’s programme at the TU Berlin,
focusing on statistical machine learning and optimal control.
Elisa Alboni is a Ph.D student at the University of Trento, supervised by Professor

Andrea Del Prete, and a visiting Ph.D student in Pearl Lab, supervised by Professor
Georgia Chalvatzaki. Her research focuses on combining Reinforcement Learning and
Trajectory Optimization. Prior to his Ph.D., she completed her Master’s programme in
Mechatronics Engineering at the University of Trento.
Yufeng Jin is a PhD student in the Pearl Lab, jointly supervised by Prof. Georgia
Chalvatzaki and Dr. Mathias Franzius at the Honda Research Institute Europe. The
focus of his research primarily revolves around object 6D pose estimation and
uncertainty analysis. Prior to his Ph.D., he completed his Master’s program in
Mechatronics and Computer Science at the Karlsruhe Institute of Technology.
Contacts — Instructors
Georgia Chalvatzaki
Q georgia.chalvatzaki@tu-darmstadt.de
Office: Robert-Piloty building S2 | 02 room D203
Davide Tateo
Q davide.tateo@tu-darmstadt.de
Office: Robert-Piloty building S2 | 02 room E303
Please use Moodle for Q&As

E-mail us only for important issues!
Contacts — Teaching Assistants
Aryaman Reddi
Q aryaman.reddi@tu-darmstadt.de
Théo Vincent
Q theovincentjourdat@gmail.com
Jiayun Li
Q jiayun.li@tu-darmstadt.de
Elisa Alboni
Q elisa.alboni@unitn.it
Yufeng Jin
Q yufeng.jin@tu-darmstadt.de
Please use Moodle for Q&As

E-mail us only for important issues!
Website & Discussion Board
Moodle:
Lecture slides
Pointers to readings
Homework assignments + hand in
Discussion board on Moodle
Please use it to ask questions of public interest.
You are encouraged to discuss with each other.
However: Please do not share solutions or give strong hints about
the solutions to the homework problems.
Asking questions via the discussion boards on Moodle is the
preferred form of communication with the course staff!
Course Language
...will be in English
Why?
Essentially all reinforcement learning literature is in English.
Knowing the proper terminology is essential.
Good to improve your English skills.
Questions and answers in emails/homework/exams will be answered

in English.
Organization (1/3)
Lectures
This course provides a deep theoretical and practical
understanding to Reinforcement Learning (RL)
It consists of twelve theoretical lectures, every Monday in-person
– see Moodle for schedule.
And practical lectures, every Friday on Zoom – see Moodle for
schedule.
We divided the topics into two parts:
Part A: MDPs and Discrete RL
Part B: Continuous and deep RL
Organization (2/3)
Exercises, homework, etc.

Each lecture provides self-test topics that you should practice to
keep up with the course
examples on specific problems will be provided during the lecture
practical (algorithmic implementation) examples will be given in
the online sessions
three assignments will be given
Organization (3/3)
Q&As
during the course
The online session will offer interactive problem-solving sessions
after the practical online lecture, there will be time given to
Q&As for the teaching assistants to answer questions that are not
easy to answer via Moodle.
Feedback: Essential for both sides...
We appreciate
FEEDBACK!
Exam & Bonus Points from Homework
There will be a written exam!

It will be in English
The format will be open questions – we will provide more info
at the end of the lecture period
Date: planned for September 13, 2024, 14:00-16:30
There will also be a winter exam, and you can keep your
assignment grades
Homework (1/3)
Homework is crucial for succeeding in the course and the final
exam!
There will be three assignments
You will have to submit an individual report for each assignment
For submission dates, check Moodle for the schedule!
Homework (2/3)
There will be 3 homework assignments.
Each assignment will contain:

Some open questions
Problems that you will have to model and solve with one or more
RL algorithms
You will have to write a report commenting on your results,
according to given instructions
You will have to submit your code
copied implementations and/or reports will be directly failed
Each report is personal!
You are NOT allowed to use ChatGPT!
Strongly advised to DO the assignments!
Homework (3/3)
Late-submission policy: Every day after the designated deadline

will lead to a reduction to 10% of the total original score.
Submissions three days later than the designated deadline will
not be considered.
Grade decomposition if you do the assignments:
Assignment 1: 10% (+ 3% Bonus)
Final exam: up to 70% (BUT you have to pass the exam to get the
Assignment grades!)
Final grade: max(exam+assignments, exam)
Final grade if you do not do any assignment: 100% from the
exam – The exam will NOT have ANY bonus questions
Background Reading
We will add current papers & tutorials.

Standard background reading:
R. Sutton, A. Barto. Reinforcement Learning - an Introduction, MIT Press
(http://incompleteideas.net/book/RLbook2020.pdf)
(Strongly recommended)
Algorithms for Reinforcement Learning, Csaba Szepesvari. (https:
//sites.ualberta.ca/~szepesva/papers/RLAlgsInMDPs.pdf)
Mathematics background for machine learning:
M.P. Deisenroth, A. Aldo Faisal, and C.S. Ong, Mathematics for Machine
Learning (2020), Cambridge University Press
https://mml-book.github.io/
MushroomRL: Reinforcement Learning library

https://github.com/MushroomRL/mushroom-rl
Background Reading
Other resources
Reinforcement Learning: State-of-the-Art, Marco Wiering and Martijn
van Otterlo, Eds.
https://link.springer.com/book/10.1007/978-3-642-27645-3
Bertsekas, Dimitri. Reinforcement learning and optimal control. Athena
Scientific, 2019. https://web.mit.edu/dimitrib/www/RL_
Frontmatter-SHORT-INTERNET-POSTED.pdf
Artificial Intelligence: A Modern Approach, Stuart J. Russell and Peter
Norvig. http://aima.cs.berkeley.edu/
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
https://www.deeplearningbook.org/
How does it fit in your course plan? (1/3)
VL Reinforcement Learnimg is a good advanced lecture that can be

combined with:
VL Lernende Roboter (aka Robot Learning)
IP Robot Learning
IP, VL Intelligent Robotic Manipulation
VL Introduction to Artificial Intelligence
Related Classes:
Improve Foundations: Statistical Machine Learning (SoSe), Robot
Learning (WiSe), Deep Learning: Architectures and Methods(SoSe)
Theses: We regularly offer B.Sc. or M.Sc. Theses on RL-related topics.

Please contact us!
B.Sc. / M.Sc. Informatik:

Visual & Interactive Computing (see Modulhandbuch)
If you are strongly interested in machine learning you should
take:
Statistical Machine Learning for V&IC credit
Data Mining and Machine Learning for WW&IV credit
Robot Learning for R&CCE credit
Computer Vision for V&IC credit
M.Sc. in Autonomous Systems & Robotics

M.Sc. in Visual Computing: Area “Computer Vision & ML”
Lectures Schedule
April 15, Introduction to RL, Lecturer: Georgia

April 22, Markov Decision Processes, Lecturer: Georgia
April 29, Dynamic programming, Lecturer: Georgia
May 6, Model-free prediction, Lecturer: Davide
May 13, Model-free control, Lecturer: Davide
May 27, Function approximation, Lecturer: Georgia
June 3, Deep Q-Learning, Lecturer: Georgia
June 10, Policy gradients, Lecturer: Davide
June 17, Deep Actor-Critic: Introduction, Lecturer: Davide
June 24, Deep Actor-Critic: On-policy, Lecturer: Davide
July 1, Deep Actor-Critic: Off-policy, Lecturer: Davide
July 8, Intrinsic Motivation & Model-based RL, Lecturer: Georgia
Practical Lectures Schedule

April 19, Introduction to MushroomRL library, TA: Aryaman Reddi
April 26, Coding an MDP, TA: Elisa Alboni
May 3, Dynamic Programming, TA: Yufeng Jin
May 10, Model-free Prediction (TD Evaluation & MC), TA: Elisa Alboni
May 17, Model-free Control (SARSA, Q Learning), TA: Yufeng Jin
First assignment announcement – Due: June 7, 23:59
May 31 Function Approximation, TA: Jiayun Li
June 7, Deep Q Networks, TA: Théo Vincent
June 14, Using the Cluster, TA: Aryaman Reddi
Second assignment announcement – Due: July 5, 23:59
June 21, Policy Gradient Performance Difference Lemma, TA: Jiayun Li
June 28, TRPO, PPO, TA: Théo Vincent
July 5, DDPG, TD3, SAC, TA: Théo Vincent
Third assignment announcement – Due: July 26, 23:59
July 12, Recap Lecture and Exam Demo, TA: Aryaman Reddi, All
Outline
Questions & Answers
Time for your questions!

Lecture 0

Uploaded by

Copyright:

Available Formats

You might also like

Lecture 0

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 0

Uploaded by

Copyright:

Available Formats

Reinforcement Learning: From Foundations to Deep

Georgia Chalvatzaki & Davide Tateo

2. Questions & Answers

Davide Tateo is a Research Group Leader at the Intelligent Autonomous Systems

Teaching Assistants (1/2)

Teaching Assistants (2/2)

Jiayun Li is a PhD student in the Pearl Lab, supervised by Professor Georgia

Elisa Alboni is a Ph.D student at the University of Trento, supervised by Professor

Please use Moodle for Q&As

Contacts — Teaching Assistants

Please use Moodle for Q&As

Website & Discussion Board

Questions and answers in emails/homework/exams will be answered

Exercises, homework, etc.

Feedback: Essential for both sides...

Exam & Bonus Points from Homework

There will be a written exam!

There will be 3 homework assignments.

Each assignment will contain:

Late-submission policy: Every day after the designated deadline

We will add current papers & tutorials.

MushroomRL: Reinforcement Learning library

How does it fit in your course plan? (1/3)

VL Reinforcement Learnimg is a good advanced lecture that can be

How does it fit in your course plan? (2/3)

Theses: We regularly offer B.Sc. or M.Sc. Theses on RL-related topics.

How does it fit in your course plan? (3/3)

B.Sc. / M.Sc. Informatik:

M.Sc. in Autonomous Systems & Robotics

April 15, Introduction to RL, Lecturer: Georgia

Practical Lectures Schedule

2. Questions & Answers

Questions & Answers

Time for your questions!

You might also like