Theory and Methods for Reinforcement Learning
Weekly outline
-
-
Teaching Assistants:
- Luca Viano (Head TA)
- Ali Kavis
- Leello Dadi
- Thomas Pethick
- Fatih Sahin
- Pedro Abranches
Should you have any question, please send a mail to the head TA (Luca Viano, email = luca.viano@epfl.ch) or post a question on the Moodle discussion board.
-
-
An overview of the course
-
-
-
MDPs; value and Q functions; value iteration, policy iteration; operator perspectives
-
-
-
Model-free policy-based and value-based methods; Monte Carlo (MC) method and temporal difference (TD) learning.
-
-
-
A brief description of the project (2-3 pages including references) which includes the following:
-
the names of the project team members
-
motivation of the projects
formal description of the problem and the goal
-
references
-
software and computational resources you will use
-
-
Primal and Dual LP, ALP, ALP with constraint sampling, primal dual methods, REPS.
-
-
-
Policy gradient methods
-
-
-
Policy gradient II : rates, gradient dominance property, distributions mismatch coefficients, natural policy gradient.
-
-
-
Policy gradient III: Natural policy gradient convergence bounds
-
-
-
Behavioral cloning, imitation learning, inverse reinforcement learning.
-
-
-
Markov Games 2: nonlinear programming, policy gradient
-
-
-
Deep RL, Actor Critic methods, DQN
-
-
-
Robust RL
-
-
-
Dear all,
please upload for your final report by Friday, Jun 10th at 11:59 PM.Please double-check the submission instructions that we uploaded on Moodle during the first week https://moodlearchive.epfl.ch/2021-2022/pluginfile.php/3075502/mod_assign/intro/syllabus-2022.pdf (page 4)
In particular, we expect between 6 and 8 pages in the NeurIPS template https://neurips.cc/Conferences/2022/PaperInformation/StyleFiles
The required structure is
- Abstract
- Introduction
- Related Work
- Approach
- Results
- Conclusion
- References
If you ran experiments, please attach your code as supplementary material, uploading a single zip file containing the main report in pdf format and a folder named supplementary for the attached files.
It is also possible to upload an Appendix in a separate pdf including it in the same zip file.
PS: Due to bank holiday, there will be no class this week. The final class is on June 2nd when you will be giving a 15 minutes presentation of your project. There is no need to submit the slides you will use at this stage.
-