Cours : Theory and Methods for Reinforcement Learning

Généralités

Tout replier Tout déplier

Announcements Forum
Course Logistics Fichier
Teaching Assistants:
Kamalaruban Parameswaran
Paul Rolland
Should you have any question, please send a mail to the head TA (Kamalaruban Parameswaran, email = kamalaruban.parameswaran@epfl.ch) or post a question on the Moodle discussion board.
Reading List:
Lectures 1-6:
[1] S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
Lecture 7:
[2] Mnih et al, Playing Atari with Deep Reinforcement Learning, arXiv, 2013.
[3] Wang et al, Dueling Network Architectures for Deep Reinforcement Learning, ICML, 2016.
[4] Hasselt et al, Deep Reinforcement Learning with Double Q-learning, AAAI, 2016.
[5] Schaul et al, Prioritized Experience Replay, ICLR, 2016.
[6] Hessel et al, Rainbow: Combining Improvements in Deep Reinforcement Learning, AAAI, 2018.
Lectures 8-9:
[7] Sutton et al, Policy Gradient Methods For Reinforcement learning with function approximation, NeurIPS, 2000.
[8] Kakade and Langford, Approximately Optimal Approximate Reinforcement Learning, ICML, 2002.
[9] Schulman et al, Trust Region Policy Optimization, ICML, 2015.
[10] Schulman et al, Proximal Policy Optimization Algorithms, arXiv, 2017.
[11] Silver et al, Deterministic Policy Gradient Algorithms, ICML, 2014.
Lecture 10:
[12] Lillicrap et al, Continuous Control With Deep Reinforcement Learning, ICLR, 2016.
[13] Fujimoto et al, Addressing Function Approximation Error in Actor-Critic Methods, ICML, 2018.
[14] Haarnoja et al, Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, ICML, 2018.
[15] Haarnoja et al, Soft Actor-Critic Algorithms and Applications, arXiv, 2018.
Lectures 11-12:
[16] Osa et al, An Algorithmic Perspective on Imitation Learning, Foundations and Trends in Robotics, 2018.
[17] Ng and Russell, Algorithms for Inverse Reinforcement Learning, ICML, 2000.
[18] Abbeel and Ng, Apprenticeship Learning via Inverse Reinforcement Learning, ICML, 2004.
[19] Ratliff et al, Maximum Margin Planning, ICML, 2006.
[20] Ziebart et al, Maximum Entropy Inverse Reinforcement Learning, AAAI, 2008.
[21] Ziebart, Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy, Thesis, 2010.
[22] Syed and Schapire, A Game-Theoretic Approach to Apprenticeship Learning, NeurIPS, 2008.
[23] Ho et al, Model-Free Imitation Learning with Policy Optimization, ICML, 2016.
[24] Ho and Ermon, Generative Adversarial Imitation Learning, NeurIPS, 2016.

17 février - 23 février

Lecture 1 Fichier
Reading material:
Chapter 3 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

24 février - 1 mars

Lecture 2 Fichier
Reading material:
Chapter 4 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

2 mars - 8 mars

Lecture 3 Fichier
Reading material:
Chapter 5 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

9 mars - 15 mars

Lecture 4 Fichier
Reading material:
Chapter 6 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
Project Proposal Devoir
A brief description of the project (1-2 page) which includes the following:

the names of the project team members

summary of the project and its importance

a reading list and directions to be explored

special computational resource requirements or licensing requirements (e.g., MuJoCo)

16 mars - 22 mars

Lecture 5 Fichier
Reading material:
Chapters 7 & 12 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

13 avril - 19 avril

20 avril - 26 avril

Lecture 8 Fichier

Reading materials:
[9] Schulman et al, Trust Region Policy Optimization, ICML, 2015.
[10] Schulman et al, Proximal Policy Optimization Algorithms, arXiv, 2017.

27 avril - 3 mai

Lecture 9 Fichier

4 mai - 10 mai

Lecture 10 Fichier

11 mai - 17 mai

Lecture 11 Fichier

18 mai - 24 mai

25 mai - 31 mai

Final Project Report Devoir

Aperçu des semaines

Généralités

17 février - 23 février

24 février - 1 mars

2 mars - 8 mars

9 mars - 15 mars

16 mars - 22 mars

23 mars - 28 mars

30 mars - 5 avril

6 avril - 12 avril

13 avril - 19 avril

20 avril - 26 avril

27 avril - 3 mai

4 mai - 10 mai

11 mai - 17 mai

18 mai - 24 mai

25 mai - 31 mai