Course: Theory and Methods for Reinforcement Learning

General

Collapse all Expand all

Announcements Forum
Course Logistics File
Teaching Assistants:
Kamalaruban Parameswaran
Paul Rolland
Should you have any question, please send a mail to the head TA (Kamalaruban Parameswaran, email = kamalaruban.parameswaran@epfl.ch) or post a question on the Moodle discussion board.
Reading List:
Lectures 1-6:
[1] S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
Lecture 7:
[2] Mnih et al, Playing Atari with Deep Reinforcement Learning, arXiv, 2013.
[3] Wang et al, Dueling Network Architectures for Deep Reinforcement Learning, ICML, 2016.
[4] Hasselt et al, Deep Reinforcement Learning with Double Q-learning, AAAI, 2016.
[5] Schaul et al, Prioritized Experience Replay, ICLR, 2016.
[6] Hessel et al, Rainbow: Combining Improvements in Deep Reinforcement Learning, AAAI, 2018.
Lectures 8-9:
[7] Sutton et al, Policy Gradient Methods For Reinforcement learning with function approximation, NeurIPS, 2000.
[8] Kakade and Langford, Approximately Optimal Approximate Reinforcement Learning, ICML, 2002.
[9] Schulman et al, Trust Region Policy Optimization, ICML, 2015.
[10] Schulman et al, Proximal Policy Optimization Algorithms, arXiv, 2017.
[11] Silver et al, Deterministic Policy Gradient Algorithms, ICML, 2014.
Lecture 10:
[12] Lillicrap et al, Continuous Control With Deep Reinforcement Learning, ICLR, 2016.
[13] Fujimoto et al, Addressing Function Approximation Error in Actor-Critic Methods, ICML, 2018.
[14] Haarnoja et al, Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, ICML, 2018.
[15] Haarnoja et al, Soft Actor-Critic Algorithms and Applications, arXiv, 2018.
Lectures 11-12:
[16] Osa et al, An Algorithmic Perspective on Imitation Learning, Foundations and Trends in Robotics, 2018.
[17] Ng and Russell, Algorithms for Inverse Reinforcement Learning, ICML, 2000.
[18] Abbeel and Ng, Apprenticeship Learning via Inverse Reinforcement Learning, ICML, 2004.
[19] Ratliff et al, Maximum Margin Planning, ICML, 2006.
[20] Ziebart et al, Maximum Entropy Inverse Reinforcement Learning, AAAI, 2008.
[21] Ziebart, Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy, Thesis, 2010.
[22] Syed and Schapire, A Game-Theoretic Approach to Apprenticeship Learning, NeurIPS, 2008.
[23] Ho et al, Model-Free Imitation Learning with Policy Optimization, ICML, 2016.
[24] Ho and Ermon, Generative Adversarial Imitation Learning, NeurIPS, 2016.

17 February - 23 February

Lecture 1 File
Reading material:
Chapter 3 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

24 February - 1 March

Lecture 2 File
Reading material:
Chapter 4 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

2 March - 8 March

Lecture 3 File
Reading material:
Chapter 5 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

9 March - 15 March

Lecture 4 File
Reading material:
Chapter 6 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
Project Proposal Assignment
A brief description of the project (1-2 page) which includes the following:

the names of the project team members

summary of the project and its importance

a reading list and directions to be explored

special computational resource requirements or licensing requirements (e.g., MuJoCo)

16 March - 22 March

Lecture 5 File
Reading material:
Chapters 7 & 12 in S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

13 April - 19 April

20 April - 26 April

Lecture 8 File

Reading materials:
[9] Schulman et al, Trust Region Policy Optimization, ICML, 2015.
[10] Schulman et al, Proximal Policy Optimization Algorithms, arXiv, 2017.

27 April - 3 May

Lecture 9 File

4 May - 10 May

Lecture 10 File

11 May - 17 May

Lecture 11 File

18 May - 24 May

25 May - 31 May

Final Project Report Assignment

Weekly outline

General

17 February - 23 February

24 February - 1 March

2 March - 8 March

9 March - 15 March

16 March - 22 March

23 March - 28 March

30 March - 5 April

6 April - 12 April

13 April - 19 April

20 April - 26 April

27 April - 3 May

4 May - 10 May

11 May - 17 May

18 May - 24 May

25 May - 31 May