Course Logistics
Teaching Assistants:
- Kamalaruban Parameswaran
- Paul Rolland
Should you have any question, please send a mail to the head TA (Kamalaruban Parameswaran, email = kamalaruban.parameswaran@epfl.ch) or post a question on the Moodle discussion board.
Reading List:
Lectures 1-6:
[1] S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.
Lecture 7:
[2] Mnih et al, Playing Atari with Deep Reinforcement Learning, arXiv, 2013.
[3] Wang et al, Dueling Network Architectures for Deep Reinforcement Learning, ICML, 2016.
[4] Hasselt et al, Deep Reinforcement Learning with Double Q-learning, AAAI, 2016.
[5] Schaul et al, Prioritized Experience Replay, ICLR, 2016.
[6] Hessel et al, Rainbow: Combining Improvements in Deep Reinforcement Learning, AAAI, 2018.
Lectures 8-9:
[8] Kakade and Langford, Approximately Optimal Approximate Reinforcement Learning, ICML, 2002.
[9] Schulman et al, Trust Region Policy Optimization, ICML, 2015.
[10] Schulman et al, Proximal Policy Optimization Algorithms, arXiv, 2017.
[11] Silver et al, Deterministic Policy Gradient Algorithms, ICML, 2014.
Lecture 10:
[12] Lillicrap et al, Continuous Control With Deep Reinforcement Learning, ICLR, 2016.
[13] Fujimoto et al, Addressing Function Approximation Error in Actor-Critic Methods, ICML, 2018.
[15] Haarnoja et al, Soft Actor-Critic Algorithms and Applications, arXiv, 2018.
Lectures 11-12:
[17] Ng and Russell, Algorithms for Inverse Reinforcement Learning, ICML, 2000.
[18] Abbeel and Ng, Apprenticeship Learning via Inverse Reinforcement Learning, ICML, 2004.
[19] Ratliff et al, Maximum Margin Planning, ICML, 2006.
[20] Ziebart et al, Maximum Entropy Inverse Reinforcement Learning, AAAI, 2008.
[22] Syed and Schapire, A Game-Theoretic Approach to Apprenticeship Learning, NeurIPS, 2008.
[23] Ho et al, Model-Free Imitation Learning with Policy Optimization, ICML, 2016.
[24] Ho and Ermon, Generative Adversarial Imitation Learning, NeurIPS, 2016.