EE-618: Course Logistics

Teaching Assistants:

Kamalaruban Parameswaran
Paul Rolland

Should you have any question, please send a mail to the head TA (Kamalaruban Parameswaran, email = kamalaruban.parameswaran@epfl.ch) or post a question on the Moodle discussion board.

Reading List:

Lectures 1-6:

[1] S. Sutton, and G. Barto, Reinforcement Learning: An Introduction, MIT Press, 2018.

Lecture 7:

[2] Mnih et al, Playing Atari with Deep Reinforcement Learning, arXiv, 2013.

[3] Wang et al, Dueling Network Architectures for Deep Reinforcement Learning, ICML, 2016.

[4] Hasselt et al, Deep Reinforcement Learning with Double Q-learning, AAAI, 2016.

[5] Schaul et al, Prioritized Experience Replay, ICLR, 2016.

[6] Hessel et al, Rainbow: Combining Improvements in Deep Reinforcement Learning, AAAI, 2018.

Lectures 8-9:

[7] Sutton et al, Policy Gradient Methods For Reinforcement learning with function approximation, NeurIPS, 2000.

[8] Kakade and Langford, Approximately Optimal Approximate Reinforcement Learning, ICML, 2002.

[9] Schulman et al, Trust Region Policy Optimization, ICML, 2015.

[10] Schulman et al, Proximal Policy Optimization Algorithms, arXiv, 2017.

[11] Silver et al, Deterministic Policy Gradient Algorithms, ICML, 2014.

Lecture 10:

[12] Lillicrap et al, Continuous Control With Deep Reinforcement Learning, ICLR, 2016.

[13] Fujimoto et al, Addressing Function Approximation Error in Actor-Critic Methods, ICML, 2018.

[14] Haarnoja et al, Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, ICML, 2018.

[15] Haarnoja et al, Soft Actor-Critic Algorithms and Applications, arXiv, 2018.

Lectures 11-12:

[16] Osa et al, An Algorithmic Perspective on Imitation Learning, Foundations and Trends in Robotics, 2018.

[17] Ng and Russell, Algorithms for Inverse Reinforcement Learning, ICML, 2000.

[18] Abbeel and Ng, Apprenticeship Learning via Inverse Reinforcement Learning, ICML, 2004.

[19] Ratliff et al, Maximum Margin Planning, ICML, 2006.

[20] Ziebart et al, Maximum Entropy Inverse Reinforcement Learning, AAAI, 2008.

[21] Ziebart, Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy, Thesis, 2010.

[22] Syed and Schapire, A Game-Theoretic Approach to Apprenticeship Learning, NeurIPS, 2008.

[23] Ho et al, Model-Free Imitation Learning with Policy Optimization, ICML, 2016.

[24] Ho and Ermon, Generative Adversarial Imitation Learning, NeurIPS, 2016.

Click logisitics2020.pdf link to view the file.