CS-456: (Project) n-step A2C

(Project) n-step A2C

◄ Incorrect equation in Lecture 8, n-step a2c?
Week 10 exercises ►

Yes, that's the idea. However, be careful as environments may have trajectories/episodes that end/reset at different timesteps and advantage computation should be robust to that.
Also you want to keep a global counter on the total environment steps used to train the agent which would be incremented according to the number of workers and steps per worker.

◄ Incorrect equation in Lecture 8, n-step a2c?
Week 10 exercises ►

Contact
EPFL CH-1015 Lausanne
+41 21 693 11 11

Follow the pulses of EPFL on social networks

Accessibility
Legal notice
Privacy policy

ANN Forum

(Project) n-step A2C

Re: (Project) n-step A2C