CS-456: Mountain car mini project

Mountain car mini project

◄ MP2 - Hidden layers
Mountain Car - DQN auxilary reward loss behaviour ►

1. You can either store your states at each step in a FIFO queue, and compute the mean and std on the batch, or compute it online with a formula like new_average = old_average * (n-1)/n + new_value /n (and a similar formula for the variance).
2. You should multiply after the clamp.

◄ MP2 - Hidden layers
Mountain Car - DQN auxilary reward loss behaviour ►

Contact
EPFL CH-1015 Lausanne
+41 21 693 11 11

Suivre les pulsations de l'EPFL sur les réseaux sociaux

Accessibilité
Mentions légales
Protection des données

ANN Forum

Mountain car mini project

Re: Mountain car mini project