MP1 - DQN implementation

Re: MP1 - DQN implementation

par Lucas Louis Gruaz,
Nombre de réponses : 0
Hello,
In 3.1 to 3.3, it is possible to implement either one single policy network, or two networks (a policy and a target network). Both should work.
In 3.4, we ask to add two additional networks (a predictor and a "target") to the agent for computing the RND reward. The target network of parts 3.1 to 3.3 should not be confused with the target network of RND. In total, you may have either 3 or 4 networks to solve part 3.4.