MP1 - DQN implementation

Re: MP1 - DQN implementation

by Lucas Louis Gruaz -
Number of replies: 0
Hello,
In 3.1 to 3.3, it is possible to implement either one single policy network, or two networks (a policy and a target network). Both should work.
In 3.4, we ask to add two additional networks (a predictor and a "target") to the agent for computing the RND reward. The target network of parts 3.1 to 3.3 should not be confused with the target network of RND. In total, you may have either 3 or 4 networks to solve part 3.4.