DDPG

Re: DDPG

par Ali Bakly,
Nombre de réponses : 0
[Not a TA] So in DDPG the action is chosen from the policy network + exploration noise, which I think would make it off policy, and hence you can use a replay buffer. I would also be interested in hearing a TA's opinion on this.
DDPG