For agent 1 (K=n=1) we implement stochasticity such that the reward is zeroed out with a probability of 0.9. Should we also use this for the rest of the agents in the project, with the same mask of probability 0.9?
Thanks a lot for the question. Yes, stochasticity should be used throughout the project starting from 3.2. Agent 1 without stochasticity is used to make sure the basic implementation is correct. Then to observe the benefits of the additional components, stochasticity is necessary.
Hi,
Thanks for the answer! In that case should we only include plots for agent1 with stochasticity?
No, both should be included. In particular, the value loss of agent 1 without stochasticity is an important success criterion.
I updated the initial reply with the comment: "Agent 1 without stochasticity is used to make sure the basic implementation is correct. Then to observe the benefits of the additional components, stochasticity is necessary."