CS-456: MP2 Clarification on reward stochasticity

Hello,

We would like to clarify that starting from Section 3.2 in Miniproject 2, all the agents should be trained with the stochastic rewards.

Follow the pulses of EPFL on social networks