Zero-out the accumulated gradients

Zero-out the accumulated gradients

par Kilian Jérémy Thomas,
Number of replies: 1

Hello,

In the exercise session, one of the key steps of the CNN algorithms is "zeroing-out".

I couldn't find any reference to this term in the lecture. What is the purpose of it?

Thank you

In reply to Kilian Jérémy Thomas

Re: Zero-out the accumulated gradients

par Nicolas Talabot,
Hi,

That is mostly a specificity of the PyTorch framework: the gradients computed at one iteration are not automatically discarded after being used for gradient descent, so we have to "zero them out" by hand.
If we don't, the gradients computed at the next iteration will be added to the previous ones instead of replacing them.

It is simply a way of telling PyTorch we don't care anymore about the computed gradients, so they can be discarded.