[HW3] Gradient Descent vs SGD

Re: [HW3] Gradient Descent vs SGD

by Firas Kanoun -
Number of replies: 0

Hey,

Thanks for the remark. Indeed, the gradient is calculated each time at a single sample so it is SGD that is used here. The update formula is therefore, each time, with respect to an i-th observation in the dataset. 

SGD update formula

Best,

Firas