CS-233: [HW3] Stochastic gradient descent

Hello,

In homework 3, we are supposed to update the weights of vector w by using the stochastic gradient descent. I don't see how to do that (there's a formula at the end of the slides but I don't see how to apply it), could you please give me more explanations on how to do it?

Thanks,
Armen

Re: [HW3] Stochastic gradient descent

by Vidit Vidit - Sunday, 10 March 2019, 11:03 PM

Hi,
For SGD, we update our weights based on loss computed from single sample. So, firstly you compute the derivate of loss due to single sample and then update the weights with this gradient.
So the loss instead of being for all the misclassified samples
$E(wb) = -\sum_{n \in M}(wb^Txb_n)l_n$
it will be only based on the single sample
$E(wb) = -(wb^Txb_n)l_n$

Hope that helps

Vidit