[Spoiler] Mock Exam 2020 - Parameter in the regularization term

Re: [Spoiler] Mock Exam 2020 - Parameter in the regularization term

par Nicolas Talabot,
Number of replies: 0
Usually, this regularization term will appear in the total loss function (the overall objective we want to minimize). For the logistic regression: the cross-entropy E(w) should be minimized, and if we want to also add such L2 regularization, the problem becomes:

w* = argmin_w E(w) + lambda * L_reg(w)

and lambda is a constant that serves to balance the two terms (similar to the "C" that appears in the SVM formulation). Usually, the value of lambda will have to be found by trial-and-error, meaning by testing values and selecting the one that gives the best performance (on a validation set).

For the example of the exams above, I believe they simply set it to 1/D, where D is the dimension of w. (if you look at the sum, it has 3 terms in the first one and 6 in the other)
In that case, I think any value would be considered correct as it is mostly the formula for the L2 loss that is important (well, based on the info I see here).


The number of terms after the polynomial expansion will depend on the initial number of features, the maximum degree of the polynomial, and which terms are kept.
There is probably some formula to deduce the maximum possible number of terms based on that but I don't know it, nor are you expected to for this course.