CS-233(b): Exam 2020 Spoiler

Hello,

I have some troubles with the exam of 2020 :

Fot this question we are asked to use logistic regression to solve the problem. As logistic regression is a binary classification problem we can only output a 1 or -1 for example. So for me it is not possible to predict the salary or a disease. For the third option we have an image and the annotations images and given a new image we wan't to predict the routes. Here we have in fact a classification but in much more than 2 groups so logistic regression will not be able to separate the data with a simple line.

So for me the only right answer is the last one. We have features and we have to predict if the students pass (1) or not (-1).

The problem is that only answer 2 and 3 are right and I'm realy confused about that. It would be great if someone can explain me why I'm wrong !

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I have another question concerning the feature expansion. If we do a polynomial feature expansion of degree 2 for a vector [x1, x2]

we have [x1, x2, x1^2, x2^2, x1x2]. I'm not sure how this formula is defined. Do we have to include the simple terms ? Do the mixed terms have to be multiplied by 2 ? Is there a formula in the course that I forgot ?

And now if we compare a polynomial feature expansion with a polynomial kernel we have to find the same expression :

so the polynomial kernel is defined as

And here in the question we have to find k(xi, xj) = phi(xi)phi(xj)^T. The right answer is the 3rd one. If we do the calculations we find this expansion. But for me this is absolutely not of the polynomial form which should look like :

So I'm pretty confused and I was wondering if I'm doing something completely wrong ?

Thank's in advance for your help !

Re: Exam 2020 Spoiler

by Nicolas Talabot - Saturday, 25 June 2022, 16:44

Hello,

Regarding the logistic regression question:

Answer 2: this can be seen as a multi-class classification problem where the features are symptoms and the classes are the diseases. In this case, it is possible to use multi-class logistic regression (slides 48 of linear classification lecture).
Answer 3: this is a binary classification, for each pixel we want to classify it as being part of a road (1) or not (0). We can apply logistic regression multiple time to get the prediction of each pixel. For instance, given the input image, classify pixel (i,j) as being road or not. Then repeat this for each pixel.
Answer 4: we don't want to classify whether individual student pass the course, but rather predict how many will. That is a regression task where we want to predict a number 0 <= P <= N (with N the max number of students).

For the polynomial feature expansion, the base idea is that we include terms of degree d <= D, with D the maximum degree. So for the feature vector [x1 x2], this expansion to the degree 2 would give:
x_tilde = [1, x1, x2, x1^2, x2^2, x1*x2]

This will be multiplied to a parameter vector w = [w0 w1 w2 w3 w4 w5] to give a polynomial of degree 2.
Assuming the w_i can take any values, multiplying by a constant any term in x_tilde will not change much.

So these are simply different formulations to obtain polynomial features, there is not necessarily a unique form written in stone.
For instance, we could only consider the terms x_i^d to simplify.

In the example that you give, assuming Xa = (a1, a2) and Xb = (b1, b2), we get:
(1 + Xa^T Xb)^2 = 1 + 2*a1b1 + 2*a2b2 + 2*a1b1a2b2 + (a1b1)^2 + (a2b2)^2
which is different than Za^T Zb.

Re: Exam 2020 Spoiler

by Jérémy Valentin Barghorn - Saturday, 25 June 2022, 19:10

Thank's a lot for your clear answer !

I still have a small question. In the exercice set 7 we have seen that an SVM trained with linear kernel on polynomially expanded data is the same as an SVM trained with polynomial kernel function on original data. This means that the polynomial feature expansion has to match the polynomial kernel. So there is a fix rule to do the polynomial feature expansion no ?
Can we conclude that the polynomial feature expansion in this case for a vector xi (x_i1, x_i2) is the same as the third answer mentionned in the question of the exam ?

Thank's a lot for your help :)

Re: Exam 2020 Spoiler

by Nicolas Talabot - Sunday, 26 June 2022, 17:40

Linear kernel with polynomial features is indeed equivalent to polynomial kernel on the original features, though not necessarily strictly equal. That will depend on the exact formulation of the kernel function and of the polynomial feature expansion.

The base idea behind this feature expansion is to get a polynomial of the original features. Now, the actual formulation can be different from one use to the other, but the important and common point between them is that the added features should be monomial like x1^d, x1^n * x2^m, etc., which will give a polynomial when multiplied to a weight vector w.

For instance, in the exercise set 7, question 2.3, we choose to omit the "interaction terms" of the form x1*x2 to simplify things. In that example, we don't necessarily use all the possible polynomial features, but the result is still a form of polynomial feature expansion.

Re: Exam 2020 Spoiler

by Tianqu Kang - Sunday, 26 June 2022, 19:17

Hi,

May I know where we can find the exam 2020 paper? I can only see the mock exam on the moodle. Many thanks