CS-431: Quizz 4

Hello,

Would it be possible to have a detailed solution for quizz 4 (all questions that are not multiple choice and where there's some calculation involved)?

Thank you

Re: Quizz 4

par Jean-Cédric Chappelier, lundi, 9 janvier 2023, 11:32

there are a bit too many (including all randomization).
I can address that in the QAS (see https://moodlearchive.epfl.ch/2022-2023/mod/forum/discuss.php?d=85338) or here if you have a more precise (focussed) question.

Re: Quizz 4

par Taha Zakariya, lundi, 9 janvier 2023, 15:02

I mean, if possible of course, it would be nice to have the solution of 1 of the different versions, but like the whole methodology so it can easily be applied on the other versions.

Re: Quizz 4

par Jean-Cédric Chappelier, lundi, 9 janvier 2023, 15:29

for what question more precisely? (I won't do it for all)

Re: Quizz 4

par Taha Zakariya, lundi, 9 janvier 2023, 16:11

Then for 4 and 16 please

Re: Quizz 4

par Jean-Cédric Chappelier, mardi, 10 janvier 2023, 17:53

For question 4:
+ product of P(w|class) times class prior (don't forget the prior!)
+ estimates of the P(w|class) by ML: count w in class divided by nb of words in class

For your version of that question:
+ estimates: Space prior = 3/5 Animal prior = 2/6
likelihoods: Space: Space:4/10, Mouse: 1/10, Cat: 1/10, Dog: 4/10
Animal: Space:2/10, Mouse: 4/10, Cat: 3/10, Dog: 1/10

+ inference: P(D6,Space)

$\propto$ 4*4*1*1*3=48
P(D6,Animal)

$\propto$ 2*2*3*4*2=96
P(D7,Space)

$\propto$ 1*1*4*4*3=48
P(D7,Animal)

$\propto$ 3*3*2*1*2=36

Re: Quizz 4

par Jean-Cédric Chappelier, mardi, 10 janvier 2023, 18:03

For question 16:
first of all attention is a weighted average of the encoder hidden states, so whenever you have the same components in the encoder hidden states, then the result has the same component as well.
So in your version the answer is already [10, 10, ?]

To compute the last component, you have to compute the weights of the average, which are the softmax of the dot-product of encoder by decoder hidden states. In your case the second encoder is only 0.1 higher than the first, thus the second dot-product is simply 0.1*6 higher than the first, which is 140+50+15=250 (thus second is 250.6).
Here the softtmay is of dim 2 so you only have to compute 1 value (well, actually you were provided with some software to do the computation, but it's not really needed): exp(250) / [ exp(250) + exp(250.6) ] = 0.35, thus second is 1-0.35=0.65

And thus the final component you are looking for is 0.35*2.5 + 0.65*2.6 = 2.565