CS-456: Miniproject 2 | Moodle 18-19

Hello,

We have a question concerning the chatbot miniproject.

When creating the three different models in part 2, we output a 3 dimensions matrix (# of sentences , maxlen, vocabulary size). It is indeed a word probability distribution, but for each word position in the sentence. Is it a correct solution, or should we only get the probability for the last word of a sentence? And we were also thinking of passing this output to an other neural network to get proper transition probabilities.

Thanks a lot,

Best regards,

Robin Leurent and Alexis Mermet.

Re: Miniproject 2

by Florian François Colombo - Tuesday, 14 May 2019, 2:44 PM

Hi,

This approach is indeed the one that is suggested to take for MP2. That is, model the joint probability of each (oredered) word in a set of sentences.

Doing it this way, your loss is the sum over n of P(word[n+1] | H[n]) where H[n] = f(word[:n]) is the hidden state of your network after being presented with the first n words.

I suggest you move forward to the generation part in order to clarify how your model works.

Hope it helps!

Florian