Miniproject 2

Miniproject 2

by Robin Leurent -
Number of replies: 1

Hello,

We have a question concerning the chatbot miniproject.
When creating the three different models in part 2, we output a 3 dimensions matrix (# of sentences , maxlen, vocabulary size). It is indeed a word probability distribution, but for each word position in the sentence. Is it a correct solution, or should we only get the probability for the last word of a sentence? And we were also thinking of passing this output to an other neural network to get proper transition probabilities.
Thanks a lot, 
Best regards,
Robin Leurent and Alexis Mermet.

In reply to Robin Leurent

Re: Miniproject 2

by Florian François Colombo -

Hi,

This approach is indeed the one that is suggested to take for MP2. That is, model the joint probability of each (oredered) word in a set of sentences.

Doing it this way, your loss is the sum over n of P(word[n+1] | H[n]) where H[n] = f(word[:n]) is the hidden state of your network after being presented with the first n words.

I suggest you move forward to the generation part in order to clarify how your model works.

Hope it helps!

Florian