CS-456: ReadOut Layer in RNN

Dear TAs,

We started miniproject 2 but we have troubles in understanding what the readout layer between the recurrent layer and the final output layer should do.

Isn't it sufficient the utput layer with a softmax activation function after the SimpleRNN layer to get the word transition probabilities?

Thanks a lot already now,

Luca

Re: ReadOut Layer in RNN

par Florian François Colombo, mardi 7 mai 2019, 17:25

Hi all,

Having a softmax output is sufficient indeed to consider the output as a probability distribution over words.

The readout can be whatever (non-recurrent) processing of the recurrent unit activities. With the proper dimension and activation function, such an additional layer can improve the performance of your model. But can also decrease the performance if the dimension and activations are badly chosen.

You are free to experience with this readout layer as you like. If you want to try without it, it is fine as well.

Hope it helps!

Best,