CS-456: MP2: Input words for Embedding Visualization

In the section of embedding visualization, we are supposed to plot the projection of the first 200 most frequent words in a 2D plot. According to the model, we shall input a list of words. I wonder if I wish to get the 2D vector of a single word after projection, what should be the input to the model we've built.

Re: MP2: Input words for Embedding Visualization

by Florian François Colombo - Sunday, 26 May 2019, 11:55 PM

Hi the input should be a layer that takes words (represented with unique integers) as input. It should be the same as for your model (or with a different size if you want to input more than maxlen words at a time - typically create a model with an input layer of size number_of_possible_words and outputs the embedding of these words.

Please make sure to write down the actual words on the embedding plot (t-SNE).

Hope it helps,

Florian

Re: MP2: Input words for Embedding Visualization

by Xiao Zhou - Friday, 31 May 2019, 10:38 AM

Thanks for your reply.

I know the input is words as integers. My question is if I want to get the 2 values by tsne of one single word, what should I input to get that? Should we input all the preprocessed sentences, and get the all TSNE values for that single word and calculate the average values of that single word?

Re: MP2: Input words for Embedding Visualization

by Florian François Colombo - Friday, 31 May 2019, 11:36 AM

The embedding weights should be the same for a given word in every sentence. It's only the recurrent layer that is context dependent.

You can also directly plot the embedding layer weights.