MP2 PART 1 generative model

MP2 PART 1 generative model

by Lucas Zweili -
Number of replies: 4

Hi!

I have difficulties understanding exactly the nature of the input and output of the model? X[:-1,:-1] and T[:-1,1:] respectively.

What represent X and T exactly ? Should T be a list of words ? and X a list of a vector of word preceding T or just the preceding word.

Then why the matrices X[:-1,:-1] and T[:-1,1:] ? And why cutting the last element of the dataset ?

Some code are given (this is great but...) and i tried to adapt mine to it but it is not well explained how should be things.


Thank you for your reply.

In reply to Lucas Zweili

Re: MP2 PART 1 generative model

by Florian François Colombo -

Hi Lucas,

As asked in the template, part of the project is for you to understand what you do. So if you do, you can adapt/modify/change what we gave you as template accordingly. To achieve good comprehension, go look at the Keras documentation and examples on sequence learning.

Some related discussion on this forum can also help you:

https://moodlearchive.epfl.ch/2018-2019/mod/forum/discuss.php?d=17840

https://moodlearchive.epfl.ch/2018-2019/mod/forum/discuss.php?d=17853

https://moodlearchive.epfl.ch/2018-2019/mod/forum/discuss.php?d=17554

X is the input sequence of your model. In the first part, it should be all the filtered sentences encoded as sequences of integer tokens of controlled length (with START and END boundary tokens and truncation/padding).

T is the target sequence of your model. It should be the one-hot encoding of X shifted by one timestep (target at word[n] is word[n+1]). Hence the [:-1] and [1:].

P.S.: Please subscribe for the fraud detection interview before sunday.


Hoping that this helps.

Kind regards,

Florian

In reply to Florian François Colombo

Re: MP2 PART 1 generative model

by Lucas Zweili -

Thank you for your reply.

I thought I understand but I have an error that is impossible to solve no matter what I try.  And I found no solution on Internet

As suggested:

My output (T) is 3D  (#sentences,maxlen-1,vocabulary_size)

And the input (X) is 2D (#sentences,maxlen-1) // even in reshaping to (#sentences,maxlen-1,1) it doesn't work.

But I don't know how to make the model understand that it should output in 3D.


here the error output:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 24)                0         
_________________________________________________________________
embedding (Embedding)        (None, 24, 128)           380032    
_________________________________________________________________
simple_rnn_20 (SimpleRNN)    (None, 64)                12352     
_________________________________________________________________
dense_16 (Dense)             (None, 50)                3250      
_________________________________________________________________
output (Dense)               (None, 24)                1224      
=================================================================
Total params: 396,858
Trainable params: 396,858
Non-trainable params: 0
_________________________________________________________________
(88100, 24) (88100, 24, 2969)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-9ae57f00edff> in <module>
     22                                     epochs=epochs,
     23                                     validation_split=validation_split,
---> 24                                     batch_size=batch_size).history
     25 
     26 #save

~\Anaconda3\envs\gpu\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
    950             sample_weight=sample_weight,
    951             class_weight=class_weight,
--> 952             batch_size=batch_size)
    953         # Prepare validation data.
    954         do_validation = False

~\Anaconda3\envs\gpu\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    787                 feed_output_shapes,
    788                 check_batch_axis=False,  # Don't enforce the batch size.
--> 789                 exception_prefix='target')
    790 
    791             # Generate sample-wise weight values given the `sample_weight` and

~\Anaconda3\envs\gpu\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    126                         ': expected ' + names[i] + ' to have ' +
    127                         str(len(shape)) + ' dimensions, but got array '
--> 128                         'with shape ' + str(data_shape))
    129                 if not check_batch_axis:
    130                     data_shape = data_shape[1:]

ValueError: Error when checking target: expected output to have 2 dimensions, but got array with shape (88099, 24, 2969)


In reply to Florian François Colombo

Re: MP2 PART 1 generative model

by Lukas Gelbmann -

A tip for anyone who finds that the one-hot encoded data T is taking too much memory: it is better to define T = X[:,:,np.newaxis] and then use the sparse_categorical_crossentropy loss instead of categorical_crossentropy. This doesn't affect the results, but saves a ton of memory.