简体   繁体   中英

Keras, Embedding and LSTMS. Getting wrong shape error

I have a long text with 1,514,669 terms (26,791 unique words). I've created a dictionary with unique words as key, and word index as values:

 {'neighbors': 0,
 'prowlings': 1,
 'trapped': 2,
 'succeed': 3,
 'shrank': 4,
 'napkin': 5,
 'verdict': 6,
 'hosted': 7,
 'lists': 8,
 'meat': 9,
 'ation': 10,
 'captor': 11,
 'corking': 12,
 'keys': 13,
 'Sardinian': 14,
 'include': 15,
 'Tradable': 16,
 'princes': 17,
 'witnessed': 18,
 'rant': 19,
 ...}

I created an input array with shape (1514669, 32) that way:

rnn_inputs = [word_to_index_dict[each] for each in ebooks_texts.split(' ') if each != '']
rnn_targets = rnn_inputs[1:] + [rnn_inputs[0]]

rnn_inputs = [rnn_inputs[i:i+32] for i in range(len(rnn_inputs)) if len(rnn_inputs[i:i+32]) == 32]
rnn_targets = [rnn_targets[i:i+32] for i in range(len(rnn_targets)) if len(rnn_targets[i:i+32]) == 32]

rnn_inputs = np.array(rnn_inputs)
rnn_targets = np.array(rnn_targets)

So, for each array row, I'm having 32 words. First row for words 0-31, second row for words 1-32, and so on.

The point is to obtain a prediction for the next word.

The model architecture is:

model = Sequential()

model.add(Embedding(len(word_to_index_dict), 128, input_length=32))
model.add(LSTM(units=128, return_sequences=True))
model.add(Dense(len(word_to_index_dict), activation='softmax'))

model.summary()
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics = ['accuracy'])
checkpointer = ModelCheckpoint(filepath='models/best-weights.hdf5', verbose=1, save_best_only=True)
model.fit(rnn_inputs, rnn_targets, batch_size=1, epochs=1, validation_split=.2, callbacks=[checkpointer], verbose=1)

I'm getting the following summary and error:

Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 32, 128)           3429248   
_________________________________________________________________
lstm_1 (LSTM)                (None, 32, 128)           131584    
_________________________________________________________________
dense_1 (Dense)              (None, 32, 26791)         3456039   
=================================================================
Total params: 7,016,871
Trainable params: 7,016,871
Non-trainable params: 0
_________________________________________________________________
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-63ea81786e79> in <module>
    117 checkpointer = ModelCheckpoint(filepath='models/best-weights.hdf5', verbose=1, save_best_only=True)
    118 
--> 119 model.fit(rnn_inputs, rnn_targets, batch_size=1, epochs=1, validation_split=.2, callbacks=[checkpointer], verbose=1)
    120 

~/miniconda3/envs/tf-cpu/lib/python3.6/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
    950             sample_weight=sample_weight,
    951             class_weight=class_weight,
--> 952             batch_size=batch_size)
    953         # Prepare validation data.
    954         do_validation = False

~/miniconda3/envs/tf-cpu/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    787                 feed_output_shapes,
    788                 check_batch_axis=False,  # Don't enforce the batch size.
--> 789                 exception_prefix='target')
    790 
    791             # Generate sample-wise weight values given the `sample_weight` and

~/miniconda3/envs/tf-cpu/lib/python3.6/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    126                         ': expected ' + names[i] + ' to have ' +
    127                         str(len(shape)) + ' dimensions, but got array '
--> 128                         'with shape ' + str(data_shape))
    129                 if not check_batch_axis:
    130                     data_shape = data_shape[1:]

ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (1514669, 32)

I'm looking on Google and Docs and just can't find a solution for my error. Any ideas about what I'm doing wrong?

I'm using python 3.6 and Ubuntu 18.

It looks like you didn't one hot encode your targets. Your targets now have shape (1514669, 32) but this should be (1514669, 32, vocab_size) (where each of the 32 words per phrase is one hot encoded) in order to be compatible with your output layer.

Alternatively, you can compile the model with sparse_categorical_crossentropy as loss instead of categorical_crossentropy . In that case your targets should have shape (1514669, 32, 1) and don't need to be one hot encoded.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM