I have a long text with 1,514,669 terms (26,791 unique words). I've created a dictionary with unique words as key, and word index as values:
{'neighbors': 0,
'prowlings': 1,
'trapped': 2,
'succeed': 3,
'shrank': 4,
'napkin': 5,
'verdict': 6,
'hosted': 7,
'lists': 8,
'meat': 9,
'ation': 10,
'captor': 11,
'corking': 12,
'keys': 13,
'Sardinian': 14,
'include': 15,
'Tradable': 16,
'princes': 17,
'witnessed': 18,
'rant': 19,
...}
I created an input array with shape (1514669, 32) that way:
rnn_inputs = [word_to_index_dict[each] for each in ebooks_texts.split(' ') if each != '']
rnn_targets = rnn_inputs[1:] + [rnn_inputs[0]]
rnn_inputs = [rnn_inputs[i:i+32] for i in range(len(rnn_inputs)) if len(rnn_inputs[i:i+32]) == 32]
rnn_targets = [rnn_targets[i:i+32] for i in range(len(rnn_targets)) if len(rnn_targets[i:i+32]) == 32]
rnn_inputs = np.array(rnn_inputs)
rnn_targets = np.array(rnn_targets)
So, for each array row, I'm having 32 words. First row for words 0-31, second row for words 1-32, and so on.
The point is to obtain a prediction for the next word.
The model architecture is:
model = Sequential()
model.add(Embedding(len(word_to_index_dict), 128, input_length=32))
model.add(LSTM(units=128, return_sequences=True))
model.add(Dense(len(word_to_index_dict), activation='softmax'))
model.summary()
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics = ['accuracy'])
checkpointer = ModelCheckpoint(filepath='models/best-weights.hdf5', verbose=1, save_best_only=True)
model.fit(rnn_inputs, rnn_targets, batch_size=1, epochs=1, validation_split=.2, callbacks=[checkpointer], verbose=1)
I'm getting the following summary and error:
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 32, 128) 3429248
_________________________________________________________________
lstm_1 (LSTM) (None, 32, 128) 131584
_________________________________________________________________
dense_1 (Dense) (None, 32, 26791) 3456039
=================================================================
Total params: 7,016,871
Trainable params: 7,016,871
Non-trainable params: 0
_________________________________________________________________
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-63ea81786e79> in <module>
117 checkpointer = ModelCheckpoint(filepath='models/best-weights.hdf5', verbose=1, save_best_only=True)
118
--> 119 model.fit(rnn_inputs, rnn_targets, batch_size=1, epochs=1, validation_split=.2, callbacks=[checkpointer], verbose=1)
120
~/miniconda3/envs/tf-cpu/lib/python3.6/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
950 sample_weight=sample_weight,
951 class_weight=class_weight,
--> 952 batch_size=batch_size)
953 # Prepare validation data.
954 do_validation = False
~/miniconda3/envs/tf-cpu/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
787 feed_output_shapes,
788 check_batch_axis=False, # Don't enforce the batch size.
--> 789 exception_prefix='target')
790
791 # Generate sample-wise weight values given the `sample_weight` and
~/miniconda3/envs/tf-cpu/lib/python3.6/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
126 ': expected ' + names[i] + ' to have ' +
127 str(len(shape)) + ' dimensions, but got array '
--> 128 'with shape ' + str(data_shape))
129 if not check_batch_axis:
130 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (1514669, 32)
I'm looking on Google and Docs and just can't find a solution for my error. Any ideas about what I'm doing wrong?
I'm using python 3.6 and Ubuntu 18.
It looks like you didn't one hot encode your targets. Your targets now have shape (1514669, 32)
but this should be (1514669, 32, vocab_size)
(where each of the 32 words per phrase is one hot encoded) in order to be compatible with your output layer.
Alternatively, you can compile the model with sparse_categorical_crossentropy
as loss instead of categorical_crossentropy
. In that case your targets should have shape (1514669, 32, 1)
and don't need to be one hot encoded.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.