简体   繁体   中英

Keras - Wrong input shape in LSTM dense layer

I am trying to build an lstm text classifier using Keras .

This is the model structure:

model_word2vec = Sequential()
model_word2vec.add(Embedding(input_dim=vocabulary_dimension,
                    output_dim=embedding_dim,
                    weights=[word2vec_weights,
                    input_length=longest_sentence,
                    mask_zero=True,
                    trainable=False))
model_word2vec.add(LSTM(units=embedding_dim, dropout=0.25, recurrent_dropout=0.25, return_sequences=True))
model_word2vec.add(Dense(3, activation='softmax'))
model_word2vec.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


results = model_word2vec.fit(X_tr_word2vec, y_tr_word2vec, validation_split=0.16, epochs=3, batch_size=128, verbose=0)

Where y_tr_word2vec is a 3-dimensional one-hot encoded variable.

When I run the code above, I get this error:

ValueError: Error when checking model target: expected dense_2 to have 3 dimensions, but got array with shape (15663, 3)

I suppose that the issue could be about y_tr_word2vec shape or the batch size dimension, but I'm not sure.

Update:

I have changed return_sequences=False , y_tr_word2vec from one-hot to categorical , 1 neuron in dense layer, and now I am using sparse_categorical_crossentropy instead of categorical_crossentropy .

Now, I get this error: ValueError: invalid literal for int() with base 10: 'countess' .

Therefore now I suppose that, during fit() , something goes wrong with the input vector X_tr_word2vec , which contains the sentences.

The problem is this code

model_word2vec.add(LSTM(units=dim_embedding, dropout=0.25, recurrent_dropout=0.25, return_sequences=True))
model_word2vec.add(Dense(3, activation='softmax'))

You have set return_sequences=True ,which means LSTM will return a 3D array to dense layer,,whereas dense does not need 3D data...so delete return_sequences=True

model_word2vec.add(LSTM(units=dim_embedding, dropout=0.25, recurrent_dropout=0.25))
model_word2vec.add(Dense(3, activation='softmax'))

why did u set return_sequences=True?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM