简体   繁体   中英

How to solve “logits and labels must have the same first dimension” error

I'm trying out different Neural Network architectures for a word based NLP.

So far I've used bidirectional-, embedded- and models with GRU's guided by this tutorial: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571 and it all worked out well. When I tried using LSTM's however, I get an error saying:

logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]

How can I solve this?

My source and target dataset consists of 7200 sample sentences. They are integer tokenized and embedded. The source dataset is post padded to match the length of the target dataset.

Here is my model and the relevant code:

lstm_model = Sequential()
lstm_model.add(Embedding(src_vocab_size, 128, input_length=X.shape[1], input_shape=X.shape[1:]))
lstm_model.add(LSTM(128, return_sequences=False, dropout=0.1, recurrent_dropout=0.1))
lstm_model.add(Dense(128, activation='relu'))
lstm_model.add(Dropout(0.5))
lstm_model.add((Dense(target_vocab_size, activation='softmax')))

lstm_model.compile(optimizer=Adam(0.002), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = lstm_model.fit(X, Y, batch_size = 32, callbacks=CALLBACK, epochs = 100, validation_split = 0.25) #At this line the error is raised!

With the shapes:

  • X.shape = (7200, 147)
  • Y.shape = (7200, 147, 1)
  • src_vocab_size = 188
  • target_vocab_size = 186

I've looked at similar question on here already and tried adding a Reshape layer

simple_lstm_model.add(Reshape((-1,)))

but this only causes the following error:

"TypeError: __int__ returned non-int (type NoneType)"

It's really weird as I preprocess the dataset the same way for all models and it works just fine except for the above.

You should have return_sequences=True and return_state=False in calling the LSTM constructor.

In your snippet, the LSTM only return its last state, instead of the sequence of states for every input embedding. In theory, you could have spotted it from the error message:

logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704]

The logits should be three-dimensional: batch size × sequence length × number of classes. The length of the sequences is 147 and indeed 32 × 147 = 4704 (number of your labels). This could have told you the length of the sequences disappeared.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM