如何解决“logits 和 label 必须具有相同的第一维”错误

Question

I'm trying out different Neural Network architectures for a word based NLP.我正在为基于单词的 NLP 尝试不同的神经网络架构。

So far I've used bidirectional-, embedded- and models with GRU's guided by this tutorial: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571 and it all worked out well.到目前为止，我已经在本教程的指导下使用了 GRU 的双向、嵌入式和模型： https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571 ，一切都很好。 When I tried using LSTM's however, I get an error saying:但是，当我尝试使用 LSTM 时，我收到一条错误消息：

logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704] logits 和 labels 必须具有相同的第一维，得到 logits 形状 [32,186] 和标签形状 [4704]

How can I solve this?我该如何解决这个问题？

My source and target dataset consists of 7200 sample sentences.我的源和目标数据集由 7200 个例句组成。 They are integer tokenized and embedded.它们是 integer 标记和嵌入的。 The source dataset is post padded to match the length of the target dataset.源数据集被后填充以匹配目标数据集的长度。

Here is my model and the relevant code:这是我的 model 和相关代码：

lstm_model = Sequential()
lstm_model.add(Embedding(src_vocab_size, 128, input_length=X.shape[1], input_shape=X.shape[1:]))
lstm_model.add(LSTM(128, return_sequences=False, dropout=0.1, recurrent_dropout=0.1))
lstm_model.add(Dense(128, activation='relu'))
lstm_model.add(Dropout(0.5))
lstm_model.add((Dense(target_vocab_size, activation='softmax')))

lstm_model.compile(optimizer=Adam(0.002), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = lstm_model.fit(X, Y, batch_size = 32, callbacks=CALLBACK, epochs = 100, validation_split = 0.25) #At this line the error is raised!

With the shapes:随着形状：

X.shape = (7200, 147) X.shape = (7200, 147)
Y.shape = (7200, 147, 1) Y.shape = (7200, 147, 1)
src_vocab_size = 188 src_vocab_size = 188
target_vocab_size = 186目标词汇大小 = 186

I've looked at similar question on here already and tried adding a Reshape layer我已经在这里查看过类似的问题并尝试添加 Reshape 图层

simple_lstm_model.add(Reshape((-1,)))

but this only causes the following error:但这只会导致以下错误：

"TypeError: __int__ returned non-int (type NoneType)" “TypeError：__int__ 返回非 int（类型 NoneType）”

It's really weird as I preprocess the dataset the same way for all models and it works just fine except for the above.这真的很奇怪，因为我对所有模型都以相同的方式预处理数据集，并且除了上述之外它工作得很好。

Answer 1

You should have return_sequences=True and return_state=False in calling the LSTM constructor.在调用 LSTM 构造函数时，您应该有return_sequences=True和return_state=False 。

In your snippet, the LSTM only return its last state, instead of the sequence of states for every input embedding.在您的代码段中，LSTM 仅返回其最后一个 state，而不是每个输入嵌入的状态序列。 In theory, you could have spotted it from the error message:理论上，您可以从错误消息中发现它：

logits and labels must have the same first dimension, got logits shape [32,186] and labels shape [4704] logits 和 labels 必须具有相同的第一维，得到 logits 形状 [32,186] 和标签形状 [4704]

The logits should be three-dimensional: batch size × sequence length × number of classes. logits 应该是三维的：批量大小 × 序列长度 × 类数。 The length of the sequences is 147 and indeed 32 × 147 = 4704 (number of your labels).序列的长度是 147，实际上是 32 × 147 = 4704（标签数）。 This could have told you the length of the sequences disappeared.这可能告诉你序列的长度消失了。

如何解决“logits 和 label 必须具有相同的第一维”错误

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-10-25 10:31:41

如何解决“logits 和 label 必须具有相同的第一维”错误

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-10-25 10:31:41

解决方案1
1 已采纳 2019-10-25 10:31:41