Tensorflow ValueError：logits 和标签必须具有相同的形状（（无，2）与（无，1））

Question

I'm new to Machine Learning, thought I'll start with keras.我是机器学习的新手，我想我将从 keras 开始。 Here I'm classifying movie reviews as three class classification (positive as 1, neutral as 0 and negative as -1) using binary crossentropy.在这里，我使用二元交叉熵将电影评论分类为三个 class 分类（正为 1，中性为 0，负为 -1）。 So, when I'm trying to wrap my keras model with tensorflow estimator, I get the error.所以，当我试图用 tensorflow 估计器包装我的 keras model 时，我得到了错误。
The code is as follows:代码如下：

import tensorflow as tf
import numpy as np
import pandas as pd
import numpy as K

csvfilename_train = 'train(cleaned).csv'
csvfilename_test = 'test(cleaned).csv'

# Read .csv files as pandas dataframes
df_train = pd.read_csv(csvfilename_train)
df_test = pd.read_csv(csvfilename_test)

train_sentences  = df_train['Comment'].values
test_sentences  = df_test['Comment'].values

# Extract labels from dataframes
train_labels = df_train['Sentiment'].values
test_labels = df_test['Sentiment'].values

vocab_size = 10000
embedding_dim = 16
max_length = 30
trunc_type = 'post'
oov_tok = '<OOV>'

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

tokenizer = Tokenizer(num_words = vocab_size, oov_token = oov_tok)
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(train_sentences)
padded = pad_sequences(sequences, maxlen = max_length, truncating = trunc_type)

test_sequences = tokenizer.texts_to_sequences(test_sentences)
test_padded = pad_sequences(test_sequences, maxlen = max_length)

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length = max_length),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(6, activation = 'relu'),
    tf.keras.layers.Dense(2, activation = 'sigmoid'),
])
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

num_epochs = 10
model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded, test_labels))

And the error is as follows:错误如下：

---> 10 model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded, test_labels))

And finally this:最后是这个：

ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))

Answer 1

There are several issues with your code.您的代码有几个问题。

You are using the wrong loss function.您使用了错误的损失 function。 The binary cross-entropy loss is used for binary classification problems but here you are doing a multi-class classification (3 classes - positive, negative, neutral).二元交叉熵损失用于二元分类问题，但在这里您正在进行多类分类（3 类 - 正、负、中性）。
Using the sigmoid activation function in the last layer is wrong because the sigmoid function maps logit values to a range between 0 and 1 (However, your class labels are 0, 1 and -1 ).在最后一层使用 sigmoid 激活 function 是错误的，因为 sigmoid function 将 logit 值映射到 0 和 1 之间的范围（但是，您的 ZA2F2ED4F8EBC1和AB614C21 A02，DC） This clearly shows that the network will never be able to predict a negative value because of the sigmoid function (which can only map values between 0 and 1) and hence, will never learn to predict the negative class.这清楚地表明，由于 sigmoid function（它只能 map 值介于 0 和 1 之间），网络将永远无法预测负值，因此，将永远不会学会预测负 ZA2F2ED4F8DCECC2CBB14C21A。

The right approach would be to view this as a multi-class classification problem and use the categorical cross-entropy loss accompanied by the softmax activation in your last Dense layer with 3 units (one for each class).正确的方法是将其视为多类分类问题，并在最后一个密集层中使用分类交叉熵损失和softmax 激活，其中包含3 个单元（每个类一个）。 Note that one-hot encoded labels have to be used for the categorical cross-entropy loss and integer labels can be used along with the sparse categorical cross-entropy loss.请注意，必须将 one-hot 编码标签用于分类交叉熵损失，并且 integer 标签可以与稀疏分类交叉熵损失一起使用。

Below is an example using categorical cross-entropy loss.下面是一个使用分类交叉熵损失的例子。

tf.keras.layers.Dense(3, activation = 'softmax')

Note the 3 changes:注意3个变化：

loss function changed to categorical cross-entropy损失 function 更改为分类交叉熵
No. of units in final Dense layer is 3最终密集层中的单元数为 3
One-hot encoding of labels is required and can be done using tf.one_hot需要标签的 one-hot 编码，可以使用 tf.one_hot 完成
tf.one_hot(train_labels, 3) tf.one_hot(train_labels, 3)

. .

Tensorflow ValueError：logits 和标签必须具有相同的形状（（无，2）与（无，1））

问题描述

1 个解决方案

解决方案1
8 已采纳 2020-08-13 00:43:20

Tensorflow ValueError：logits 和标签必须具有相同的形状（（无，2）与（无，1））

问题描述

1 个解决方案

解决方案1 8 已采纳 2020-08-13 00:43:20

解决方案1
8 已采纳 2020-08-13 00:43:20