[英]Tensorflow ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
I'm new to Machine Learning, thought I'll start with keras.我是机器学习的新手,我想我将从 keras 开始。 Here I'm classifying movie reviews as three class classification (positive as 1, neutral as 0 and negative as -1) using binary crossentropy.
在这里,我使用二元交叉熵将电影评论分类为三个 class 分类(正为 1,中性为 0,负为 -1)。 So, when I'm trying to wrap my keras model with tensorflow estimator, I get the error.
所以,当我试图用 tensorflow 估计器包装我的 keras model 时,我得到了错误。
The code is as follows:代码如下:
import tensorflow as tf
import numpy as np
import pandas as pd
import numpy as K
csvfilename_train = 'train(cleaned).csv'
csvfilename_test = 'test(cleaned).csv'
# Read .csv files as pandas dataframes
df_train = pd.read_csv(csvfilename_train)
df_test = pd.read_csv(csvfilename_test)
train_sentences = df_train['Comment'].values
test_sentences = df_test['Comment'].values
# Extract labels from dataframes
train_labels = df_train['Sentiment'].values
test_labels = df_test['Sentiment'].values
vocab_size = 10000
embedding_dim = 16
max_length = 30
trunc_type = 'post'
oov_tok = '<OOV>'
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
tokenizer = Tokenizer(num_words = vocab_size, oov_token = oov_tok)
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(train_sentences)
padded = pad_sequences(sequences, maxlen = max_length, truncating = trunc_type)
test_sequences = tokenizer.texts_to_sequences(test_sentences)
test_padded = pad_sequences(test_sequences, maxlen = max_length)
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length = max_length),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(6, activation = 'relu'),
tf.keras.layers.Dense(2, activation = 'sigmoid'),
])
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
num_epochs = 10
model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded, test_labels))
And the error is as follows:错误如下:
---> 10 model.fit(padded, train_labels, epochs = num_epochs, validation_data = (test_padded, test_labels))
And finally this:最后是这个:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
There are several issues with your code.您的代码有几个问题。
The right approach would be to view this as a multi-class classification problem and use the categorical cross-entropy loss accompanied by the softmax activation in your last Dense layer with 3 units (one for each class).正确的方法是将其视为多类分类问题,并在最后一个密集层中使用分类交叉熵损失和softmax 激活,其中包含3 个单元(每个类一个)。 Note that one-hot encoded labels have to be used for the categorical cross-entropy loss and integer labels can be used along with the sparse categorical cross-entropy loss.
请注意,必须将 one-hot 编码标签用于分类交叉熵损失,并且 integer 标签可以与稀疏分类交叉熵损失一起使用。
Below is an example using categorical cross-entropy loss.下面是一个使用分类交叉熵损失的例子。
tf.keras.layers.Dense(3, activation = 'softmax')
Note the 3 changes:注意3个变化:
loss function changed to categorical cross-entropy损失 function 更改为分类交叉熵
No. of units in final Dense layer is 3最终密集层中的单元数为 3
One-hot encoding of labels is required and can be done using tf.one_hot需要标签的 one-hot 编码,可以使用 tf.one_hot 完成
tf.one_hot(train_labels, 3) tf.one_hot(train_labels, 3)
. .
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.