[英]Why am I getting this shape error in my loss function with my TensorFlow NN
我正在研究一个 NLP 项目,以从文本中获取情感。 我正在使用这个数据集: https://www.kaggle.com/datasets/praveengovi/emotions-dataset-for-nlp?select=train.txt
我不断收到此错误:
logits and labels must have the same first dimension, got logits shape [16,6] and labels shape [96]
我的批量大小是 16,所以标签形状是正确的大小,因为我对输出进行了热编码,并且有 6 个可能的类(6*16 = 96)。 出于某种原因,网络正在改变标签的形状,我不知道这是在哪里发生的。
这是我的代码:
import numpy as np
import pandas as pd
import tensorflow as tf
import os
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from sklearn import preprocessing
from keras.utils.np_utils import to_categorical
from tensorflow.keras import layers
from tensorflow.keras import losses
training_size = 14000
val_size = 1000
BATCH_SIZE = 16
with open('/content/drive/MyDrive/KaggleDatasets/train.txt') as f:
contents = f.readlines()
split_txt = []
for i in range (len(contents)):
split_txt.append(contents[i].split(';'))
sentences = []
emotions = []
for i in range (len(contents)):
sentences.append(split_txt[i][0])
emotions.append(split_txt[i][1])
labels = np.array(emotions)
labels = labels.astype('str')
unique_labels = np.unique(labels)
print(unique_labels)
label_dict = {
'anger\n':0,
'fear\n':1,
'joy\n':2,
'love\n':3,
'sadness\n':4,
'surprise\n':5
}
#get labels from string to int
int_labels = []
for i in range(len(labels)):
int_labels.append(label_dict[labels[i]])
catagorical_labels = np.array(to_categorical(int_labels, num_classes = (len(unique_labels))))
sentences=np.array(sentences)
x_train = sentences[0:training_size]
x_val = sentences[training_size:training_size+val_size]
x_test = sentences[val_size:]
y_train = catagorical_labels[0:training_size]
y_val = catagorical_labels[training_size:training_size+val_size]
y_test = catagorical_labels[val_size:]
tokenizer = Tokenizer(num_words=500, oov_token = "<00V>")
tokenizer.fit_on_texts(x_train)
word_index = tokenizer.word_index
training_sequences = tokenizer.texts_to_sequences(x_train)
training_padded = pad_sequences(training_sequences, padding='post')
val_sequences = tokenizer.texts_to_sequences(x_val)
val_padded = pad_sequences(val_sequences, padding='post')
test_sequences = tokenizer.texts_to_sequences(x_test)
test_padded = pad_sequences(test_sequences, padding='post')
train_ds = tf.data.Dataset.from_tensor_slices((training_padded, y_train))
val_ds = tf.data.Dataset.from_tensor_slices((val_padded, y_val))
test_ds = tf.data.Dataset.from_tensor_slices((test_padded, y_test))
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)
train_ds = train_ds.batch(batch_size=BATCH_SIZE)
val_ds = val_ds.batch(batch_size=BATCH_SIZE)
test_ds = test_ds.batch(batch_size=BATCH_SIZE)
vocab_size = len(word_index)
embed_dim = 32
max_length = training_padded.shape[1]
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embed_dim, input_length=max_length),
tf.keras.layers.GlobalMaxPooling1D(),
tf.keras.layers.Dense(20, activation='relu'),
tf.keras.layers.Dense(6, activation='softmax')
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
callbacks = [
tf.keras.callbacks.ReduceLROnPlateau(monitor='loss', patience=2, verbose=1),
tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5, verbose=1),
]
epochs=3
history = model.fit(
train_ds,
epochs=epochs,
validation_data=val_ds,
callbacks=callbacks
)
*在损失 function 计算期间,我在这里遇到了错误
在这种情况下,您的 y_true 和 y_pred 应该具有相同的形状。
我认为您需要将标签重塑为 [6, 16] 张量。
请参阅以下文档以了解相同的内容。
https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy#:~:text=Invokes%20the%20Loss%20instance。
您需要使用CategoricalCrossentropy
损失而不是SparseCategoricalCrossentropy
。
此外,您的填充验证序列与您的训练序列具有不同的长度。
您可以使用maxlen
参数使它们相等:
val_padded = pad_sequences(val_sequences, padding='post', maxlen=training_padded.shape[-1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.