[英]Keras: How to one-hot encode logits to match labels for loss function
我正在尝试为我的 LSTM 语言 model 实现困惑损失 function。 但是我收到以下错误:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [32,3345] and labels shape [107040]
[[{{node loss_9/dense_10_loss/perplexity/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]
现在,我认为解决这个问题的方法是对我的 logits 进行一次热编码,但我不确定如何做到这一点,即我不知道如何访问我的 logits,而且我不知道我应该编码什么深度他们与。
我的损失 function 如下所示:
import keras.losses
from keras import backend as K
def perplexity(y_true, y_pred):
"""
The perplexity metric. Why isn't this part of Keras yet?!
https://stackoverflow.com/questions/41881308/how-to-calculate-perplexity-of-rnn-in-tensorflow
https://github.com/keras-team/keras/issues/8267
"""
cross_entropy = K.sparse_categorical_crossentropy(y_true, y_pred)
perplexity = K.exp(cross_entropy)
return perplexity
我定义我的 LSTM model 如下:
# define model
model = Sequential()
model.add(Embedding(vocab_size, 500, input_length=max_length-1))
model.add(LSTM(750))
model.add(Dense(vocab_size, activation='softmax'))
print(model.summary())
# compile network
model.compile(loss=perplexity, optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(X, y, epochs=150, verbose=2)
如果你使用 sparse_categorical_crossentropy 你的 output 必须是简单的 integer 编码
def perplexity(y_true, y_pred):
cross_entropy = K.sparse_categorical_crossentropy(y_true, y_pred)
perplexity = K.exp(cross_entropy)
return perplexity
vocab_size = 10
X = np.random.uniform(0,1, (1000,10))
y = np.random.randint(0,vocab_size, 1000)
model = Sequential()
model.add(Dense(64, activation='relu', input_dim=(10)))
model.add(Dense(vocab_size, activation='softmax'))
# compile network
model.compile(loss=perplexity, optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(X, y, epochs=10, verbose=2)
如果您有 one-hot 编码目标,请注意更改K.sparse_categorical_crossentropy
中的K.categorical_crossentropy
def perplexity(y_true, y_pred):
cross_entropy = K.categorical_crossentropy(y_true, y_pred)
perplexity = K.exp(cross_entropy)
return perplexity
vocab_size = 10
X = np.random.uniform(0,1, (1000,10))
y = pd.get_dummies(np.random.randint(0,vocab_size, 1000)).values # one-hot
model = Sequential()
model.add(Dense(64, activation='relu', input_dim=(10)))
model.add(Dense(vocab_size, activation='softmax'))
# compile network
model.compile(loss=perplexity, optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(X, y, epochs=10, verbose=2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.