繁体   English   中英

Tensorflow fit ValueError: Shape mismatch: the shape of labels (received (16640,)) 应该等于 logits 的形状,除了最后一个维度

[英]Tensorflow fit ValueError: Shape mismatch: The shape of labels (received (16640,)) should equal the shape of logits except for the last dimension

我从转换为序列的标记化文本然后 numpy 数组创建了一个 tf 数据集

tokenizer = Tokenizer()
tokenizer.fit_on_texts(bible_text)#Builds the word index
sequences = tokenizer.texts_to_sequences(bible_text)

##-->[[5, 1, 914, 32, 1352, 1, 214, 2, 1, 111],
## [2, 1, 111, 31, 252, 2091, 2, 1874, 2, 547, 31, 38, 1, 196, 3, 1, 899, 2, 1, 298, 3, 32, 878, 38, 1, 196, 3, 1, 266],
## [2, 32, 33, 79, 54, 16, 369, 2, 54, 31, 369], [2, 32, 215, 1, 369, 6, 17, 31, 156, 2, 32, 955, 1, 369, 34, 1, 547], ...]

sequences=pad_sequences(sequences, padding='post')

##-->[[   5    1  914   32 1352    1  214    2    1  111    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0    0    0    0    0    0    0    0    0
##     0    0    0    0    0    0]
##...]

word_index=tokenizer.word_index 

##for k,v in sorted(word_index.items(), key=operator.itemgetter(1))[:10]:
##   print (k,v)

##--> the 1
##and 2
##of 3
##to 4
##in 5
##that 6
##shall 7
##he 8
##lord 9
##his 10
##
##[...]

vocab_size = len(tokenizer.word_index) + 1

构建输入和目标序列

input_sequences, target_sequences = sequences[:,:-1], sequences[:,1:]
seq_length=input_sequences.shape[1] ##-->89
num_verses=input_sequences.shape[0]

input_sequences=np.array(input_sequences)
target_sequences=np.array(target_sequences)

和数据集

dataset= tf.data.Dataset.from_tensor_slices((input_sequences, target_sequences))

这个数据集设置似乎没有什么特别的错误。 我在这里定义模型

EPOCHS=2
BATCH_SIZE=256
VAL_FRAC=0.2  
LSTM_UNITS=1024
DENSE_UNITS=vocab_size
EMBEDDING_DIM=256
BUFFER_SIZE=10000

len_val=int(num_verses*VAL_FRAC)

#build validation dataset
validation_dataset = dataset.take(len_val)
validation_dataset = (
    validation_dataset
    .shuffle(BUFFER_SIZE)
    .padded_batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE))

#build training dataset
train_dataset = dataset.skip(len_val)
train_dataset = (
    train_dataset
    .shuffle(BUFFER_SIZE)
    .padded_batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE))

#build the model: 2 stacked LSTM
print('Build model...')
model = tf.keras.Sequential()
model.add(Embedding(vocab_size, EMBEDDING_DIM))
model.add(LSTM(LSTM_UNITS, return_sequences=True, input_shape=(seq_length, vocab_size)))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(DENSE_UNITS))
model.add(Activation('softmax'))

loss=tf.losses.SparseCategoricalCrossentropy(from_logits=False)

model.compile(optimizer='adam',
              loss=loss,
              metrics=[
                  tf.keras.metrics.SparseCategoricalAccuracy()]
              )

model.summary()

我收到以下错误 - 它属于 fit 方法

ValueError: Shape mismatch: The shape of labels (received (16640,)) should equal the shape of logits except for the last dimension (received (256, 3067)).

任何想法,可能有什么问题?

编辑

如果我更改为 categorical_crossentropy 损失我得到

   /usr/local/lib/python3.6/dist-packages/keras/backend.py:4839 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py:1161 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (256, 65) and (256, 3067) are incompatible

您的预处理步骤看起来不错。 假设您想生成一个序列作为您的输出(您的目标是序列),请尝试按如下方式调整您的模型:

model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(vocab_size, EMBEDDING_DIM))
model.add(tf.keras.layers.LSTM(LSTM_UNITS, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.LSTM(512, return_sequences=True))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(DENSE_UNITS, activation='softmax')))

请注意,您的最后一个LSTM层现在再次返回序列。 时间分布层简单地将带有 softmax 激活函数的全连接层应用于每个时间步i以计算词汇表中每个单词的概率。 每个全连接层使用的节点数等于词汇量的大小,以便为每个单词提供公平的预测机会。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM