简体   繁体   English

InvalidArgumentError:logits 和 labels 必须具有相同的第一维 seq2seq Tensorflow

[英]InvalidArgumentError: logits and labels must have the same first dimension seq2seq Tensorflow

I am getting this error in seq2seq.sequence_loss even though first dim of logits and labels has same dimension, ie batchSize我在seq2seq.sequence_loss中收到此错误,即使 logits 和标签的第一个 dim 具有相同的维度,即 batchSize

I have created a seq2seq model in TF 1.0 version.我在 TF 1.0 版本中创建了一个 seq2seq 模型。 My loss function is as follows:我的损失函数如下:

    logits  = self.decoder_logits_train
    targets = self.decoder_train_targets
    self.loss     = seq2seq.sequence_loss(logits=logits, targets=targets, weights=self.loss_weights)
    self.train_op = tf.train.AdamOptimizer().minimize(self.loss)

I am getting following error on running my network while training:我在训练时运行我的网络时遇到以下错误:

InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [1280,150000] and labels shape [1536]
     [[Node: sequence_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](sequence_loss/Reshape, sequence_loss/Reshape_1)]]

I confirm the shapes of logits and targets tensors as follows:我确认logitstargets张量的形状如下:

a,b = sess.run([model.decoder_logits_train, model.decoder_train_targets], feed_dict)
print(np.shape(a)) # (128, 10, 150000) which is (BatchSize, MaxSeqSize, Vocabsize)
print(np.shape(b)) # (128, 12) which is (BatchSize, Max length of seq including padding)

So, since the first dimension of targets and logits are same then why I am getting this error?那么,由于targetslogits的第一个维度相同,那么为什么我会收到此错误?

Interestingly, in error u can observe that the dimension of logits is mentioned as (1280, 150000) , which is (128 * 10, 150000) [product of first two dimension, vocab_size] , and same for targets ie (1536) , which is (128*12) , again product of first two dimension?有趣的是,在错误中你可以观察到 logits 的维度被提到为(1280, 150000) ,它是(128 * 10, 150000) [product of first two dimension, vocab_size] ,目标也是如此,即(1536) ,它是(128*12) ,又是第一个二维乘积?

Note: Tensorflow 1.0 CPU version注:Tensorflow 1.0 CPU 版本

The error message seems to be a bit misleading, as you actually need first and second dimensions to be the same.错误消息似乎有点误导,因为您实际上需要第一维和第二维相同。 This is written here :这是写在这里

logits: A Tensor of shape [batch_size, sequence_length, num_decoder_symbols] and dtype float. logits:形状为 [batch_size, sequence_length, num_decoder_symbols] 和 dtype float 的张量。 The logits correspond to the prediction across all classes at each timestep. logits 对应于每个时间步跨所有类的预测。

targets: A Tensor of shape [batch_size, sequence_length] and dtype int.目标:形状为 [batch_size, sequence_length] 和数据类型为 int 的张量。 The target represents the true class at each timestep.目标代表每个时间步的真实类别。

This also makes sense, as logits are probability vectors, while targets represent the real output, so they need to be of the same length.这也是有道理的,因为logits是概率向量,而targets代表真实输出,因此它们需要具有相同的长度。

maybe your way of padding wrong.也许你的填充方式不对。 if you padded _EOS to the end of target seq, then the max_length(real length of target sentence) should add 1 to be [batch, max_len+1].如果你在目标序列的末尾填充_EOS,那么max_length(目标句子的实际长度)应该加1为[batch, max_len+1]。 Since you padded _GO and _EOS, your target sentence length should add 2, which makes it equals 12.由于您填充了 _GO 和 _EOS,因此您的目标句子长度应加 2,使其等于 12。

I read some other people's implementation of NMT, they only padded _EOS for target sentence, while _GO for input of decoder.我读了一些其他人的 NMT 实现,他们只为目标句子填充了 _EOS,而为解码器的输入填充了 _GO。 Tell me if I'm wrong.告诉我,如果我错了。

I had the same error as you and I understood the problem:我和你有同样的错误,我理解这个问题:

The problem:问题:

You run the decoder using this parameters:您使用以下参数运行解码器:

  • targets are the decoder_inputs. targets是 decoder_inputs。 They have length max_length because of padding.由于填充,它们的长度为max_length Shape: [batch_size, max_length]形状: [batch_size, max_length]
  • sequence_length are the non-padded-lengths of all the targets of your current batch. sequence_length是当前批次的所有目标的非填充长度。 Shape: [batch_size]形状: [batch_size]

Your logits, that are the output tf.contrib.seq2seq.dynamic_decode has shape:您的 logits,即输出tf.contrib.seq2seq.dynamic_decode具有以下形状:

[batch_size, longer_sequence_in_this_batch, n_classes]

Where longer_sequence_in_this_batch is equal to tf.reduce_max(sequence_length)其中longer_sequence_in_this_batch等于tf.reduce_max(sequence_length)

So, you have a problem when computing the loss because you try to use both:因此,您在计算损失时遇到问题,因为您尝试同时使用两者:

  • Your logits with 1st dimension shape longer_sequence_in_this_batch您的一维形状 longer_sequence_in_this_batch 的longer_sequence_in_this_batch
  • Your targets with 1st dimension shape max_length您的目标具有一维形状max_length

Note that longer_sequence_in_this_batch <= max_length请注意longer_sequence_in_this_batch <= max_length

How to fix it:如何修复:

You can simply apply some padding to your logits.您可以简单地对您的登录应用一些填充。

logits  = self.decoder_logits_train
targets = self.decoder_train_targets

paddings = [[0, 0], [0, max_length-tf.shape(logits)[1]], [0, 0]]
padded_logits = tf.pad(logits, paddings, 'CONSTANT', constant_values=0)


self.loss = seq2seq.sequence_loss(logits=padded_logits, targets=targets, 
                                  weights=self.loss_weights)

Using this method,you ensure that your logits will be padded as the targets and will have dimension [batch_size, max_length, n_classes]使用这种方法,您可以确保您的 logits 将被填充为目标并且具有维度[batch_size, max_length, n_classes]

For more information about the pad function, visit Tensorflow's documentation有关 pad 函数的更多信息,请访问Tensorflow 的文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何解决“logits 和 label 必须具有相同的第一维”错误 - How to solve “logits and labels must have the same first dimension” error logits和标签必须具有相同的第一尺寸,logits形状[3662,5]和标签形状[18310] - logits and labels must have the same first dimension, got logits shape [3662,5] and labels shape [18310] tensorflow seq2seq model 输出相同的 output - tensorflow seq2seq model outputting the same output Tensorflow seq2seq回归模型 - Tensorflow seq2seq regression model Tensorflow:使用注意和 BeamSearch 的 seq2seq 模型中的 .clone() 问题 - Tensorflow: Troubles with .clone() in seq2seq model using Attention and BeamSearch 嵌入rnn seq2seq和基本rnn seq2seq - embedding rnn seq2seq and basic rnn seq2seq CIFAR-10 TensorFlow:InvalidArgumentError(请参阅上面的回溯):logits和标签必须是可广播的 - CIFAR-10 TensorFlow: InvalidArgumentError (see above for traceback): logits and labels must be broadcastable Tensorflow登录和标签必须是可广播的 - Tensorflow Logits and Labels must be broadcastable ValueError:logits 和标签必须具有相同的形状 ((None, 6, 8, 1) vs (None, 1)) - ValueError: logits and labels must have the same shape ((None, 6, 8, 1) vs (None, 1)) 保存并加载自定义 Tensorflow Model(自回归 seq2seq 多元时间序列 GRU/RNN) - Save and Load Custom Tensorflow Model (Autoregressive seq2seq multivariate time series GRU/RNN)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM