简体   繁体   English

tensorflow GradientDescentOptimizer:运算输入和计算的输入梯度之间的形状不兼容

[英]tensorflow GradientDescentOptimizer: Incompatible shapes between op input and calculated input gradient

The model worked well before optimization step. 在优化步骤之前,该模型运行良好。 However, when I want to optimize my model, the error message showed up: 但是,当我要优化模型时,出现错误消息:

Incompatible shapes between op input and calculated input gradient. 运算输入和计算的输入梯度之间的形状不兼容。 Forward operation: softmax_cross_entropy_with_logits_sg_12. 转发操作:softmax_cross_entropy_with_logits_sg_12。 Input index: 0. Original input shape: (16, 1). 输入索引:0。原始输入形状:(16,1)。 Calculated input gradient shape: (16, 16) 计算的输入渐变形状:(16,16)

the following is my code. 以下是我的代码。

import tensorflow as tf;  
batch_size = 16
size = 400
labels  = tf.placeholder(tf.int32, batch_size)
doc_encode  = tf.placeholder(tf.float32, [batch_size, size])

W1 = tf.Variable(np.random.rand(size, 100), dtype=tf.float32, name='W1')
b1 = tf.Variable(np.zeros((100)), dtype=tf.float32, name='b1')

W2 = tf.Variable(np.random.rand(100, 1),dtype=tf.float32, name='W2')
b2 = tf.Variable(np.zeros((1)), dtype=tf.float32, name='b2')

D1 = tf.nn.relu(tf.matmul(doc_encode, W1) + b1)
D2 = tf.nn.selu(tf.matmul(D1, W2) + b2)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=D2))
optim = tf.train.GradientDescentOptimizer(0.01).minimize(cost, aggregation_method=tf.AggregationMethod.EXPERIMENTAL_TREE)
with tf.Session() as sess:  
    sess.run(tf.global_variables_initializer())
    _cost, _optim = sess.run([cost, optim], {labels:np.array([1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1]), doc_encode: np.random.rand(batch_size, size)})

Correct following things. 更正以下内容。

First, 第一,

Change placeholders input shape to this 将占位符输入形状更改为此

X = tf.placeholder(tf.int32, shape=[None,400]
Y = tf.placeholder(tf.float32, shape=[None,1]

Why None because this gives you freedom of feeding any size. 为什么选择“ 无”,因为这使您可以自由喂食任何尺寸的食物。 This is preferred because while training you want to use mini batch but while predicting or inference time, you will generally feed single thing. 这是首选方法,因为在训练时要使用微型批处理,而在预测或推断时间时,通常将只喂一些东西。 Marking it None, takes care of that. 将其标记为None(无),即可解决。

Second, 第二,

Correct your weight initialization, you are feeding in random values, they may be negatives too. 校正体重初始化,您输入的是随机值,它们也可能是负数。 It is always recommended to initialize with slight positive value. 始终建议使用较小的正值进行初始化。 (I see you are using relu as activation, the Gradient of which is zero for negative weight values, so those weights are never updated in Gradient descent, in other words those are useless weights) (我看到您正在使用relu作为激活,对于负权重值,其Gradient为零,因此这些权重永远不会在Gradient下降中更新,换句话说,这些都是无用的权重)

Third, 第三,

Logits are result you obtain from W2*x + b2 . Logits是从W2*x + b2获得的结果。 And that tf.nn.softmax_cross.....(..) automatically applied softmax activation. 且该tf.nn.softmax_cross.....(..)自动应用了softmax激活。 So no need of SeLu for last layer. 因此,最后一层不需要SeLu。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Tensorflow 输入形状不兼容 - Tensorflow Input Shapes Incompatible Tensorflow InvalidArgumentError:Lambda图层对输入进行切片后形状不兼容 - Tensorflow InvalidArgumentError: Incompatible shapes after slicing input by Lambda Layer TensorFlow CNN 不兼容形状:4D 输入形状 - TensorFlow CNN Incompatible Shapes: 4D input shape 输入与图层不兼容 - Tensorflow - Input incompatible with layers - Tensorflow Keras 2 输入 model,不兼容的形状 - Keras 2 input model, incompatible Shapes 输入形状总是与图层不兼容 - Input shapes always incompatible with layers Tensorflow-ValueError:形状必须为0级,但输入范围为[],[10],[]的“范围”(操作数:“范围”)的“极限”必须为1级 - Tensorflow - ValueError: Shape must be rank 0 but is rank 1 for 'limit' for 'range' (op: 'Range') with input shapes: [], [10], [] Tensorflow引发“尺寸必须相等,但输入形状为[0,100],[0,100]的'MatMul'(op:'MatMul')的尺寸必须为100和0。” - Tensorflow throws “Dimensions must be equal, but are 100 and 0 for 'MatMul' (op: 'MatMul') with input shapes: [0,100], [0,100].” 张量流中GradientDescentOptimizer和AdamOptimizer之间的区别? - Difference between GradientDescentOptimizer and AdamOptimizer in tensorflow? Theano Scan Op的梯度输入断开 - Disconnected Input in Gradient of Theano Scan Op
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM