简体   繁体   English

Tensorflow 使用 GradientTape 时矩阵大小不兼容

[英]Tensorflow incompatible matrix size when using GradientTape

I am trying to run code that previously worked on tensorflow 2.2.0 on version 2.4.0-rc0 for apple silicon (using python 3.8), but it is now generating the following error regarding the matrix dimensions:我正在尝试运行以前在 tensorflow 2.2.0 上用于苹果硅的 2.4.0-rc0 版本的代码(使用 python 3.8),但它现在生成以下关于矩阵尺寸的错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError: GetOutputShape: Matrix size-incompatible: In[0]: [256,4], In[1]: [4,400]

I am using nested gradient tapes to compute the gradient of my MLP model wrt the inputs (which form part of the loss), after which I compute the gradient of the loss wrt the trainable variables as below:我正在使用嵌套梯度磁带来计算我的 MLP model wrt 输入(构成损失的一部分)的梯度,然后我计算可训练变量的损失梯度,如下所示:

    def get_grad_and_loss(self, x, y):
        with tf.GradientTape(persistent=True) as gl_tape:
            gl_tape.watch(x)

            with tf.GradientTape(persistent=True) as l_tape:
                l_tape.watch(x)
                y_pred = self.call(x)

            grad_mat = l_tape.gradient(y_pred, x)
            loss = tf.reduce_mean(tf.math.square(y_pred - y[:, tf.newaxis])) + tf.reduce_mean(tf.maximum(0, -1 * (grad_mat[:, 0])))

        g = gl_tape.gradient(loss, self.trainable_weights)

        return g, loss

In words I am computing the MSE and trying to force the sign of the gradient to be positive (as a soft constraint).换句话说,我正在计算 MSE 并试图强制梯度的符号为正(作为软约束)。 I have read through the documentation on gradient tape and as I understand it, setting persistent=True should allow me to recompute gradients freely.我已经阅读了梯度磁带上的文档,据我了解,设置persistent=True应该可以让我自由地重新计算梯度。 As a side note my code works fine if I omit the nested gradient tape and simply use the MSE metric, so I don't think the issue lies anywhere else in the code.作为旁注,如果我省略嵌套渐变带并简单地使用 MSE 指标,我的代码可以正常工作,所以我认为问题不在于代码中的其他任何地方。 Any pointers would be much appreciated, thanks in advance:)任何指针将不胜感激,在此先感谢:)

You seem to have confusion over which gradient tape watches which variables.您似乎对哪个梯度磁带监视哪些变量感到困惑。 I suggest to make sure that the tapes watch over different vars.我建议确保磁带监视不同的变量。 Presently they both watch x .目前他们都在看x Most likely you need to enter gl_tape.watch(self.trainable_weights) .您很可能需要输入gl_tape.watch(self.trainable_weights) There are examples out there with 2 gradient tapes working together.有两个渐变胶带一起工作的例子。 Check them out.去看一下。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Tensorflow InvalidArgumentError矩阵大小不兼容 - Tensorflow InvalidArgumentError Matrix size incompatible Tensorflow:张量上的矩阵大小不兼容错误 - Tensorflow: Matrix size-incompatible error on Tensors 矩阵尺寸不兼容 - Keras Tensorflow - Matrix size-incompatible - Keras Tensorflow 使用 GradientTape 时,Tensorflow 渐变总是给出 None - Tensorflow gradient always gives None when using GradientTape Tensorflow keras 矩阵大小与极其简单的模型不兼容 - Tensorflow keras Matrix size-incompatible with extremely simple model 在 Tensorflow 2.0 中使用 GradientTape() 和 jacobian() 时出错 - Error when working with GradientTape() and jacobian() in Tensorflow 2.0 使用 GradientTape 训练基本的 TensorFlow Model - Training a basic TensorFlow Model using the GradientTape 使用功能 API 和 tf.GradientTape() 的组合在 Tensorflow 2.0 中进行训练时,如何将模型图记录到张量板? - How to log model graph to tensorboard when using combination to Functional API and tf.GradientTape() to train in Tensorflow 2.0? 使用 tf.GradientTape() wrt 输入的梯度是 None (Tensorflow 2.4) - gradient using tf.GradientTape() wrt inputs is None (Tensorflow 2.4) Tensorflow不兼容的形状,批量大小已更改 - Tensorflow incompatible shapes, the batch size changed
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM