Tensorflow，Keras：如何使用停止渐变在Keras图层中设置add_loss？

Question

Question 1 问题1

We know that we can use tf.stop_gradient(B) to prevent variable B being trained in backpropagation. 我们知道我们可以使用tf.stop_gradient(B)来防止变量B在反向传播中受到训练。 But I have no idea how to stop B in certain loss. 但我不知道如何在某种损失中阻止B

To put is simple, assume our loss is: 简单来说，假设我们的损失是：

loss = categorical_crossentropy + my_loss
B = tf.stop_gradient(B)

where both categorical_crossentropy and my_loss all depends on B . 其中categorical_crossentropy和my_loss都依赖于B So, if we set stop gradient for B , both of them will take B as constant. 因此，如果我们为B设置停止梯度，它们都将B作为常量。

But how do I set only my_loss stop gradient wrt B , leave categorical_crossentropy unchanged? 但是我如何只设置my_loss停止渐变wrt B ，保持categorical_crossentropy不变？ Something like B = tf.stop_gradient(B, myloss) 比如B = tf.stop_gradient(B, myloss)

My code for that would be: 我的代码是：

my_loss = ...
B = tf.stop_gradient(B)
categorical_crossentropy = ...
loss = categorical_crossentropy + my_loss

Will that work? 那会有用吗？ Or, how to make that work? 或者，如何使这项工作？

Question 2 问题2

Okay, guys, if Q1 can be solved, my final quest is how to do that in custom layer? 好的，伙计们，如果Q1能够解决，我最后的任务是如何在自定义层中做到这一点？

To put it specific, assume we have a custom layer, which have trainable weights A and B and self loss my_loss for this layer only. 具体来说，假设我们有一个自定义图层，它具有可训练的权重A和B并且my_loss此图层自我损失my_loss 。

class My_Layer(keras.layers.Layer):
    def __init__(self, **kwargs):
        super(My_Layer, self).__init__(**kwargs)
    def build(self, input_shape):
        self.w = self.add_weight(name='w', trainable=True)
        self.B = self.add_weight(name='B', trainable=True)
        my_loss = w * B
        # tf.stop_gradient(w)
        self.add_loss(my_loss)

How do I make w only trainable for model loss (MSE, crossentropy etc.), and B only trainable for my_loss ? 我如何使w只能训练模型损失（MSE，交叉熵等），而B只能训练my_loss ？

If I add that tf.stop_gradient(w) , will that stop w for my_loss only or the final loss of the model? 如果我添加tf.stop_gradient(w) ， my_loss仅针对my_loss停止w还是模型的最终丢失？

Answer 1

Question 1 问题1

When you run y = tf.stop_gradient(x) , you create a StopGradient operation whose input is x and output is y . 运行y = tf.stop_gradient(x) ，将创建一个StopGradient操作，其输入为x ，输出为y 。 This operation behaves like an identity, ie the value of x is the same as the value of y except that gradients don't flow from y to x . 此操作的行为类似于标识，即x的值与y的值相同，除了渐变不从y流向x 。

If you want to have gradients flow to B only from some losses, you can simply do: 如果你想让梯度仅从一些损失流向B ，你可以简单地做：

B_no_grad = tf.stop_gradient(B)
loss1 = get_loss(B)  # B will be updated because of loss1
loss2 = get_loss(B_no_grad)   # B will not be updated because of loss2

Things should become clear when you think about the computation graph you are building. 当您考虑要构建的计算图时，事情应该变得清晰。 stop_gradient allows you to create an "identity" node for any tensor (not just variable) that does not allow gradients to flow through it. stop_gradient允许您为任何不允许渐变流过它的张量（不仅仅是变量）创建“标识”节点。

Question 2 问题2

I don't know how to do this while using a model loss that you specify using a string (eg model.compile(loss='categorical_crossentropy', ...) because you don't control its construction. However, you can do it by adding losses using add_loss or building a model-level loss yourself using model outputs. For the former, just create some losses using plain variables and some using *_no_grad versions, add them all using add_loss() , and compile your model with loss=None . 我不知道如何使用你使用字符串指定的模型丢失（例如model.compile(loss='categorical_crossentropy', ...)因为你不控制它的结构。但是，你可以做它通过增加损失，使用add_loss或使用模型输出建立模型级的损失你自己。对于前者，只需创建一个使用普通的变量的一些使用一些损失和*_no_grad版本，添加它们全部采用add_loss()并编译模型loss=None 。

Tensorflow，Keras：如何使用停止渐变在Keras图层中设置add_loss？

问题描述

Question 1 问题1

Question 2 问题2

1 个解决方案

解决方案1
2 已采纳 2018-08-22 21:11:53

Tensorflow，Keras：如何使用停止渐变在Keras图层中设置add_loss？

问题描述

Question 1 问题1

Question 2 问题2

1 个解决方案

解决方案1 2 已采纳 2018-08-22 21:11:53

解决方案1
2 已采纳 2018-08-22 21:11:53