简体   繁体   English

Tensorflow,Keras:如何使用停止渐变在Keras图层中设置add_loss?

[英]Tensorflow, Keras: How to set add_loss in Keras layer with stop gradient?

Question 1 问题1

We know that we can use tf.stop_gradient(B) to prevent variable B being trained in backpropagation. 我们知道我们可以使用tf.stop_gradient(B)来防止变量B在反向传播中受到训练。 But I have no idea how to stop B in certain loss. 但我不知道如何在某种损失中阻止B

To put is simple, assume our loss is: 简单来说,假设我们的损失是:

loss = categorical_crossentropy + my_loss
B = tf.stop_gradient(B)

where both categorical_crossentropy and my_loss all depends on B . 其中categorical_crossentropymy_loss都依赖于B So, if we set stop gradient for B , both of them will take B as constant. 因此,如果我们为B设置停止梯度,它们都将B作为常量。

But how do I set only my_loss stop gradient wrt B , leave categorical_crossentropy unchanged? 但是我如何只设置my_loss停止渐变wrt B ,保持categorical_crossentropy不变? Something like B = tf.stop_gradient(B, myloss) 比如B = tf.stop_gradient(B, myloss)

My code for that would be: 我的代码是:

my_loss = ...
B = tf.stop_gradient(B)
categorical_crossentropy = ...
loss = categorical_crossentropy + my_loss

Will that work? 那会有用吗? Or, how to make that work? 或者,如何使这项工作?


Question 2 问题2

Okay, guys, if Q1 can be solved, my final quest is how to do that in custom layer? 好的,伙计们,如果Q1能够解决,我最后的任务是如何在自定义层中做到这一点?

To put it specific, assume we have a custom layer, which have trainable weights A and B and self loss my_loss for this layer only. 具体来说,假设我们有一个自定义图层,它具有可训练的权重AB并且my_loss此图层自我损失my_loss

class My_Layer(keras.layers.Layer):
    def __init__(self, **kwargs):
        super(My_Layer, self).__init__(**kwargs)
    def build(self, input_shape):
        self.w = self.add_weight(name='w', trainable=True)
        self.B = self.add_weight(name='B', trainable=True)
        my_loss = w * B
        # tf.stop_gradient(w)
        self.add_loss(my_loss)

How do I make w only trainable for model loss (MSE, crossentropy etc.), and B only trainable for my_loss ? 我如何使w只能训练模型损失(MSE,交叉熵等),而B只能训练my_loss

If I add that tf.stop_gradient(w) , will that stop w for my_loss only or the final loss of the model? 如果我添加tf.stop_gradient(w)my_loss仅针对my_loss停止w还是模型的最终丢失?

Question 1 问题1

When you run y = tf.stop_gradient(x) , you create a StopGradient operation whose input is x and output is y . 运行y = tf.stop_gradient(x) ,将创建一个StopGradient操作,其输入为x ,输出为y This operation behaves like an identity, ie the value of x is the same as the value of y except that gradients don't flow from y to x . 此操作的行为类似于标识,即x的值与y的值相同,除了渐变不从y流向x

If you want to have gradients flow to B only from some losses, you can simply do: 如果你想让梯度仅从一些损失流向B ,你可以简单地做:

B_no_grad = tf.stop_gradient(B)
loss1 = get_loss(B)  # B will be updated because of loss1
loss2 = get_loss(B_no_grad)   # B will not be updated because of loss2 

Things should become clear when you think about the computation graph you are building. 当您考虑要构建的计算图时,事情应该变得清晰。 stop_gradient allows you to create an "identity" node for any tensor (not just variable) that does not allow gradients to flow through it. stop_gradient允许您为任何不允许渐变流过它的张量(不仅仅是变量)创建“标识”节点。

Question 2 问题2

I don't know how to do this while using a model loss that you specify using a string (eg model.compile(loss='categorical_crossentropy', ...) because you don't control its construction. However, you can do it by adding losses using add_loss or building a model-level loss yourself using model outputs. For the former, just create some losses using plain variables and some using *_no_grad versions, add them all using add_loss() , and compile your model with loss=None . 我不知道如何使用你使用字符串指定的模型丢失(例如model.compile(loss='categorical_crossentropy', ...)因为你不控制它的结构。但是,你可以做它通过增加损失,使用add_loss或使用模型输出建立模型级的损失你自己。对于前者,只需创建一个使用普通的变量的一些使用一些损失和*_no_grad版本,添加它们全部采用add_loss()并编译模型loss=None

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM