[英]Tensorflow, Keras: How to set add_loss in Keras layer with stop gradient?
We know that we can use tf.stop_gradient(B)
to prevent variable B
being trained in backpropagation. 我们知道我们可以使用
tf.stop_gradient(B)
来防止变量B
在反向传播中受到训练。 But I have no idea how to stop B
in certain loss. 但我不知道如何在某种损失中阻止
B
To put is simple, assume our loss is: 简单来说,假设我们的损失是:
loss = categorical_crossentropy + my_loss
B = tf.stop_gradient(B)
where both categorical_crossentropy
and my_loss
all depends on B
. 其中
categorical_crossentropy
和my_loss
都依赖于B
So, if we set stop gradient for B
, both of them will take B
as constant. 因此,如果我们为
B
设置停止梯度,它们都将B
作为常量。
But how do I set only my_loss
stop gradient wrt B
, leave categorical_crossentropy
unchanged? 但是我如何只设置
my_loss
停止渐变wrt B
,保持categorical_crossentropy
不变? Something like B = tf.stop_gradient(B, myloss)
比如
B = tf.stop_gradient(B, myloss)
My code for that would be: 我的代码是:
my_loss = ...
B = tf.stop_gradient(B)
categorical_crossentropy = ...
loss = categorical_crossentropy + my_loss
Will that work? 那会有用吗? Or, how to make that work?
或者,如何使这项工作?
Okay, guys, if Q1 can be solved, my final quest is how to do that in custom layer? 好的,伙计们,如果Q1能够解决,我最后的任务是如何在自定义层中做到这一点?
To put it specific, assume we have a custom layer, which have trainable weights A
and B
and self loss my_loss
for this layer only. 具体来说,假设我们有一个自定义图层,它具有可训练的权重
A
和B
并且my_loss
此图层自我损失my_loss
。
class My_Layer(keras.layers.Layer):
def __init__(self, **kwargs):
super(My_Layer, self).__init__(**kwargs)
def build(self, input_shape):
self.w = self.add_weight(name='w', trainable=True)
self.B = self.add_weight(name='B', trainable=True)
my_loss = w * B
# tf.stop_gradient(w)
self.add_loss(my_loss)
How do I make w
only trainable for model loss (MSE, crossentropy etc.), and B
only trainable for my_loss
? 我如何使
w
只能训练模型损失(MSE,交叉熵等),而B
只能训练my_loss
?
If I add that tf.stop_gradient(w)
, will that stop w
for my_loss
only or the final loss of the model? 如果我添加
tf.stop_gradient(w)
, my_loss
仅针对my_loss
停止w
还是模型的最终丢失?
Question 1 问题1
When you run y = tf.stop_gradient(x)
, you create a StopGradient
operation whose input is x
and output is y
. 运行
y = tf.stop_gradient(x)
,将创建一个StopGradient
操作,其输入为x
,输出为y
。 This operation behaves like an identity, ie the value of x
is the same as the value of y
except that gradients don't flow from y
to x
. 此操作的行为类似于标识,即
x
的值与y
的值相同,除了渐变不从y
流向x
。
If you want to have gradients flow to B
only from some losses, you can simply do: 如果你想让梯度仅从一些损失流向
B
,你可以简单地做:
B_no_grad = tf.stop_gradient(B)
loss1 = get_loss(B) # B will be updated because of loss1
loss2 = get_loss(B_no_grad) # B will not be updated because of loss2
Things should become clear when you think about the computation graph you are building. 当您考虑要构建的计算图时,事情应该变得清晰。
stop_gradient
allows you to create an "identity" node for any tensor (not just variable) that does not allow gradients to flow through it. stop_gradient
允许您为任何不允许渐变流过它的张量(不仅仅是变量)创建“标识”节点。
Question 2 问题2
I don't know how to do this while using a model loss that you specify using a string (eg model.compile(loss='categorical_crossentropy', ...)
because you don't control its construction. However, you can do it by adding losses using add_loss
or building a model-level loss yourself using model outputs. For the former, just create some losses using plain variables and some using *_no_grad
versions, add them all using add_loss()
, and compile your model with loss=None
. 我不知道如何使用你使用字符串指定的模型丢失(例如
model.compile(loss='categorical_crossentropy', ...)
因为你不控制它的结构。但是,你可以做它通过增加损失,使用add_loss
或使用模型输出建立模型级的损失你自己。对于前者,只需创建一个使用普通的变量的一些使用一些损失和*_no_grad
版本,添加它们全部采用add_loss()
并编译模型loss=None
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.