如何在 Tensoflow.Keras 中将可训练参数转换为损失 function

Question

I'm trying to implement a loss function where variables in Convolutional layers are required for calculation.我正在尝试实现损失 function ，其中需要卷积层中的变量进行计算。 There's one method given by the official documents that involve variables in the loss function:官方文档给出了一种方法，涉及loss中的变量function：

If this is not the case for your loss (if, for example, your loss references a Variable of one of the model's layers), you can wrap your loss in a zero-argument lambda.如果您的损失不是这种情况（例如，如果您的损失引用模型层之一的变量），您可以将损失包装在零参数 lambda 中。 These losses are not tracked as part of the model's topology since they can't be serialized.这些损失没有作为模型拓扑的一部分进行跟踪，因为它们不能被序列化。

inputs = tf.keras.Input(shape=(10,))
x = tf.keras.layers.Dense(10)(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs, outputs)
# Weight regularization.
model.add_loss(lambda: tf.reduce_mean(x.kernel))

However, this is just adding a simple regularized to the model.然而，这只是向 model 添加一个简单的正则化。 Is there a way to implement more complicated regularizer which involves calculation between variables in different layers?有没有办法实现更复杂的正则化器，其中涉及不同层中变量之间的计算？ And what if a trainable variable is added to the regularizer as well?如果一个可训练变量也被添加到正则化器中呢？

Answer 1

You can add arbitarily complex loss functions using the add_loss API.您可以使用add_loss API 添加任意复杂的损失函数。 Here is an example to add a loss that uses weights of two different layers.这是一个添加使用两个不同层权重的损失的示例。

import tensorflow as tf

print('TensorFlow:', tf.__version__)

inp = tf.keras.Input(shape=[10])
x = tf.keras.layers.Dense(16)(inp)
x = tf.keras.layers.Dense(32)(x)
x = tf.keras.layers.Dense(4)(x)
out = tf.keras.layers.Dense(1)(x)

model = tf.keras.Model(inputs=[inp], outputs=[out])
model.summary()


def custom_loss(weight_a, weight_b):
    def _custom_loss():
        # This can include any arbitrary logic
        loss = tf.norm(weight_a) + tf.norm(weight_b)
        return loss
    return _custom_loss

weight_a = model.layers[2].kernel
weight_b = model.layers[3].kernel

model.add_loss(custom_loss(weight_a, weight_b))


print('\nlosses:', model.losses)

Output: Output：

TensorFlow: 2.3.0-dev20200611
Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 10)]              0         
_________________________________________________________________
dense (Dense)                (None, 16)                176       
_________________________________________________________________
dense_1 (Dense)              (None, 32)                544       
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 132       
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 5         
=================================================================
Total params: 857
Trainable params: 857
Non-trainable params: 0
_________________________________________________________________

losses: [<tf.Tensor: shape=(), dtype=float32, numpy=7.3701963>]

Answer 2

Inspired by @Srihari Humbarwadi, I find a way to implement a complicated regularized that involves:受@Srihari Humbarwadi 的启发，我找到了一种实现复杂正则化的方法，包括：

add a trainable parameter for the regularizer loss为正则化器损失添加可训练参数
custom calculation between weights in different layers不同层权重之间的自定义计算

The idea is to construct a subclass model:这个想法是构造一个子类 model：

class Pseudo_Model(Model):
    def __init__(self, **kwargs):
        super(Pseudo_Model, self).__init__(**kwargs)
        self.dense1 = Dense(16)
        self.dense2 = Dense(4)
        self.dense3 = Dense(2)
        self.a = tf.Variable(shape=(1,), initial_value=tf.ones(shape=(1,)))

    def call(self, inputs, training=True, mask=None):
        x = self.dense1(inputs)
        x = self.dense2(x)
        x = self.dense3(x)

        return x

The model is built through: model 通过以下方式构建：

    sub_model = Pseudo_Model(name='sub_model')
    inputs = Input(shape=(32,))
    outputs = sub_model(inputs)
    model = Model(inputs, outputs)
    model.summary()
    model.get_layer('sub_model').summary()

The structure of the model: model的结构：

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 32)]              0         
_________________________________________________________________
sub_model (Pseudo_Model)     (None, 2)                 607       
=================================================================
Total params: 607
Trainable params: 607
Non-trainable params: 0
_________________________________________________________________
Model: "sub_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 16)                528       
_________________________________________________________________
dense_1 (Dense)              (None, 4)                 68        
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 10        
=================================================================
Total params: 607
Trainable params: 607
Non-trainable params: 0
_________________________________________________________________

Then define the loss function as @Srihari Humbarwadi has mentioned, only to add a new trainable parameter a:然后像@Srihari Humbarwadi 提到的那样定义损失function，只是添加一个新的可训练参数a：

def custom_loss(weight_a, weight_b, a):
    def _custom_loss():
        # This can include any arbitrary logic
        loss = a * tf.norm(weight_a) + tf.norm(weight_b)
        return loss

    return _custom_loss

The loss is added to the model by add_loss() API:通过 add_loss() API 将损失添加到 model 中：

    a_ = model.get_layer('sub_model').a
    weighta = model.get_layer('sub_model').layers[0].kernel
    weightb = model.get_layer('sub_model').layers[1].kernel
    model.get_layer('sub_model').add_loss(custom_loss(weighta, weightb, a_))

    print(model.losses)
    #[<tf.Tensor: id=116, shape=(1,), dtype=float32, numpy=array([7.2659254], dtype=float32)>]

Then I create a fake dataset to test it:然后我创建一个假数据集来测试它：

    fake_data = np.random.rand(1000, 32)
    fake_labels = np.random.rand(1000, 2)
    model.compile(optimizer=tf.keras.optimizers.SGD(), loss='mse')
    model.fit(x=fake_data, y=fake_labels, epochs=5)

    print(model.get_layer(name='sub_model').a)

As you can see, the variables and loss are being updated:如您所见，变量和损失正在更新：

Train on 1000 samples
Epoch 1/5
2020-06-19 19:21:02.475464: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
1000/1000 - 1s - loss: 3.9039
Epoch 2/5
1000/1000 - 0s - loss: -3.0905e+00
Epoch 3/5
1000/1000 - 0s - loss: -1.2103e+01
Epoch 4/5
1000/1000 - 0s - loss: -2.6855e+01
Epoch 5/5
1000/1000 - 0s - loss: -5.3408e+01
<tf.Variable 'Variable:0' shape=(1,) dtype=float32, numpy=array([-8.13609], dtype=float32)>

Process finished with exit code 0

But still, this is a really tricky method.但是，这仍然是一个非常棘手的方法。 I don't know if there is a more elegant and stable way to achieve the same function.不知道有没有更优雅稳定的方式来实现同样的function。

如何在 Tensoflow.Keras 中将可训练参数转换为损失 function

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-06-19 10:45:20

解决方案2
0 2020-06-19 17:24:47

如何在 Tensoflow.Keras 中将可训练参数转换为损失 function

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-06-19 10:45:20

解决方案2 0 2020-06-19 17:24:47

解决方案1
2 已采纳 2020-06-19 10:45:20

解决方案2
0 2020-06-19 17:24:47