简体   繁体   English

TensorFlow 2.0 如何从 tf.keras.layers 层获取可训练变量,如 Conv2D 或 Dense

[英]TensorFlow 2.0 How to get trainable variables from tf.keras.layers layers, like Conv2D or Dense

I have been trying to get the trainable variables from my layers and can't figure out a way to make it work.我一直试图从我的图层中获取可训练的变量,但无法找到使其工作的方法。 So here is what I have tried:所以这是我尝试过的:

I have tried accessing the kernel and bias attribute of the Dense or Conv2D object directly, but to no avail.我曾尝试直接访问 Dense 或 Conv2D 对象的内核和偏差属性,但无济于事。 The type of result that I get is "Dense object has no attribute 'kernel'".我得到的结果类型是“密集对象没有属性‘内核’”。

trainable_variables.append(conv_layer.kernel)
trainable_variables.append(conv_layer.bias)

Similarly, I have tried using the attribute "trainable_variables" in the following way:同样,我尝试通过以下方式使用属性“trainable_variables”:

trainable_variables.extend(conv_layer.trainable_variables)

From what I know this is supposed to return a list of two variables, the weight and the bias variables.据我所知,这应该返回两个变量的列表,权重和偏差变量。 However, what I get is an empty list.但是,我得到的是一个空列表。

Any idea of how to get the variables from labels in TensorFlow 2.0?知道如何从 TensorFlow 2.0 中的标签获取变量吗? I want to be able to later feed those variables to an optimizer, in a way similar to the following:我希望以后能够以类似于以下方式将这些变量提供给优化器:

gradients = tape.gradient(loss, trainable_variables)
optimizer.apply_gradients(zip(gradients, trainable_variables))

Edit: Here is part of my current code to serve as an example and help answering the question (Hope it is readable)编辑:这是我当前代码的一部分,用作示例并帮助回答问题(希望它是可读的)

from tensorflow.keras.layers import Dense, Conv2D, Conv2DTranspose, Reshape, Flatten

... 

class Network:
    def __init__(self, params):
        weights_initializer = tf.initializers.GlorotUniform(seed=params["seed"])
        bias_initializer = tf.initializers.Constant(0.0)

        self.trainable_variables = []

        self.conv_layers = []
        self.conv_activations = []
        self.create_conv_layers(params, weights_initializer, bias_initializer)

        self.flatten_layer = Flatten()


        self.dense_layers = []
        self.dense_activations = []
        self.create_dense_layers(params, weights_initializer, bias_initializer)

        self.output_layer = Dense(1, kernel_initializer=weights_initializer, bias_initializer=bias_initializer)
        self.trainable_variables.append(self.output_layer.kernel)
        self.trainable_variables.append(self.output_layer.bias)

    def create_conv_layers(self, params, weight_init, bias_init):
        nconv = len(params['stride'])
        for i in range(nconv):
            conv_layer = Conv2D(filters=params["nfilter"][i],
                                kernel_size=params["shape"][i], kernel_initializer=weight_init,
                                kernel_regularizer=spectral_norm,
                                use_bias=True, bias_initializer=bias_init,
                                strides=params["stride"][i],
                                padding="same", )
            self.conv_layers.append(conv_layer)
            self.trainable_variables.append(conv_layer.kernel)
            self.trainable_variables.append(conv_layer.bias)
            self.conv_activations.append(params["activation"])

    def create_conv_layers(self, params, weight_init, bias_init):
        nconv = len(params['stride'])
        for i in range(nconv):
            conv_layer = Conv2D(filters=params["nfilter"][i],
                                kernel_size=params["shape"][i], kernel_initializer=weight_init,
                                kernel_regularizer=spectral_norm,
                                use_bias=True, bias_initializer=bias_init,
                                strides=params["stride"][i],
                                padding="same", )
            self.conv_layers.append(conv_layer)
            self.trainable_variables.append(conv_layer.kernel)
            self.trainable_variables.append(conv_layer.bias)
            self.conv_activations.append(params["activation"])

As you can see I am trying to gather all my trainable variables into a list attribute called trainable_variables.如您所见,我正在尝试将所有可训练变量收集到一个名为 trainable_variables 的列表属性中。 However as I mentioned this code does not work because I get an error for trying to acquire the kernel and bias attributes of those layer objects.然而,正如我所提到的,这段代码不起作用,因为我在尝试获取这些层对象的内核和偏差属性时出错。

Ok, so I think I found the problem.好的,所以我想我发现了问题。

The trainable variables were not available until I used the given layer object.在我使用给定的图层对象之前,可训练变量不可用。 After I run my forward pass I could retrieve attributes of the tf.keras.layers.Layer object like trainable_variables and weights.运行前向传递后,我可以检索 tf.keras.layers.Layer 对象的属性,如 trainable_variables 和权重。

However, before my forward pass I received an empty list.然而,在我向前传球之前,我收到了一个空清单。 To make things a little bit more clear:为了让事情更清楚一点:

with tf.GradientTape() as tape:
    print(dense_layers[0].trainable_variables)
    self.forward_pass(X)
    self.compute_loss()
    print(dense_layers[0].trainable_variables)

On the code above, the attribute trainable_variables is an empty list before executing self.forward_pass.在上面的代码中,在执行 self.forward_pass 之前,属性 trainable_variables 是一个空列表。 However, right after it, I could retrieve the kernel and bias numpy arrays.然而,在它之后,我可以检索内核和偏置 numpy 数组。

Let me start by having a simple model as an example to make it easier to explain and understand.让我先以一个简单的模型为例,以便于解释和理解。

model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(1, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Conv2D(1, (3, 3), activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(3, activation='softmax'))

When using gradient tape you pass model.trainable_weights which returns the weights and biases of the entire model and use the optimizer to apply the gradients.使用梯度磁带时,您通过model.trainable_weights返回整个模型的权重和偏差,并使用优化器应用梯度。

If you print the output of model.trainable_weights , you will get this output.如果你打印model.trainable_weights的输出,你会得到这个输出。 I removed the actual weights and biases for readability.为了可读性,我删除了实际的权重和偏差。

[<tf.Variable 'conv2d/kernel:0' shape=(3, 3, 3, 1) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d/bias:0' shape=(1,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d_1/kernel:0' shape=(3, 3, 1, 1) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d_1/bias:0' shape=(1,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense/kernel:0' shape=(169, 10) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense/bias:0' shape=(10,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_1/kernel:0' shape=(10, 10) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_2/kernel:0' shape=(10, 3) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'dense_2/bias:0' shape=(3,) dtype=float32, numpy=array([...], dtype=float32)>]

As you can see each layer's kernel and bias was outputted as a list.如您所见,每一层的内核和偏差都作为列表输出。 This is the same output you pass to the gradient tape.这与您传递给渐变带的输出相同。 If you want to pass just a specific layer, you can slice the list and get the desired weights you want to train.如果您只想传递特定层,您可以对列表进行切片并获得您想要训练的所需权重。

model.trainable_weights[0:2] # Get the first conv layer weights at index 0 and bias at index 1.

Which will output only the first conv layer weights and biases.这将只输出第一个 conv 层的权重和偏差。

[<tf.Variable 'conv2d/kernel:0' shape=(3, 3, 3, 1) dtype=float32, numpy=array([...], dtype=float32)>,
 <tf.Variable 'conv2d/bias:0' shape=(1,) dtype=float32, numpy=array([...], dtype=float32)>]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM