Tensorflow 2 Keras 嵌套 Model 子类化 - 总参数为零

Question

I am trying to implement a simple model subclassing inspired by the VGG network.我正在尝试实现一个受 VGG 网络启发的简单 model 子类化。

So here is the code:所以这里是代码：

class ConvMax(tf.keras.Model):
    def __init__(self, filters=4, kernel_size=3, pool_size=2, activation='relu'):
        super(ConvMax, self).__init__()

        self.conv = tf.keras.layers.Conv2D(filters, kernel_size, padding='same', activation=activation)
        self.maxpool = tf.keras.layers.MaxPool2D((pool_size, pool_size))

    def call(self, input_tensor):
        x = self.conv(input_tensor)
        x = self.maxpool(x)
        return x

class RepeatedConvMax(tf.keras.Model):
    def __init__(self, repetitions=4, filters=4, kernel_size=3, pool_size=2, activation='relu', **kwargs):
        super(RepeatedConvMax, self).__init__(**kwargs)
    
        self.repetitions = repetitions
        self.filters = filters
        self.kernel_size = kernel_size
        self.pool_size = pool_size
        self.activation = activation
    
        # Define a repeated ConvMax
        for i in range(self.repetitions):
            # Define a ConvMax layer, specifying filters, kernel_size, pool_size.
            vars(self)[f'convMax_{i}'] = ConvMax(self.filters, self.kernel_size, self.pool_size, self.activation)

    def call(self, input_tensor):
        # Connect the first layer
        x = vars(self)['convMax_0'](input_tensor)
   
        # Connect the existing layers
        for i in range(1, self.repetitions):
            x = vars(self)[f'convMax_{i}'](x)
    
        # return the last layer
        return x

But when I am trying to build the network to see the summaries, here is what I found:但是当我尝试构建网络以查看摘要时，我发现了以下内容：

model_input = tf.keras.layers.Input(shape=(64,64,3,), name="input_layer")
x = RepeatedConvMax()(model_input)
model = tf.keras.Model(inputs=model_input, outputs=x)

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_layer (InputLayer)     [(None, 64, 64, 3)]       0         
_________________________________________________________________
repeated_conv_max (RepeatedC (None, 4, 4, 4)           0         
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

The total params are zero总参数为零

However, when I try:但是，当我尝试：

model_input = tf.keras.layers.Input(shape=(64,64,3,), name="input_layer")
x = ConvMax()(model_input)
x = ConvMax()(x)
x = ConvMax()(x)
x = ConvMax()(x)
model = tf.keras.Model(inputs=model_input, outputs=x)
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_layer (InputLayer)     [(None, 64, 64, 3)]       0         
_________________________________________________________________
conv_max (ConvMax)           (None, 32, 32, 4)         112       
_________________________________________________________________
conv_max_1 (ConvMax)         (None, 16, 16, 4)         148       
_________________________________________________________________
conv_max_2 (ConvMax)         (None, 8, 8, 4)           148       
_________________________________________________________________
conv_max_3 (ConvMax)         (None, 4, 4, 4)           148       
=================================================================
Total params: 556
Trainable params: 556
Non-trainable params: 0
_________________________________________________________________

It shows correct the total params.它显示正确的总参数。

Do you know what is the problem?你知道是什么问题吗？ Why on the two-level subclassing, the parameter is 0?为什么在二级子类化上，参数为0？ Will it affect the training?会不会影响训练？

Thank you...谢谢...

Answer 1

The problem is not with keras but in the way you are initializing the layers in RepeatedConvMax .问题不在于 keras ，而在于您在RepeatedConvMax中初始化图层的方式。

TLDR: don't use vars to dinamically instantiate and retrieve attributes, instead use setattr and getattr TLDR：不要使用vars动态实例化和检索属性，而是使用setattr和getattr

To solve the problem, you simply have to replace vars[] with setattr and getattr .要解决此问题，您只需将vars[]替换为setattr和getattr 。 From my (very limited, I actually found this out right now while looking for a solution) understanding, when you call vars you are working on a copy of the dictionary representing your object.根据我的（非常有限，实际上我现在在寻找解决方案时发现了这一点）的理解，当您调用vars时，您正在处理代表您的 object 的字典副本。 When you create attributes dynamically in this way, Keras is not able to add the weights to the model (why is that, I don't yet know, but I will find out and update the answer when I do).当您以这种方式动态创建属性时，Keras 无法将权重添加到 model （为什么会这样，我还不知道，但我会找出并更新答案）。

If you define your class like this, everything works as expected:如果您像这样定义 class ，一切都会按预期工作：

class RepeatedConvMax(tf.keras.Model):
    def __init__(self, repetitions=4, filters=4, kernel_size=3, pool_size=2, activation='relu', **kwargs):
        super(RepeatedConvMax, self).__init__(**kwargs)

        self.repetitions = repetitions
        self.filters = filters
        self.kernel_size = kernel_size
        self.pool_size = pool_size
        self.activation = activation

        # Define a repeated ConvMax
        for i in range(self.repetitions):
            # Define a ConvMax layer, specifying filters, kernel_size, pool_size.
            setattr(self, f'convMax_{i}', ConvMax(self.filters, self.kernel_size, self.pool_size, self.activation))

    def call(self, input_tensor, training=None, mask=None):
        # Connect the first layer
        x = getattr(self, 'convMax_0')(input_tensor)

        # Connect the existing layers
        for i in range(1, self.repetitions):
            print(f"Layer {i}")
            x = getattr(self, f'convMax_{i}')(x)
            print(x)

        # return the last layer
        return x

Answer 2

Don't add your layers using vars , loop over the amount of layers you want and add them to a tf.keras.Sequential object, and do your forward pass through that.不要使用vars添加图层，循环遍历所需的图层数量并将它们添加到tf.keras.Sequential object，然后进行前向传递。

Refactored Class :重构 Class ：

class RefactoredRepeatedConvMax(tf.keras.models.Model):
    def __init__(self,
                 repetitions=4,
                 filters=4,
                 kernel_size=3,
                 pool_size=2,
                 activation="relu"):
        super().__init__()
        self.repetitions = repetitions
        self.filters = filters
        self.kernel_size = kernel_size
        self.pool_size = pool_size
        self.activation = activation
        self.conv_layers = tf.keras.Sequential()
        for _ in tf.range(self.repetitions):
            self.conv_layers.add(ConvMax(
                self.filters,
                self.kernel_size,
                self.pool_size,
                self.activation))

    def call(self, x):
        return self.conv_layers(x)

Model : Model ：

model_input = tf.keras.layers.Input(shape=(64, 64, 3), name="input_layer")
x = RefactoredRepeatedConvMax()(model_input)
model = tf.keras.Model(inputs=model_input, outputs=x)
model.summary()

# Model: "model"
# _________________________________________________________________
# Layer (type)                 Output Shape              Param #   
# =================================================================
# input_layer (InputLayer)     [(None, 64, 64, 3)]       0         
# _________________________________________________________________
# refactored_repeated_conv_max (None, 4, 4, 4)           556       
# =================================================================
# Total params: 556
# Trainable params: 556
# Non-trainable params: 0
# _________________________________________________________________

Tensorflow 2 Keras 嵌套 Model 子类化 - 总参数为零

问题描述

2 个解决方案

解决方案1
3 已采纳 2021-03-05 12:12:17

解决方案2
1 2021-03-05 12:18:28

Tensorflow 2 Keras 嵌套 Model 子类化 - 总参数为零

问题描述

2 个解决方案

解决方案1 3 已采纳 2021-03-05 12:12:17

解决方案2 1 2021-03-05 12:18:28

解决方案1
3 已采纳 2021-03-05 12:12:17

解决方案2
1 2021-03-05 12:18:28