[英]Tensorflow 2 Keras Nested Model Subclassing - Total parameters zero
I am trying to implement a simple model subclassing inspired by the VGG network.我正在尝试实现一个受 VGG 网络启发的简单 model 子类化。
So here is the code:所以这里是代码:
class ConvMax(tf.keras.Model):
def __init__(self, filters=4, kernel_size=3, pool_size=2, activation='relu'):
super(ConvMax, self).__init__()
self.conv = tf.keras.layers.Conv2D(filters, kernel_size, padding='same', activation=activation)
self.maxpool = tf.keras.layers.MaxPool2D((pool_size, pool_size))
def call(self, input_tensor):
x = self.conv(input_tensor)
x = self.maxpool(x)
return x
class RepeatedConvMax(tf.keras.Model):
def __init__(self, repetitions=4, filters=4, kernel_size=3, pool_size=2, activation='relu', **kwargs):
super(RepeatedConvMax, self).__init__(**kwargs)
self.repetitions = repetitions
self.filters = filters
self.kernel_size = kernel_size
self.pool_size = pool_size
self.activation = activation
# Define a repeated ConvMax
for i in range(self.repetitions):
# Define a ConvMax layer, specifying filters, kernel_size, pool_size.
vars(self)[f'convMax_{i}'] = ConvMax(self.filters, self.kernel_size, self.pool_size, self.activation)
def call(self, input_tensor):
# Connect the first layer
x = vars(self)['convMax_0'](input_tensor)
# Connect the existing layers
for i in range(1, self.repetitions):
x = vars(self)[f'convMax_{i}'](x)
# return the last layer
return x
But when I am trying to build the network to see the summaries, here is what I found:但是当我尝试构建网络以查看摘要时,我发现了以下内容:
model_input = tf.keras.layers.Input(shape=(64,64,3,), name="input_layer")
x = RepeatedConvMax()(model_input)
model = tf.keras.Model(inputs=model_input, outputs=x)
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_layer (InputLayer) [(None, 64, 64, 3)] 0
_________________________________________________________________
repeated_conv_max (RepeatedC (None, 4, 4, 4) 0
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________
The total params are zero总参数为零
However, when I try:但是,当我尝试:
model_input = tf.keras.layers.Input(shape=(64,64,3,), name="input_layer")
x = ConvMax()(model_input)
x = ConvMax()(x)
x = ConvMax()(x)
x = ConvMax()(x)
model = tf.keras.Model(inputs=model_input, outputs=x)
model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_layer (InputLayer) [(None, 64, 64, 3)] 0
_________________________________________________________________
conv_max (ConvMax) (None, 32, 32, 4) 112
_________________________________________________________________
conv_max_1 (ConvMax) (None, 16, 16, 4) 148
_________________________________________________________________
conv_max_2 (ConvMax) (None, 8, 8, 4) 148
_________________________________________________________________
conv_max_3 (ConvMax) (None, 4, 4, 4) 148
=================================================================
Total params: 556
Trainable params: 556
Non-trainable params: 0
_________________________________________________________________
It shows correct the total params.它显示正确的总参数。
Do you know what is the problem?你知道是什么问题吗? Why on the two-level subclassing, the parameter is 0?
为什么在二级子类化上,参数为0? Will it affect the training?
会不会影响训练?
Thank you...谢谢...
The problem is not with keras but in the way you are initializing the layers in RepeatedConvMax
.问题不在于 keras ,而在于您在
RepeatedConvMax
中初始化图层的方式。
TLDR: don't use vars
to dinamically instantiate and retrieve attributes, instead use setattr
and getattr
TLDR:不要使用
vars
动态实例化和检索属性,而是使用setattr
和getattr
To solve the problem, you simply have to replace vars[]
with setattr
and getattr
.要解决此问题,您只需将
vars[]
替换为setattr
和getattr
。 From my (very limited, I actually found this out right now while looking for a solution) understanding, when you call vars
you are working on a copy of the dictionary representing your object.根据我的(非常有限,实际上我现在在寻找解决方案时发现了这一点)的理解,当您调用
vars
时,您正在处理代表您的 object 的字典副本。 When you create attributes dynamically in this way, Keras is not able to add the weights to the model (why is that, I don't yet know, but I will find out and update the answer when I do).当您以这种方式动态创建属性时,Keras 无法将权重添加到 model (为什么会这样,我还不知道,但我会找出并更新答案)。
If you define your class like this, everything works as expected:如果您像这样定义 class ,一切都会按预期工作:
class RepeatedConvMax(tf.keras.Model):
def __init__(self, repetitions=4, filters=4, kernel_size=3, pool_size=2, activation='relu', **kwargs):
super(RepeatedConvMax, self).__init__(**kwargs)
self.repetitions = repetitions
self.filters = filters
self.kernel_size = kernel_size
self.pool_size = pool_size
self.activation = activation
# Define a repeated ConvMax
for i in range(self.repetitions):
# Define a ConvMax layer, specifying filters, kernel_size, pool_size.
setattr(self, f'convMax_{i}', ConvMax(self.filters, self.kernel_size, self.pool_size, self.activation))
def call(self, input_tensor, training=None, mask=None):
# Connect the first layer
x = getattr(self, 'convMax_0')(input_tensor)
# Connect the existing layers
for i in range(1, self.repetitions):
print(f"Layer {i}")
x = getattr(self, f'convMax_{i}')(x)
print(x)
# return the last layer
return x
Don't add your layers using vars
, loop over the amount of layers you want and add them to a tf.keras.Sequential
object, and do your forward pass through that.不要使用
vars
添加图层,循环遍历所需的图层数量并将它们添加到tf.keras.Sequential
object,然后进行前向传递。
Refactored Class :重构 Class :
class RefactoredRepeatedConvMax(tf.keras.models.Model):
def __init__(self,
repetitions=4,
filters=4,
kernel_size=3,
pool_size=2,
activation="relu"):
super().__init__()
self.repetitions = repetitions
self.filters = filters
self.kernel_size = kernel_size
self.pool_size = pool_size
self.activation = activation
self.conv_layers = tf.keras.Sequential()
for _ in tf.range(self.repetitions):
self.conv_layers.add(ConvMax(
self.filters,
self.kernel_size,
self.pool_size,
self.activation))
def call(self, x):
return self.conv_layers(x)
Model : Model :
model_input = tf.keras.layers.Input(shape=(64, 64, 3), name="input_layer")
x = RefactoredRepeatedConvMax()(model_input)
model = tf.keras.Model(inputs=model_input, outputs=x)
model.summary()
# Model: "model"
# _________________________________________________________________
# Layer (type) Output Shape Param #
# =================================================================
# input_layer (InputLayer) [(None, 64, 64, 3)] 0
# _________________________________________________________________
# refactored_repeated_conv_max (None, 4, 4, 4) 556
# =================================================================
# Total params: 556
# Trainable params: 556
# Non-trainable params: 0
# _________________________________________________________________
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.