简体   繁体   English

CNN负数参数

[英]CNN Negative Number of Parameters

I am trying to build a CNN model with keras. 我正在尝试使用keras构建CNN模型。 When i add two blocks of Conv3D and MaxPooling, everything is normal. 当我添加两个Conv3D和MaxPooling块时,一切正常。 However, once the third block is added (as shown in the code) the number of trainable parameters gets negative value. 但是,一旦添加了第三个块(如代码中所示),则可训练参数的数量将变为负值。 Any idea how this can happen? 知道如何发生吗?

model = keras.models.Sequential()

# # # First Block
model.add(Conv2D(filters=16, kernel_size=(5, 5), padding='valid', input_shape=(157, 462, 14), activation = 'tanh' ))
model.add(MaxPooling2D( (2,2) ))

# # # Second Block     
model.add(Conv2D(filters=32, kernel_size=(5, 5), padding='valid', activation = 'tanh'))
model.add(MaxPooling2D( (2, 2) ))

# # # Third Block   
model.add(Conv2D(filters=64, kernel_size=(5, 5), padding='valid', activation = 'tanh'))
model.add(MaxPooling2D( (2, 2) ))

model.add(Flatten())
model.add(Dense(157 * 462))
model.compile(loss='mean_squared_error',
              optimizer=keras.optimizers.Adamax(),
               metrics=['mean_absolute_error'])

print(model.summary())

The result of this code is the following: 该代码的结果如下:

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 153, 458, 16)      5616      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 76, 229, 16)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 72, 225, 32)       12832     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 36, 112, 32)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 32, 108, 64)       51264     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 16, 54, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 55296)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 72534)             -284054698
=================================================================
Total params: -283,984,986
Trainable params: -283,984,986
Non-trainable params: 0
_________________________________________________________________
None

Yes, of course, your Dense layer has a weight matrix of size 55296 x 72534 , which contains 4010840064 numbers, that is 4010 million parameters. 是的,当然,您的Dense图层的权重矩阵大小为55296 x 72534 ,其中包含4010840064数字,即401,000万个参数。

Somewhere in the Keras code the number of parameters is stored as an int32, and that means there is a limit to what numbers it can store, namely 2^32 - 1 = 2147483647 , and now you can see, your 4010 million parameters is larger than 2^32 - 1 , so the number overflows into the negative side of an integer. 在Keras代码中的某个地方,参数数量存储为int32,这意味着它可以存储的数量是有限制的,即2^32 - 1 = 2147483647 ,现在您可以看到,您的401,000万个参数更大大于2^32 - 1 ,因此数字溢出到整数的负数侧。

I would recommend not making a model with such large number of parameters, you would not be able to train it anyway without aa huge amount of RAM. 我建议您不要建立具有如此大量参数的模型,否则,如果没有大量的RAM,您将无法进行训练。

The problem is because you are running your code in CPU due to which the backend of keras tensorflow or theano are able to work properly. 问题是因为您正在CPU中运行代码,因此keras tensorflow或theano的后端可以正常工作。 I was able to run your code perfectly with GPU in google colab and this is what I got 我能够在Google colab中使用GPU完美地运行您的代码,这就是我得到的

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 153, 458, 16)      5616      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 76, 229, 16)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 72, 225, 32)       12832     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 36, 112, 32)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 32, 108, 64)       51264     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 16, 54, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 55296)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 72534)             4010912598
=================================================================
Total params: 4,010,982,310
Trainable params: 4,010,982,310
Non-trainable params: 0

I recommend you to use GPU for training such a huge network. 我建议您使用GPU来训练如此庞大的网络。

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM