简体   繁体   English

keras.applications中的input_shape如何工作?

[英]how does input_shape in keras.applications work?

I have been through the Keras documentation but I am still unable to figure how does the input_shape parameter works and why it does not change the number of parameters for my DenseNet model when I pass it my custom input shape. 我已经阅读过Keras 文档,但仍无法确定input_shape参数的工作方式以及为什么在传递自定义输入形状时它不会更改DenseNet模型的参数数量。 An example: 一个例子:

import keras
from keras import applications
from keras.layers import Conv3D, MaxPool3D, Flatten, Dense
from keras.layers import Dropout, Input, BatchNormalization
from keras import Model

# define model 1
INPUT_SHAPE = (224, 224, 1) # used to define the input size to the model
n_output_units = 2
activation_fn = 'sigmoid'
densenet_121_model = applications.densenet.DenseNet121(include_top=False, weights=None, input_shape=INPUT_SHAPE, pooling='avg')
inputs = Input(shape=INPUT_SHAPE, name='input')
model_base = densenet_121_model(inputs)
output = Dense(units=n_output_units, activation=activation_fn)(model_base)
model = Model(inputs=inputs, outputs=output)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 224, 224, 1)       0         
_________________________________________________________________
densenet121 (Model)          (None, 1024)              7031232   
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 2050      
=================================================================
Total params: 7,033,282
Trainable params: 6,949,634
Non-trainable params: 83,648
_________________________________________________________________



# define model 2
INPUT_SHAPE = (512, 512, 1) # used to define the input size to the model
n_output_units = 2
activation_fn = 'sigmoid'
densenet_121_model = applications.densenet.DenseNet121(include_top=False, weights=None, input_shape=INPUT_SHAPE, pooling='avg')
inputs = Input(shape=INPUT_SHAPE, name='input')
model_base = densenet_121_model(inputs)
output = Dense(units=n_output_units, activation=activation_fn)(model_base)
model = Model(inputs=inputs, outputs=output)
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 512, 512, 1)       0         
_________________________________________________________________
densenet121 (Model)          (None, 1024)              7031232   
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 2050      
=================================================================
Total params: 7,033,282
Trainable params: 6,949,634
Non-trainable params: 83,648
_________________________________________________________________

Ideally with an increase in the input shape the number of parameters should increase, however as you can see they stay exactly the same. 理想情况下,随着输入形状的增加,参数的数量应增加,但是如您所见,它们保持完全相同。 My questions are thus: 因此,我的问题是:

  1. Why do the number of parameters not change with a change in the input_shape ? 为什么参数的数量不随input_shape变化而变化?
  2. I have only defined one channel in my input_shape , what would happen to my model training in this scenario? 我在input_shape只定义了一个通道,在这种情况下我的模型训练会怎样? The documentation says the following: 该文档说:

input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 32. Eg (200, 200, 3) would be one valid value. input_shape:可选的形状元组,仅当include_top为False时才指定(否则,输入形状必须为(224,224,3)(具有“ channels_last”数据格式)或(3,224,224)(具有“ channels_first”数据格式),它应该有3个输入通道,宽度和高度不小于32(例如(200,200,3)是一个有效值)。

However when I run the model with this configuration it runs without any problems. 但是,当我使用此配置运行模型时,它运行没有任何问题。 Could there be something that I am missing out? 可能有我遗漏的东西吗?

Using Keras 2.2.4 with Tensorflow 1.12.0 as backend. 使用Keras 2.2.4和Tensorflow 1.12.0作为后端。

1. In the convolutional layers the input size does not influence the number of weights, because the number of weights is determined by the kernel matrix dimensions. 1.在卷积层中,输入大小不影响权重的数量,因为权重的数量由内核矩阵维确定。 A larger input size leads to a larger output size, but not to an increasing number of weights. 较大的输入大小会导致较大的输出大小,但不会导致权重增加。

This means, that the output size of the convolutional layers of the second model will be larger than for the first model, which would increase the number of weights in the following dense layer. 这意味着,第二个模型的卷积层的输出大小将比第一个模型的大,这将增加后续密集层中的权重数。 However if you take a look into the architecture of DenseNet you notice that there's a GlobalMaxPooling2D layer after all the convolutional layers, which averages all the values for each output channel. 但是,如果您看一下DenseNet的体系结构,则会注意到在所有卷积层之后都有一个GlobalMaxPooling2D层,该层将每个输出通道的所有值取平均值。 Thats why the output of DenseNet will be of size 1024, whatever the input shape. 这就是为什么无论输入形状如何,DenseNet的输出大小都为1024的原因。

2. Yes, the model will still work. 2.是的,该模型仍然可以使用。 I'm not entirely sure about that, but my guess is that the single channel will be broadcasted (dublicated) to fill all three channels. 我对此不太确定,但是我猜想单个频道将被广播(复制)以填充所有三个频道。 Thats at least how these things are usually handled (see for exaple tensorflow or numpy ). 至少这就是这些事情通常的处理方式(请参阅示例tensorflownumpy )。

The DenseNet is composed of two parts, the convolution part, and the global pooling part. DenseNet由两部分组成:卷积部分和全局池化部分。

The number of the convolution part's trainable weights doesn't depend on the input shape. 卷积部分可训练权重的数量不取决于输入形状。

Usually, a classification network should employ fully connected layers to infer the classification, however, in DenseNet , global pooling is used and doesn't bring any trainable weights. 通常,分类网络应采用完全连接的层来推断分类,但是,在DenseNet ,使用global pooling并且不会带来任何可训练的权重。

Therefore, the input shape doesn't affect the number of weights of the entire network. 因此,输入形状不会影响整个网络的权数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM