简体   繁体   中英

Output shapes and parameters of a CNN with Keras

I have difficulty understanding the output shapes and number of parameters of layers in a Keras CNN model.

Let's take this toy example:

model = Sequential()
model.add(Conv1D(7, kernel_size=40, activation="relu", input_shape=(60, 1)))
model.add(Conv1D(10, kernel_size=16, activation="relu"))
model.add(MaxPooling1D(pool_size=3))
model.summary()

The output is:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_17 (Conv1D)           (None, 21, 7)             287       
_________________________________________________________________
conv1d_18 (Conv1D)           (None, 6, 10)             1130      
_________________________________________________________________
max_pooling1d_11 (MaxPooling (None, 2, 10)             0         
=================================================================
Total params: 1,417
Trainable params: 1,417
Non-trainable params: 0
_________________________________________________________________

For the first Conv1D layer, there are 7 filters of output size (60 - 40 + 1) = 21 each. The number of parameters is (40 + 1) * 7 = 287, to take the bias into account. So, I'm OK with it.

But on which dimension will operate the second Conv1D layer? I guess that the output filter size is 21 - 16 + 1 = 6, but I don't understand by which operation we can go from 7 to 10 for the last dimension. I don't understand either how the number of parameters is computed.

Finally, I don't understand the output shape of the MaxPooling1D layer, since I would expect the output size to be 6 - 3 + 1 = 4 and not 2. How is it computed?

... but I don't understand by which operation we can go from 7 to 10 for the last dimension.

By the same operation that it went from 1 to 7 in the first layer: the convolution filters are applied on whole last axis (ie dimension) of their input and produce a single number at each application window. There are 10 filters in the second convolution layer, therefore 10 values would be generated for each window, hence the dimension of last axis would be 10 (the same reasoning applies to the first convolution layer as well).

I don't understand either how the number of parameters is computed.

There are 10 filters. As I mentioned above, the filter is applied on the whole last axis. So they must have a width of 7 (ie last axis size of their input). And the kernel size is 16. So we have: 10 * (16 * 7) + 10 (1 bias per filter) = 1130.

Finally, I don't understand the output shape of the MaxPooling1D layer, since I would expect the output size to be 6 - 3 + 1 = 4 and not 2. How is it computed?

The stride of 1D-pooling layer is by default equal to the pool_size . Therefore, applying on a sequence of length 6, a pooling layer of size 3 would have only 2 application windows.

Note: You may also find this relevant answer useful about how 1D-conv works.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM