如何理解卷積層 output 形狀

Question

我對卷積層的 output 形狀有點困惑。 例如，如圖所示，6 6 3 圖像使用 2 個過濾器，最后 output 將是 4 4 2，這三個顏色通道將融合為 1 層，但在卷積層之后的某些網絡中，顏色通道仍然保持，例如這里 model.add(Conv2D(32, kernel_size=5,strides=1, activation=None, input_shape=(128,128,3)))，output shape 是 conv2d_5,5,33問題是我沒有看到任何特定的代碼來說明顏色通道是否保留。 在此處輸入圖像描述

Answer 1

在 OP 發布的示例圖像中，輸入大小為6 x 6 x 3 ( input_dim=6, channel_in=3 )，帶有2個大小為3 x 3 ( filter_size=3 ) 的濾波器，空間維度可以計算為(input_dim - filter_size + 2 * padding) / stride + 1 = (6 - 3 + 2 * 0)/1 + 1 = 4 （其中padding=0和stride=1 ）

因此4 x 4功能 map。 標准 CNN 層中用於計算此特征 map 中的元素的操作是全連接層的操作。 考慮下面的示例過濾器和圖像補丁（來自CS231n ）：

然后 output 元素計算為：

import numpy as np

# filter weights of size 3 x 3 x 3
w0 = np.array([
    [[0., -1., 0.],
     [1., -1., 0.],
     [0., -1., 0.]],
    [[0., 1., -1.],
     [-1., 1., 0.],
     [1., -1., 0.]],
    [[-1., 0., 0.],
     [0., -1., -1.],
     [1., -1., 0.]]
])
# bias value for the filter
b0 = 1

# an input image patch 3 x 3 x 3
x_patch = np.array([
    [[0., 0., 0.],
     [0., 2., 1.],
     [0., 1., 1.]],
    [[0., 0., 0.],
     [0., 0., 1.],
     [0., 0., 1.]],
    [[0., 0., 0.],
     [0., 0., 0.],
     [0., 0., 2.]]
])

# define the operation for each channel
>>> op = lambda xs, ws: np.sum(xs*ws)
>>> op(x_patch[:, :, 0], w0[:, :, 0]) # channel 1
0.0
>>> op(x_patch[:, :, 1], w0[:, :, 1]) # channel 2
-3.0
>>> op(x_patch[:, :, 2], w0[:, :, 2]) # channel 3
0.0

# add the values for each channel (this is where 
# channel dimension is summed over) plus the bias
>>> 0.0 + (-3.0) + 0.0 + b0
-2.0

# or simply
>>> np.sum(x_patch * w0) + b0
-2.0

這通常是 CNN 的情況，也可以將其可視化為

與通道維度保持原樣的深度卷積相比：

TensorFlow 為tf.keras.layers.Conv2D （此處）和tf.keras.layers.DepthwiseConv2D （此處）中的每個提供單獨的實現，因此您可以根據您的應用程序在此處使用。

對於您的第二個示例（使用 tf v2.9.0），我無法重現5 x 5 x 3 x 32的 output 尺寸：

import tensorflow as tf

# The inputs are 128 x 128 RGB images with 
# `data_format=channels_last` (by default) and 
# the batch size is 4.
>>> input_shape = (4, 128, 128, 3)
>>> x = tf.random.normal(input_shape)
>>> y = tf.keras.layers.Conv2D(
 32, 
 kernel_size=5, 
 strides=1, 
 activation=None, 
 input_shape=(128, 128, 3)
)(x)
>>> print(y.shape)
(4, 124, 124, 32)

示例代碼根據官方文檔示例稍作調整。

Answer 2

model = Sequential()

model.add(Conv2D(32, kernel_size=5,strides=1, activation=None, input_shape=(128,128,3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPool2D(2,2))
model.add(Dropout(0.2))

model.add(Flatten())
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(units=1))
model.add(Activation('sigmoid'))


for layer in model.layers:
    # check for convolutional layer
    if 'conv' not in layer.name:
        continue
    # get filter weights
    filters, biases = layer.get_weights()
    print(layer.name, filters.shape)

所以當我打印 conv output 圖層形狀時，它顯示為 conv2d_46 (5, 5, 3, 32)。當我打印摘要時，外形顯示不同，什么是無？層（類型）Output 形狀參數 #

conv2d_45 (Conv2D) (無, 124, 124, 32) 2432

batch_normalization_38（蝙蝠（無，124、124、32）128
歸一化）

activation_36（激活）（無、124、124、32）0

max_pooling2d_17 (MaxPoolin (無, 62, 62, 32) 0
g2D)

dropout_26 (輟學) (無, 62, 62, 32) 0

flatten_11（展平）（無，123008）0

dense_23（密集）（無，64）7872576

dropout_27（輟學）（無，64）0

dense_24（密集）（無，1）65

activation_37（激活）（無，1）0

如何理解卷積層 output 形狀

問題描述

1 個解決方案

解決方案1
1 2022-07-26 10:48:59

解決方案2
0 2022-07-27 13:43:37

所以當我打印 conv output 圖層形狀時，它顯示為 conv2d_46 (5, 5, 3, 32)。當我打印摘要時，外形顯示不同，什么是無？層（類型）Output 形狀參數 #

如何理解卷積層 output 形狀

問題描述

1 個解決方案

解決方案1 1 2022-07-26 10:48:59

解決方案2 0 2022-07-27 13:43:37

所以當我打印 conv output 圖層形狀時，它顯示為 conv2d_46 (5, 5, 3, 32)。 當我打印摘要時，外形顯示不同，什么是無？ 層（類型）Output 形狀參數 #

解決方案1
1 2022-07-26 10:48:59

解決方案2
0 2022-07-27 13:43:37

所以當我打印 conv output 圖層形狀時，它顯示為 conv2d_46 (5, 5, 3, 32)。當我打印摘要時，外形顯示不同，什么是無？層（類型）Output 形狀參數 #