torch.nn.conv2d 中參數的含義

Question

在 fastai 前沿深度學習程序員課程第 7 講。

 self.conv1 = nn.Conv2d(3,10,kernel_size = 5,stride=1,padding=2)

10 是否意味着過濾器的數量或過濾器將提供的激活數量？

Answer 1

這是您可能會發現的

torch.nn.Conv2d（in_channels，out_channels，kernel_size，stride=1，padding=0，dilation=1，groups=1，bias=True，padding_mode='zeros'）

參數

in_channels (int) - 輸入圖像中的通道數
out_channels (int) – 卷積產生的通道數
kernel_size (int or tuple) – 卷積核的大小
stride (int or tuple, optional) -- 卷積的步幅。 （默認值：1）
padding (int or tuple, optional) – 零填充添加到輸入的兩側（默認值：0）
padding_mode (string, optional) – 零
dilation (int or tuple, optional) – 內核元素之間的間距。 （默認值：1）
組（int，可選）–從輸入到輸出通道的阻塞連接數。 （默認值：1）
bias (bool, optional) -- 如果為真，則為輸出添加可學習的偏差。 （默認：真）

這個URL對過程有幫助的可視化。

因此，對於具有 3 個通道的圖像（彩色圖像），開頭的in_channels為 3。 對於黑白圖像，它應該是 1。一些衛星圖像應該有 4。

out_channels是過濾器的數量，您可以任意設置。

讓我們創建一個示例來“證明”這一點。

import torch
import torch.nn as nn
c = nn.Conv2d(1,3, stride = 1, kernel_size=(4,5))
print(c.weight.shape)
print(c.weight)

出去

torch.Size([3, 1, 4, 5])
Parameter containing:
tensor([[[[ 0.1571,  0.0723,  0.0900,  0.1573,  0.0537],
          [-0.1213,  0.0579,  0.0009, -0.1750,  0.1616],
          [-0.0427,  0.1968,  0.1861, -0.1787, -0.2035],
          [-0.0796,  0.1741, -0.2231,  0.2020, -0.1762]]],


        [[[ 0.1811,  0.0660,  0.1653,  0.0605,  0.0417],
          [ 0.1885, -0.0440, -0.1638,  0.1429, -0.0606],
          [-0.1395, -0.1202,  0.0498,  0.0432, -0.1132],
          [-0.2073,  0.1480, -0.1296, -0.1661, -0.0633]]],


        [[[ 0.0435, -0.2017,  0.0676, -0.0711, -0.1972],
          [ 0.0968, -0.1157,  0.1012,  0.0863, -0.1844],
          [-0.2080, -0.1355, -0.1842, -0.0017, -0.2123],
          [-0.1495, -0.2196,  0.1811,  0.1672, -0.1817]]]], requires_grad=True)

如果我們改變 out_channels 的數量，

c = nn.Conv2d(1,5, stride = 1, kernel_size=(4,5))
print(c.weight.shape) # torch.Size([5, 1, 4, 5])

我們將得到 5 個過濾器，每個過濾器 4x5，因為這是我們的內核大小。 如果我們設置 2 個通道，（有些圖像可能只有 2 個通道）

c = nn.Conv2d(2,5, stride = 1, kernel_size=(4,5))
print(c.weight.shape) # torch.Size([5, 2, 4, 5])

我們的過濾器將有 2 個通道。

我認為他們有這本書中的術語，並且由於他們沒有將其稱為過濾器，因此他們沒有使用該術語。

所以你是對的； 過濾器是卷積層正在學習的東西，過濾器的數量是輸出通道的數量。 它們在開始時是隨機設置的。

激活次數是根據bs和圖像尺寸計算的：

bs=16
x = torch.randn(bs, 3, 28, 28)
c = nn.Conv2d(3,10,kernel_size=5,stride=1,padding=2)
out = c(x)
print(out.nelement()) #125440 number of activations

Answer 2

檢查文檔https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d你有 3 個 in_channels 和 10 個 out_channels 所以這 10 個 out_channels 是 @thefifthjack005 過濾器也稱為功能。

torch.nn.conv2d 中參數的含義

問題描述

2 個解決方案

解決方案1
55 2019-06-20 12:16:57

解決方案2
2 2019-06-20 12:05:43

torch.nn.conv2d 中參數的含義

問題描述

2 個解決方案

解決方案1 55 2019-06-20 12:16:57

解決方案2 2 2019-06-20 12:05:43

解決方案1
55 2019-06-20 12:16:57

解決方案2
2 2019-06-20 12:05:43