I am struggling to work out how to calculate the dimensions for the fully connected layer. I am inputing images which are (448x448) using a batch size (16). Below is the code for my convolutional layers:
class ConvolutionalNet(nn.Module):
def __init__(self, num_classes=182):
super().__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(3, 16, kernal_size=5, stride=1, padding=2),
nn.BatchNorm2d(16),
nn.ReLU(),
nn.MaxPool2d(kernal_size=2, stride=2)
)
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernal_size=5, stride=1, padding=2),
nn.BatchNorm2d(32),
nn.ReLU(),
nn.MaxPool2d(kernal_size=2, stride=2)
)
self.layer3 = nn.Sequential(
nn.Conv2d(32, 32, kernal_size=5, stride=1, padding=2),
nn.BatchNorm2d(32),
nn.ReLU(),
nn.MaxPool2d(kernal_size=2, stride=2)
)
self.layer4 = nn.Sequential(
nn.Conv2d(32, 64, kernal_size=5, stride=1, padding=2),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.MaxPool2d(kernal_size=2, stride=2)
)
self.layer5 = nn.Sequential(
nn.Conv2d(64, 64, kernal_size=5, stride=1, padding=2),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.MaxPool2d(kernal_size=2, stride=2)
)
I want to add a fully connected layer:
self.fc = nn.Linear(?, num_classes)
Would anyone be able to explain the best way to go about calculating this? Also, if I have multiple fully connected layers eg (self.fc2, self.fc3), would the second parameter always equal the number of classes. I am new to coding and finding it hard to wrap my head around this.
The conv layers don't change the width/height of the features since you've set padding
equal to (kernel_size - 1) / 2
. Max pooling with kernel_size = stride = 2
will decrease the width/height by a factor of 2
(rounded down if input shape is not even).
Using 448
as input width/height, the output width/height will be 448 // 2 // 2 // 2 // 2 // 2 = 448/32 = 14
(where //
is floor-divide operator).
The number of channels is fully determined by the last conv layer, which outputs 64
channels.
Therefore you will have a [B,64,14,14]
shaped tensor, so the Linear
layer should have in_features = 64*14*14 = 12544
.
Note you'll need to flatten the input beforehand, something like.
self.layer6 = nn.Sequential(
nn.Flatten(),
nn.Linear(12544, num_classes)
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.