Pytorch Conv1D gives different size to ConvTranspose1d

Question

I am trying to build a basic/shallow CNN auto-encoder for 1D time series data in pytorch/pytorch-lightning.

Currently, my encoding block is:

class encodingBlock(nn.Module):
    def __init__(self):
        super().__init__()
                
        self.conv1d_1 = nn.Conv1d(1, 64, kernel_size=32)
        self.relu = nn.ReLU()
        self.batchnorm = nn.BatchNorm1d(64)
        self.maxpool = nn.MaxPool1d(kernel_size=2, stride=2, return_indices=True)
        self.fc = nn.Linear(64, 4)

    def forward(self, x):
        cnn_out1 = self.conv1d_1(x)
        norm_out1 = self.batchnorm(cnn_out1)
        relu_out1 = self.relu(norm_out1)
        maxpool_out, indices = self.maxpool(relu_out1)
        gap_out = torch.mean(maxpool_out, dim = 2)
        fc_out = self.relu(self.fc(gap_out))
        return fc_out, indices

And my decoding block is:

class decodingBlock(nn.Module):
    def __init__(self):
        super().__init__()
                
        self.Tconv1d_1 = nn.ConvTranspose1d(64, 1, kernel_size=32, output_padding=1)
        self.relu = nn.ReLU()
        self.batchnorm = nn.BatchNorm1d(1)
        self.maxunpool = nn.MaxUnpool1d(kernel_size=2, stride=2)
        self.upsamp = nn.Upsample(size=59, mode='nearest')
        self.fc = nn.Linear(4, 64)

    def forward(self, x, indices):
        fc_out = self.fc(x)
        relu_out = self.relu(fc_out)
        relu_out = relu_out.unsqueeze(dim = 2)
        upsamp_out = self.upsamp(relu_out)
        maxpool_out = self.maxunpool(upsamp_out, indices)
        cnnT_out = self.Tconv1d_1(maxpool_out)
        norm_out = self.batchnorm(cnnT_out)
        relu_out = self.relu(norm_out)            
        return relu_out

However, looking at the outputs:

Input size: torch.Size([1, 1, 150])
Conv1D out size: torch.Size([1, 64, 119])
Maxpool out size: torch.Size([1, 64, 59])
Global average pooling out size: torch.Size([1, 64])
Encoder dense out size: torch.Size([1, 4])
...
Decoder input: torch.Size([1, 4])
Decoder dense out size: torch.Size([1, 64])
Unsqueeze out size: torch.Size([1, 64, 1])
Upsample out size: torch.Size([1, 64, 59])
Decoder maxunpool out size: torch.Size([1, 64, 118])
Transpose Conv out size: torch.Size([1, 1, 149])

The outputs from the MaxUnpool1d and ConvTranspose1d layers are not the expected dimension.

I have two questions that I was hoping to get some help on:

Why are the dimensions wrong?
Is there a better way to "reverse" the global average pooling than the upsampling procedure I have used?

Answer 1

1. Regarding input and output shapes:
pytorch 's doc has the explicit formula relating input and output sizes. For convolution :

Similarly for pooling :

For transposed convolution :

And for unpooling :

Make sure your padding and output_padding values add up to the proper output shape.

2. Is there a better way?
Transposed convolution has its faults, as you already noticed. It also tends to produce "checkerboard artifacts" .

One solution is to use pixelshuffle : that is, predict for each low-res point twice the number of channels, and then split them into two points with the desired number of features.

Alternatively, you can interpolate using a fixed method from the low resolution to the higher one. Apply regular convolutions to the upsampled vectors. If you choose this path, you might consider using ResizeRight instead of pytorch's interpolate - it has better handling of edge cases.

Pytorch Conv1D gives different size to ConvTranspose1d

Question

1 answers

solution1
0 2021-11-10 15:39:19

Pytorch Conv1D gives different size to ConvTranspose1d

Question

1 answers

solution1 0 2021-11-10 15:39:19

solution1
0 2021-11-10 15:39:19