[英]Pytorch Conv1D gives different size to ConvTranspose1d
I am trying to build a basic/shallow CNN auto-encoder for 1D time series data in pytorch/pytorch-lightning.我正在尝试为 pytorch/pytorch-lightning 中的一维时间序列数据构建一个基本/浅层 CNN 自动编码器。
Currently, my encoding block is:目前,我的编码块是:
class encodingBlock(nn.Module):
def __init__(self):
super().__init__()
self.conv1d_1 = nn.Conv1d(1, 64, kernel_size=32)
self.relu = nn.ReLU()
self.batchnorm = nn.BatchNorm1d(64)
self.maxpool = nn.MaxPool1d(kernel_size=2, stride=2, return_indices=True)
self.fc = nn.Linear(64, 4)
def forward(self, x):
cnn_out1 = self.conv1d_1(x)
norm_out1 = self.batchnorm(cnn_out1)
relu_out1 = self.relu(norm_out1)
maxpool_out, indices = self.maxpool(relu_out1)
gap_out = torch.mean(maxpool_out, dim = 2)
fc_out = self.relu(self.fc(gap_out))
return fc_out, indices
And my decoding block is:我的解码块是:
class decodingBlock(nn.Module):
def __init__(self):
super().__init__()
self.Tconv1d_1 = nn.ConvTranspose1d(64, 1, kernel_size=32, output_padding=1)
self.relu = nn.ReLU()
self.batchnorm = nn.BatchNorm1d(1)
self.maxunpool = nn.MaxUnpool1d(kernel_size=2, stride=2)
self.upsamp = nn.Upsample(size=59, mode='nearest')
self.fc = nn.Linear(4, 64)
def forward(self, x, indices):
fc_out = self.fc(x)
relu_out = self.relu(fc_out)
relu_out = relu_out.unsqueeze(dim = 2)
upsamp_out = self.upsamp(relu_out)
maxpool_out = self.maxunpool(upsamp_out, indices)
cnnT_out = self.Tconv1d_1(maxpool_out)
norm_out = self.batchnorm(cnnT_out)
relu_out = self.relu(norm_out)
return relu_out
However, looking at the outputs:但是,查看输出:
Input size: torch.Size([1, 1, 150])
Conv1D out size: torch.Size([1, 64, 119])
Maxpool out size: torch.Size([1, 64, 59])
Global average pooling out size: torch.Size([1, 64])
Encoder dense out size: torch.Size([1, 4])
...
Decoder input: torch.Size([1, 4])
Decoder dense out size: torch.Size([1, 64])
Unsqueeze out size: torch.Size([1, 64, 1])
Upsample out size: torch.Size([1, 64, 59])
Decoder maxunpool out size: torch.Size([1, 64, 118])
Transpose Conv out size: torch.Size([1, 1, 149])
The outputs from the MaxUnpool1d and ConvTranspose1d layers are not the expected dimension. MaxUnpool1d 和 ConvTranspose1d 层的输出不是预期的维度。
I have two questions that I was hoping to get some help on:我有两个问题希望得到帮助:
1. Regarding input and output shapes: 1.关于输入和输出形状:
pytorch 's doc has the explicit formula relating input and output sizes. pytorch的文档具有与输入和输出大小相关的显式公式。 For convolution :
对于卷积:
For transposed convolution :对于转置卷积:
Make sure your padding and output_padding
values add up to the proper output shape.确保您的 padding 和
output_padding
值加起来为正确的输出形状。
2. Is there a better way? 2. 有没有更好的方法?
Transposed convolution has its faults, as you already noticed.正如您已经注意到的那样,转置卷积有其缺点。 It also tends to produce "checkerboard artifacts" .
它还倾向于产生“棋盘伪影” 。
One solution is to use pixelshuffle
: that is, predict for each low-res point twice the number of channels, and then split them into two points with the desired number of features.一种解决方案是使用
pixelshuffle
:即,为每个低分辨率点预测通道数的两倍,然后将它们分成具有所需特征数量的两个点。
Alternatively, you can interpolate
using a fixed method from the low resolution to the higher one.或者,您可以使用固定方法从低分辨率
interpolate
到较高分辨率。 Apply regular convolutions to the upsampled vectors.对上采样的向量应用常规卷积。 If you choose this path, you might consider using
ResizeRight
instead of pytorch's interpolate - it has better handling of edge cases.如果您选择这条路径,您可能会考虑使用
ResizeRight
而不是 pytorch 的 interpolate - 它可以更好地处理边缘情况。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.