简体   繁体   English

如何控制output ConvTranspose1d的pytorch维度?

[英]How to control output Dimensions of pytorch ConvTranspose1d?


I'm currently building on a convolutional encoder-decoder.network in pytorch using Conv1d Layers for the encoder and ConvTranspose1d layers for the decoder.我目前正在 pytorch 中构建一个卷积编码器-解码器网络,使用 Conv1d 层作为编码器,使用 ConvTranspose1d 层作为解码器。 Unfortionately the output dimensions of the decoder do not match the encoder.不幸的是,解码器的 output 维度与编码器不匹配。

How can I ensure decoder shapes match encoder shapes?如何确保解码器形状与编码器形状匹配?

The code:代码:

## Building the neural network
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np

class Net(nn.Module):
    def __init__(self):
      super(Net, self).__init__()

      
      self.conv11 = nn.Conv1d(1, 12, kernel_size=(8,13), stride=1)
      self.relu11 = nn.ReLU(inplace=False)
      self.batch11 = nn.BatchNorm2d(12)
      self.conv12 = nn.Conv1d(12, 16, (1,11), 1)
      self.relu12 = nn.ReLU(inplace=False)
      self.batch12 = nn.BatchNorm2d(16)
      self.conv13 = nn.Conv1d(16, 20, (1,9), 1)
      self.relu13 = nn.ReLU(inplace=False)
      self.batch13 = nn.BatchNorm2d(20)
      self.conv14 = nn.Conv1d(20, 24, (1,7), 1)
      self.relu14 = nn.ReLU(inplace=False)
      self.batch14 = nn.BatchNorm2d(24)
      self.conv15 = nn.Conv1d(24, 32, (1,7), 1)
      self.relu15 = nn.ReLU(inplace=False)
      self.batch15 = nn.BatchNorm2d(32)

      # ConvTranspose explained: https://medium.com/@marsxiang/convolutions-transposed-and-deconvolution-6430c358a5b6
      self.conv25 = nn.ConvTranspose1d(32, 24, (1,7), 1)
      self.relu25 = nn.ReLU(inplace=False)
      self.batch25 = nn.BatchNorm2d(24)
      self.conv24 = nn.ConvTranspose1d(24, 20, (1,9), 1) ### Problem Layer
      self.relu24 = nn.ReLU(inplace=False)
      self.batch24 = nn.BatchNorm2d(20)
      self.conv23 = nn.ConvTranspose1d(20, 16, (1,11), 1) ### Problem Layer
      self.relu23 = nn.ReLU(inplace=False)
      self.batch23 = nn.BatchNorm2d(16)
      self.conv22 = nn.ConvTranspose1d(16, 12, (1,13), 1) ### Problem Layer
      self.relu22 = nn.ReLU(inplace=False) 
      self.batch22 = nn.BatchNorm2d(12)
      self.conv21 = nn.ConvTranspose1d(12, 1, (1,129), 1)

    def forward(self, x):
      print("Forward pass")
      print(x.shape)
      x = self.batch11(self.relu11(self.conv11(x))) #First Layer
      print("Encoder")
      print(x.shape)
      x = self.batch12(self.relu12(self.conv12(x)))
      print(x.shape)
      x = self.batch13(self.relu13(self.conv13(x)))
      print(x.shape)
      x = self.batch14(self.relu14(self.conv14(x)))
      print(x.shape)
      shape14 = x.shape
      x = self.batch15(self.relu15(self.conv15(x)))
      print("Latent Space")
      print(x.shape)
      x = self.batch25(self.relu25(self.conv25(x)))
      print("Decoder")
      print(x.shape)
      x = self.batch24(self.relu24(self.conv24(x))) ### Problem Layer
      print("Problem Layer")
      print(x.shape)
      x = self.batch23(self.relu23(self.conv23(x))) ### Problem Layer
      print("Problem Layer")
      print(x.shape)
      x = self.batch22(self.relu22(self.conv22(x))) ### Problem Layer
      print(x.shape)
      x = self.conv21(x)
      print("Output Layer")
      print(x.shape)
      return x

net = Net()
print(net)

Creating dummy data and calculating a forward pass of the.network创建虚拟数据并计算网络的前向传播

test_samples = np.random.rand(5,8,129) ##Dummy data
Z_samples = test_samples
print(Z_samples.shape)
print(Z_samples[0,:,:].shape)
inp = torch.from_numpy(Z_samples[0,:,:]).float()
print(inp.shape)
inp = torch.unsqueeze(inp, 0)
inp = torch.unsqueeze(inp, 0)
print(inp.shape)
out = net(inp)
print("Out Shape")
print(out.shape)

Console Output of above block:以上块的控制台 Output:

(5, 8, 129)
(8, 129)
torch.Size([8, 129])
torch.Size([1, 1, 8, 129])
Forward pass
torch.Size([1, 1, 8, 129])
Encoder
torch.Size([1, 12, 1, 117])
torch.Size([1, 16, 1, 107])
torch.Size([1, 20, 1, 99])
torch.Size([1, 24, 1, 93])
Latent Space
torch.Size([1, 32, 1, 87])
Decoder
torch.Size([1, 24, 1, 93])  # Remark: This Layer-Output is fine
Problem Layer
torch.Size([1, 20, 1, 101]) # Remark: Here the last dimension should be 99 instead of 101
Problem Layer
torch.Size([1, 16, 1, 111]) # Remark: Here the last dimension should be 107 instead of 111
torch.Size([1, 12, 1, 123]) # Remark: Here the last dimension should be 117 instead of 123
Output Layer
torch.Size([1, 1, 1, 251]) # Remark: Here the last dimension should be 129 instead of 251
Out Shape
torch.Size([1, 1, 1, 251])

I found this threat recommending to use the "output_size" argument of ConvTranspose1d in the Forward-Pass.我发现这个威胁建议在 Forward-Pass 中使用 ConvTranspose1d 的“output_size”参数。 If I do so, I get a Index Error (shown in following image).如果我这样做,我会得到一个索引错误(如下图所示)。

在前向传递中使用 ConvTranspose1d 的 output_size 参数时出现索引错误

To make "conv - transposed_conv" pair preserve input shape, conv and transposed_conv should have same parameters, so, each (spatial) shape-changing conv must be paired with equally parametrized transposed_conv (well, channels less restricted then spatial parameters(kernel, stride, padding) ), yours are not.为了使“conv - transposed_conv”对保留输入形状,conv 和 transposed_conv 应该具有相同的参数,因此,每个(空间)形状变化的 conv 必须与同样参数化的 transposed_conv 配对(好吧,通道比空间参数(内核,步幅)限制更少, padding) ), 你的不是。

With setting up transposeds like this:像这样设置转置:

    self.conv25 = nn.ConvTranspose1d(32, 24, (1,7), 1)
    self.relu25 = nn.ReLU(inplace=False)
    self.batch25 = nn.BatchNorm2d(24)
    self.conv24 = nn.ConvTranspose1d(24, 20, (1,7), 1) ### Problem Layer
    self.relu24 = nn.ReLU(inplace=False)
    self.batch24 = nn.BatchNorm2d(20)
    self.conv23 = nn.ConvTranspose1d(20, 16, (1,9), 1) ### Problem Layer
    self.relu23 = nn.ReLU(inplace=False)
    self.batch23 = nn.BatchNorm2d(16)
    self.conv22 = nn.ConvTranspose1d(16, 12, (1,11), 1) ### Problem Layer
    self.relu22 = nn.ReLU(inplace=False) 
    self.batch22 = nn.BatchNorm2d(12)
    self.conv21 = nn.ConvTranspose1d(12, 1, (8,13), 1)

Result shape appears right (torch.Size([1, 1, 8, 129])).结果形状显示在右侧 (torch.Size([1, 1, 8, 129]))。

If you need some independent latent space su.net, make it preserve its input shape too (as awhole).如果你需要一些独立的潜在空间 su.net,让它也保留它的输入形状(作为一个整体)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM