简体   繁体   中英

How to get a 5-Dimensional output after torch.nn.Conv2d layer in PyTorch?

I am working on a project based on the OpenPose research paper that I read two weeks ago. In that, the model is supposed to give a 5-dimensional output. For example, torch.nn.conv2d() gives a 4-D output of the following shape: (Batch_size, n_channels, input_width, input_height) . What I need is an output of the following shape: (Batch_size, n_channels, input_width, input_height, 2) . Here 2 is a fixed number not subject to any changes. The 2 is there because each entry is a 2-dimensional vector hence for each channel in every pixel position, there are 2 values hence, the added dimension.

What will be the best way to do this? I thought about having 2 seperate branches for each of the vector values but the network is very deep and I would like to be as Computationally efficient as possible.

So you are effectively looking to compute feature maps which are interpreted as 2-dimensional vectors. Unless there is something fancy math-wise happening there, you are probably fine with just having twice as many output channels: (batch_size, n_channels * 2, width, height) , and then reshaping it as

output5d = output4d.reshape(
      output4d.shape[0],
      output4d.shape[1] / 2,
      2,
      output4d.shape[2],
      output4d.shape[3]
)

which gives you a shape of (batch_size, n_channels, 2, width, height) . If you really want to have 2 as the last dimension, you can use transpose :

output5d = output5d.transpose(2, 4)

but if there is no strong argument in favor of this layout, I would suggest you do not transpose as it always costs a bit of performance.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM