简体   繁体   English

pytorch中卷积中的屏蔽内核

[英]masked kernel in convolution in pytorch

Let's say:让我们说:

x = torch.arange(16, dtype=torch.float).reshape(1, 1, 4, 4)

and a 2d convolution layer is:一个二维卷积层是:

layer = torch.nn.Conv2d(in_channels=1, out_channels=1, kernel_size=2, stride=2)
layer.weight.data[:] = 1.
layer.bias.data[:] = 0.

Normally, passing x to layer will give:通常,将x传递给图层将给出:

>>layer(x)    
tensor([[[[10., 18.],
              [42., 50.]]]], grad_fn=<MkldnnConvolutionBackward>)

Considering having 4 mask filters, how it is done to mask kernel in each step?考虑到有 4 个掩码过滤器,如何在每一步中对内核进行掩码? for example the following picture indicates 4 filters(white: True, black: False)例如下图表示4个过滤器(白色:真,黑色:假) 对于给定的 x,stride=2 和 kernel_size=2

The output should be:输出应该是:

tensor([[[[5., 15.],
          [30., 40.]]]], grad_fn=<MkldnnConvolutionBackward>)

PS: all masks are obtained by missing pixels in 2d input array. PS:所有掩码都是通过在二维输入数组中丢失像素获得的。 So 4 masks above are actually a matrix with the same shape as input.所以上面的4个掩码实际上是一个与输入形状相同的矩阵。

I think you are looking for partial convolution from Nvidia research.我认为您正在寻找来自 Nvidia 研究的部分卷积

A more detailed description is given in their ECCV 2018 paper Image Inpainting for Irregular Holes Using Partial Convolutions在他们的 ECCV 2018 论文Image Inpainting for Irregular Holes Using Partial Convolutions 中给出了更详细的描述

I found the solution.我找到了解决方案。

Element-wise multiplication between input and the mask before feeding it to a Conv2d method would be enough.(masking input is much easier than masking kernel itself !!):在将输入和掩码提供给 Conv2d 方法之前,在输入和掩码之间进行元素乘法就足够了。(掩码输入比掩码内核本身容易得多!):

mask = torch.tensor([[[1, 1, 1, 0]], [[1, 0, 1, 1]], [[1, 1, 0, 1]], [[0, 1, 1, 1]]], dtype=torch.float, requires_grad=False).reshape(1, 1, 4, 4)

>>layer(torch.mul(x, mask))
tensor([[[[5., 15.],
          [30., 40.]]]], grad_fn=<MkldnnConvolutionBackward>)

PS Thanks to @Shai I got the idea from partial convolution represented in this paper . PS感谢@Shai,我从本文中表示的部分卷积中得到了这个想法。 However it does some extra manipulation on output.然而,它对输出做了一些额外的操作。 it defines a mask ratio and I guess does some weighting the final output based on it.它定义了一个掩码比率,我想根据它对最终输出进行一些加权。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM