简体   繁体   English

Pytorch的“折叠”和“展开”是如何工作的?

[英]How does Pytorch's "Fold" and "Unfold" work?

I've gone through the official doc .我已经阅读了官方文档 I'm having a hard time understanding what this function is used for and how it works.我很难理解这个 function 的用途以及它是如何工作的。 Can someone explain this in layman's terms?有人可以用外行的话解释一下吗?

unfold and fold are used to facilitate "sliding window" operation (like convolutions). unfoldfold用于促进“滑动窗口”操作(如卷积)。
Suppose you want to apply a function foo to every 5x5 window in a feature map/image:假设您想将函数foo应用于特征图/图像中的每个 5x5 窗口:

from torch.nn import functional as f
windows = f.unfold(x, kernel_size=5)

Now windows has size of batch-(5*5* x.size(1) )-num_windows, you can apply foo on windows :现在, windowssize分批(5 * 5 * x.size(1) )-num_windows,你可以申请foowindows

processed = foo(windows)

Now you need to "fold" processed back to the original size of x :现在您需要“折叠” processedx的原始大小:

out = f.fold(processed, x.shape[-2:], kernel_size=5)

You need to take care of padding , and kernel_size that may affect your ability to "fold" back processed to the size of x .您需要注意paddingkernel_size ,它们可能会影响您“折叠” processedx大小的能力。
Moreover, fold sums over overlapping elements, so you might want to divide the output of fold by patch size.此外,在重叠元素上fold总和,因此您可能希望将fold的输出除以补丁大小。

unfold imagines a tensor as a longer tensor with repeated columns/rows of values 'folded' on top of each other, which is then "unfolded": unfold将张量想象成一个较长的张量,其中重复的列/行值“折叠”在彼此的顶部,然后“展开”:

  • size determines how large the folds are size决定了折叠的大小
  • step determines how often it is folded step决定折叠的频率

Eg for a 2x5 tensor, unfolding it with step=1 , and patch size=2 across dim=1 :例如,对于一个 2x5 张量,用step=1展开它,在dim=1 patch size=2展开它:

x = torch.tensor([[1,2,3,4,5],
                  [6,7,8,9,10]])
>>> x.unfold(1,2,1)
tensor([[[ 1,  2], [ 2,  3], [ 3,  4], [ 4,  5]],
        [[ 6,  7], [ 7,  8], [ 8,  9], [ 9, 10]]])

在此处输入图片说明

fold is roughly the opposite of this operation, but "overlapping" values are summed in the output. fold与此操作大致相反,但“重叠”值在输出中求和。

One dimensional unfolding is easy:一维展开很容易:

x = torch.arange(1, 9).float()
print(x)
# dimension, size, step
print(x.unfold(0, 2, 1))
print(x.unfold(0, 3, 2))

Out:出去:

tensor([1., 2., 3., 4., 5., 6., 7., 8.])
tensor([[1., 2.],
        [2., 3.],
        [3., 4.],
        [4., 5.],
        [5., 6.],
        [6., 7.],
        [7., 8.]])
tensor([[1., 2., 3.],
        [3., 4., 5.],
        [5., 6., 7.]])

Two dimensional unfolding (also called patching )二维展开(也称为patching

import torch
patch=(3,3)
x=torch.arange(16).float()
print(x, x.shape)
x2d = x.reshape(1,1,4,4)
print(x2d, x2d.shape)
h,w = patch
c=x2d.size(1)
print(c) # channels
# unfold(dimension, size, step)
r = x2d.unfold(2,h,1).unfold(3,w,1).transpose(1,3).reshape(-1, c, h, w)
print(r.shape)
print(r) # result
tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
        14., 15.]) torch.Size([16])
tensor([[[[ 0.,  1.,  2.,  3.],
          [ 4.,  5.,  6.,  7.],
          [ 8.,  9., 10., 11.],
          [12., 13., 14., 15.]]]]) torch.Size([1, 1, 4, 4])
1
torch.Size([4, 1, 3, 3])

tensor([[[[ 0.,  1.,  2.],
          [ 4.,  5.,  6.],
          [ 8.,  9., 10.]]],


        [[[ 4.,  5.,  6.],
          [ 8.,  9., 10.],
          [12., 13., 14.]]],


        [[[ 1.,  2.,  3.],
          [ 5.,  6.,  7.],
          [ 9., 10., 11.]]],


        [[[ 5.,  6.,  7.],
          [ 9., 10., 11.],
          [13., 14., 15.]]]])

修补

Since there are no answers with 4-D tensors and nn.functional.unfold() only accepts 4-D tensor, I will would to explain this.由于 4-D 张量没有答案,而 nn.functional.unfold() 只接受 4-D 张量,我将对此进行解释。

Assuming the input tensor is of shape (batch_size, channels, height, width) , and I have taken an example where batch_size = 1, channels = 2, height = 3, width = 3 .假设输入张量的形状为(batch_size, channels, height, width) ,我举了一个例子,其中batch_size = 1, channels = 2, height = 3, width = 3

在此处输入图像描述

kernel_size = 2 which is nothing but a 2x2 kernel kernel_size = 2这不过是一个 2x2 kernel

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM