使用张量流中的tf.nn.conv2d_transpose获取解卷积图层的输出形状

Question

According to this paper , the output shape is N + H - 1 , N is input height or width, H is kernel height or width. 根据本文，输出形状为N + H - 1 ， N为输入高度或宽度， H为核高度或宽度。 This is obvious inverse process of convolution. 这是卷积的明显逆过程。 This tutorial gives a formula to calculate the output shape of convolution which is (W−F+2P)/S+1 , W - input size, F - filter size, P - padding size, S - stride. 本教程给出了计算卷积输出形状的公式，即(W−F+2P)/S+1 ， W - 输入大小， F - 滤波器大小， P - 填充大小， S - 步幅。 But in Tensorflow , there are test cases like: 但是在Tensorflow中，有一些测试用例如：

  strides = [1, 2, 2, 1]

  # Input, output: [batch, height, width, depth]
  x_shape = [2, 6, 4, 3]
  y_shape = [2, 12, 8, 2]

  # Filter: [kernel_height, kernel_width, output_depth, input_depth]
  f_shape = [3, 3, 2, 3]

So we use y_shape , f_shape and x_shape , according to formula (W−F+2P)/S+1 to calculate padding size P . 因此我们根据公式(W−F+2P)/S+1使用y_shape ， f_shape和x_shape来计算填充大小P From (12 - 3 + 2P) / 2 + 1 = 6 , we get P = 0.5 , which is not an integer. 从(12 - 3 + 2P) / 2 + 1 = 6 ，得到P = 0.5 ，这不是整数。 How does deconvolution works in Tensorflow? deconvolution如何在Tensorflow中运行？

Answer 1

for deconvolution, 对于反卷积，

output_size = strides * (input_size-1) + kernel_size - 2*padding

strides, input_size, kernel_size, padding are integer padding is zero for 'valid' strides，input_size，kernel_size，padding是整数填充为零''有效'

Answer 2

The formula for the output size from the tutorial assumes that the padding P is the same before and after the image (left & right or top & bottom). 本教程中输出大小的公式假定填充P在图像之前和之后（左侧和右侧或顶部和底部）相同。 Then, the number of places in which you put the kernel is: W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after) . 然后，你放置内核的地方数量是： W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after) 。

But tensorflow also handles the situation where you need to pad more pixels to one of the sides than to the other, so that the kernels would fit correctly. 但是，tensorflow还可以处理需要将更多像素填充到一侧而不是另一侧的情况，以便内核能够正确匹配。 You can read more about the strategies to choose the padding ( "SAME" and "VALID" ) in the docs . 您可以在文档中阅读有关选择填充（ "SAME"和"VALID" ）的策略的更多信息。 The test you're talking about uses method "VALID" . 您正在谈论的测试使用方法"VALID" 。

Answer 3

This discussion is really helpful. 这个讨论非常有用。 Just add some additional information. 只需添加一些其他信息。 padding='SAME' can also let the bottom and right side get the one additional padded pixel. padding='SAME'也可以让底部和右侧获得一个额外的填充像素。 According to TensorFlow document , and the test case below 根据TensorFlow文档，以及下面的测试用例

strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 12, 8, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]

is using padding='SAME'. 正在使用padding ='SAME'。 We can interpret padding='SAME' as: 我们可以将padding ='SAME'解释为：

(W−F+pad_along_height)/S+1 = out_height,
(W−F+pad_along_width)/S+1 = out_width.

So (12 - 3 + pad_along_height) / 2 + 1 = 6 , and we get pad_along_height=1 . 所以(12 - 3 + pad_along_height) / 2 + 1 = 6 ，我们得到pad_along_height=1 。 And pad_top=pad_along_height/2 = 1/2 = 0 (integer division), pad_bottom=pad_along_height-pad_top=1 . 并且pad_top=pad_along_height/2 = 1/2 = 0 （整数除法）， pad_bottom=pad_along_height-pad_top=1 。

As for padding='VALID', as the name suggested, we use padding when it is proper time to use it. 至于padding ='VALID'，顾名思义，我们在适当的时候使用填充。 At first, we assume that the padded pixel = 0, if this doesn't work well, then we add 0 padding where any value outside the original input image region. 首先，我们假设填充像素= 0，如果这不起作用，那么我们在原始输入图像区域之外的任何值处添加0填充。 For example, the test case below, 例如，下面的测试用例，

strides = [1, 2, 2, 1]

# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 13, 9, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]

The output shape of conv2d is conv2d的输出形状是

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
           = ceil(float(13 - 3 + 1) / float(3)) = ceil(11/3) = 6
           = (W−F)/S + 1.

Cause (W−F)/S+1 = (13-3)/2+1 = 6 , the result is an integer, we don't need to add 0 pixels around the border of the image, and pad_top=1/2 , pad_left=1/2 in the TensorFlow document padding='VALID' section are all 0. 原因(W−F)/S+1 = (13-3)/2+1 = 6 ，结果是整数，我们不需要在图像边框周围添加0像素，而pad_top=1/2 ， TensorFlow文档中的pad_left=1/2 padding ='VALID'部分全部为0。

使用张量流中的tf.nn.conv2d_transpose获取解卷积图层的输出形状

问题描述

3 个解决方案

解决方案1
7 2018-03-18 12:18:37

解决方案2
3 2016-03-15 18:16:18

解决方案3
1 2017-01-01 14:54:48

使用张量流中的tf.nn.conv2d_transpose获取解卷积图层的输出形状

问题描述

3 个解决方案

解决方案1 7 2018-03-18 12:18:37

解决方案2 3 2016-03-15 18:16:18

解决方案3 1 2017-01-01 14:54:48

解决方案1
7 2018-03-18 12:18:37

解决方案2
3 2016-03-15 18:16:18

解决方案3
1 2017-01-01 14:54:48