简体   繁体   English

使用张量流中的tf.nn.conv2d_transpose获取解卷积图层的输出形状

[英]Getting the output shape of deconvolution layer using tf.nn.conv2d_transpose in tensorflow

According to this paper , the output shape is N + H - 1 , N is input height or width, H is kernel height or width. 根据本文 ,输出形状为N + H - 1N为输入高度或宽度, H为核高度或宽度。 This is obvious inverse process of convolution. 这是卷积的明显逆过程。 This tutorial gives a formula to calculate the output shape of convolution which is (W−F+2P)/S+1 , W - input size, F - filter size, P - padding size, S - stride. 教程给出了计算卷积输出形状的公式,即(W−F+2P)/S+1W - 输入大小, F - 滤波器大小, P - 填充大小, S - 步幅。 But in Tensorflow , there are test cases like: 但是在Tensorflow中 ,有一些测试用例如:

  strides = [1, 2, 2, 1]

  # Input, output: [batch, height, width, depth]
  x_shape = [2, 6, 4, 3]
  y_shape = [2, 12, 8, 2]

  # Filter: [kernel_height, kernel_width, output_depth, input_depth]
  f_shape = [3, 3, 2, 3]

So we use y_shape , f_shape and x_shape , according to formula (W−F+2P)/S+1 to calculate padding size P . 因此我们根据公式(W−F+2P)/S+1使用y_shapef_shapex_shape来计算填充大小P From (12 - 3 + 2P) / 2 + 1 = 6 , we get P = 0.5 , which is not an integer. (12 - 3 + 2P) / 2 + 1 = 6 ,得到P = 0.5 ,这不是整数。 How does deconvolution works in Tensorflow? deconvolution如何在Tensorflow中运行?

for deconvolution, 对于反卷积,

output_size = strides * (input_size-1) + kernel_size - 2*padding

strides, input_size, kernel_size, padding are integer padding is zero for 'valid' strides,input_size,kernel_size,padding是整数填充为零''有效'

The formula for the output size from the tutorial assumes that the padding P is the same before and after the image (left & right or top & bottom). 本教程中输出大小的公式假定填充P在图像之前和之后(左侧和右侧或顶部和底部)相同。 Then, the number of places in which you put the kernel is: W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after) . 然后,你放置内核的地方数量是: W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after)

But tensorflow also handles the situation where you need to pad more pixels to one of the sides than to the other, so that the kernels would fit correctly. 但是,tensorflow还可以处理需要将更多像素填充到一侧而不是另一侧的情况,以便内核能够正确匹配。 You can read more about the strategies to choose the padding ( "SAME" and "VALID" ) in the docs . 您可以在文档中阅读有关选择填充( "SAME""VALID" )的策略的更多信息。 The test you're talking about uses method "VALID" . 您正在谈论的测试使用方法"VALID"

This discussion is really helpful. 这个讨论非常有用。 Just add some additional information. 只需添加一些其他信息。 padding='SAME' can also let the bottom and right side get the one additional padded pixel. padding='SAME'也可以让底部和右侧获得一个额外的填充像素。 According to TensorFlow document , and the test case below 根据TensorFlow文档 ,以及下面的测试用例

strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 12, 8, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]

is using padding='SAME'. 正在使用padding ='SAME'。 We can interpret padding='SAME' as: 我们可以将padding ='SAME'解释为:

(W−F+pad_along_height)/S+1 = out_height,
(W−F+pad_along_width)/S+1 = out_width.

So (12 - 3 + pad_along_height) / 2 + 1 = 6 , and we get pad_along_height=1 . 所以(12 - 3 + pad_along_height) / 2 + 1 = 6 ,我们得到pad_along_height=1 And pad_top=pad_along_height/2 = 1/2 = 0 (integer division), pad_bottom=pad_along_height-pad_top=1 . 并且pad_top=pad_along_height/2 = 1/2 = 0 (整数除法), pad_bottom=pad_along_height-pad_top=1

As for padding='VALID', as the name suggested, we use padding when it is proper time to use it. 至于padding ='VALID',顾名思义,我们在适当的时候使用填充。 At first, we assume that the padded pixel = 0, if this doesn't work well, then we add 0 padding where any value outside the original input image region. 首先,我们假设填充像素= 0,如果这不起作用,那么我们在原始输入图像区域之外的任何值处添加0填充。 For example, the test case below, 例如,下面的测试用例,

strides = [1, 2, 2, 1]

# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 13, 9, 2]

# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]

The output shape of conv2d is conv2d的输出形状是

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
           = ceil(float(13 - 3 + 1) / float(3)) = ceil(11/3) = 6
           = (W−F)/S + 1.

Cause (W−F)/S+1 = (13-3)/2+1 = 6 , the result is an integer, we don't need to add 0 pixels around the border of the image, and pad_top=1/2 , pad_left=1/2 in the TensorFlow document padding='VALID' section are all 0. 原因(W−F)/S+1 = (13-3)/2+1 = 6 ,结果是整数,我们不需要在图像边框周围添加0像素,而pad_top=1/2TensorFlow文档中的pad_left=1/2 padding ='VALID'部分全部为0。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM