[英]Getting the output shape of deconvolution layer using tf.nn.conv2d_transpose in tensorflow
According to this paper , the output shape is N + H - 1
, N
is input height or width, H
is kernel height or width. 根据本文 ,输出形状为N + H - 1
, N
为输入高度或宽度, H
为核高度或宽度。 This is obvious inverse process of convolution. 这是卷积的明显逆过程。 This tutorial gives a formula to calculate the output shape of convolution which is (W−F+2P)/S+1
, W
- input size, F
- filter size, P
- padding size, S
- stride. 本教程给出了计算卷积输出形状的公式,即(W−F+2P)/S+1
, W
- 输入大小, F
- 滤波器大小, P
- 填充大小, S
- 步幅。 But in Tensorflow , there are test cases like: 但是在Tensorflow中 ,有一些测试用例如:
strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 12, 8, 2]
# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
So we use y_shape
, f_shape
and x_shape
, according to formula (W−F+2P)/S+1
to calculate padding size P
. 因此我们根据公式(W−F+2P)/S+1
使用y_shape
, f_shape
和x_shape
来计算填充大小P
From (12 - 3 + 2P) / 2 + 1 = 6
, we get P = 0.5
, which is not an integer. 从(12 - 3 + 2P) / 2 + 1 = 6
,得到P = 0.5
,这不是整数。 How does deconvolution works in Tensorflow? deconvolution如何在Tensorflow中运行?
for deconvolution, 对于反卷积,
output_size = strides * (input_size-1) + kernel_size - 2*padding
strides, input_size, kernel_size, padding are integer padding is zero for 'valid' strides,input_size,kernel_size,padding是整数填充为零''有效'
The formula for the output size from the tutorial assumes that the padding P
is the same before and after the image (left & right or top & bottom). 本教程中输出大小的公式假定填充P
在图像之前和之后(左侧和右侧或顶部和底部)相同。 Then, the number of places in which you put the kernel is: W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after)
. 然后,你放置内核的地方数量是: W (size of the image) - F (size of the kernel) + P (additional padding before) + P (additional padding after)
。
But tensorflow also handles the situation where you need to pad more pixels to one of the sides than to the other, so that the kernels would fit correctly. 但是,tensorflow还可以处理需要将更多像素填充到一侧而不是另一侧的情况,以便内核能够正确匹配。 You can read more about the strategies to choose the padding ( "SAME"
and "VALID"
) in the docs . 您可以在文档中阅读有关选择填充( "SAME"
和"VALID"
)的策略的更多信息。 The test you're talking about uses method "VALID"
. 您正在谈论的测试使用方法"VALID"
。
This discussion is really helpful. 这个讨论非常有用。 Just add some additional information. 只需添加一些其他信息。 padding='SAME'
can also let the bottom and right side get the one additional padded pixel. padding='SAME'
也可以让底部和右侧获得一个额外的填充像素。 According to TensorFlow document , and the test case below 根据TensorFlow文档 ,以及下面的测试用例
strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 12, 8, 2]
# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
is using padding='SAME'. 正在使用padding ='SAME'。 We can interpret padding='SAME' as: 我们可以将padding ='SAME'解释为:
(W−F+pad_along_height)/S+1 = out_height,
(W−F+pad_along_width)/S+1 = out_width.
So (12 - 3 + pad_along_height) / 2 + 1 = 6
, and we get pad_along_height=1
. 所以(12 - 3 + pad_along_height) / 2 + 1 = 6
,我们得到pad_along_height=1
。 And pad_top=pad_along_height/2 = 1/2 = 0
(integer division), pad_bottom=pad_along_height-pad_top=1
. 并且pad_top=pad_along_height/2 = 1/2 = 0
(整数除法), pad_bottom=pad_along_height-pad_top=1
。
As for padding='VALID', as the name suggested, we use padding when it is proper time to use it. 至于padding ='VALID',顾名思义,我们在适当的时候使用填充。 At first, we assume that the padded pixel = 0, if this doesn't work well, then we add 0 padding where any value outside the original input image region. 首先,我们假设填充像素= 0,如果这不起作用,那么我们在原始输入图像区域之外的任何值处添加0填充。 For example, the test case below, 例如,下面的测试用例,
strides = [1, 2, 2, 1]
# Input, output: [batch, height, width, depth]
x_shape = [2, 6, 4, 3]
y_shape = [2, 13, 9, 2]
# Filter: [kernel_height, kernel_width, output_depth, input_depth]
f_shape = [3, 3, 2, 3]
The output shape of conv2d
is conv2d
的输出形状是
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
= ceil(float(13 - 3 + 1) / float(3)) = ceil(11/3) = 6
= (W−F)/S + 1.
Cause (W−F)/S+1 = (13-3)/2+1 = 6
, the result is an integer, we don't need to add 0 pixels around the border of the image, and pad_top=1/2
, pad_left=1/2
in the TensorFlow document padding='VALID' section are all 0. 原因(W−F)/S+1 = (13-3)/2+1 = 6
,结果是整数,我们不需要在图像边框周围添加0像素,而pad_top=1/2
, TensorFlow文档中的pad_left=1/2
padding ='VALID'部分全部为0。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.