[英]Why conv2d in tensorflow gives an output has the same shape as input
According to this Deep Learning course http://cs231n.github.io/convolutional-networks/#conv , It says that if there is an input x
with shape [W,W]
(where W = width = height
) goes through a Convolutional Layer with filter shape [F,F]
and stride S
, the Layer will return an output
with shape [(WF)/S +1, (WF)/S +1]
根据这个深度学习课程http://cs231n.github.io/convolutional-networks/#conv ,它说如果有一个输入
x
的形状[W,W]
(其中W = width = height
)经过一个带过滤器的形状卷积层 [F,F]
和步幅 S
中,第二层将返回一个output
具有形状[(WF)/S +1, (WF)/S +1]
However, when I'm trying to follow the tutorial of the Tensorflow: https://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html . 但是,当我试图遵循Tensorflow教程时: https ://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html。 There seems to have difference of the function
tf.nn.conv2d(inputs, filter, stride)
函数
tf.nn.conv2d(inputs, filter, stride)
似乎有区别tf.nn.conv2d(inputs, filter, stride)
Whatever how do I change my filter size, conv2d
will constantly return me a value with the same shape as the input. 无论如何更改我的滤镜大小,
conv2d
将不断返回一个与输入形状相同的值。
In my case, I am using the MNIST
dataset which indicates that every image has size [28,28]
(ignoring channel_num = 1
) 就我而言,我使用
MNIST
数据集,表明每个图像都有大小[28,28]
(忽略channel_num = 1
)
but after I defining the first conv1
layers, I used the conv1.get_shape()
to see its output, it gives me [28,28, num_of_filters]
但是在我定义了第一个
conv1
图层后,我使用了conv1.get_shape()
来查看它的输出,它给了我[28,28, num_of_filters]
Why is this? 为什么是这样? I thought the return value should follow the formula above.
我认为返回值应该遵循上面的公式。
Appendix: Code snippet 附录:代码段
#reshape x from 2d to 4d
x_image = tf.reshape(x, [-1, 28, 28, 1]) #[num_samples, width, height, channel_num]
## define the shape of weights and bias
w_shape = [5, 5, 1, 32] #patch_w, patch_h, in_channel, output_num(out_channel)
b_shape = [32] #bias only need to be consistent with output_num
## init weights of conv1 layers
W_conv1 = weight_variable(w_shape)
b_conv1 = bias_variable(b_shape)
## first layer x_iamge->conv1/relu->pool1
#Our convolutions uses a stride of one
#and are zero padded
#so that the output is the same size as the input
h_conv1 = tf.nn.relu(
conv2d(x_image, W_conv1) + b_conv1
)
print 'conv1.shape=',h_conv1.get_shape()
## conv1.shape= (?, 28, 28, 32)
## I thought conv1.shape should be (?, (28-5)/1+1, 24 ,32)
h_pool1 = max_pool_2x2(h_conv1) #output 32 num
print 'pool1.shape=',h_pool1.get_shape() ## pool1.shape= (?, 14, 14, 32)
Conv2d has a parameter called padding see here Conv2d有一个名为padding的参数, 请参见此处
Where if you set padding to "VALID" it will satisfy your formula. 如果您将填充设置为“有效”,它将满足您的公式。 It defaults to "SAME" which pads (same as adding a border around) the image filled with zeroes such that the output will remain the same shape as the input.
它默认为“SAME”,它填充零填充图像(与添加边框相同),使得输出将保持与输入相同的形状。
It depends on the padding parameter. 它取决于填充参数。 'SAME' will keep the output as WxW (assuming stride=1,) 'VALID' will shrink the size of the output to (W-F+1)x(W-F+1)
'SAME'将输出保持为WxW(假设stride = 1,)'VALID'将输出的大小缩小为(W-F + 1)x(W-F + 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.