I'm reading material from the TensorFlow website:
https://www.tensorflow.org/tutorials/layers
Suppose we have 10 greyscale monochrome 28x28 pixel images,
So, if we apply a second convolutional layer(let's say 64 5x5 filters as in the link), do we apply these filters to each channel of each image and get 10*32*64*14*14 data?
Yes and No. You do apply the filters to each channel and each image, but you don't get 10*32*64*14*14
output dimensions. The dimensionality of the output is going to be 10*64*14*14
, because the layer specified 64 output channels per image. In turn, the weights used for this convolution will have size 32*64*5*5
(64 5-by-5 filters for every channel on the input).
No. If you convolve & pad (ignoring the batch size) a 14x14x32
volume with a set of 64 5x5
filters, you'll end up with a 14x14x64
output volume
Every single convolutional filter is convolved along the whole input depth. Thus, your 14x14x32
input volume is convolved with a 5x5
filter and then the output is a 14x14x1
feature map.
Then, the second 5x5
filter of the stack of 64 filters, is convolved again with the input volume. The same operation is done for each one of the 64 filters and the resulting feature maps are stacked, forming your output volume 14x14x64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.