简体   繁体   中英

Understanding the effect on resolution by using average pooling in convolution neural network

I am using a colorization code on CIFAR-10 dataset and I came across this line:

downsize_module = nn.Sequential(nn.AvgPool2d(2), nn.AvgPool2d(2),  nn.Upsample(scale_factor=2), .Upsample(scale_factor=2))

Average Pooling is used two times, so what is the resolution of output image ?

Here is my understanding:

For example, if we have 8*8 original input image, 1st average pool (2*2) will give 4*4 as output and 2nd average (2*2) pool will give 2*2 as output.

The resolution of the output image is =1/16 th of input image in terms of pixels. So, in terms of pixel it will be 1/16th of original but in term of dimension it will be 1/4th of original.

Which will be correct to say? 1/16th in terms of pixels or 1/4th.

You are giving the same information in both cases, you just have to specify what unit you are talking about. Total number of pixels or pixels in the edge. If your image is square the total number of pixels scales as the square of the number of pixels on the edge. Usually when referring to the total number of pixels (eg in photography) it's common to see something like: Resolution = 10.4Mp or 3.2kp. On the other hand when talking about screens the resolution is usually the height of the image in pixels like: Resolution = 1080 means an image of 1920x1080 pixels.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM