简体   繁体   中英

Implementing an image-pyramid with Keras

To get a scale invariance (or to detect objects at any scale) on my CNN model, I want to implement Image Pyramids . As the article explains, while creating image-pyramids, image is subjected to repeated smoothing and subsampling.

I am implementing a CNN in Keras. Is there a way with Keras to implement image pyramids? I read one of the SO post that says to use AveragePooling2D to achieve the pyramid effect.

Is that even correct? How could AveragePooling2D layer give the pyramid effect?

One CNN architecture that achieves you goal is the U-Net, originally introduced by this paper.

It uses a sequence of convolutional and pooling layers to create a pyramid. Note that it's not an image pyramid of the input image, but the idea is to learn what is useful at different scales, not directly feed the pyramid.

Now, think about how the AveragePooling2D works. You select a patch of the original image, replace it with the average and then move to the next patch. This is exactly what you describe in generating an image pyramid: smoothing is achieved by the averaging, and replacing the patch with one pixel is a downsampling.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM