简体   繁体   English

输入图像大小CNN Tensorflow

[英]Input image size CNN Tensorflow

I am feeding my CNN with image size of 2048x2048. 我正在以2048x2048的图像尺寸填充CNN。 However I accidentaly forgot to change the size of the input: 但是我意外地忘记了更改输入的大小:

img_width, img_height = 32, 32
input_shape = (3, img_width, img_height)

I was wondering why does the training still run even if the input images are larger than 32x32 pixels? 我想知道为什么即使输入图像大于32x32像素,训练仍然可以进行? Or does the CNN only recognizes part of the images? 还是CNN只能识别部分图像?

Thanks in advance. 提前致谢。

Most implementations of convolutional layers will operate on any sized images. 卷积层的大多数实现将在任何大小的图像上运行。 For 2D convolutions with padding="SAME" , the output size of any convolutional layer will be [batch_size, input_height // stride[0], input_width // stride[1], filters] (valid padding just means you lose a few pixels on each boundary depending on the kernel size). 对于具有padding="SAME" 2D卷积,任何卷积层的输出大小将为[batch_size, input_height // stride[0], input_width // stride[1], filters] (有效的填充仅意味着您会损失几个像素取决于内核大小)。

Similarly, the number of weights involved in the kernel is filters_in * filters_out * np.prod(kernel_size) , which is independent of input size (bias is just filters_out ). 类似地,内核中涉及的权重数是filters_in * filters_out * np.prod(kernel_size) ,它与输入大小无关(偏差仅是filters_out )。

To do tasks such as classification, most CNNs use a spatial pooling layer at the end (eg tf.reduce_mean(features, axis=(1, 2)) ) which reduces the output of any input image batch to [batch_size, n_filters_out] , regardless of input size, followed by some dense layers to regress to logits. 为了执行诸如分类之类的任务,大多数CNN都在最后使用空间池化层(例如tf.reduce_mean(features, axis=(1, 2)) ),这[batch_size, n_filters_out]任何输入图像批处理的输出减少为[batch_size, n_filters_out] ,无论输入大小如何,其后都是一些密集层以回归logit。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM