简体   繁体   English

具有动态输入形状的 CNN

[英]CNN with a dynamic input shape

Hello everyone!大家好!

Since I am trying to make a Fully Convolutional Neural Network which converts grayscale images to rgb images, I was wondering if I could train and test the model on different sized images (different pixels and ratio).由于我正在尝试制作一个将灰度图像转换为 rgb 图像的完全卷积神经网络,我想知道我是否可以在不同大小的图像(不同像素和比率)上训练和测试模型。 Normally you would just downsample or upsample, what I do not want to do.通常你只会下采样或上采样,这是我不想做的。 I heared that it might be possible if I use a Fully Convolutional Neural Network, but I still have no clue what the code would look like.我听说如果我使用完全卷积神经网络可能是可能的,但我仍然不知道代码会是什么样子。 Could you help me out with some code?你能帮我一些代码吗?

Why is this a problem?为什么这是个问题?

Like I said, the input image should not be downsampled, because I do not classify anything.就像我说的,输入图像不应该被下采样,因为我没有对任何东西进行分类。 I want to produce a new image with the same size as the input image.我想生成一个与输入图像大小相同的新图像。 So there should not be any loss.所以应该不会有任何损失。

Code for a fixed input shape:固定输入形状的代码:

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
...

with images of the size 28*28px图片大小为 28*28px

How I thought it might work:我认为它可能如何工作:

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(None, None, 1)))
model.add(layers.MaxPooling2D((2, 2)))
...

with images of different sizes具有不同尺寸的图像

  • image1 = 256*300图片 1 = 256*300
  • image2 = 1024*800图片2 = 1024*800
  • image3 = 500*400图片 3 = 500*400

Here is an example of an autoencoder which converts grayscale images to rgb images, but this one has a fixed input shape. 是一个将灰度图像转换为 rgb 图像的自动编码器示例,但是这个具有固定的输入形状。

在此处输入图像描述

*I am using TF 2.0 Alpha *我使用的是 TF 2.0 Alpha

I figured out, that a convolutional neural network does not care at all about the input shape.我发现,卷积神经网络根本不关心输入形状。 What it cares about, are kernel size, stride and padding.它关心的是内核大小、步幅和填充。 For example settingkernel size = 3, stride = 1, padding = 1 does not change the tensors shape.例如设置 kernel size = 3, stride = 1, padding = 1 不会改变张量的形状。 When it comes to pooling, it must be assured, that padding = 1 is added, which is called half/same padding ( http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html ).说到pooling,一定要保证,加了padding = 1,这叫half/same paddinghttp://deeplearning.net/software/theano/tutorial/conv_arithmetic.html )。 Thus, it is possible to make a fully convolutional autoencoder which is able to process images of different sizes.因此,可以制作一个能够处理不同尺寸图像的全卷积自编码器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM