具有动态输入形状的 CNN

Question

Hello everyone!大家好！

Since I am trying to make a Fully Convolutional Neural Network which converts grayscale images to rgb images, I was wondering if I could train and test the model on different sized images (different pixels and ratio).由于我正在尝试制作一个将灰度图像转换为 rgb 图像的完全卷积神经网络，我想知道我是否可以在不同大小的图像（不同像素和比率）上训练和测试模型。 Normally you would just downsample or upsample, what I do not want to do.通常你只会下采样或上采样，这是我不想做的。 I heared that it might be possible if I use a Fully Convolutional Neural Network, but I still have no clue what the code would look like.我听说如果我使用完全卷积神经网络可能是可能的，但我仍然不知道代码会是什么样子。 Could you help me out with some code?你能帮我一些代码吗？

Why is this a problem?为什么这是个问题？

Like I said, the input image should not be downsampled, because I do not classify anything.就像我说的，输入图像不应该被下采样，因为我没有对任何东西进行分类。 I want to produce a new image with the same size as the input image.我想生成一个与输入图像大小相同的新图像。 So there should not be any loss.所以应该不会有任何损失。

Code for a fixed input shape:固定输入形状的代码：

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
...

with images of the size 28*28px图片大小为 28*28px

How I thought it might work:我认为它可能如何工作：

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(None, None, 1)))
model.add(layers.MaxPooling2D((2, 2)))
...

with images of different sizes具有不同尺寸的图像

image1 = 256*300图片 1 = 256*300
image2 = 1024*800图片2 = 1024*800
image3 = 500*400图片 3 = 500*400

Here is an example of an autoencoder which converts grayscale images to rgb images, but this one has a fixed input shape. 这是一个将灰度图像转换为 rgb 图像的自动编码器示例，但是这个具有固定的输入形状。

*I am using TF 2.0 Alpha *我使用的是 TF 2.0 Alpha

Answer 1

I figured out, that a convolutional neural network does not care at all about the input shape.我发现，卷积神经网络根本不关心输入形状。 What it cares about, are kernel size, stride and padding.它关心的是内核大小、步幅和填充。 For example settingkernel size = 3, stride = 1, padding = 1 does not change the tensors shape.例如设置 kernel size = 3, stride = 1, padding = 1 不会改变张量的形状。 When it comes to pooling, it must be assured, that padding = 1 is added, which is called half/same padding ( http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html ).说到pooling，一定要保证，加了padding = 1，这叫half/same padding （ http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html ）。 Thus, it is possible to make a fully convolutional autoencoder which is able to process images of different sizes.因此，可以制作一个能够处理不同尺寸图像的全卷积自编码器。

具有动态输入形状的 CNN

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-11-12 14:50:42

具有动态输入形状的 CNN

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-11-12 14:50:42

解决方案1
1 已采纳 2019-11-12 14:50:42