[英]CNN with a dynamic input shape
Hello everyone!大家好!
Since I am trying to make a Fully Convolutional Neural Network which converts grayscale images to rgb images, I was wondering if I could train and test the model on different sized images (different pixels and ratio).由于我正在尝试制作一个将灰度图像转换为 rgb 图像的完全卷积神经网络,我想知道我是否可以在不同大小的图像(不同像素和比率)上训练和测试模型。 Normally you would just downsample or upsample, what I do not want to do.
通常你只会下采样或上采样,这是我不想做的。 I heared that it might be possible if I use a Fully Convolutional Neural Network, but I still have no clue what the code would look like.
我听说如果我使用完全卷积神经网络可能是可能的,但我仍然不知道代码会是什么样子。 Could you help me out with some code?
你能帮我一些代码吗?
Why is this a problem?为什么这是个问题?
Like I said, the input image should not be downsampled, because I do not classify anything.就像我说的,输入图像不应该被下采样,因为我没有对任何东西进行分类。 I want to produce a new image with the same size as the input image.
我想生成一个与输入图像大小相同的新图像。 So there should not be any loss.
所以应该不会有任何损失。
Code for a fixed input shape:固定输入形状的代码:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
...
with images of the size 28*28px图片大小为 28*28px
How I thought it might work:我认为它可能如何工作:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(None, None, 1)))
model.add(layers.MaxPooling2D((2, 2)))
...
with images of different sizes具有不同尺寸的图像
Here is an example of an autoencoder which converts grayscale images to rgb images, but this one has a fixed input shape. 这是一个将灰度图像转换为 rgb 图像的自动编码器示例,但是这个具有固定的输入形状。
*I am using TF 2.0 Alpha *我使用的是 TF 2.0 Alpha
I figured out, that a convolutional neural network does not care at all about the input shape.我发现,卷积神经网络根本不关心输入形状。 What it cares about, are kernel size, stride and padding.
它关心的是内核大小、步幅和填充。 For example settingkernel size = 3, stride = 1, padding = 1 does not change the tensors shape.
例如设置 kernel size = 3, stride = 1, padding = 1 不会改变张量的形状。 When it comes to pooling, it must be assured, that padding = 1 is added, which is called half/same padding ( http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html ).
说到pooling,一定要保证,加了padding = 1,这叫half/same padding ( http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html )。 Thus, it is possible to make a fully convolutional autoencoder which is able to process images of different sizes.
因此,可以制作一个能够处理不同尺寸图像的全卷积自编码器。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.