具有不同图像尺寸的Keras CNN

Question

I'm trying to use the VOC2012 dataset for training a CNN. 我正在尝试使用VOC2012数据集来训练CNN。 For my project, I require B&W data, so I extracted the R components. 对于我的项目，我需要黑白数据，因此我提取了R组件。 So far so good. 到现在为止还挺好。 The trouble is that the images are of different sizes, so I can't figure out how to pass it to the model. 问题在于图像的大小不同，所以我不知道如何将其传递给模型。 I compiled my model, and then created my mini-batches of size 32 as below (where X_train and Y_train are the paths to the files). 我编译了模型，然后创建了如下的32个迷你批处理（其中X_train和Y_train是文件的路径）。

for x in X_train:
    img = plt.imread(x)
    img = img.reshape(*(img.shape), 1)
    X.append(img)

for y in Y_train:
    img = plt.imread(y)
    img = img.reshape(*(img.shape), 1)
    Y.append(img)

model.train_on_batch(np.array(X), np.array(Y))

However, I suspect that because the images are all of different sizes, the numpy array has a shape (32,) rather than (32, height, width, 1) as I'd expect. 但是，我怀疑由于图像的大小都不同，所以numpy数组的形状为（32，）而不是我期望的形状（32，height，width，1）。 How do I take care of this? 我该如何处理？

Answer 1

According to some sources, one is indeed able to train at least some architectures with varying input sizes. 根据某些消息来源，确实可以训练至少一些具有不同输入大小的体系结构。 ( Quora , Cross Validated ) （ Quora ，交叉验证）

When it comes to generating an array of arrays varying in size, one might simply use a Python list of NumPy arrays, or an ndarray of type object to collect all the image data. 当要生成大小可变的数组的数组时，可能只是使用Python的NumPy数组列表或类型为object的ndarray来收集所有图像数据。 Then in the training process, the Quora answer mentioned that only batch size 1 can be used, or one might clump several images together based on the sizes. 然后在训练过程中，Quora回答提到只能使用批处理大小1，否则可能会根据大小将多个图像合并在一起。 Even padding with zeros could be used to make the images evenly sized, but I can't say much about the validity of that approach. 甚至可以使用零填充来使图像大小均匀，但是对于这种方法的有效性我不能说太多。

Best of luck in your research! 祝您研究顺利！

Example code for illustration: 示例代码示例：

# Generate 10 "images" with different sizes
images = [np.zeros((i+5, i+10)) for i in range(10)]
images = np.array([np.zeros((i+5, i+10)) for i in range(10)])

# Or an empty array to append to
images = np.array([], dtype=object)

具有不同图像尺寸的Keras CNN

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-09-08 18:25:07

Example code for illustration: 示例代码示例：

具有不同图像尺寸的Keras CNN

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-09-08 18:25:07

Example code for illustration: 示例代码示例：

解决方案1
0 已采纳 2018-09-08 18:25:07