简体   繁体   English

用于 CNN 的 ImageDataGenerator(),输入和输出为图像

[英]ImageDataGenerator() for CNN with input and output as an Image

I'm looking for a training map with something like this:我正在寻找具有以下内容的训练地图:

Grayscale Image -> Coloured Image灰度图像 -> 彩色图像

But the dataset can't be loaded all to the ram as X and Y because of obvious reasons.但是由于明显的原因,数据集不能作为 X 和 Y 全部加载到 ram 中。

I looked up the ImageDataGenerator() library, but it didn't give me a clear answer as to make it work here.我查找了ImageDataGenerator()库,但它没有给我一个明确的答案,让它在这里工作。

Summary:概括:

Input Shape = (2048, 2048, 1)输入形状 = (2048, 2048, 1)

Output Shape = (2048, 2048, 2)输出形状 = (2048, 2048, 2)

Training Dataset = 17,000 images训练数据集 = 17,000 张图像

Validation Dataset = 1,000 images验证数据集 = 1,000 张图像

Here's the structure of the model I'm trying to train:这是我正在尝试训练的模型的结构:

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 2048, 2048,  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 2048, 2048, 1 160         input_1[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu (LeakyReLU)         (None, 2048, 2048, 1 0           conv2d[0][0]                     
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 2048, 2048, 3 4640        leaky_re_lu[0][0]                
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU)       (None, 2048, 2048, 3 0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 2048, 2048, 3 128         leaky_re_lu_1[0][0]              
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 1024, 1024, 3 0           batch_normalization[0][0]        
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 1024, 1024, 6 18496       max_pooling2d[0][0]              
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU)       (None, 1024, 1024, 6 0           conv2d_2[0][0]                   
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 1024, 1024, 6 256         leaky_re_lu_2[0][0]              
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 512, 512, 64) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 512, 512, 128 73856       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU)       (None, 512, 512, 128 0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 512, 512, 128 512         leaky_re_lu_3[0][0]              
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 512, 512, 256 295168      batch_normalization_2[0][0]      
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU)       (None, 512, 512, 256 0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 512, 512, 256 1024        leaky_re_lu_4[0][0]              
__________________________________________________________________________________________________
up_sampling2d (UpSampling2D)    (None, 1024, 1024, 2 0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 1024, 1024, 1 295040      up_sampling2d[0][0]              
__________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU)       (None, 1024, 1024, 1 0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 1024, 1024, 1 512         leaky_re_lu_5[0][0]              
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D)  (None, 2048, 2048, 1 0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 2048, 2048, 6 73792       up_sampling2d_1[0][0]            
__________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU)       (None, 2048, 2048, 6 0           conv2d_6[0][0]                   
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 2048, 2048, 6 0           leaky_re_lu_6[0][0]              
                                                                 input_1[0][0]                    
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 2048, 2048, 6 37504       concatenate[0][0]                
__________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU)       (None, 2048, 2048, 6 0           conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 2048, 2048, 6 256         leaky_re_lu_7[0][0]              
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 2048, 2048, 3 18464       batch_normalization_5[0][0]      
__________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU)       (None, 2048, 2048, 3 0           conv2d_8[0][0]                   
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 2048, 2048, 2 578         leaky_re_lu_8[0][0]              
==================================================================================================
Total params: 820,386
Trainable params: 819,042
Non-trainable params: 1,344
__________________________________________________________________________________________________

I am gonna post my comment with an example code:我将用示例代码发布我的评论:

You need to train the model with batches.您需要批量训练模型。 If for example, you want to use 500 images in 1 epoch, you can do 50 images per batches and 10 epochs instead.例如,如果您想在 1 个 epoch 中使用 500 张图像,则可以每批次使用 50 张图像,而改为使用 10 个 epoch。 That way you are only loading 50 images in memory.这样你只在内存中加载 50 张图像。 You have to set the batch size and set shuffle to True so your batches have different images.您必须设置批次大小并将 shuffle 设置为 True,以便您的批次具有不同的图像。

If you have your images in a directory, the comment above translate into something like this:如果您的图像在目录中,则上面的注释会转换为如下内容:

from keras_preprocessing.image import ImageDataGenerator
preprocessing_images = ImageDataGenerator()

train_generator = preprocessing_images.flow_from_directory(
        train_path,
        target_size=target_size,
        batch_size=50,
        class_mode="categorical", # classes are provided in categorical format for a 2-unit output layer
        shuffle=True,
        color_mode="grayscale",
        seed=1234567890
    )

There you have a generator which will yield batches of 50 images.在那里你有一个生成器,它将生成 50 张图像的批次。 I have not specified the arguments for your specific problem, you need to change the target_size and all that stuff.我没有为您的特定问题指定参数,您需要更改 target_size 和所有这些东西。 Note that this is an infinite generator, it is going to return an infinite number of batches with 50 images.请注意,这是一个无限生成器,它将返回具有 50 张图像的无限数量的批次。 You can wrap it into a tf.data.Dataset or specify the steps per epoch.您可以将其包装成 tf.data.Dataset 或指定每个时期的步骤。 There are several ways.有几种方法。 I hope it helps.我希望它有帮助。 Work a bit with it and if you still have problems, I will elaborate on the answer(I am a busy right now)稍微处理一下,如果你仍然有问题,我会详细说明答案(我现在很忙)

That would be easiest with a custom training loop.使用自定义训练循环会最简单。

def reconstruct(colored_inputs):
    with tf.GradientTape() as tape:
        grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)

        out = autoencoder(grayscale_inputs)
        loss = loss_object(colored_inputs, out)

    gradients = tape.gradient(loss, autoencoder.trainable_variables)
    optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))

    reconstruction_loss(loss)

Here, my data iterator is cyling through all the color pictures, but its converted to grayscale before being passed to the model.在这里,我的数据迭代器循环遍历所有彩色图片,但在传递给模型之前将其转换为灰度。 Then, the RGB output of the model is compared to the original RGB image.然后,将模型的 RGB 输出与原始 RGB 图像进行比较。 You will have to use the argument class_mode=None in flow_from_directory .您必须在flow_from_directory使用参数class_mode=None I used tf.image.rgb_to_grayscale to make the conversion between grayscale and RGB.我使用tf.image.rgb_to_grayscale在灰度和 RGB 之间进行转换。

Full example:完整示例:

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
import os

os.chdir(r'catsanddogs')

generator = tf.keras.preprocessing.image.ImageDataGenerator()
iterator = generator.flow_from_directory(
    target_size=(32, 32),
    directory='.',
    batch_size=4,
    class_mode=None)

encoder = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(32, 32, 1)),
    tf.keras.layers.Dense(32),
    tf.keras.layers.Dense(16)
])

decoder = tf.keras.Sequential([
    tf.keras.layers.Dense(32, input_shape=[16]),
    tf.keras.layers.Dense(32 * 32 * 3),
    tf.keras.layers.Reshape([32, 32, 3])
])


autoencoder = tf.keras.Sequential([encoder, decoder])

loss_object = tf.losses.BinaryCrossentropy()

reconstruction_loss = tf.metrics.Mean(name='reconstruction_loss')

optimizer = tf.optimizers.Adam()


def reconstruct(colored_inputs):
    with tf.GradientTape() as tape:
        grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)

        out = autoencoder(grayscale_inputs)
        loss = loss_object(colored_inputs, out)

    gradients = tape.gradient(loss, autoencoder.trainable_variables)
    optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))

    reconstruction_loss(loss)


if __name__ == '__main__':
    template = 'Epoch {:2} Reconstruction Loss {:.4f}'
    for epoch in range(50):
        reconstruction_loss.reset_states()
        for input_batches in iterator:
            reconstruct(input_batches)
        print(template.format(epoch + 1, reconstruction_loss.result()))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 具有矢量输出和2D图像图形输入的CNN(输入为数组) - CNN with vector output and 2D image graph input (input is an array) 如何使用 KFold 交叉验证 Output 作为图像处理的 CNN 输入? - How to Use KFold Cross Validation Output as CNN Input for Image Processing? 如何用CNN输出图像? - How to output an image with a CNN? 如何获取使用 ImageDataGenerator 为双输入 CNN 模型构建的数据集的标签? - how to get the labels of a dataset which is built using ImageDataGenerator for dual input CNN model? 如何通过 Tensorflow ImageDataGenerator 构建多输入图像处理 - How to build multi-input image process by Tensorflow ImageDataGenerator 如何获得与输入相同维度的 CNN 的 output - how to get the output of a CNN with same dimension as the input CNN的输出与输入的变化不大 - Output of a CNN doesn't change much with the input Pytorch,如何将CNN的output馈入RNN的输入? - Pytorch,How to feed output of CNN into input of RNN? Keras CNN:除了将图像添加到CNN之外,还添加文本作为附加输入 - Keras CNN: Add text as additional input besides image to CNN 为什么CNN输出的ndarray转换的图像不正确 - Why the image converted by ndarray of the CNN output is not correct
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM