[英]ImageDataGenerator() for CNN with input and output as an Image
I'm looking for a training map with something like this:我正在寻找具有以下内容的训练地图:
Grayscale Image -> Coloured Image灰度图像 -> 彩色图像
But the dataset can't be loaded all to the ram as X and Y because of obvious reasons.但是由于明显的原因,数据集不能作为 X 和 Y 全部加载到 ram 中。
I looked up the ImageDataGenerator() library, but it didn't give me a clear answer as to make it work here.我查找了ImageDataGenerator()库,但它没有给我一个明确的答案,让它在这里工作。
Summary:概括:
Input Shape = (2048, 2048, 1)输入形状 = (2048, 2048, 1)
Output Shape = (2048, 2048, 2)输出形状 = (2048, 2048, 2)
Training Dataset = 17,000 images训练数据集 = 17,000 张图像
Validation Dataset = 1,000 images验证数据集 = 1,000 张图像
Here's the structure of the model I'm trying to train:这是我正在尝试训练的模型的结构:
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 2048, 2048, 0
__________________________________________________________________________________________________
conv2d (Conv2D) (None, 2048, 2048, 1 160 input_1[0][0]
__________________________________________________________________________________________________
leaky_re_lu (LeakyReLU) (None, 2048, 2048, 1 0 conv2d[0][0]
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 2048, 2048, 3 4640 leaky_re_lu[0][0]
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 2048, 2048, 3 0 conv2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 2048, 2048, 3 128 leaky_re_lu_1[0][0]
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 1024, 1024, 3 0 batch_normalization[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 1024, 1024, 6 18496 max_pooling2d[0][0]
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 1024, 1024, 6 0 conv2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 1024, 1024, 6 256 leaky_re_lu_2[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 512, 512, 64) 0 batch_normalization_1[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 512, 512, 128 73856 max_pooling2d_1[0][0]
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 512, 512, 128 0 conv2d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 512, 512, 128 512 leaky_re_lu_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 512, 512, 256 295168 batch_normalization_2[0][0]
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU) (None, 512, 512, 256 0 conv2d_4[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 512, 512, 256 1024 leaky_re_lu_4[0][0]
__________________________________________________________________________________________________
up_sampling2d (UpSampling2D) (None, 1024, 1024, 2 0 batch_normalization_3[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 1024, 1024, 1 295040 up_sampling2d[0][0]
__________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU) (None, 1024, 1024, 1 0 conv2d_5[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 1024, 1024, 1 512 leaky_re_lu_5[0][0]
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D) (None, 2048, 2048, 1 0 batch_normalization_4[0][0]
__________________________________________________________________________________________________
conv2d_6 (Conv2D) (None, 2048, 2048, 6 73792 up_sampling2d_1[0][0]
__________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU) (None, 2048, 2048, 6 0 conv2d_6[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 2048, 2048, 6 0 leaky_re_lu_6[0][0]
input_1[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D) (None, 2048, 2048, 6 37504 concatenate[0][0]
__________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU) (None, 2048, 2048, 6 0 conv2d_7[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 2048, 2048, 6 256 leaky_re_lu_7[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D) (None, 2048, 2048, 3 18464 batch_normalization_5[0][0]
__________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU) (None, 2048, 2048, 3 0 conv2d_8[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 2048, 2048, 2 578 leaky_re_lu_8[0][0]
==================================================================================================
Total params: 820,386
Trainable params: 819,042
Non-trainable params: 1,344
__________________________________________________________________________________________________
I am gonna post my comment with an example code:我将用示例代码发布我的评论:
You need to train the model with batches.您需要批量训练模型。 If for example, you want to use 500 images in 1 epoch, you can do 50 images per batches and 10 epochs instead.例如,如果您想在 1 个 epoch 中使用 500 张图像,则可以每批次使用 50 张图像,而改为使用 10 个 epoch。 That way you are only loading 50 images in memory.这样你只在内存中加载 50 张图像。 You have to set the batch size and set shuffle to True so your batches have different images.您必须设置批次大小并将 shuffle 设置为 True,以便您的批次具有不同的图像。
If you have your images in a directory, the comment above translate into something like this:如果您的图像在目录中,则上面的注释会转换为如下内容:
from keras_preprocessing.image import ImageDataGenerator
preprocessing_images = ImageDataGenerator()
train_generator = preprocessing_images.flow_from_directory(
train_path,
target_size=target_size,
batch_size=50,
class_mode="categorical", # classes are provided in categorical format for a 2-unit output layer
shuffle=True,
color_mode="grayscale",
seed=1234567890
)
There you have a generator which will yield batches of 50 images.在那里你有一个生成器,它将生成 50 张图像的批次。 I have not specified the arguments for your specific problem, you need to change the target_size and all that stuff.我没有为您的特定问题指定参数,您需要更改 target_size 和所有这些东西。 Note that this is an infinite generator, it is going to return an infinite number of batches with 50 images.请注意,这是一个无限生成器,它将返回具有 50 张图像的无限数量的批次。 You can wrap it into a tf.data.Dataset or specify the steps per epoch.您可以将其包装成 tf.data.Dataset 或指定每个时期的步骤。 There are several ways.有几种方法。 I hope it helps.我希望它有帮助。 Work a bit with it and if you still have problems, I will elaborate on the answer(I am a busy right now)稍微处理一下,如果你仍然有问题,我会详细说明答案(我现在很忙)
That would be easiest with a custom training loop.使用自定义训练循环会最简单。
def reconstruct(colored_inputs):
with tf.GradientTape() as tape:
grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)
out = autoencoder(grayscale_inputs)
loss = loss_object(colored_inputs, out)
gradients = tape.gradient(loss, autoencoder.trainable_variables)
optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))
reconstruction_loss(loss)
Here, my data iterator is cyling through all the color pictures, but its converted to grayscale before being passed to the model.在这里,我的数据迭代器循环遍历所有彩色图片,但在传递给模型之前将其转换为灰度。 Then, the RGB output of the model is compared to the original RGB image.然后,将模型的 RGB 输出与原始 RGB 图像进行比较。 You will have to use the argument class_mode=None
in flow_from_directory
.您必须在flow_from_directory
使用参数class_mode=None
。 I used tf.image.rgb_to_grayscale
to make the conversion between grayscale and RGB.我使用tf.image.rgb_to_grayscale
在灰度和 RGB 之间进行转换。
Full example:完整示例:
import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
import os
os.chdir(r'catsanddogs')
generator = tf.keras.preprocessing.image.ImageDataGenerator()
iterator = generator.flow_from_directory(
target_size=(32, 32),
directory='.',
batch_size=4,
class_mode=None)
encoder = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(32, 32, 1)),
tf.keras.layers.Dense(32),
tf.keras.layers.Dense(16)
])
decoder = tf.keras.Sequential([
tf.keras.layers.Dense(32, input_shape=[16]),
tf.keras.layers.Dense(32 * 32 * 3),
tf.keras.layers.Reshape([32, 32, 3])
])
autoencoder = tf.keras.Sequential([encoder, decoder])
loss_object = tf.losses.BinaryCrossentropy()
reconstruction_loss = tf.metrics.Mean(name='reconstruction_loss')
optimizer = tf.optimizers.Adam()
def reconstruct(colored_inputs):
with tf.GradientTape() as tape:
grayscale_inputs = tf.image.rgb_to_grayscale(colored_inputs)
out = autoencoder(grayscale_inputs)
loss = loss_object(colored_inputs, out)
gradients = tape.gradient(loss, autoencoder.trainable_variables)
optimizer.apply_gradients(zip(gradients, autoencoder.trainable_variables))
reconstruction_loss(loss)
if __name__ == '__main__':
template = 'Epoch {:2} Reconstruction Loss {:.4f}'
for epoch in range(50):
reconstruction_loss.reset_states()
for input_batches in iterator:
reconstruct(input_batches)
print(template.format(epoch + 1, reconstruction_loss.result()))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.