[英]GPU out of memory when training convolutional neural network on Tensorflow
我正在使用卷積神經網絡使用 Tensorflow 1.9 在 GTX1080 Ti 上訓練一組約 9000 張圖像(300x500),但每次都遇到超出內存的問題。 我收到系統內存超過 10% 的警告,幾分鍾后該進程被終止。 我的代碼如下。
import tensorflow as tf
from os import listdir
train_path = '/media/NewVolume/colorizer/img/train/'
col_train_path = '/media/NewVolume/colorizer/img/colored/train/'
val_path = '/media/NewVolume/colorizer/img/val/'
col_val_path = '/media/NewVolume/colorizer/img/colored/val/'
def load_image(image_file):
image = tf.read_file(image_file)
image = tf.image.decode_jpeg(image)
return image
train_dataset = []
col_train_dataset = []
val_dataset = []
col_val_dataset = []
for i in listdir(train_path):
train_dataset.append(load_image(train_path + i))
col_train_dataset.append(load_image(col_train_path + i))
for i in listdir(val_path):
val_dataset.append(load_image(val_path + i))
col_val_dataset.append(load_image(col_val_path + i))
train_dataset = tf.stack(train_dataset)
col_train_dataset = tf.stack(col_train_dataset)
val_dataset = tf.stack(val_dataset)
col_val_dataset = tf.stack(col_val_dataset)
input1 = tf.placeholder(tf.float32, [None, 300, 500, 1])
color = tf.placeholder(tf.float32, [None, 300, 500, 3])
#MODEL
conv1 = tf.layers.conv2d(inputs = input1, filters = 8, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
pool1 = tf.layers.max_pooling2d(inputs = conv1, pool_size=[2, 2], strides=2)
conv2 = tf.layers.conv2d(inputs = pool1, filters = 16, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
pool2 = tf.layers.max_pooling2d(inputs = conv2, pool_size=[2, 2], strides=2)
conv3 = tf.layers.conv2d(inputs = pool2, filters = 32, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
pool3 = tf.layers.max_pooling2d(inputs = conv3, pool_size=[2, 2], strides=2)
flat = tf.layers.flatten(inputs = pool3)
dense = tf.layers.dense(flat, 2432, activation = tf.nn.relu)
reshaped = tf.reshape(dense, [tf.shape(dense)[0],38, 64, 1])
conv_trans1 = tf.layers.conv2d_transpose(inputs = reshaped, filters = 32, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
upsample1 = tf.image.resize_nearest_neighbor(conv_trans1, (2*tf.shape(conv_trans1)[1],2*tf.shape(conv_trans1)[2]))
conv_trans2 = tf.layers.conv2d_transpose(inputs = upsample1, filters = 16, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
upsample2 = tf.image.resize_nearest_neighbor(conv_trans2, (2*tf.shape(conv_trans2)[1],2*tf.shape(conv_trans2)[2]))
conv_trans3 = tf.layers.conv2d_transpose(inputs = upsample2, filters = 8, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
upsample3 = tf.image.resize_nearest_neighbor(conv_trans3, (2*tf.shape(conv_trans3)[1],2*tf.shape(conv_trans3)[2]))
conv_trans4 = tf.layers.conv2d_transpose(inputs = upsample3, filters = 3, kernel_size=[5, 5], activation=tf.nn.relu, padding = 'same')
reshaped2 = tf.reshape(dense, [tf.shape(conv_trans4)[0],300,500,3])
#TRAINING
loss = tf.losses.mean_squared_error(color, reshaped2)
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss)
EPOCHS = 10
BATCH_SIZE = 3
dataset = tf.data.Dataset.from_tensor_slices((train_dataset,col_train_dataset)).repeat().batch(BATCH_SIZE)
iterator = dataset.make_one_shot_iterator()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(EPOCHS):
x,y=iterator.get_next()
_, loss_value = sess.run([train_step, loss],feed_dict={input1:x.eval(session=sess),color:y.eval(session=sess)})
print("Iter: {}, Loss: {:.4f}".format(i, loss_value))
我認為您的問題出在以下代碼中。
def load_image(image_file):
image = tf.read_file(image_file)
image = tf.image.decode_jpeg(image)
return image
...
for i in listdir(train_path):
train_dataset.append(load_image(train_path + i))
col_train_dataset.append(load_image(col_train_path + i))
您正在嘗試將 TF 張量操作用作常規代碼。 但是你最終得到的是圖表上的節點,這些節點只能在會話中進行評估。 在這種情況下,您嘗試將訓練和評估數據集中的每個圖像加載到 GPU 內存中(因為您的會話在 GPU 上運行)。 我猜你的圖像比你的 GPU 內存多。
這個問題有多種解決方案。 您可以將tf.read_image
操作作為圖形的一部分,並將每個批次的圖像名稱作為訓練循環中的提要字典傳遞。 您可以構建一個適當的輸入管道,其中文件名、批處理和文件數據的加載將在圖形中處理,或者您可以使用一些外部庫將圖像加載到 numpy 數組中,並將 numpy 數組輸入到圖形中。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.