简体   繁体   English

Tensorflow 在运行自动编码器时出现 memory 错误

[英]Tensorflow out of memory error while running autoencoders

I am running a google colab to explore autoencoders to remove noisy data and I keep getting an OOM error from TensorFlow.我正在运行一个 google colab 来探索自动编码器以删除噪声数据,并且我不断收到来自 TensorFlow 的 OOM 错误。 My code is as follows:我的代码如下:

def denoising_autoencoder():
  model = Sequential()
  model.add(keras.layers.Input(shape=(420, 540, 1)))
  model.add(keras.layers.Conv2D(48, (3, 3), activation='relu', padding='same'))
  model.add(keras.layers.Conv2D(72, (3, 3), activation='relu', padding='same'))
  model.add(keras.layers.Conv2D(144, (3, 3), activation='relu', padding='same'))
  model.add(keras.layers.BatchNormalization())
  model.add(keras.layers.MaxPooling2D((2, 2), padding='same'))
  model.add(keras.layers.Dropout(0.5))
  model.add(keras.layers.Conv2D(144, (3, 3), activation='relu', padding='same'))
  model.add(keras.layers.Conv2D(72, (3, 3), activation='relu', padding='same'))
  model.add(keras.layers.Conv2D(48, (3, 3), activation='relu', padding='same'))
  model.add(keras.layers.BatchNormalization())
  model.add(keras.layers.UpSampling2D((2, 2)))
  model.add(keras.layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same'))
  return model

model = denoising_autoencoder()
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_absolute_error'])
callback = EarlyStopping(monitor='loss', patience=30)
history = model.fit(X_train, Y_train, validation_data = (X_val, Y_val), epochs=30, batch_size=24, verbose=0, callbacks=[callback])

And the error I get is as follows:我得到的错误如下:

ResourceExhaustedError:  OOM when allocating tensor with shape[24,48,420,540] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
     [[{{node gradient_tape/sequential/conv2d_6/Conv2D/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
 [Op:__inference_train_function_1289]

I have also tried the following (based on reading some other posts regarding this issue):我还尝试了以下方法(基于阅读有关此问题的其他一些帖子):

gpu_options = tf.compat.v1.GPUOptions(allow_growth=True)
session = tf.compat.v1.InteractiveSession(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options))

It could because the numbers are floats, that can consume lots of memory.这可能是因为数字是浮点数,可能会消耗大量 memory。 You can try quantization, which is a technique to mitigate this problem.您可以尝试量化,这是一种缓解此问题的技术。 Here's the documentation .这是文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM