Tensorflow 在每次调用 session.run() 时都会泄漏内存并带有最终图

Question

I am trying to use the tf.data api to feed variable-size image data (LxLx2) to my model, however I notice that I am leaking memory at every iteration.我正在尝试使用 tf.data api 将可变大小的图像数据 (LxLx2) 提供给我的模型，但是我注意到我在每次迭代时都在泄漏内存。 I would expect that the memory use would be determined by the largest image in the dataset, however I can see that memory use is increasing even when processing an image that is smaller than the maximum size seen so far.我希望内存使用量将由数据集中的最大图像决定，但是我可以看到，即使处理小于目前看到的最大尺寸的图像时，内存使用量也在增加。

Leaking memory over 100 iterations内存泄漏超过 100 次迭代

When I directly gather the processed features instead of computing the neural network activations, the memory does not seem to leak.当我直接收集处理过的特征而不是计算神经网络激活时，内存似乎没有泄漏。

Expected memory use (forgoing NN computation)预期内存使用（放弃神经网络计算）

It seems that the most common cause of this type of problem is dynamically adding nodes to the graph, however I call graph.finalize() prior to the iteration and do not catch any error.此类问题的最常见原因似乎是向图中动态添加节点，但是我在迭代之前调用了 graph.finalize() 并且没有捕获任何错误。

I am using python 3.5.4 and tensorflow 1.10 and running the computation on the CPU only.我正在使用 python 3.5.4 和 tensorflow 1.10 并仅在 CPU 上运行计算。

import tensorflow as tf
from sys import argv

# Data preparation
def record_parser(value):
    keys_to_features = {
        'seq_length': tf.VarLenFeature(dtype=tf.int64),
        'seq_feat': tf.VarLenFeature(dtype=tf.float32)
        }
    parsed = tf.parse_single_example(value, keys_to_features)
    length_ = tf.reshape(parsed['seq_length'].values, [])
    i32_len = tf.cast(length_, dtype=tf.int32)
    features_ = tf.reshape(parsed['seq_feat'].values, [i32_len, i32_len, 2])
    return features_

graph = tf.get_default_graph()
dataset_ = tf.data.TFRecordDataset(argv[1])
dataset_ = dataset_.map(lambda value: record_parser(value))
dataset_ = dataset_.batch(1)
iterator = dataset_.make_one_shot_iterator()
features = iterator.get_next()

# NN part
nn0 = tf.layers.conv2d(features, filters=64, kernel_size=15, padding='SAME',\
 activation=tf.nn.relu)
nn = tf.layers.dense(nn0, units=100, activation=tf.nn.relu)
prediction = tf.layers.dense(nn, 17, activation=None)

var_init_op = tf.group(
                tf.global_variables_initializer(),
                tf.local_variables_initializer()
                )
graph.finalize()

# Iterating over samples
with tf.Session() as sess:
    sess.run(var_init_op)
    for i in range(100):
        out_loss = sess.run(prediction)
        #out_loss = sess.run(features)

Answer 1

Mentioning the Answer here for the benefit of the Community.为了社区的利益，在此提及答案。

The Issue of Memory Leakage when using tf.data API in Tensorflow Version 1.10 is resolved by upgrading to Tensorflow Version 1.13 .在Tensorflow Version 1.10使用tf.data API 时的内存泄漏问题已通过升级到Tensorflow Version 1.13解决。

Tensorflow 在每次调用 session.run() 时都会泄漏内存并带有最终图

问题描述

1 个解决方案

解决方案1
0 2019-12-11 04:47:57

Tensorflow 在每次调用 session.run() 时都会泄漏内存并带有最终图

问题描述

1 个解决方案

解决方案1 0 2019-12-11 04:47:57

解决方案1
0 2019-12-11 04:47:57