简体   繁体   English

使用TensorFlow for Java进行内存泄漏

[英]Memory leak using TensorFlow for Java

The following test code leaks memory: 以下测试代码泄漏内存:

private static final float[] X = new float[]{1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0};

public void testTensorFlowMemory() {
    // create a graph and session
    try (Graph g = new Graph(); Session s = new Session(g)) {
        // create a placeholder x and a const for the dimension to do a cumulative sum along
        Output x = g.opBuilder("Placeholder", "x").setAttr("dtype", DataType.FLOAT).build().output(0);
        Output dims = g.opBuilder("Const", "dims").setAttr("dtype", DataType.INT32).setAttr("value", Tensor.create(0)).build().output(0);
        Output y = g.opBuilder("Cumsum", "y").addInput(x).addInput(dims).build().output(0);
        // loop a bunch to test memory usage
        for (int i=0; i<10000000; i++){
            // create a tensor from X
            Tensor tx = Tensor.create(X);
            // run the graph and fetch the resulting y tensor
            Tensor ty = s.runner().feed("x", tx).fetch("y").run().get(0);
            // close the tensors to release their resources
            tx.close();
            ty.close();
        }

        System.out.println("non-threaded test finished");
    }
}

Is there something obvious I'm doing wrong? 有什么明显的东西我做错了吗? The basic flow is to create a graph and a session on that graph, create a placeholder and a constant in order to do a cumulative sum on a tensor fed in as x. 基本流程是在该图上创建图形和会话,创建占位符和常量,以便在以x为单位的张量上执行累积和。 After running the resulting y operation, I close both the x and y tensors to free their memory resources. 运行生成的y操作后,我关闭x和y张量以释放其内存资源。

Things I believe so far to help: 我相信到目前为止帮助的事情:

  • This is not a Java objects memory problem. 这不是Java对象内存问题。 The heap does not grow, other memory in the JVM is not growing- according to jvisualvm. 根据jvisualvm,堆不会增长,JVM中的其他内存不会增长。 Doesn't appear to be a JVM memory leak according to Java's Native Memory Tracking. 根据Java的本机内存跟踪,似乎不是JVM内存泄漏。
  • The close operations are helping, if they're not there the memory grows by leaps and bounds. 密切的操作正在帮助,如果他们不在那里,内存会突飞猛进。 With them in place it still grows pretty fast, but nearly as much as without them. 随着它们到位,它仍然会变得非常快,但几乎与没有它们一样多。
  • The cumsum operator is not important, it happens with sum and other operators as well cumsum运算符并不重要,它也适用于sum和其他运算符
  • It happens on Mac OS with TF 1.1, and CentOS 7 with TF 1.1 and 1.2_rc0 它发生在带有TF 1.1的Mac OS和带有TF 1.1和1.2_rc0的CentOS 7上
  • Commenting out the Tensor ty lines removes the leak, so it appears to be in there. 评论Tensor ty线可以消除泄漏,因此它似乎在那里。

Any ideas? 有任何想法吗? Thanks! 谢谢! Also, here's a Github project that demonstrates this issue with both a threaded test (to grow the memory faster) and an unthreaded test (to show it's not due to threading). 此外, 这是一个Github项目,通过线程测试(以更快地增长内存)和无线测试(以显示它不是由于线程) 来演示此问题 It uses maven and can be run with simple: 它使用maven,可以简单地运行:

mvn test

I believe there is indeed a leak (in particular a missing TF_DeleteStatus corresponding to the allocation in JNI code ) (Thanks for the detailed instructions to reproduce) 我相信确实存在泄漏(特别是缺少与JNI代码中分配相对应的TF_DeleteStatus )(感谢重现的详细说明)

I'd encourage you to file an issue at http://github.com/tensorflow/tensorflow/issues and hopefully it should be fixed before the final 1.2 release. 我建议您在http://github.com/tensorflow/tensorflow/issues上提交一个问题,并希望它应该在最终的1.2版本之前修复。

(Relatedly, you also have a leak outside the loop since the Tensor object created by Tensor.create(0) is not being closed) (相关地,由于Tensor.create(0)创建的Tensor对象未被关闭, Tensor.create(0)在循环外部也有泄漏)

UPDATE : This was fixed and 1.2.0-rc1 should no longer have this problem. 更新 :这是固定的,1.2.0-rc1应该不再有这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM