Tensorflow GPU application crashes Jupyter notebook kernel

Question

We are running Tensorflow applications on GPU using multiple Jupyter notebooks. Every once in a while one of the runs crashes the notebook, with the simple notification that "The kernel has crashed...".

When we placed the code into a python .py file, the stderr output was

F tensorflow/core/kernels/conv_ops_3d.cc:369] Check failed:   stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted

In another run the stderr reported:

F tensorflow/core/common_runtime/gpu/gpu_util.cc:296] GPU->CPU Memcpy failed

The problem is that the tensorflow applications are grabbing a lot of memory. In Linux you can run top to see what is going on. On our machine we saw that each tensorflow process was grabbing 0.55t !

When you run the process inside a Jupyter notebook and do not shutdown the notebook, the notebook does not release the memory. At some point you will run a process that cannot access memory and it will die. If you are running inside a notebook it will only tell you that the kernel has died.

Can anyone help with this?

Answer 1

One suggestion is to place the following snippet before you import tensorflow:

import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"

Added after @ Nicolas comment

Yes this disables GPU! Which is not what is wanted.

Tensorflow GPU application crashes Jupyter notebook kernel

Question

1 answers

solution1
1 2017-06-30 16:33:29

Tensorflow GPU application crashes Jupyter notebook kernel

Question

1 answers

solution1 1 2017-06-30 16:33:29

solution1
1 2017-06-30 16:33:29