简体   繁体   中英

jupyter notebook running in docker on remote server: keras not using gpu

I'm setting up a jupyter notebook run on a remote server but my code appears not to be using the GPU. It looks like tensorflow is identifying the GPU but Keras is missing it somehow. Is there something in my setup process leading to this?

I installed nvidia docker via the github instructions:

# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker 

I'm ssh'ing into my server:

ssh me@serverstuff

And then on the server running:

docker run -it -p 9999:9999 --name mycontainer -v /mydata:/mycontainer/mydata ufoym/deepo bash
jupyter notebook --ip 0.0.0.0 --port 9999 --no-browser --allow-root

And then opening up a new command prompt on my desktop and running:

ssh -N -f -L localhost:9999:serverstuff:9999 me@serverstuff

Then signing in, and opening up localhost:9999 in my browser, and logging in with the provided token successfully.

But when I run DL training in my notebook the speed is such that it doesn't seem to be using GPU.

!nvidia-smi

gives:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.86       Driver Version: 430.86       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730     WDDM  | 00000000:01:00.0 N/A |                  N/A |
| 25%   41C    P8    N/A /  N/A |    551MiB /  2048MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

and

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

gives:

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 7106107654095923441
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 13064397814134284140
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 14665046342845873047
physical_device_desc: "device: XLA_GPU device"
]

and

from keras import backend as K
K.tensorflow_backend._get_available_gpus()

gives:

[]

Try installing another image, I also had problems with custom images so I went with a direct nvidia image:

docker pull nvcr.io/nvidia/tensorflow:19.08-py3

there are other versions as well, you can check them out here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM