I'm setting up a jupyter notebook run on a remote server but my code appears not to be using the GPU. It looks like tensorflow is identifying the GPU but Keras is missing it somehow. Is there something in my setup process leading to this?
I installed nvidia docker via the github instructions:
# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker
I'm ssh'ing into my server:
ssh me@serverstuff
And then on the server running:
docker run -it -p 9999:9999 --name mycontainer -v /mydata:/mycontainer/mydata ufoym/deepo bash
jupyter notebook --ip 0.0.0.0 --port 9999 --no-browser --allow-root
And then opening up a new command prompt on my desktop and running:
ssh -N -f -L localhost:9999:serverstuff:9999 me@serverstuff
Then signing in, and opening up localhost:9999 in my browser, and logging in with the provided token successfully.
But when I run DL training in my notebook the speed is such that it doesn't seem to be using GPU.
!nvidia-smi
gives:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.86 Driver Version: 430.86 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 730 WDDM | 00000000:01:00.0 N/A | N/A |
| 25% 41C P8 N/A / N/A | 551MiB / 2048MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
and
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
gives:
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 7106107654095923441
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 13064397814134284140
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 14665046342845873047
physical_device_desc: "device: XLA_GPU device"
]
and
from keras import backend as K
K.tensorflow_backend._get_available_gpus()
gives:
[]
Try installing another image, I also had problems with custom images so I went with a direct nvidia image:
docker pull nvcr.io/nvidia/tensorflow:19.08-py3
there are other versions as well, you can check them out here
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.