简体   繁体   English

带有 TensorFlow 后端的 Keras 不使用 GPU

[英]Keras with TensorFlow backend not using GPU

I built the gpu version of the docker image https://github.com/floydhub/dl-docker with keras version 2.0.0 and tensorflow version 0.12.1.我使用 keras 版本 2.0.0 和 tensorflow 版本 0.12.1 构建了 docker 镜像https://github.com/floydhub/dl-docker的 gpu 版本。 I then ran the mnist tutorial https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py but realized that keras is not using GPU.然后我运行了 mnist 教程https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py但意识到 keras 没有使用 GPU。 Below is the output that I have下面是我的输出

root@b79b8a57fb1f:~/sharedfolder# python test.py
Using TensorFlow backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-09-06 16:26:54.866833: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866855: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866863: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866870: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

Can anyone let me know if there are some settings that need to be made before keras uses GPU ?谁能告诉我在 keras 使用 GPU 之前是否需要进行一些设置? I am very new to all these so do let me know if I need to provide more information.我对所有这些都很陌生,所以如果我需要提供更多信息,请告诉我。

I have installed the pre-requisites as mentioned on the page我已经安装了页面上提到的先决条件

I am able to launch the docker image我可以启动 docker 镜像

docker run -it -p 8888:8888 -p 6006:6006 -v /sharedfolder:/root/sharedfolder floydhub/dl-docker:cpu bash
  • GPU Version Only: Install Nvidia drivers on your machine either from Nvidia directly or follow the instructions here .仅 GPU 版本:直接从 Nvidia 或按照此处的说明在您的计算机上安装 Nvidia 驱动程序。 Note that you don't have to install CUDA or cuDNN.请注意,您不必安装 CUDA 或 cuDNN。 These are included in the Docker container.这些都包含在 Docker 容器中。

I am able to run the last step我能够运行最后一步

cv@cv-P15SM:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.66  Mon May  1 15:29:16 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
  • GPU Version Only: Install nvidia-docker: https://github.com/NVIDIA/nvidia-docker , following the instructions here.仅限 GPU 版本:按照此处的说明安装 nvidia-docker: https : //github.com/NVIDIA/nvidia-docker This will install a replacement for the docker CLI.这将安装 docker CLI 的替代品。 It takes care of setting up the Nvidia host driver environment inside the Docker containers and a few other things.它负责在 Docker 容器内设置 Nvidia 主机驱动程序环境和其他一些事情。

I am able to run the step here我可以在这里运行这一步

# Test nvidia-smi
cv@cv-P15SM:~$ nvidia-docker run --rm nvidia/cuda nvidia-smi

Thu Sep  7 00:33:06 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 780M    Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   55C    P0    N/A /  N/A |    310MiB /  4036MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+

I am also able to run the nvidia-docker command to launch a gpu supported image.我还可以运行 nvidia-docker 命令来启动 gpu 支持的图像。

What I have tried我试过的

I have tried the following suggestions below我已经尝试了以下建议

  1. Check if you have completed step 9 of this tutorial ( https://github.com/ignaciorlando/skinner/wiki/Keras-and-TensorFlow-installation ).检查您是否已完成本教程的第 9 步( https://github.com/ignaciorlando/skinner/wiki/Keras-and-TensorFlow-installation )。 Note: Your file paths may be completely different inside that docker image, you'll have to locate them somehow.注意:您的文件路径在该 docker 映像中可能完全不同,您必须以某种方式找到它们。

I appended the suggested lines to my bashrc and have verified that the bashrc file is updated.我将建议的行附加到我的 bashrc 并已验证 bashrc 文件已更新。

echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda-8.0' >> ~/.bashrc
  1. To import the following commands in my python file在我的 python 文件中导入以下命令

    import os os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152 os.environ["CUDA_VISIBLE_DEVICES"]="0"

Both steps, done separately or together unfortunately did not solve the issue.不幸的是,单独或一起完成的两个步骤都没有解决问题。 Keras is still running with the CPU version of tensorflow as its backend. Keras 仍然以 tensorflow 的 CPU 版本作为其后端运行。 However, I might have found the possible issue.但是,我可能已经找到了可能的问题。 I checked the version of my tensorflow via the following commands and found two of them.我通过以下命令检查了我的 tensorflow 的版本,并找到了其中的两个。

This is the CPU version这是CPU版本

root@08b5fff06800:~# pip show tensorflow
Name: tensorflow
Version: 1.3.0
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: opensource@google.com
License: Apache 2.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: tensorflow-tensorboard, six, protobuf, mock, numpy, backports.weakref, wheel

And this is the GPU version这是 GPU 版本

root@08b5fff06800:~# pip show tensorflow-gpu
Name: tensorflow-gpu
Version: 0.12.1
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: opensource@google.com
License: Apache 2.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: mock, numpy, protobuf, wheel, six

Interestingly, the output shows that keras is using tensorflow version 1.3.0 which is the CPU version and not 0.12.1, the GPU version有趣的是,输出显示 keras 使用的是 tensorflow 版本 1.3.0,它是 CPU 版本,而不是 0.12.1,即 GPU 版本

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

import tensorflow as tf
print('Tensorflow: ', tf.__version__)

Output输出

root@08b5fff06800:~/sharedfolder# python test.py
Using TensorFlow backend.
Tensorflow:  1.3.0

I guess now I need to figure out how to have keras use the gpu version of tensorflow.我想现在我需要弄清楚如何让 keras 使用 tensorflow 的 GPU 版本。

It is never a good idea to have both tensorflow and tensorflow-gpu packages installed side by side (the one single time it happened to me accidentally, Keras was using the CPU version).同时安装tensorflowtensorflow-gpu从来都不是一个好主意(有一次我不小心发生了这种情况,Keras 使用的是 CPU 版本)。

I guess now I need to figure out how to have keras use the gpu version of tensorflow.我想现在我需要弄清楚如何让 keras 使用 tensorflow 的 GPU 版本。

You should simply remove both packages from your system, and then re-install tensorflow-gpu [UPDATED after comment]:您应该简单地从系统中删除这两个软件包,然后重新安装tensorflow-gpu [评论后更新]:

pip uninstall tensorflow tensorflow-gpu
pip install tensorflow-gpu

Moreover, it is puzzling why you seem to use the floydhub/dl-docker:cpu container, while according to the instructions you should be using the floydhub/dl-docker:gpu one...此外,令人费解的是为什么您似乎使用floydhub/dl-docker:cpu容器,而根据说明您应该使用floydhub/dl-docker:gpu一个...

I had similar kind of issue - keras didn't use my GPU.我有类似的问题 - keras 没有使用我的 GPU。 I had tensorflow-gpu installed according to instruction into conda, but after installation of keras it simply not listed GPU as available device.我根据 conda 中的说明安装了 tensorflow-gpu,但是在安装 keras 后,它根本没有将 GPU 列为可用设备。 I've realized that installation of keras adds tensorflow package!我意识到安装 keras 会添加 tensorflow 包! So I had both tensorflow and tensorflow-gpu packages.所以我有 tensorflow 和 tensorflow-gpu 包。 I've found that there is keras-gpu package available.我发现有 keras-gpu 包可用。 After complete uninstallation of keras, tensorflow, tensorflow-gpu and installation of tensorflow-gpu, keras-gpu the problem was solved.完整卸载keras、tensorflow、tensorflow-gpu并安装tensorflow-gpu、keras-gpu后问题解决。

In the future, you can try using virtual environments to separate tensorflow CPU and GPU, for example:以后可以尝试使用虚拟环境来分离tensorflow CPU和GPU,例如:

conda create --name tensorflow python=3.5
activate tensorflow
pip install tensorflow

AND

conda create --name tensorflow-gpu python=3.5
activate tensorflow-gpu
pip install tensorflow-gpu

这对我有用:安装 tensorflow v2.2.0 pip install tensorflow==2.2.0 也删除 tensorflow-gpu(如果存在)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM