繁体   English   中英

Tensorflow-2.3.0 未检测到 GPU

[英]Tensorflow-2.3.0 does not detect GPU

我正在使用Ubuntu 20.04 。我将 Tensorflow-2.2.0 升级到 Tensorflow-2.3.0。 当版本为2.2.0 时,tensorflow 很好地利用了 GPU。 但是升级到2.3.0版本后,它没有检测到 GPU。

我已经从 stackoverflow 看到了这个链接 那是cuDNN版本的问题。 但我需要 cuDNN 版本。

me_sajied@Kunai:~$ apt list | grep cudnn

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libcudnn7-dev/now 7.6.5.32-1+cuda10.1 amd64 [installed,local]
libcudnn7/now 7.6.5.32-1+cuda10.1 amd64 [installed,local]

我也有所有必需的软件及其版本。

库达

me_sajied@Kunai:~$ apt list | grep cuda-toolkit

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

cuda-toolkit-10-0/unknown 10.0.130-1 amd64
cuda-toolkit-10-1/unknown,now 10.1.243-1 amd64 [installed,automatic]
cuda-toolkit-10-2/unknown 10.2.89-1 amd64
cuda-toolkit-11-0/unknown,unknown 11.0.3-1 amd64
nvidia-cuda-toolkit-gcc/focal 10.1.243-3 amd64
nvidia-cuda-toolkit/focal 10.1.243-3 amd64

Python

me_sajied@Kunai:~$ python3 --version
Python 3.8.2

环境

LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64"

日志

me_sajied@Kunai:~$ python3
Python 3.8.2 (default, Jul 16 2020, 14:00:26) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-09-13 21:28:37.387327: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
>>> 
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-09-13 21:28:48.806385: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-13 21:28:48.836251: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2699905000 Hz
2020-09-13 21:28:48.836637: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3fde5f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-13 21:28:48.836685: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-13 21:28:48.840030: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-13 21:28:48.882190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-13 21:28:48.882582: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x408bd90 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-13 21:28:48.882606: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce 930MX, Compute Capability 5.0
2020-09-13 21:28:48.882796: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-13 21:28:48.883151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce 930MX computeCapability: 5.0
coreClock: 1.0195GHz coreCount: 3 deviceMemorySize: 1.96GiB deviceMemoryBandwidth: 14.92GiB/s
2020-09-13 21:28:48.883196: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-13 21:28:48.883415: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64
2020-09-13 21:28:48.885196: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-13 21:28:48.885544: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-13 21:28:48.887160: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-13 21:28:48.888134: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-13 21:28:48.891565: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-13 21:28:48.891603: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-09-13 21:28:48.891625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-13 21:28:48.891632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-09-13 21:28:48.891639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
False
>>> 

在你的~/.bashrc添加:

LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64

如果 lib64 文件夹的位置不同,则需要相应地调整它。

附带说明一下,如果您想频繁地在多个 CUDA 版本之间切换,您还可以直接在终端中为特定命令设置环境变量,例如:

LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64 python myprogram_which_needs_10_1.py

然后,如果要切换到不同的版本,只需修改命令前的路径即可。

2020-09-13 21:28:48.883415: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] 无法加载动态库“libcublas.so.10”; dlerror: libcublas.so.10: 无法打开共享对象文件:没有那个文件或目录;

就我而言,这是由安装引起的
用于CUDA 10.2libcublas10libcublas-dev通过apt upgrade

我对这个问题的解决方案如下。

  • 我的环境基于 NVIDIA 的 CUDA 存储库。
$ sudo apt install --reinstall libcublas10=10.2.1.243-1 libcublas-dev=10.2.1.243-1

并防止出现可升级的候选人。

$ sudo apt-mark hold libcublas10
$ sudo apt-mark hold libcublas-dev

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM