繁体   English   中英

在theano中使用CUDA8

[英]Using CUDA8 in theano

我已经安装了CUDA8并安装了theano,在导入theano时它会搜索CUDA7.5而不是CUDA8,怎么能告诉theano使用CUDA8代替CUDA7.5?

我的系统只有CUDA8,它不包含混合环境cuda(即同时具有CUDA7.5和CUDA8)。

这是nvidia-smi的输出

$ nvidia-smi 
Sat Feb  4 11:32:30 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 970M    Off  | 0000:01:00.0     Off |                  N/A |
| N/A   54C    P0    22W /  N/A |      0MiB /  3016MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |

这是nvcc -V的输出

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

在ipython中导入theano时,它无法在gpu模式下运行而出现错误,它无法找到libcudart.so.7.5

Python 3.6.0 (default, Jan 16 2017, 12:12:55) 
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import theano
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcudart.so.7.5: cannot open shared object file: No such file or directory
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu0 is not available  (error: cuda unavailable)

这是我的.theanorc的内容

[global]                                                                                                                                                                                                       
floatX = float32
device = gpu0
cuda.root = /opt/cuda

我尝试从源代码构建theano ,在卸载之前的安装后,它也无法正常工作。 我确实用theano-cache clean / theano-cache purge ,我手动删除了.theano目录下的内容,这也.theano

随着更多调试我得到错误https://github.com/Theano/Theano/blob/8b9f73365e4932f1c005a0a37b907d28985fbc5f/theano/gof/cmodule.py#L302

nvcc_compiler尝试加载cuda_ndarray.socuda_ndarray在theano缓存

mod.cu的编译阶段运行没有错误。

在这种情况下,链接器指向错误的libcudart

readelf -a cuda_ndarray.so | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libcublas.so.8.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpython3.6m.so.1.0]
 0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.7.5]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

我假设ldconfig正在缓存cuda库

$ sudo ldconfig -v | grep -e 'cuda\|blas'
/opt/cuda/lib64:
    libcublas.so.8.0 -> libcublas.so.8.0.45
    libcudart.so.8.0 -> libcudart.so.8.0.44
    libnvblas.so.8.0 -> libnvblas.so.8.0.44
/opt/cuda/nvvm/lib64:
    libcuda.so.1 -> libcuda.so.375.26
    libblas.so.3 -> libblas.so.3.7.0
    libicudata.so.58 -> libicudata.so.58.2
    libopenblas.so.0 -> libopenblas.so
    libicudata.so.58 -> libicudata.so.58.1

在进一步挖掘我的问题之后,我重构了我原来的问题并在这里发布了nvcc正在挑选错误的libcudart库来解决我的问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM