简体   繁体   English

在theano中使用CUDA8

[英]Using CUDA8 in theano

I have working installtion of CUDA8 and have installed theano, while importing the theano it searches for CUDA7.5 instead of CUDA8, How can tell theano to use CUDA8 instead of CUDA7.5? 我已经安装了CUDA8并安装了theano,在导入theano时它会搜索CUDA7.5而不是CUDA8,怎么能告诉theano使用CUDA8代替CUDA7.5?

My sytem only have CUDA8, and it doesn't contain mixed environment cuda(ie having both CUDA7.5 and CUDA8). 我的系统只有CUDA8,它不包含混合环境cuda(即同时具有CUDA7.5和CUDA8)。

Here is a output of nvidia-smi 这是nvidia-smi的输出

$ nvidia-smi 
Sat Feb  4 11:32:30 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 970M    Off  | 0000:01:00.0     Off |                  N/A |
| N/A   54C    P0    22W /  N/A |      0MiB /  3016MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |

Here is a output of nvcc -V 这是nvcc -V的输出

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

While importing theano in ipython it fails to run in gpu mode with error that, it can't find the libcudart.so.7.5 在ipython中导入theano时,它无法在gpu模式下运行而出现错误,它无法找到libcudart.so.7.5

Python 3.6.0 (default, Jan 16 2017, 12:12:55) 
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import theano
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcudart.so.7.5: cannot open shared object file: No such file or directory
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu0 is not available  (error: cuda unavailable)

Here is a content of my .theanorc 这是我的.theanorc的内容

[global]                                                                                                                                                                                                       
floatX = float32
device = gpu0
cuda.root = /opt/cuda

I tried to build theano from source, after uninstalling the previous installtion of it, that too is not working. 我尝试从源代码构建theano ,在卸载之前的安装后,它也无法正常工作。 I did cleared the theano-cache with theano-cache clean / theano-cache purge and my manually deleting the content under .theano directory, which too couldn't helped. 我确实用theano-cache clean / theano-cache purge ,我手动删除了.theano目录下的内容,这也.theano

With more debugging I get error here https://github.com/Theano/Theano/blob/8b9f73365e4932f1c005a0a37b907d28985fbc5f/theano/gof/cmodule.py#L302 随着更多调试我得到错误https://github.com/Theano/Theano/blob/8b9f73365e4932f1c005a0a37b907d28985fbc5f/theano/gof/cmodule.py#L302

when nvcc_compiler tries to load the cuda_ndarray.so from cuda_ndarray in theano cache nvcc_compiler尝试加载cuda_ndarray.socuda_ndarray在theano缓存

comiplation phase for mod.cu runs without error. mod.cu的编译阶段运行没有错误。

In this case linker is pointing to wrong libcudart 在这种情况下,链接器指向错误的libcudart

readelf -a cuda_ndarray.so | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libcublas.so.8.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpython3.6m.so.1.0]
 0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.7.5]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

I assume ldconfig is properly caching the cuda libraries 我假设ldconfig正在缓存cuda库

$ sudo ldconfig -v | grep -e 'cuda\|blas'
/opt/cuda/lib64:
    libcublas.so.8.0 -> libcublas.so.8.0.45
    libcudart.so.8.0 -> libcudart.so.8.0.44
    libnvblas.so.8.0 -> libnvblas.so.8.0.44
/opt/cuda/nvvm/lib64:
    libcuda.so.1 -> libcuda.so.375.26
    libblas.so.3 -> libblas.so.3.7.0
    libicudata.so.58 -> libicudata.so.58.2
    libopenblas.so.0 -> libopenblas.so
    libicudata.so.58 -> libicudata.so.58.1

在进一步挖掘我的问题之后,我重构了我原来的问题并在这里发布了nvcc正在挑选错误的libcudart库来解决我的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM