简体   繁体   English

无法编译cuda_ndarray.cu:libcublas.so.7.5:无法打开共享对象文件

[英]Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file

I am trying to import theano library in an aws instance to use GPU. 我试图在aws实例中导入theano库以使用GPU。 I have written a python script using boto to automate aws setup which will essentially do an ssh to the instance from my local machine and then start a bash script where I do " python -c 'import theano'" to start the GPU. 我已经编写了一个使用boto自动化aws设置的python脚本,它本质上是从我的本地机器对该实例执行ssh,然后启动一个bash脚本,我在其中执行“ python -c'import theano'”来启动GPU。 But I get the following error: 但是我收到以下错误:

ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file: No such file or directory 错误(theano.sandbox.cuda):无法编译cuda_ndarray.cu:libcublas.so.7.5:无法打开共享对象文件:没有这样的文件或目录

When I tried to import theano module directly in the instance command shell it automatically starts using GPU. 当我尝试直接在实例命令shell中导入theano模块时,它会自动开始使用GPU。

Using gpu device 0: GRID K520 (CNMeM is disabled) 使用gpu设备0:GRID K520(禁用CNMeM)

I guess I am missing some other import that has to made while importing through my automation python script. 我想我错过了通过我的自动化python脚本导入时必须进行的其他一些导入。 What could possibly be the solution? 什么可能是解决方案?

I will try to solve this problem clearly and concise, as I found not really good answer for people which are starting using unix or are not familiar with compilation and linking. 我将尝试清楚简洁地解决这个问题,因为我发现对于那些开始使用unix或者不熟悉编译和链接的人来说不是很好的答案。

The problem has to do with dynamic linking and it can be solved in two ways. 问题与动态链接有关,可以通过两种方式解决。 First one is by setting LD_LIBRARY_PATH enviroment variable. 第一个是通过设置LD_LIBRARY_PATH环境变量。 Assuming cuda is installed in /usr/local/cuda/, just add in your enviroment file /etc/enviroment: 假设cuda安装在/ usr / local / cuda /中,只需在您的环境文件/ etc / enviroment中添加:

LD_LIBRARY_PATH=/usr/local/cuda/

Or simply in your bashrc: 或者只是在你的bashrc中:

export LD_LIBRARY_PATH=/usr/local/cuda/lib64/

This solution is not recommended by unix gurus (i am not one i have just read that on the internet and i follow linux gurus). 这个解决方案不是由unix大师推荐的(我不是我刚刚在互联网上读过的,我跟随linux大师)。 So the solution I found is simple, modify the path where the linux ld software search for libraries by default. 所以我发现的解决方案很简单,默认情况下修改linux ld软件搜索库的路径。 To do that just do (you have to do it as root): 要做到这一点(你必须以root身份):

cd /etc/ld.so.conf.d/

Then pick for example and edit: 然后选择例如并编辑:

vi libc.conf 

Inside this file just add the path to the lib64 root like: 在这个文件里面只需添加lib64根的路径,如:

/usr/local/cuda/lib64/

You would get something like this in the file: 你会在文件中得到这样的东西:

\# libc default configuration

/usr/local/lib

/usr/local/cuda/lib64/

And then just run: 然后运行:

sudo ldconfig

Hope this answer helps people which are starting seen programming, or using high level languages such as python that uses C code below (like theano does) and are not familiar with compilation, linkig... 希望这个答案可以帮助那些开始看编程的人,或者使用高级语言,比如下面使用C代码的python(比如theano),并且不熟悉编译,链接......

I faced the same error on Ubuntu 16.04 with cuda 7.5 and found the solution here : 我在使用cuda 7.5的Ubuntu 16.04上遇到了同样的错误,并在此处找到了解决方案:

  1. cuda 7.5 don't support the default g++ version. cuda 7.5不支持默认的g ++版本。 Install an supported version and make it the default: 安装支持的版本并将其设为默认值:

     sudo apt-get install g++-4.9 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10 sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20 sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10 sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30 sudo update-alternatives --set cc /usr/bin/gcc sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30 sudo update-alternatives --set c++ /usr/bin/g++ 
  2. Work around a glibc bug - create .theanorc in the home directory with the following settings: 解决glibc错误 - 使用以下设置在主目录中创建.theanorc:

     [global] device=gpu floatX=float32 [nvcc] flags=-D_FORCE_INLINES 

And don't forget to check environment variables: PATH should contain your cuda bin folder location and CUDA_HOME should contain cuda home location 并且不要忘记检查环境变量:PATH应该包含你的cuda bin文件夹位置,而CUDA_HOME应该包含cuda home location

I've added it to mine .bashrc file this way: 我用这种方式将它添加到我的.bashrc文件中:

export PATH="/usr/local/cuda/bin:$PATH"
export CUDA_HOME="/usr/local/cuda:$CUDA_HOME"

I had a similar problem recently and spend ages figuring out what was going wrong (to the point I corrupted my Linux install and had to do a fresh install). 我最近遇到了类似的问题,花了很多时间搞清楚出了什么问题(我破坏了我的Linux安装并且不得不重新安装)。

A potential solution for this error is to delete the .theano/ directory that is (possibly) located in your home directory: 此错误的潜在解决方案删除 (可能)位于主目录中.theano/目录:

sudo rm -rf ~/.theano

To prevent this error from happening again, do not run your scripts as root user (ie without sudo ). 要防止再次发生此错误,请不要以root用户身份运行脚本(即不使用sudo )。

Running a script as root will create the hidden directory with root permissions, making it inaccessible to other processes. 以root身份运行脚本将创建具有root权限的隐藏目录,使其无法访问其他进程。

On the suggestion of Kumar here , I did 在库马尔的建议, 在这里 ,我做了

sudo ldconfig /usr/local/cuda/lib64

And it magically started working. 它神奇地开始工作。 Thanks Kumar! 谢谢Kumar!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Theano for Keras:无法编译cuda_ndarray.cu:libcublas.so.7.5:无法打开共享对象文件:无此类文件或目录 - Theano for Keras: Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file: No such file or directory 在Windows 10上导入Theano时出错:无法编译cuda_ndarray.cu - Error Importing Theano on Windows 10: Failed to compile cuda_ndarray.cu ImportError:libcublas.so.8.0:无法打开共享对象文件:没有这样的文件或目录(共享Linux) - ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory (Shared Linux) Tensorflow-ImportError:libcublas.so.8.0:无法打开共享对象文件:没有这样的文件或目录 - Tensorflow- ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory Google Colaboratory ImportError:libcublas.so.10.0:无法打开共享对象文件:运行时没有这样的文件或目录 - Google Colaboratory ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory when running 而导入python。 ImportError:libcublas.so.9.0:无法打开共享对象文件:没有这样的文件或目录 - While Import python. ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory Ubuntu 中的 Tensorflow 2.3.0:libcublas.so.10:无法打开共享对象文件 - Tensorflow 2.3.0 in Ubuntu: libcublas.so.10: cannot open shared object file 正在获取ImportError:libcublas.so.9.0:无法打开共享对象文件:没有这样的文件或目录-降级TF版本不起作用 - Getting ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory - downgrading TF version does not work ImportError:libcublas.so.9.0:无法打开共享对象文件:在Ubuntu 16.04.03上安装张量流时没有这样的文件或目录 - ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory when installing tensor flow on Ubuntu 16.04.03 TensorFlow:libcudart.so.7.5:无法打开共享对象文件:没有这样的文件或目录 - TensorFlow: libcudart.so.7.5: cannot open shared object file: No such file or directory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM