简体   繁体   中英

Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory

I just updated my graphics cards drives with

sudo apt install nvidia-driver-470
sudo apt install cuda-drivers-470

I decided to install them in this manner because they were being held back when trying to sudo apt upgrade . I mistakenly then did sudo apt autoremove to cleanup old packages. After restarting my computer for new drivers to get setup properly, I could no longer use GPU acceleration with tensorflow.

import tensorflow as tf
tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2021-12-07 16:52:01.771391: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-07 16:52:01.807283: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 16:52:01.807973: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.808017: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.808048: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.856391: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.856466: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-12-07 16:52:01.857601: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
False

Have you installed cuda-toolkit ? The error indicates that version 11 of the libraries is not found. The problem is that the cudatoolkit and the cudnn version could be incompatible with your tensorflow version.

If you already installed the correct version of the toolkit, go directly to Step 5. (You can check the version with the command nvcc --version ).

  1. Download the installer from https://developer.nvidia.com/cuda-11-4-4-download-archive?target_os=Linux (this version is compatible with the driver nvidia-470 you installed). The next steps are specific to the runfile option.

  2. As you already installed nvidia-drivers , press Continue if this message appears.

    在此处输入图像描述

  3. Accept the terms.

    在此处输入图像描述

  4. Again, as you already installed the drivers, just disable the Driver option and press Install .

    在此处输入图像描述

  5. Now you need to configure the paths for binaries and libraries. Using find command search for nvcc and libcublas.so.* :

     sudo find / -name 'nvcc' # Path to binaries sudo find / -name 'libcublas.so.*' # Path to libraries
  6. Finally, add the next lines at the end of file ~/.profile according to the paths you found above. Cuda was installed on /usr/local/cuda-11.4 in my system.

     if [ -d "/usr/local/cuda-11.4" ]; then PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}} LD_LIBRARY_PATH=/usr/local/cuda-11.4/targets/x86_64-linux/lib/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} fi

If updating ~\.profile doesn't work, try to update .bashrc or .zshrc (in case you use zsh instead of bash ).

  1. Restart the computer.

You can create symlinks inside of /usr/lib/x86_64-linux-gnu directory. I found it by:

$ whereis libcudart
libcudart: /usr/lib/x86_64-linux-gnu/libcudart.so /usr/share/man/man7/libcudart.7.gz

Within this folder you can find other versions of those cuda libraries. Then create symlinks like this. Your specific version that you are linking to might be slightly different.

$ sudo ln -s libcublas.so.10.2.1.243 libcublas.so.11
$ sudo ln -s libcublasLt.so.10.2.1.243 libcublasLt.so.11
$ sudo ln -s libcusolver.so.10.2.0.243 libcusolver.so.11
$ sudo ln -s libcusparse.so.10.3.0.243 libcusparse.so.11

Now your GPU should be detected.

import tensorflow as tf
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2021-12-07 17:07:26.914296: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-07 17:07:26.950731: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.029687: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.030421: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.325218: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.325642: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.326022: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-07 17:07:27.326408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 9280 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:06:00.0, compute capability: 8.6
True

This method works because these cuda libraries are similar enough that even NVIDIA build them with symlinks often. If tensorflow is looking for libcublas.so.11 , you can create a file with that name that just points to another version of libcublas that is already installed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM