简体   繁体   中英

Tensorflow: Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file

I've been trying to run tensorflow in my gpu for some long days but I've been not able to accomplish it.

I know that there are several questions with similar questions but I've tried everything I found and it didn't work, so that is why I'm writting this question:

How to install libcusolver.so.11

https://stackoverflow.com/a/67642774/15098668

I've installed the drivers 460.106.00 and cuda 11.2 for the Nvidia GeForce RTX 3090:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00   Driver Version: 460.106.00   CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    On   | 00000000:08:00.0  On |                  N/A |
| 33%   26C    P8    22W / 350W |    282MiB / 24260MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1264      G   /usr/lib/xorg/Xorg                 59MiB |
|    0   N/A  N/A      3349      G   /usr/lib/xorg/Xorg                124MiB |
|    0   N/A  N/A      3508      G   /usr/bin/gnome-shell               77MiB |
|    0   N/A  N/A      6384      G   /usr/lib/firefox/firefox            4MiB |
+-----------------------------------------------------------------------------+

The cudnn:

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 1

And the GCC compiler:

gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

I've also added the LD_LIRARY_PATH to./bashrc

# Nvidia cuda toolkit
export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

I've tried several tensorflow and tensorflow-gpu versions, from 2.4 to 2.7, but in everyone fails with:

2022-01-24 21:28:43.206834: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory

or

2022-01-24 21:28:44.087779: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087827: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087858: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087891: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087921: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087947: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2022-01-24 21:28:44.087975: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory

Thanks in advance, I dont know what more to try...

After trying a lot of things, I created a new conda environment and installed tensorflow-gpu, as I didn't care about the TF version:

conda install tensorflow-gpu -c anaconda

It installed all the following packages:

package                    |            build
    ---------------------------|-----------------
    _tflow_select-2.1.0        |              gpu           2 KB  anaconda
    absl-py-0.10.0             |           py38_0         170 KB  anaconda
    aiohttp-3.6.3              |   py38h7b6447c_0         622 KB  anaconda
    astunparse-1.6.3           |             py_0          17 KB  anaconda
    async-timeout-3.0.1        |           py38_0          12 KB  anaconda
    attrs-20.2.0               |             py_0          41 KB  anaconda
    blas-1.0                   |              mkl           6 KB  anaconda
    blinker-1.4                |           py38_0          21 KB  anaconda
    brotlipy-0.7.0             |py38h7b6447c_1000         349 KB  anaconda
    c-ares-1.16.1              |       h7b6447c_0         112 KB  anaconda
    ca-certificates-2020.10.14 |                0         128 KB  anaconda
    cachetools-4.1.1           |             py_0          12 KB  anaconda
    certifi-2020.6.20          |           py38_0         160 KB  anaconda
    cffi-1.14.0                |   py38h2e261b9_0         228 KB  anaconda
    chardet-3.0.4              |        py38_1003         170 KB  anaconda
    click-7.1.2                |             py_0          67 KB  anaconda
    cryptography-3.1.1         |   py38h1ba5d50_0         618 KB  anaconda
    cudatoolkit-10.1.243       |       h6bb024c_0       513.2 MB  anaconda
    cudnn-7.6.5                |       cuda10.1_0       250.6 MB  anaconda
    cupti-10.1.168             |                0         1.7 MB  anaconda
    gast-0.3.3                 |             py_0          14 KB  anaconda
    google-auth-1.22.1         |             py_0          62 KB  anaconda
    google-auth-oauthlib-0.4.1 |             py_2          21 KB  anaconda
    google-pasta-0.2.0         |             py_0          44 KB  anaconda
    grpcio-1.31.0              |   py38hf8bcb03_0         2.3 MB  anaconda
    h5py-2.10.0                |   py38hd6299e0_1         1.1 MB  anaconda
    hdf5-1.10.6                |       hb1b8bf9_0         4.8 MB  anaconda
    idna-2.10                  |             py_0          56 KB  anaconda
    importlib-metadata-2.0.0   |             py_1          35 KB  anaconda
    intel-openmp-2020.2        |              254         947 KB  anaconda
    keras-preprocessing-1.1.0  |             py_1          36 KB  anaconda
    libgfortran-ng-7.3.0       |       hdf63c60_0         1.3 MB  anaconda
    libprotobuf-3.13.0.1       |       hd408876_0         2.3 MB  anaconda
    markdown-3.3.2             |           py38_0         123 KB  anaconda
    mkl-2019.4                 |              243       204.1 MB  anaconda
    mkl-service-2.3.0          |   py38he904b0f_0          68 KB  anaconda
    mkl_fft-1.2.0              |   py38h23d657b_0         173 KB  anaconda
    mkl_random-1.1.0           |   py38h962f231_0         398 KB  anaconda
    multidict-4.7.6            |   py38h7b6447c_1          72 KB  anaconda
    numpy-1.19.1               |   py38hbc911f0_0          20 KB  anaconda
    numpy-base-1.19.1          |   py38hfa32c7d_0         5.3 MB  anaconda
    oauthlib-3.1.0             |             py_0          88 KB  anaconda
    openssl-1.1.1h             |       h7b6447c_0         3.8 MB  anaconda
    opt_einsum-3.1.0           |             py_0          54 KB  anaconda
    protobuf-3.13.0.1          |   py38he6710b0_1         702 KB  anaconda
    pyasn1-0.4.8               |             py_0          58 KB  anaconda
    pyasn1-modules-0.2.8       |             py_0          67 KB  anaconda
    pycparser-2.20             |             py_2          94 KB  anaconda
    pyjwt-1.7.1                |           py38_0          32 KB  anaconda
    pyopenssl-19.1.0           |             py_1          47 KB  anaconda
    pysocks-1.7.1              |           py38_0          27 KB  anaconda
    requests-2.24.0            |             py_0          54 KB  anaconda
    requests-oauthlib-1.3.0    |             py_0          22 KB  anaconda
    rsa-4.6                    |             py_0          26 KB  anaconda
    scipy-1.5.2                |   py38h0b6359f_0        18.7 MB  anaconda
    six-1.15.0                 |             py_0          13 KB  anaconda
    tensorboard-2.2.1          |     pyh532a8cf_0         2.5 MB  anaconda
    tensorboard-plugin-wit-1.6.0|             py_0         663 KB  anaconda
    tensorflow-2.2.0           |gpu_py38hb782248_0           4 KB  anaconda
    tensorflow-base-2.2.0      |gpu_py38h83e3d50_0       421.3 MB  anaconda
    tensorflow-estimator-2.2.0 |     pyh208ff02_0         276 KB  anaconda
    tensorflow-gpu-2.2.0       |       h0d30ee6_0           2 KB  anaconda
    termcolor-1.1.0            |           py38_1           8 KB  anaconda
    urllib3-1.25.11            |             py_0          93 KB  anaconda
    werkzeug-1.0.1             |             py_0         243 KB  anaconda
    wrapt-1.12.1               |   py38h7b6447c_1          50 KB  anaconda
    yarl-1.6.2                 |   py38h7b6447c_0         142 KB  anaconda
    zipp-3.3.1                 |             py_0          11 KB  anaconda
    ------------------------------------------------------------
                                           Total:        1.41 GB

Including cudatoolkit and cudnn...

And after that, I don't know why, TF detected the nvidia card:

2022-01-25 09:37:52.865587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-01-25 09:37:52.902796: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-25 09:37:52.903487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:08:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2022-01-25 09:37:52.903637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2022-01-25 09:37:52.904633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2022-01-25 09:37:52.905878: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2022-01-25 09:37:52.906023: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2022-01-25 09:37:52.907115: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2022-01-25 09:37:52.907719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2022-01-25 09:37:52.910042: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2022-01-25 09:37:52.910137: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-25 09:37:52.911078: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-25 09:37:52.911707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
Num GPUs Available:  1

Prcess finished with exit code 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM