简体   繁体   中英

CUDA and CuDNN version conflict against Tensorflow2.4.1

I'd like someone to give your TF2.4.1 environment with RTX 30X0. More specific, I'd like to know nvidia driver, CUDA and CuDNN versions. Also TF2.4.1 installation manner after all.

I am struggling to install tensorflow2.4.1 into my PC which has below. OS: Ubuntu 20.04 (Version does not matter) CPU: Ryzen 5600X GPU: RTX 3070

I know that the requirements from TF2.4.1 is CUDA11.0 w/ CuDNN8.0.4 according to following site.

https://www.tensorflow.org/install/gpu

However, NVIDIA driver version 457 is the first version for RTX 3070 and current latest version is Ver.460. So, I installed version 460 and 'nvidia-smi' returns following on terminal. (The CUDA version 11.2 is NOT installed with this driver installation, as you may know.)

| NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 |

The support matrices from NVIDIA site below says that CUDA 11.0 is for driver ver.450. Which means that it cannot be the one for RTX 3070. For RTX 3070, I think CUDA 11.2 or later is the one since it supports Ver.460.

https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html

After installation of CUDA11.2 w/ CuDNN8.1.0, I installed tensorflow by pip to pyenv. However, the TF runs on CPU. I confirmed by 'tf.config.list_physical_devices('GPU')' which returned '[]' as I expected. As long as TF2.4.1's restriction with regard to NVIDIA environment is effective, do you think I cannot make TF environment on this PC?

I tried so many patterns with fresh installation of Ubuntu. Once I failed to install ver.450 with DPKG error, but I will try again.

Thank you all. I found that my issue is solved as below.

After Ubuntu 20.04 installed, install the latest NVIDIA driver. (Maybe no need, since it will be installed automatically with CUDA installation.)

sudo ubuntu-drivers devices
sudo apt-get install --no-install-recommends nvidia-driver-460
nvidia-smi
sudo reboot

Install CUDA11.0 which TF2.4.1 wants. This time I refer runfile[local] instead of deb[local] which I got failure before. https://developer.nvidia.com/cuda-11.0-update1-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=2004&target_type=runfilelocal

sudo apt update
wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run
sudo sh cuda_11.0.3_450.51.06_linux.run

*WARNING 'Existing package manager installation of the driver found.' appeared at above line and I aborted it. Done below instead.

sudo sh ./cuda_11.0.3_450.51.06_linux.run --toolkit --silent –-override

Confirm if usr/local/cuda-11.0/bin exists and edit bashrc.

sudo vim ~/.bashrc

Add following 2 lines to bashrc.

export PATH=/usr/local/cuda-11.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

reload bashrc

source ~/.bashrc

Install CuDNN. Download a tared library in advance. https://developer.nvidia.com/rdp/cudnn-archive

cd Downloads
tar -xzvf cudnn-11.0-linux-x64-v8.0.4.30.tgz
sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
sudo apt update

Install TF2.4.1 as https://www.tensorflow.org/install/pip?hl=ja instructs.

sudo apt install python3-dev python3-pip python3-venv
cd Documents/ML/tf-test1
python3 -m venv test1
source test1/bin/activate
pip install --upgrade pip
pip list
pip install --upgrade tensorflow

Check if TF is properly installed.

import tensorflow as tf
print(tf.__version__)
print(tf.config.list_physical_devices('GPU'))
print(tf.test.gpu_device_name())

When I perform my CNN model, 'Failed to get convolution algorithm. This is probably because cuDNN failed to initialize' occurred as fit(). Then I did below as always.

export TF_FORCE_GPU_ALLOW_GROWTH=true

Then sorted out.

I have an ubuntu 20.4 like you. But so that tensorflow can see the GPUs well, what finally worked for me is: tensorflow-gpu 2.2.0 cuda 10.1 cudnn 7.6 And all with an NVIDIA drive of version 460.73.01

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM