簡體   English   中英

Tensorflow:Cuda 計算能力 3.0。 最低要求的 Cuda 能力是 3.5

[英]Tensorflow: Cuda compute capability 3.0. The minimum required Cuda capability is 3.5

我正在從源(文檔)安裝 tensorflow。

Cuda驅動版本:

nvcc: NVIDIA (R) Cuda compiler driver
Cuda compilation tools, release 7.5, V7.5.17

當我運行以下命令時:

bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu

它給了我以下錯誤:

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:118] Found device 0 with properties: 
name: GeForce GT 640
major: 3 minor: 0 memoryClockRate (GHz) 0.9015
pciBusID 0000:05:00.0
Total memory: 2.00GiB
Free memory: 1.98GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:138] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:148] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:843] Ignoring gpu device (device: 0, name: GeForce GT 640, pci bus id: 0000:05:00.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
F tensorflow/cc/tutorials/example_trainer.cc:128] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'Cast': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
     [[Node: Cast = Cast[DstT=DT_FLOAT, SrcT=DT_INT32, _device="/gpu:0"](Const)]])
Aborted (core dumped)

我需要一個不同的 GPU 來運行它嗎?

我已經安裝了 Tensorflow 1.8 版。 它推薦 CUDA 9.0。 我正在使用具有 CUDA 計算能力 3.0 的 GTX 650M 卡,現在它的工作原理非常棒。 操作系統是 ubuntu 18.04。 下面是詳細步驟:

安裝依賴

我已經為我的 opencv 3.4 編譯包含了 ffmpeg 和一些相關包,如果不需要,請不要安裝運行以下命令:

sudo apt-get update 
sudo apt-get dist-upgrade -y
sudo apt-get autoremove -y
sudo apt-get upgrade
sudo add-apt-repository ppa:jonathonf/ffmpeg-3 -y
sudo apt-get update
sudo apt-get install build-essential -y
sudo apt-get install ffmpeg -y
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev -y
sudo apt-get install python-dev libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev -y
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev -y
sudo apt-get install libxvidcore-dev libx264-dev -y
sudo apt-get install unzip qtbase5-dev python-dev python3-dev python-numpy python3-numpy -y
sudo apt-get install libopencv-dev libgtk-3-dev libdc1394-22 libdc1394-22-dev libjpeg-dev libpng12-dev libtiff5-dev >libjasper-dev -y
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev -y
sudo apt-get install libv4l-dev libtbb-dev libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev -y
sudo apt-get install libvorbis-dev libxvidcore-dev v4l-utils vtk6 -y
sudo apt-get install liblapacke-dev libopenblas-dev libgdal-dev checkinstall -y
sudo apt-get install libgtk-3-dev -y
sudo apt-get install libatlas-base-dev gfortran -y
sudo apt-get install qt-sdk -y
sudo apt-get install python2.7-dev python3.5-dev python-tk -y
sudo apt-get install cython libgflags-dev -y
sudo apt-get install tesseract-ocr -y
sudo apt-get install tesseract-ocr-eng -y 
sudo apt-get install tesseract-ocr-ell -y
sudo apt-get install gstreamer1.0-python3-plugin-loader -y
sudo apt-get install libdc1394-22-dev -y
sudo apt-get install openjdk-8-jdk
sudo apt-get install pkg-config zip g++-6 gcc-6 zlib1g-dev unzip  git
sudo wget https://bootstrap.pypa.io/get-pip.py
sudo python get-pip.py
sudo pip install -U pip
sudo pip install -U numpy
sudo pip install -U pandas
sudo pip install -U wheel
sudo pip install -U six

安裝 nvidia 驅動

運行以下命令:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-390 -y

重新啟動並運行以下命令,它應該為您提供如下圖所述的詳細信息: 在此處輸入圖片說明

gcc-6 和 g++-6 檢查。

CUDA 9.0 需要 gcc-6 和 g++-6,運行以下命令:

cd /usr/bin 
sudo rm -rf gcc gcc-ar gcc-nm gcc-ranlib g++
sudo ln -s gcc-6 gcc
sudo ln -s gcc-ar-6 gcc-ar
sudo ln -s gcc-nm-6 gcc-nm
sudo ln -s gcc-ranlib-6 gcc-ranlib
sudo ln -s g++-6 g++

安裝 CUDA 9.0

轉到https://developer.nvidia.com/cuda-90-download-archive 選擇選項:Linux->x86_64->Ubuntu->17.04->deb(local)。 下載主文件和兩個補丁。 運行以下命令:

sudo dpkg -i cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

導航到 PC 上的第一個補丁並雙擊它,它會自動執行,第二個補丁也是如此。

將以下行添加到您的 ~/.bashrc 文件並重新啟動:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:$PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

為 CUDA 9.0 安裝 cudnn 7.1.4

https://developer.nvidia.com/cudnn下載 tar 文件並將其解壓縮到您的 Downloads 文件夾下載需要 nvidia 開發的登錄名,免費注冊運行以下命令:

cd ~/Downloads/cudnn-9.0-linux-x64-v7.1/cuda
sudo cp include/* /usr/local/cuda/include/
sudo cp lib64/libcudnn.so.7.1.4 lib64/libcudnn_static.a /usr/local/cuda/lib64/
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libcudnn.so.7.1.4 libcudnn.so.7
sudo ln -s libcudnn.so.7 libcudnn.so

為 CUDA 9.0 安裝 NCCL 2.2.12

https://developer.nvidia.com/nccl下載 tar 文件並將其解壓縮到您的 Downloads 文件夾下載需要 nvidia 開發的登錄名,免費注冊運行以下命令:

sudo mkdir -p /usr/local/cuda/nccl/lib /usr/local/cuda/nccl/include
cd ~/Downloads/nccl-repo-ubuntu1604-2.2.12-ga-cuda9.0_1-1_amd64/
sudo cp *.txt /usr/local/cuda/nccl
sudo cp include/*.h /usr/include/
sudo cp lib/libnccl.so.2.1.15 lib/libnccl_static.a /usr/lib/x86_64-linux-gnu/
sudo ln -s /usr/include/nccl.h /usr/local/cuda/nccl/include/nccl.h
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libnccl.so.2.1.15 libnccl.so.2
sudo ln -s libnccl.so.2 libnccl.so
for i in libnccl*; do sudo ln -s /usr/lib/x86_64-linux-gnu/$i /usr/local/cuda/nccl/lib/$i; done

安裝 Bazel(推薦手動安裝 bazel 工作,參考: https ://docs.bazel.build/versions/master/install-ubuntu.html#install-with-installer-ubuntu)

https://github.com/bazelbuild/bazel/releases下載“bazel-0.13.1-installer-darwin-x86_64.sh” 運行以下命令:

chmod +x bazel-0.13.1-installer-darwin-x86_64.sh
./bazel-0.13.1-installer-darwin-x86_64.sh --user
export PATH="$PATH:$HOME/bin"

編譯 Tensorflow

我們將使用 CUDA、XLA JIT(哦是的)和 jemalloc 作為 malloc 支持進行編譯。 所以我們為這些事情輸入 yes。 運行以下命令並按照運行配置的描述回答查詢

git clone https://github.com/tensorflow/tensorflow 
git checkout r1.8
./configure
You have bazel 0.13.0 installed.
Please specify the location of python. [Default is /usr/bin/python]:
Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: y
XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]:
Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.4
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: 2.2.12
Please specify the location where NCCL 2 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:/usr/local/cuda/nccl
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.0]
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/x86_64-linux-gnu-gcc-7]: /usr/bin/gcc-6
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
 --config=mkl          # Build with MKL support.

 --config=monolithic   # Config for mostly static monolithic build.

Configuration finished

現在要編譯 tensorflow,運行下面的命令,這會消耗大量 RAM 並且需要時間。 如果您有很多 RAM,您可以從下面的行中刪除“--local_resources 2048,.5,1.0”,否則這將適用於 2 GB 的 RAM

bazel build --config=opt --config=cuda --local_resources 2048,.5,1.0 //tensorflow/tools/pip_package:build_pip_package

編譯完成后,您將看到如下圖所示的內容,確認成功在此處輸入圖片說明

構建wheel文件,在下面運行:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

使用pip安裝生成的wheel文件

sudo pip install /tmp/tensorflow_pkg/tensorflow*.whl

現在要在設備上進行探索,您可以運行 tensorflow,下圖是 ipython 終端上的展示

在此處輸入圖片說明

在 anaconda 中,tensorflow-gpu=1.12 和 cudatoolkit=9.0 與具有 3.0 計算能力的 gpu 兼容。 這是用於創建新環境和安裝 3.0 gpus 所需庫的 ccommand。

conda create -n tf-gpu
conda activate tf-gpu
conda install tensorflow-gpu=1.12
conda install cudatoolkit=9.0

那么您可以通過以下方式嘗試。

>python
import tensorflow as tf
tf.Session()

這是我的輸出

名稱:GeForce GT 650M 主要:3 次要:0 memoryClockRate(GHz):0.95 pciBusID:0000:01:00.0 totalMemory:3.94GiB freeMemory:3.26GiB 2019-12-09 13:2535913:25359 13:25359 monflow/13:253591run_ gpu/gpu_device.cc:1511] 添加可見 gpu 設備:0 2019-12-09 13:26:12.050152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 設備互連 StreamExecutor,強度為 19 邊矩陣:201 -12-09 13:26:12.050199: 我 tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2019-12-09 13:26:12.050222: 我 tensorflow/core/common_runtime/gpu.pu 1001] 0: N 2019-12-09 13:26:12.050481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 創建 TensorFlow 設備 (/job:localhost/replica:0/task:0/device: GPU:0 2989 MB 內存)-> 物理 GPU(設備:0,名稱:GeForce GT 650M,pci 總線 ID:0000:01:00.0,計算能力:3.0)

享受!

@Taako,很抱歉這么晚回復。 我沒有保存上面顯示的編譯的輪文件。 但是,這是 tensorflow 1.9 的新內容。 希望這對你有足夠的幫助。 請確保以下用於構建的詳細信息。

Tensorflow:1.9 CUDA 工具包:9.2 CUDNN:7.1.4 NCCL:2.2.13

以下是輪文件的鏈接:輪文件

感謝您提供 WHL! 我現在終於能夠使用 TF 了,因為我的筆記本電腦只支持 Compute 3.0。 我無法按照您關於全新安裝 Ubuntu 18.04 的說明進行編譯,並想指出以下幾點:

  • 在您的“依賴項”部分,libjasper 不再獨立可用,ffmpeg 不再從您列出的存儲庫中可用,並且 libtiff5-dev 不再可用(我認為有一個新版本)。 我知道這主要用於 OpenCV 的東西,我也使用它。 您還重復了幾個包,例如 git 和 unzip。
  • 在您的“Nvidia 驅動程序”部分,我認為存儲庫中沒有該驅動程序。 至少我拉不動。 使用您構建的 WHL 文件,我使用 Nvidia 網站上的 418 驅動程序,這似乎運行良好。
  • 在您的“為 CUDA 9.0 安裝 cudnn 7.1.4”部分中,您“cd /usr/lib/x86_64-linux-gnu”,但文件在 /usr/local/cuda 中。 這是正確的嗎? 我猜至少必須告訴鏈接指向 cuda 文件夾。
  • 在“為 CUDA 9.0 安裝 NCCL 2.2.12”一節中,您使用的是 2.2.12,但您的命令行都參考 2.1.15
  • 在您的 Bazel 安裝部分,您說要使用 Bazel Darwin 安裝程序,但我認為這是適用於 Mac 的。 我認為您需要 Bazel Linux 安裝程序。

再次感謝您為此所做的所有工作!

PS我能夠通過按照這些說明執行Tensorflow 1.12的git checkout並通過pip安裝keras_applications和keras_preprocessing,使用CUDA 9.2,CUDNN 7.1.4和NCCL 2,2,13,使用Bazel 0.15.0來構建它. 有人指出 CUDA 9.0 不能用 gcc6/g++6 編譯。 顯然9.2可以。

對於 Tensorflow 2.1.0

通過編譯 TF2.1.0 的源代碼,我能夠在 Windows 上管理它。 由於 XLA 原因,TF 2.2.0 構建失敗,即使為 bazel 禁用了所有 XLA 標志。 還要小心使用更新的 Python 版本 - 我在使用 Python 3.8 的預構建 pip 包中遇到了一些奇怪的錯誤,所以我使用 Python 3.6 來解決這個問題。

一個警告——在構建完成幾個小時后我開始使用這個庫,一個只持續幾秒鍾的簡單模型訓練效果很好,但由於 CUDA 錯誤,基本卷積網絡的訓練在 0 或 1 個時期后失敗. 您的里程可能會有所不同。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM