简体   繁体   English

tensorflow Mac OS gpu支持

[英]tensorflow Mac OS gpu support

According to 根据

https://www.tensorflow.org/install/install_mac Note: As of version 1.2, TensorFlow no longer provides GPU support on Mac OS X. GPU support for OS X is no longer provided. https://www.tensorflow.org/install/install_mac注意:从版本1.2开始,TensorFlow不再在Mac OS X上提供GPU支持。不再提供对OS X的GPU支持。

However, I would want to run an e-gpu setup like akitio node with a 1080 ti via thunderbolt 3. 但是,我想通过thunderbolt 3来运行像akitio节点这样的e-gpu设置。

What steps are required to get this setup to work? 要使此设置生效,需要执行哪些步骤? So far I know that 到目前为止,我知道

are required. 是必要的。 What else is needed to get CUDA / tensorflow to work? 还有什么需要让CUDA / tensorflow工作?

I wrote a little tutorial on compiling TensorFlow 1.2 with GPU support on macOS . 在macOS编写了一个关于编译TensorFlow 1.2和GPU支持的小教程。 I think it's customary to copy relevant parts to SO, so here it goes: 我认为将相关部分复制到SO是习惯做法,所以在这里:

  1. If you haven't used a TensorFlow-GPU set-up before, I suggest first setting everything up with TensorFlow 1.0 or 1.1, where you can still do pip install tensorflow-gpu . 如果您以前没有使用TensorFlow-GPU设置,我建议先使用TensorFlow 1.0或1.1设置所有内容,您仍然可以执行pip install tensorflow-gpu Once you get that working, the CUDA set-up would also work if you're compiling TensorFlow. 一旦你开始工作,如果你正在编译TensorFlow,那么CUDA设置也会有效。 If you have an external GPU, YellowPillow's answer (or mine ) might help you get things set up. 如果您有外部GPU,YellowPillow的答案(或的答案)可能会帮助您完成设置。
  2. Follow the official tutorial “ Installing TensorFlow from Sources ”, but obviously substitute git checkout r1.0 with git checkout r1.2 . 按照官方教程“ 从源代码安装TensorFlow ”,但显然用git checkout r1.0替换git checkout r1.2 When doing ./configure , pay attention to the Python library path: it sometimes suggests an incorrect one. 在执行./configure ,请注意Python库路径:它有时会提示不正确的路径。 I chose the default options in most cases, except for: Python library path, CUDA support and compute capacity. 在大多数情况下,我选择了默认选项,除了:Python库路径,CUDA支持和计算容量。 Don't use Clang as the CUDA compiler: this will lead you to an error “Inconsistent crosstool configuration; 不要使用Clang作为CUDA编译器:这将导致错误“不一致的crosstool配置; no toolchain corresponding to 'local_darwin' found for cpu 'darwin'.”. 没有为cpu'darwin'找到的'local_darwin'对应的工具链。“ Using /usr/bin/gcc as your compiler will actually use Clang that comes with macOS / XCode. 使用/usr/bin/gcc作为编译器实际上将使用macOS / XCode附带的Clang。 Below is my full configuration. 以下是我的完整配置。
  3. TensorFlow 1.2 expects a C library called OpenMP, which is not available in the current Apple Clang. TensorFlow 1.2需要一个名为OpenMP的C库,这在当前的Apple Clang中是不可用的。 It should speed up multithreaded TensorFlow on multi-CPU machines, but it will also compile without it. 它应该加速多CPU机器上的多线程TensorFlow,但它也可以在没有它的情况下进行编译。 We could try to build TensorFlow with gcc 4 (which I didn't manage), or simply remove the line that includes OpenMP from the build file. 我们可以尝试使用gcc 4(我没有管理)构建TensorFlow,或者只是从构建文件中删除包含OpenMP的行。 In my case I commented out line 98 of tensorflow/third_party/gpus/cuda/BUILD.tpl , which contained linkopts = [“-lgomp”] (but the location of the line might obviously change). 在我的情况下,我评论了tensorflow/third_party/gpus/cuda/BUILD.tpl第98行,其中contained linkopts = [“-lgomp”] (但该行的位置可能会明显改变)。 Some people had issues with zmuldefs , but I assume that was with earlier versions; 有些人对zmuldefs有问题 ,但我认为这与早期版本有关; thanks to udnaan for pointing out that it's OK to comment out these lines. 感谢udnaan指出可以评论出这些线条。
  4. I had some problems building with the latest bazel 0.5.3, so I reverted to using 0.4.5 that I already had installed. 我在使用最新的bazel 0.5.3时遇到了一些问题,所以我恢复使用已安装的0.4.5。 But some discussion in a github issue mentioned bazel 0.5.2 also didn't have the problem. 但是在一个github问题中的一些讨论提到bazel 0.5.2也没有问题。
  5. Now build with bazel and finish the installation as instructed by the official install guide. 现在用bazel构建并按照官方安装指南的说明完成安装。 On my 3.2 GHz iMac this took about 37 minutes. 在我的3.2 GHz iMac上,大约需要37分钟。

Using python library path: /Users/m/code/3rd/conda/envs/p3gpu/lib/python3.6/site-packages 使用python库路径:/Users/m/code/3rd/conda/envs/p3gpu/lib/python3.6/site-packages

Do you wish to build TensorFlow with MKL support? 您是否希望使用MKL支持构建TensorFlow? [y/N] N [y / N] N.

No MKL support will be enabled for TensorFlow 不会为TensorFlow启用MKL支持

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 当指定bazel选项“--config = opt”时,请指定在编译期间使用的优化标志[Default is -march = native]:

Do you wish to build TensorFlow with Google Cloud Platform support? 您是否希望通过Google Cloud Platform支持构建TensorFlow? [y/N] [Y / N]

No Google Cloud Platform support will be enabled for TensorFlow 不会为TensorFlow启用Google Cloud Platform支持

Do you wish to build TensorFlow with Hadoop File System support? 您是否希望使用Hadoop文件系统支持构建TensorFlow? [y/N] [Y / N]

No Hadoop File System support will be enabled for TensorFlow 不会为TensorFlow启用Hadoop文件系统支持

Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? 您是否希望使用XLA即时编译器(实验性)构建TensorFlow? [y/N] [Y / N]

No XLA support will be enabled for TensorFlow TensorFlow不会启用XLA支持

Do you wish to build TensorFlow with VERBS support? 您是否希望使用VERBS支持构建TensorFlow? [y/N] [Y / N]

No VERBS support will be enabled for TensorFlow 不会为TensorFlow启用VERBS支持

Do you wish to build TensorFlow with OpenCL support? 您是否希望在OpenCL支持下构建TensorFlow? [y/N] [Y / N]

No OpenCL support will be enabled for TensorFlow 不会为TensorFlow启用OpenCL支持

Do you wish to build TensorFlow with CUDA support? 您是否希望通过CUDA支持构建TensorFlow? [y/N] y [y / N] y

CUDA support will be enabled for TensorFlow 将为TensorFlow启用CUDA支持

Do you want to use clang as CUDA compiler? 你想使用clang作为CUDA编译器吗? [y/N] [Y / N]

nvcc will be used as CUDA compiler nvcc将用作CUDA编译器

Please specify the CUDA SDK version you want to use, eg 7.0. 请指定您要使用的CUDA SDK版本,例如7.0。 [Leave empty to use system default]: [留空以使用系统默认值]:

Please specify the location where CUDA toolkit is installed. 请指定安装CUDA工具包的位置。 Refer to README.md for more details. 有关更多详细信息,请参阅README.md。 [Default is /usr/local/cuda]: [默认为/ usr / local / cuda]:

Please specify which gcc should be used by nvcc as the host compiler. 请指定nvcc应使用哪个gcc作为主机编译器。 [Default is /usr/bin/gcc]: [默认为/ usr / bin / gcc]:

Please specify the cuDNN version you want to use. 请指定您要使用的cuDNN版本。 [Leave empty to use system default]: [留空以使用系统默认值]:

Please specify the location where cuDNN library is installed. 请指定cuDNN库的安装位置。 Refer to README.md for more details. 有关更多详细信息,请参阅README.md。 [Default is /usr/local/cuda]: [默认为/ usr / local / cuda]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with. 请指定要使用的逗号分隔Cuda计算功能列表。

You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus . 您可以在以下网址找到设备的计算能力: https//developer.nvidia.com/cuda-gpus

Please note that each additional compute capability significantly increases your build time and binary size. 请注意,每个额外的计算能力都会显着增加构建时间和二进制文件大小。

[Default is: "3.5,5.2"]: 6.1 [默认为:“3.5,5.2”]:6.1

INFO: Starting clean (this may take a while). 信息:开始清洁(这可能需要一段时间)。 Consider using --async if the clean takes more than several minutes. 如果清理时间超过几分钟,请考虑使用--async。

Configuration finished 配置完成

Assuming that you have already setup your eGPU box and attached the TB3 cable from the eGPU to your TB3 port: 假设您已经设置了eGPU盒并将TB3电缆从eGPU连接到TB3端口:

1. Download the automate-eGPU script and run it 1.下载automate-eGPU脚本并运行它

curl -o ~/Desktop/automate-eGPU.sh
https://raw.githubusercontent.com/goalque/automate-eGPU/master/automate-eGPU.sh
&& chmod +x ~/Desktop/automate-eGPU.sh && cd ~/Desktop && sudo
./automate-eGPU.sh

You might get an error saying: 您可能会收到错误消息:

"Boot into recovery partition and type: csrutil disable" “启动到恢复分区并键入:csrutil disable”

All you need to do now is to restart your computer and when it's restarting hold down cmd + R to enable the recovery mode. 您现在需要做的就是重新启动计算机,当它重新启动时按住cmd + R以启用恢复模式。 Then locate the Terminal while in recovery mode and type in: 然后在恢复模式下找到终端并输入:

csrutil disable

Then restart your computer and re-run the automate-eGPU.sh script 然后重新启动计算机并重新运行automate-eGPU.sh脚本

2: Download and installing CUDA 2:下载并安装CUDA

Run the cuda_8.0.61_mac.dmg file and follow through the installation phase. 运行cuda_8.0.61_mac.dmg文件并按照安装阶段进行操作。 Then afterwards you will need to set the paths. 然后你需要设置路径。

Go to your Terminal and type: 转到您的终端并输入:

vim ~/.bash_profile

Or whether you have stored your environmental variables and then add these three lines: 或者您是否已存储环境变量,然后添加以下三行:

export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH="$CUDA_HOME/lib:$CUDA_HOME:$CUDA_HOME/extras/CUPTI/lib"
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH

3. Downloading and installing cuDNN 3.下载并安装cuDNN

To download cuDNN is a bit more troublesome you have to sign up to be a developer for Nvidia and then afterwards you can download it. 要下载cuDNN有点麻烦,你必须注册成为Nvidia的开发人员,然后你可以下载它。 Make sure to download cuDNN v5.1 Library for OSX as it's the one that Tensorflow v1.1 expects Note that we can't use Tensorflow v1.2 as there is no GPU support for Macs :(( 确保下载cuDNN v5.1 Library for OSX因为它是Tensorflow v1.1所期望的注意我们不能使用Tensorflow v1.2,因为没有GPU支持Mac:((

[![enter image description here][1]][1] [![在此输入图片说明] [1]] [1]

Now you will download a zip file called cudnn-8.0-osx-x64-v5.1.tgz , unzip and, which will create a file called cuda and cd to it using terminal. 现在你将下载一个名为cudnn-8.0-osx-x64-v5.1.tgz的zip文件,解压缩,它将使用终端创建一个名为cuda和cd的文件。 Assuming that the folder is in Downloads 假设该文件夹位于下载中

Open terminal and type: 打开终端并输入:

cd ~/Downloads/cuda

Now we need to copy cuDNN files to where CUDA is stored so: 现在我们需要将cuDNN文件复制到存储CUDA位置,这样:

sudo cp include/* /usr/local/cuda/include/
sudo cp lib/* /usr/local/cuda/lib/

4. Now install Tensorflow-GPU v1.1 in your conda/virtualenv 4.现在在conda / virtualenv中安装Tensorflow-GPU v1.1

For me since I use conda I created a new environment using Terminal: 对我来说,因为我使用conda我使用Terminal创建了一个新环境:

conda create -n egpu python=3
source activate egpu
pip install tensorflow-gpu # should install version 1.1

5. Verify that it works 5.验证它是否有效

First you have to restart your computer then: 首先,您必须重新启动计算机:

In terminal type python and enter: 在终端类型python输入:

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

If you have a GPU this should run with no problem, if it does then you should get a stack trace (just a bunch of error messages) and it should include 如果你有一个GPU应该运行没有问题,如果它,那么你应该得到一个堆栈跟踪(只是一堆错误消息),它应该包括

Cannot assign a device to node 'MatMul': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process 无法将设备分配给节点'MatMul':无法满足显式设备规范'/ device:GPU:0',因为在此过程中没有注册与该规范匹配的设备

If not then you're done congratz! 如果没有那么你就完成了祝贺! I just got mine set up today and it's working perfectly :) 我刚刚安装了我的设备并且工作正常:)

I could finally make it work with the following setup 我终于可以使用以下设置了

Hardware 硬件

Software versions 软件版本

  • macOS Sierra Version 10.12.6 macOS Sierra版本10.12.6
  • GPU Driver Version: 10.18.5 (378.05.05.25f01) GPU驱动版本:10.18.5(378.05.05.25f01)
  • CUDA Driver Version: 8.0.61 CUDA驱动程序版本:8.0.61
  • cuDNN v5.1 (Jan 20, 2017), for CUDA 8.0: Need to register and download cuDNN v5.1(2017年1月20日),适用于CUDA 8.0:需要注册和下载
  • tensorflow-gpu 1.0.0 tensorflow-gpu 1.0.0
  • Keras 2.0.8 Keras 2.0.8

I wrote a gist with the procedure: 我写了一个关于程序的要点:

https://gist.github.com/jganzabal/8e59e3b0f59642dd0b5f2e4de03c7687 https://gist.github.com/jganzabal/8e59e3b0f59642dd0b5f2e4de03c7687

Here is my solution to install an e-gpu on a mac. 这是我在mac上安装e-gpu的解决方案。 Tensorflow doesn't support tensorflow-gpu anymore, so there are definitely better approaches to get it working: Tensorflow不再支持tensorflow-gpu,所以肯定有更好的方法让它工作:

My configuration: 我的配置:

  • IMac 27' late 2012 IMac 27'2012年末
  • Aktio Node Aktio节点
  • GTX 1080 ti GTX 1080 ti
  • 3 Screens: One of them connected to the GTX 1080 and the others directly plugged on the mac. 3个屏幕:其中一个连接到GTX 1080,其他一个直接插在Mac上。

Advantages of windows bootcamp installation: windows bootcamp安装的优点:

  • You can use pip to install tensorflow-gpu. 你可以使用pip来安装tensorflow-gpu。
  • Good GPU 1080 ti support (Downloadable display driver) 良好的GPU 1080 ti支持(可下载的显示驱动程序)

Howto: 如何:

  • Install windows 10 with bootcamp. 使用bootcamp安装Windows 10。 Do not connect the Akito node for the moment. 暂时不要连接Akito节点。
  • Download and install the display driver for your gpu from NVIDIA download page NVIDIA下载页面下载并安装gpu的显示驱动程序
  • Install Visual Studio 安装Visual Studio
    • If you want to use CUDA 9.x you can install Visual Studio 2017 如果要使用CUDA 9.x,可以安装Visual Studio 2017
    • Otherwise install Visual Studio 2015 否则安装Visual Studio 2015
  • Install CUDA and CuDNN 安装CUDA和CuDNN
    • Note that the tensorflow-gpu version must match with your Cuda and your CudNN version. 请注意,tensorflow-gpu版本必须与您的Cuda和您的CudNN版本匹配。 See available tensorflow releases here . 在此处查看可用的tensorflow版本。
    • After the CUDA installation you can move the unpacked CuDNN files to the CUDA folder at: C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.0. 安装CUDA后,您可以将解压缩的CuDNN文件移动到CUDA文件夹:C:\\ Program Files \\ NVIDIA GPU Computing Toolkit \\ CUDA \\ v9.0。 Move the lib files to the lib folder, the bin files to the bin folder and the include files to the include folder. 将lib文件移动到lib文件夹,将bin文件移动到bin文件夹,将include文件移动到include文件夹。
  • Install Python 3.5+ 安装Python 3.5+
    • You need a 64-bit version to install tensorflow-gpu with pip 你需要一个64位版本来安装带有pip的tensorflow-gpu
    • Python 2.7 won't work. Python 2.7不起作用。
  • Install tensorflow with pip: 使用pip安装tensorflow:

Command: 命令:

pip install tensorflow-gpu==1.5.0rc0

Check your installation 检查您的安装

The display driver has been installed correctly when you can plug a screen to the GTX 1080 ti card. 当您可以将屏幕插入GTX 1080 ti卡时,显示驱动程序已正确安装。

Call C:\\Program Files\\NVIDIA Corporation\\NVSMI\\nvidia-smi.exe to check if your video card is available for CUDA. 调用C:\\ Program Files \\ NVIDIA Corporation \\ NVSMI \\ nvidia-smi.exe检查您的视频卡是否可用于CUDA。

Execute the following tensorflow command to see available devices: 执行以下tensorflow命令以查看可用设备:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

Troubleshooting and hints: 故障排除和提示:

  • Windows wants to update your GTX 1080 driver. Windows想要更新您的GTX 1080驱动程序。 Never allow that because you won't be able to startup your computer again! 永远不要允许,因为您将无法再次启动计算机! A black screen with moving dots will appear before you can login to windows. 在您登录Windows之前,将出现带有移动点的黑屏。 Game over! 游戏结束! Only use the display driver from NVIDIA download page . 仅使用NVIDIA下载页面中的显示驱动程序。
  • If you cannot start windows on OSX anymore, press the alt key at startup to reinstall windows. 如果无法再在OSX上启动Windows,请在启动时按alt键重新安装Windows。

Ubuntu solution: Ubuntu解决方案:

I couldn't find a working solution but here are some approaches: 我找不到一个有效的解决方案,但这里有一些方法:

It seems that my GTX 680 (iMac) and my GTX 1080 ti won't work together. 我的GTX 680(iMac)和我的GTX 1080 ti似乎无法协同工作。 Ubuntu could not be started anymore after installing the display driver via apt-get: Ubuntu not starting anmore . 通过apt-get安装显示驱动程序后, Ubuntu无法再启动Ubuntu无法启动 Try to download the official display driver from NVIDIA download page . 尝试从NVIDIA下载页面下载官方显示驱动程序。

OSX Solution: Tensorflow GPU is only supported up to tensorflow 1.1. OSX解决方案: Tensorflow GPU仅支持tensorflow 1.1。 I tried to install a newer version but couldn't build tensorflow-gpu with cuda support. 我试图安装更新版本,但无法使用cuda支持构建tensorflow-gpu。 Here are some approaches: 以下是一些方法:

  • Install OSX Sierra to use the e-gpu script . 安装OSX Sierra以使用e-gpu脚本 High Sierra won't work (Jan, 13 2018). High Sierra将无效(2018年1月13日)。 Downgrade to sierra by deleting all your partitions. 通过删除所有分区降级到sierra。 Then press Command + R at startup to load the internet recovery. 然后在启动时按Command + R以加载Internet恢复。 Don't forget to backup your data first. 不要忘记先备份您的数据。
  • Install e-gpu script . 安装e-gpu脚本
  • If tensorflow-gpu 1.1 is enough for you, you can just install via pip, otherwise you need to build your pip with bazel. 如果tensorflow-gpu 1.1足够你,你只需要通过pip安装,否则你需要用bazel建立你的pip。

Conclusion: The windows installation is easier than OSX or Ubuntu installation because display drivers work properly and tensorflow and must not be build on your own. 结论: Windows安装比OSX或Ubuntu安装更容易,因为显示驱动程序正常工作和张量流,不能自行构建。 Always check the software version you use. 始终检查您使用的软件版本。 The must match exactly. 必须完全匹配。

I hope this will help you! 我希望这能帮到您!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM