简体   繁体   English

支持Nvidia CUDA Toolkit 9.2

[英]Support for Nvidia CUDA Toolkit 9.2

What is the reasoning that Tensorflow-gpu is bound to a specific version of Nvidia's CUDA Toolkit? Tensorflow-gpu绑定到特定版本的Nvidia CUDA Toolkit的原因是什么? The current version appears to look for 9.0 specifically and will not work with anything greater. 当前版本似乎专门寻找9.0,并且不适用于任何更大的版本。 For example I installed the latest Toolkit 9.2 and added it to path but Tensorflow-gpu will not work with it and complains that it is looking for 9.0. 例如,我安装了最新的Toolkit 9.2并将其添加到路径中,但Tensorflow-gpu无法使用它并抱怨它正在寻找9.0。

I can see major version updates not being supported but a minor release? 我可以看到主要版本更新不受支持,但是次要版本?

That's a good question. 这是个好问题。 According to NVidia's website , 根据NVidia的网站

The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases. CUDA驱动程序是向后兼容的,这意味着针对特定版本的CUDA编译的应用程序将继续在后续(稍后)驱动程序版本上运行。

So technically, it should not be a problem to support later iterations of a CUDA driver. 从技术上讲,支持CUDA驱动程序的后续迭代应该不是问题。 And in practice, you will find working non-official pre-built binaries with later versions of CUDA and CuDNN on the net [1] , [2] . 在实践中,您将在网上找到工作的非官方预建二进制文件以及更高版本的CUDA和CuDNN [1][2] Even easier to install, the tensorflow-gpu package installed from conda currently comes bundled with CUDA 9.2. 更容易安装, tensorflow-gpu安装的tensorflow-gpu软件包目前与CUDA 9.2捆绑在一起。

When asked on the topic, a dev answered , 当被问及这个话题时,开发人员回答说

The answer to why is driver issues in the ones required by 9.1, not many new features we need in cuda 9.1, and a few more minor issues. 为什么驱动程序问题出现在9.1所要求的问题,cuda 9.1中我们需要的新功能并不多,以及一些小问题。

So the reason looks rather vague -- he might mean that CUDA 9.1 (and 9.2) requires graphics card driver that are perhaps a bit too recent to be really convenient, but that is an uneducated guess. 所以原因看起来很模糊 - 他可能意味着CUDA 9.1(和9.2)需要显卡驱动程序,这可能有点太新了,不太方便,但这是一个没有受过教育的猜测。

If NVidia is right about binary compatibility, you may try to simply rename or link your CUDA 9.2 library as a CUDA 9.0 library and it should work. 如果NVidia对二进制兼容性是正确的,您可以尝试简单地将CUDA 9.2库重命名或链接为CUDA 9.0库,它应该可以工作。 But I would save all my work before attempting this... and the fact that people go as far as recompiling tensorflow to support later CUDA versions may be a hint on how this could end. 但是在尝试这个之前我会保存所有的工作......而且人们甚至重新编译张量流以支持后来的CUDA版本可能暗示了这可能会如何结束。

When you download TF, you download a pre-built binary file. 下载TF时,下载预先构建的二进制文件。 In the build process TF is hard linked into a specific version of Cuda, so you cannot use it with different cuda versions. 在构建过程中,TF很难链接到特定版本的Cuda,因此您不能将它与不同的cuda版本一起使用。

If you want to work with the new (or sometimes older) version of cuda you will need to install TF from source ( check how here ) Or, if you realy don't want to build yourself, check in these repos, there are others that publish specific TF binaries, few examples: 如果您想使用新的(或有时更旧的)cuda版本,您需要从源代码安装TF( 请查看此处 )或者,如果您真的不想自己构建,请检查这些回购,还有其他发布特定的TF二进制文件,几个例子:

For your convenience I add here the CUDA + cuDNN versions that are required for each prebuilt Tensorflow version: 为方便起见,我在这里添加了每个预构建的Tensorflow版本所需的CUDA + cuDNN版本:

(I write here just about the TF versions that I worked with, maybe older TF versions use older versions of CUDA as well) (我在这里写的只是我使用的TF版本,也许旧的TF版本也使用旧版本的CUDA)

  • before TF v1.5 cuda 8.0 and cuDNN 6 在TF v1.5 cuda 8.0和cuDNN 6之前
  • start from: 1.5 - Prebuilt binaries are now built against CUDA 9 and cuDNN 7. 从1.5开始 - 预建二进制文件现在针对CUDA 9和cuDNN 7构建。

The issue is not with NVIDIA drivers but Tensorflow itself. 问题不在于NVIDIA驱动程序,而在于Tensorflow本身。 I spent an hour trying to make it work, and finally realized that if you download the pre-built binary from googleapi.com, it is hard coded to load libcudart.so.9.0! 我花了一个小时试图让它工作,并最终意识到,如果你从googleapi.com下载预先构建的二进制文件,它是硬编码加载libcudart.so.9.0! If you have both cuda 9.0 and 9.2 installed, tensorflow will work (but it's actually loading the dynamic libraries from 9.0). 如果您同时安装了cuda 9.0和9.2,则tensorflow将起作用(但它实际上是从9.0加载动态库)。 (BTW, I installed TF using anaconda.) (顺便说一下,我用anaconda安装了TF。)

A cleaner approach is to build TF from source. 更清洁的方法是从源代码构建TF。 It's not too complicated. 这不是太复杂。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM