简体   繁体   English

如何强制生成更高版本的cubin文件

[英]How to force cubin file generation for a higher compute version

In the samples provided with CUDA 6.0, I'm running the following compile command with error output: 在CUDA 6.0提供的示例中,我正在运行以下带有错误输出的编译命令:

foo@foo:/usr/local/cuda-6.0/samples/0_Simple/cdpSimpleQuicksort$ nvcc --cubin -I../../common/inc cdpSimpleQuicksort.cu
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
cdpSimpleQuicksort.cu(105): error: calling a __global__ function("cdp_simple_quicksort") from a __global__ function("cdp_simple_quicksort") is only allowed on the compute_35 architecture or above

cdpSimpleQuicksort.cu(114): error: calling a __global__ function("cdp_simple_quicksort") from a __global__ function("cdp_simple_quicksort") is only allowed on the compute_35 architecture or above

2 errors detected in the compilation of "/tmp/tmpxft_0000241a_00000000-6_cdpSimpleQuicksort.cpp1.ii".

I then altered the command to this, with a new failure: 然后,我将命令更改为此,但出现新的失败:

foo@foo:/usr/local/cuda-6.0/samples/0_Simple/cdpSimpleQuicksort$ nvcc --cubin -I../../common/inc -gencode arch=compute_35,code=sm_35 cdpSimpleQuicksort.cu
cdpSimpleQuicksort.cu(105): error: kernel launch from __device__ or __global__ functions requires separate compilation mode

cdpSimpleQuicksort.cu(114): error: kernel launch from __device__ or __global__ functions requires separate compilation mode

2 errors detected in the compilation of "/tmp/tmpxft_000024f3_00000000-6_cdpSimpleQuicksort.cpp1.ii".

Does this have anything to do with the fact that the machine I'm on is only Compute 2.1 capable and the build tools are blocking me? 这与我使用的计算机仅具有Compute 2.1功能并且构建工具阻止了我有关吗? What's the resolution... I'm not finding anything in the documentation that is clearly handling this error. 解决方案是什么...我在文档中找不到任何明显可解决此错误的内容。

I looked at this question, and that... a link to documentation is simply not helping. 我看着这个问题,那个...文档的链接根本无济于事。 I need to know how I have to modify the compile command. 我需要知道如何修改编译命令。

Look at the makefile that comes with that cdpSimpleQuicksort project. 查看该cdpSimpleQuicksort项目随附的makefile。 It shows some additional switches that are needed to compile it, due to CUDA dynamic parallelism (which is essentially the second set of errors you are seeing.) Go back and study that makefile, and see if you can figure out how to combine some of the compile commands there with --cubin . 由于CUDA动态并行性,它显示了编译它所需的一些其他开关(这实际上是您看到的第二组错误。)返回并研究该makefile,看看是否可以找出如何组合其中的一些内容。使用--cubin在那里编译命令。

The readers digest version is that this should compile without error: 读者摘要版本是该版本应无错误编译:

nvcc --cubin -rdc=true -I../../common/inc -arch=sm_35 cdpSimpleQuicksort.cu

Having said all that, you should be able to compile for whatever kind of target you want, but you won't be able to run a cdp code on a cc2.1 architecture. 说了这么多,您应该能够针对所需的任何类型的目标进行编译,但是您将无法在cc2.1架构上运行cdp代码。

cdp documentation and here cdp文档此处

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM