简体   繁体   English

Visual Studio编译器设置对CUDA内核性能的影响

[英]Effect of visual studio compiler settings on performance of CUDA kernels

I get about 3-4x times difference in computation time of a same CUDA kernel compiled on two different machines. 在两台不同的计算机上编译的相同CUDA内核的计算时间差约为3-4倍。 Both versions run on a same machine and GPU device. 两种版本都在同一台机器和GPU设备上运行。 The direct conclusion explaining the difference is different compiler settings. 解释差异的直接结论是不同的编译器设置。 Although there is no single perfect setting and the tuning should be customized depending on the kernel, I wonder if there is any clear guideline for helping to choose the right settings. 尽管没有一个完美的设置,应该根据内核自定义调整,但我想知道是否有任何明确的指南可帮助您选择正确的设置。 I use Visual Studio 2010. Thank you. 我使用Visual Studio2010。谢谢。

  1. Compile in release mode, not debug mode, if you want fastest performance. 如果要获得最快的性能,请以发布模式而不是调试模式进行编译。 The -G switch passed to the nvcc compiler will usually have a negative effect on GPU code performance. 传递给nvcc编译器的-G开关通常会对GPU代码性能产生负面影响。
  2. It's generally recommended to select the right architecture for the GPU you are compiling for. 通常建议为要编译的GPU选择正确的架构。 For example, if you have a cc 2.1 capability GPU, make sure that setting (sm_21, in GPU code settings) is being passed to the compiler. 例如,如果您具有cc 2.1功能的GPU,请确保将设置(在GPU代码设置中为sm_21)传递给编译器。 There are some counter examples to this (eg compiling for cc 2.0 seems to run faster, etc.) but as a general recommendation, it is best. 有一些反例(例如,针对cc 2.0的编译似乎运行更快等),但作为一般建议,最好。
  3. Use the latest version of CUDA (compiler). 使用最新版本的CUDA(编译器)。 This is especially important when using GPU libraries (CUFFT, CUBLAS, etc.) (yes, this is not really a compiler setting) 在使用GPU库(CUFFT,CUBLAS等)时,这一点尤其重要(是的,这实际上不是编译器设置)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM