简体   繁体   English

CUDA代码在Linux上编译,但在Windows中不编译(Visual Studio 2012)

[英]CUDA code compile on Linux but not in Windows ( Visual Studio 2012)

I'm developing a program that use CUDA developing toolkit version 10.1 and I'm using visual studio 2012. I'm working on windows but I share code with a linux user. 我正在开发一个使用CUDA开发工具包版本10.1的程序,并且正在使用Visual Studio2012。我正在Windows上工作,但与Linux用户共享代码。 All the code works fine on the two cases, except for some line of code that works on linux but not on windows. 所有代码在这两种情况下都可以正常工作,除了某些行代码可以在linux上运行,而不能在Windows上运行。 So every time I have to change these lines. 因此,每次我必须更改这些行。 I would avoid to do this and by the fact that on linux the code compile well, I think there are some reasons why on windows doesn't compile, but these reasons must be for sure not about the code but about some visual studio setting or similar. 我会避免这样做,因为在Linux上代码可以很好地编译,我认为有一些原因导致Windows上的代码无法编译,但是这些原因一定是与代码无关,而是与某些Visual Studio设置有关,或者类似。 Can you help me? 你能帮助我吗? In particular the line of codes are: 特别是以下代码行:

int n_devices = 0;
cudaGetDeviceCount(&n_devices);
cudaDeviceProp props[n_devices];

On the last line i have the error: 在最后一行,我有错误:

error: expression must have a constant value 错误:表达式必须具有恒定值

I can fix this error defining const int n_devices = 1; 我可以修复定义const int n_devices = 1;错误const int n_devices = 1; and commenting the function cudaGetDeviceCount(&n_devices); 并注释函数cudaGetDeviceCount(&n_devices); . It works because I already know the right number of devices but for sure is less right solution than the previous one. 之所以有效,是因为我已经知道正确的设备数量,但是可以肯定的是,与上一个设备相比,正确的解决方案要少得多。

The other problem is that I have a utils.cuh file in which there are defined two const value 另一个问题是我有一个utils.cuh文件,其中定义了两个const值

const float PI = 3.141592654f;
const float EPS = 1e-3f;

I invoke this two values in the utils.cu file and at compile time i have the error: 我在utils.cu文件中调用了这两个值,并且在编译时出现错误:

error: "PI" is undefined in device code 错误:设备代码中未定义“ PI”

error: "EPS" is undefined in device code 错误:设备代码中未定义“ EPS”

I can fix this declaring these two variables in this way: 我可以通过以下方式解决这两个变量的声明:

#define PI 3.141592654f
#define EPS 1e-3f

So even if I can fix all the two problems I really want to leave the code in the first configuration (since it works on linux). 因此,即使我可以解决所有两个问题,我也确实希望将代码保留在第一个配置中(因为它在Linux上有效)。 Could be a problem related to compiler version? 可能是与编译器版本有关的问题吗? I really don't know which could be the reason. 我真的不知道这可能是原因。

You won't be able to fix either of these problems just by changing compiler versions or anything like that. 仅通过更改编译器版本或类似的内容,您将无法解决这些问题。

The first issue is described here and here , it has nothing to do with CUDA except insofar as CUDA is making use of the host compiler. 第一个问题在这里这里描述,与CUDA无关,除非CUDA正在使用主机编译器。 The code you have shown makes use of a VLA (variable length array) which is part of the C99 standard but not part of any C++ standard. 您显示的代码使用了VLA(可变长度数组),它是C99标准的一部分,而不是任何C ++标准的一部分。 CUDA is primarily implemented based on C++, and makes use of the C++ host compiler to compile host code, which is what you have shown. CUDA主要基于C ++实现,并且利用C ++主机编译器来编译主机代码,这已显示出来。 On windows it is using the Microsoft compiler for that. 在Windows上,它使用Microsoft编译器。 So the Microsoft compiler is correct to disallow VLA, and there is no way to avoid this AFAIK. 因此,Microsoft编译器禁止VLA是正确的,并且没有办法避免这种AFAIK。 Your code works on linux, because on linux nvcc uses the g++ host compiler, and it allows (in a non-standard-compliant way) the use of a VLA in C++ host code. 您的代码可在linux上使用,因为nvcc在linux上使用g++主机编译器,并且允许(以非标准兼容的方式)在C ++主机代码中使用VLA。

I don't know of any method to address this that doesn't involve some change to your code, for cross-platform compatibility. 我不知道有什么方法可以解决此问题,并且不需要跨平台兼容性就可以对代码进行一些更改。 But a small amount of (C or) C++ programming skill can provide a solution for you that should work either on linux or windows: 但是,少量的C语言(或C ++)编程技能可以为您提供应在Linux或Windows上运行的解决方案:

int n_devices = 0;
cudaGetDeviceCount(&n_devices);
cudaDeviceProp *props = new cudaDeviceProp[n_devices];

(if you wanted to use a C compliant method, you could use malloc in a similar fashion) (如果您想使用兼容C的方法,则可以类似的方式使用malloc

The second issue is a limitation of CUDA, it is documented here . 第二个问题是CUDA的局限性, 在此处记录

There is also no method to address this cross-platform that I know of that involves no changes to your code. 据我所知,也没有任何方法可以解决这个跨平台问题,而无需更改您的代码。

You already have identified one possible workaround that can work in a cross-platform way both on linux and windows: 您已经确定了一种可能的解决方法,该方法可以在Linux和Windows上以跨平台的方式工作:

#define PI 3.141592654f
#define EPS 1e-3f

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM