CUDA和课程

Question

I've searched all over for some insight on how exactly to use classes with CUDA, and while there is a general consensus that it can be done and apparently is being done by people, I've had a hard time finding out how to actually do it. 我一直在搜索有关如何使用CUDA的类的一些见解，虽然人们普遍认为它可以完成并且显然是由人完成，但我很难找到实际的方法。做到这一点。

I have a class which implements a basic bitset with operator overloading and the like. 我有一个类，它通过运算符重载等实现基本的bitset。 I need to be able to instantiate objects of this class on both the host and the device, copy between the two, etc. Do I define this class in a .cu? 我需要能够在主机和设备上实例化此类的对象，在两者之间进行复制等。我是否在.cu中定义了这个类？ If so, how do I use it in my host-side C++ code? 如果是这样，我如何在我的主机端C ++代码中使用它？ The functions of the class do not need to access special CUDA variables like threadId; 该类的函数不需要访问特殊的CUDA变量，如threadId; it just needs to be able to be used host and device side. 它只需要能够用于主机和设备端。

Thanks for any help, and if I'm approaching this in completely the wrong way, I'd love to hear alternatives. 感谢您的帮助，如果我以完全错误的方式接近这一点，我很想听听替代方案。

Answer 1

Define the class in a header that you #include, just like in C++. 在#include的头文件中定义类，就像在C ++中一样。

Any method that must be called from device code should be defined with both __device__ and __host__ declspecs, including the constructor and destructor if you plan to use new / delete on the device (note new / delete require CUDA 4.0 and a compute capability 2.0 or higher GPU). 必须使用__device__和__host__ declspecs定义必须从设备代码调用的任何方法，包括构造函数和析构函数（如果您计划在设备上使用new / delete （注意new / delete需要CUDA 4.0和计算能力2.0或更高版本） GPU）。

You probably want to define a macro like 您可能想要定义一个宏

#ifdef __CUDACC__
#define CUDA_CALLABLE_MEMBER __host__ __device__
#else
#define CUDA_CALLABLE_MEMBER
#endif

Then use this macro on your member functions 然后在您的成员函数上使用此宏

class Foo {
public:
    CUDA_CALLABLE_MEMBER Foo() {}
    CUDA_CALLABLE_MEMBER ~Foo() {}
    CUDA_CALLABLE_MEMBER void aMethod() {}
};

The reason for this is that only the CUDA compiler knows __device__ and __host__ -- your host C++ compiler will raise an error. 原因是只有CUDA编译器知道__device__和__host__ - 您的主机C ++编译器将引发错误。

Edit: Note __CUDACC__ is defined by NVCC when it is compiling CUDA files . 编辑：注意__CUDACC__由NVCC在编译CUDA文件时定义。 This can be either when compiling a .cu file with NVCC or when compiling any file with the command line option -x cu . 这可以在使用NVCC编译.cu文件时，也可以在使用命令行选项-x cu编译任何文件时使用。

Answer 2

Another good resource for this question are some of the code examples that come with the CUDA toolkit. 这个问题的另一个好资源是CUDA工具包附带的一些代码示例。 Within these code samples you can find examples of just about any thing you could imagine. 在这些代码示例中，您可以找到几乎可以想象的任何事物的示例。 One that is pertinent to your question is the quadtree.cu file. 与您的问题相关的是quadtree.cu文件。 Best of luck. 祝你好运。

CUDA和课程

问题描述

2 个解决方案

解决方案1
54 已采纳 2011-08-08 06:58:31

解决方案2
3 2013-11-13 22:21:40

CUDA和课程

问题描述

2 个解决方案

解决方案1 54 已采纳 2011-08-08 06:58:31

解决方案2 3 2013-11-13 22:21:40

解决方案1
54 已采纳 2011-08-08 06:58:31

解决方案2
3 2013-11-13 22:21:40