简体   繁体   English

CUDA错误:“ __ device __,__ constant__和__shared__变量不支持动态初始化”

[英]CUDA error: “dynamic initialization is not supported for __device__, __constant__ and __shared__ variables”

I'm trying to statically initialize read-only std::map variables in GPU memory as follows: 我试图静态初始化GPU内存中的只读std::map变量,如下所示:

// EXAMPLE 1:
using namespace std;

// first attempt: __device__ extern const
__device__ extern const map<char, const char*> BYTES_TO_WORDS = {
{0xB0, "zero"}, {0xB1, "one"}, {0xB2, "two"}, {0xB3, "three"}};

// second attempt: __const__ static
enum class Color{RED, GREEN, BLUE};
enum class Device{PC, TABLET, PHONE};

__constant__ static map<Color, Device> COLORS_TO_THINGS = {
{Color::RED,Device::PC},{Color::GREEN,Device::TABLET},{Color::BLUE,Device::PHONE}};

But I'm getting the following error: 但我收到以下错误:

dynamic initialization is not supported for __device__, __constant__ and __shared__ variables

I'm confused because I don't get this error when I try something like this: 我很困惑,因为尝试这样的操作时没有出现此错误:

// EXAMPLE 2:
__device__ extern int PLAIN_ARRAY[] = {1, 2, 3, 4, 5};

I just want to be able to create and initialize a read-only std::map and access it from both CPU and GPU code. 我只希望能够创建和初始化一个只读的 std::map并从CPU和GPU代码中访问它。 I would appreciate if you could tell me how to do it properly. 如果您能告诉我如何正确操作,将不胜感激。

EDIT: I was pointed out that the standard libraries are not supported in device code. 编辑:有人指出,设备代码不支持标准库。 But the error I'm getting seems to suggest that it's rather a memory management issue. 但是我遇到的错误似乎表明这是一个内存管理问题。

Initializing a C++ object such as an std::map involves calling the constructor at runtime. 初始化诸如std::map类的C ++对象涉及在运行时调用构造函数。 The C++11 syntax you are using to initialize your std::map s is a form of list initialization which calls the std::initializer_list overload of std::map 's constructor. 用于初始化std::map的C ++ 11语法是列表初始化的一种形式,它调用std::map构造函数的std::initializer_list重载。 Your example with PLAIN_ARRAY does not call any constructors as this is a form of aggregate initialization which only involves initializing some int s by value, and initializing an int does not require a constructor call. 使用PLAIN_ARRAY示例不会调用任何构造函数,因为这是聚合初始化的一种形式,它仅涉及通过值初始化一些int ,而初始化int不需要构造函数调用。

In CUDA, it is not possible to use any kind of dynamic initialization with global variables stored on the GPU, such as __device__ and __constant__ variables, which means the initial value of the object must be known at compile-time, and not only produced at runtime after calling a constructor. 在CUDA中,无法对存储在GPU上的全局变量使用任何类型的动态初始化,例如__device____constant__变量,这意味着必须在编译时知道对象的初始值,而不仅要在编译时知道调用构造函数后的运行时。

Another issue is that even in contexts where you can call constructors in device code, you wouldn't be able to call the constructor of std::map as, being part of the C++ standard library, it has no __device__ constructor, nor does it have any other __device__ member functions, so it can only be used from host code. 另一个问题是,即使在可以在设备代码中调用构造函数的上下文中,您也无法调用std::map的构造函数,因为作为C ++标准库的一部分,它没有__device__构造函数,也没有具有任何其他__device__成员函数,因此只能在主机代码中使用。 The CUDA runtime does not define any kind of device functionality for C++ STL classes. CUDA运行时未为C ++ STL类定义任何类型的设备功能。 Even if you manage to cudaMemcpy() an std::map from host memory to GPU memory, you won't be able to use the object, firstly because all its member functions are __host__ functions, with no __device__ counterparts, and secondly, an std::map will internally contain pointer member variables referring to dynamically allocated host memory, which will not be valid memory addresses on the GPU. 即使您设法将std::map从主机内存cudaMemcpy() std::map到GPU内存,您也将无法使用该对象,首先是因为其所有成员函数都是__host__函数,没有__device__对应对象,其次, std::map将在内部包含指向动态分配的主机内存的指针成员变量,这些主机内存将不是GPU上的有效内存地址。

An alternative would be to use plain arrays of structs instead of maps, for example: 一种替代方法是使用简单的结构数组而不是映射,例如:

__device__
const struct {
    unsigned char byte;
    const char word[10];
} BYTES_TO_WORDS[] = {
    {0xB0, "zero"},
    {0xB1, "one"},
    {0xB2, "two"},
    {0xB3, "three"}
};

However, unlike with std::map , you will have to implement looking up a value by its key manually. 但是,与std::map ,您将必须实现通过键手动查找值。


I just want to be able to create and initialize a read-only std::map and access it from both CPU and GPU code . 我只希望能够创建和初始化一个只读std::map从CPU和GPU代码中进行访问

Unfortunately, this is not trivial, since you can't define a variable as both __device__ and __host__ . 不幸的是,这并不简单,因为您不能同时将__device____host__定义为变量。 To access a __device__ variable from host code, you would have to use cudaMemcpyFromSymbol() , which is quite awkward compared to just accessing a variable like normal. 要从主机代码访问__device__变量,您将不得不使用cudaMemcpyFromSymbol() ,与仅以普通方式访问变量相比,这很尴尬。 Therefore you may end up having to define your constants in host memory and then copy your constants from host memory to device memory: 因此,您可能最终不得不在主机内存中定义常量,然后将常量从主机内存复制到设备内存:

const byte_word BYTES_TO_WORDS[] = {
    {0xB0, "zero"},
    // ...
};

// uninitialized array
__device__
byte_word DEV_BYTES_TO_WORDS[sizeof BYTES_TO_WORDS / sizeof(byte_word)];

// at startup, use `cudaMemCpyToSymbol()` to populate `DEV_BYTES_TO_WORDS`
// from `BYTES_TO_WORDS`.

An alternative would be to use a preprocessor define to effectively copy and paste the same initializer across both arrays, rather than copying the data over at runtime. 一种替代方法是使用预处理器定义在两个数组之间有效地复制和粘贴相同的初始化程序,而不是在运行时复制数据。 In any case, two separate arrays are required. 无论如何,都需要两个单独的阵列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM