简体   繁体   English

CUDA使用cudaMemcpy复制多个结构数组

[英]CUDA Copying multiple arrays of structs with cudaMemcpy

Suppose a struct X with some primitives and an array of Y structs: 假设结构X具有一些基本元素和Y结构数组:

typedef struct 
{ 
   int a;    
   Y** y;
} X;

An instance X1 of X is initialized at the host, and then copied to an instance X2 of X, on the device memory, through cudaMemcpy. X的实例X1在主机上初始化,然后通过cudaMemcpy复制到设备内存上的X的实例X2。

This works fine for all the primitives in X (such as int a), but cudaMemcpy seems to flatten any double pointer into a single pointer, thus causing out of bounds exceptions wherever there's an access to the struct arrays in X (such as y). 这对于X中的所有原语(例如int a)都可以正常工作,但是cudaMemcpy似乎将任何双指针都扁平化为单个指针,因此,只要在X中可以访问结构数组(例如y),就会导致超出范围的异常。 。

In this case am I supposed to use another memcpy function, such as cudaMemcpy2D or cudaMemcpyArrayToArray? 在这种情况下,我应该使用另一个memcpy函数,例如cudaMemcpy2D或cudaMemcpyArrayToArray吗?

Suggestions are much appreciated. 建议非常感激。 Thanks! 谢谢!

edit 编辑

The natural approach (as in "that's what I'd do if it were just C) towards copying an array of structures would be to cudaMalloc the array and then cudaMalloc and initialize each element separately, eg: 复制结构数组的自然方法(如“如果只是C,这就是我要做的事情”)将是cudaMalloc该数组,然后cudaMalloc并分别初始化每个元素,例如:

X** h_x;
X** d_x;
int num_x;

cudaMalloc((void**)&d_x, sizeof(X)*num_x);

int i=0;
for(;i<num_x;i++)
{
    cudaMalloc((void**)d_x[i], sizeof(X));
    cudaMemcpy(&d_x[i], &h_x[i], sizeof(X), cudaMemcpyHostToDevice);
}

However, the for's cudaMalloc generates a crash. 但是,for的cudaMalloc会导致崩溃。 I confess I'm not yet comfortable with the usage of pointers in Cuda functions, so perhaps I screwed up with the cudaMalloc and cudaMemcpy parameters? 我承认我对使用Cuda函数中的指针还不满意,所以也许我搞砸了cudaMalloc和cudaMemcpy参数?

cudaMemcpy , cudaMemcpy2D and cudaMemcpyArrayToArray all copy from a contiguous memory region in the host to a contiguous memory region on the device. cudaMemcpycudaMemcpy2DcudaMemcpyArrayToArray都从主机中的连续内存区域复制到设备上的连续内存区域。

You have to copy all your data in an intermediary contiguous buffer you send to the device. 您必须将所有数据复制到发送到设备的中间连续缓冲区中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM