简体   繁体   中英

CUDA Copying multiple arrays of structs with cudaMemcpy

Suppose a struct X with some primitives and an array of Y structs:

typedef struct 
{ 
   int a;    
   Y** y;
} X;

An instance X1 of X is initialized at the host, and then copied to an instance X2 of X, on the device memory, through cudaMemcpy.

This works fine for all the primitives in X (such as int a), but cudaMemcpy seems to flatten any double pointer into a single pointer, thus causing out of bounds exceptions wherever there's an access to the struct arrays in X (such as y).

In this case am I supposed to use another memcpy function, such as cudaMemcpy2D or cudaMemcpyArrayToArray?

Suggestions are much appreciated. Thanks!

edit

The natural approach (as in "that's what I'd do if it were just C) towards copying an array of structures would be to cudaMalloc the array and then cudaMalloc and initialize each element separately, eg:

X** h_x;
X** d_x;
int num_x;

cudaMalloc((void**)&d_x, sizeof(X)*num_x);

int i=0;
for(;i<num_x;i++)
{
    cudaMalloc((void**)d_x[i], sizeof(X));
    cudaMemcpy(&d_x[i], &h_x[i], sizeof(X), cudaMemcpyHostToDevice);
}

However, the for's cudaMalloc generates a crash. I confess I'm not yet comfortable with the usage of pointers in Cuda functions, so perhaps I screwed up with the cudaMalloc and cudaMemcpy parameters?

cudaMemcpy , cudaMemcpy2D and cudaMemcpyArrayToArray all copy from a contiguous memory region in the host to a contiguous memory region on the device.

You have to copy all your data in an intermediary contiguous buffer you send to the device.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM