[英]CUDA Copy inherited class object to device
I have a Parent
class and an inherited Child
class: 我有一个Parent
类和一个继承的Child
类:
class Parent {};
class Child : public Parent {};
There are a couple child classes that inherit from Parent
, but for simplicity, I only included one. 有两个从Parent
继承的子类,但为简单起见,我仅包含其中一个。 These inherited classes are necessary for the project I am working on. 这些继承的类对于我正在从事的项目是必需的。 I also have an object from another class, which I wish to copy onto the device: 我还有另一个类的对象,希望将其复制到设备上:
class CopyClass {
public:
Parent ** par;
};
Note that the Parent ** par;
注意, Parent ** par;
is there because I need to have a list of Child
objects, but which child it will be using (and the length of the list) is unknown at compile time. 在那里是因为我需要一个Child
对象的列表,但是在编译时它将使用哪个子对象(以及列表的长度)是未知的。 Here is my attempt at copying a CopyClass
object onto the device: 这是我尝试将CopyClass
对象复制到设备上的尝试:
int length = 5;
//Instantiate object on the CPU
CopyClass cpuClass;
cpuClass.par = new Parent*[length];
for(int i = 0; i < length; ++i) cpuClass.par[i] = new Child;
//Copy object onto GPU
CopyClass * gpuClass;
cudaMalloc(&gpuClass,sizeof(CopyClass));
cudaMemcpy(gpuClass,&cpuClass,sizeof(CopyClass),cudaMemcpyHostToDevice);
//Copy dynamically allocated variables to GPU
Parent ** d_par;
d_par = new Parent*[length];
for(int i = 0; i < length; ++i) {
cudaMalloc(&d_par[i],sizeof(Child));
printf("\tCopying data\n");
cudaMemcpy(d_par[i],cpuClass.par[i],sizeof(Child),cudaMemcpyHostToDevice);
}
//SIGSEGV returned during following operation
cudaMemcpy(gpuClass->par,d_par,length*sizeof(void*),cudaMemcpyHostToDevice);
I have seen multiple similar problems to this here , here , here , here , and here , but either I couldnt understand the problem they were having, or it didn't seem to fit in with this particular issue. 我在这里 , 这里 , 这里 , 这里和这里已经看到了多个与此类似的问题,但是我要么无法理解他们所遇到的问题,要么似乎不适合这个特定问题。
I know that the segmentation fault I am getting is because gpuClass->par
is on the device, and cudaMemCpy does not allow device pointers. 我知道我得到的分段错误是因为gpuClass->par
位于设备上,而cudaMemCpy不允许设备指针。 However, I see no other way to "insert" the pointer into the gpuClass
object. 但是,我看不到将指针“插入” gpuClass
对象的其他方法。
The ways which I could see a solution is to: 我可以看到的解决方案是:
1) Flatten my data structure. 1)整理我的数据结构。 However, I don't know how to do this with the inherited class functionality that I want. 但是,我不知道如何使用我想要的继承的类功能来执行此操作。
2) Instantiate gpuClass
originally on the gpu, which I don't know how to do, or 2)最初在gpu上实例化gpuClass
,我不知道该怎么做,或者
3) I have seen in one of the solutions that you can use cudaMemCpy to copy the address of your dynamically allocated list into an object, but once again, I don't know how to do that (specifically for copying a device pointer to the location of another device pointer). 3)我在一种解决方案中看到,您可以使用cudaMemCpy将动态分配的列表的地址复制到一个对象中,但是再次,我不知道该怎么做(特别是将设备指针复制到另一个设备指针的位置)。
Any help would be greatly appreciated. 任何帮助将不胜感激。
In your first related link I give 5 steps for the object based deep-copy sequence, but this case is complicated by the fact that you are doing a double-pointer version of the example given in that link. 在您的第一个相关链接中,我为基于对象的深度复制序列提供了5个步骤,但是由于您正在对该链接中给出的示例进行双指针版本操作,因此使这种情况变得复杂。 The complexity associated with a double-pointer deep-copy is such that the usual recommendation is to avoid it (ie flatten). 与双指针深拷贝相关的复杂性是通常的建议是避免它 (即变平)。
The first fix we need to make to your code is to properly handle the d_par
array. 我们需要对您的代码进行的第一个修复是正确处理d_par
数组。 You need to make a corresponding allocation on the device to hold the array associated with d_par
. 您需要在设备上进行相应的分配,以保存与d_par
相关联的数组。 The array associated with d_par
has storage for 5 object pointers. 与d_par
关联的数组可存储5个对象指针。 You've allocated host-side storage for it (with new
) but nowhere are you are doing a device-side allocation for it. 您已经为其分配了主机端存储(带有new
),但是您无处在为其进行设备端分配。 (I'm not talking about the d_par
pointer itself , I'm talking about what it points to , which is an array of 5 pointers). (我不是在谈论d_par
指针本身 ,我说的是什么它指向的 ,这是5个指针数组)。
The second fix we need to make is to adjust the fixup of the par
pointer itself (as opposed to what it points to), in the top-level device side object. 我们需要做的第二个修复是在顶级设备端对象中调整par
指针本身(与其指向的对象相反)的修复。 You've attempted to combine both these into a single step, but that won't work. 您已经尝试将这两个步骤合并为一个步骤,但这是行不通的。
Here's a modified version of your code that seems to work correctly with the above changes: 这是您的代码的修改后的版本,似乎可以通过上述更改正常运行:
$ cat t29.cu
#include <stdio.h>
class Parent {public: int my_id;};
class Child : public Parent {};
class CopyClass {
public:
Parent ** par;
};
const int length = 5;
__global__ void test_kernel(CopyClass *my_class){
for (int i = 0; i < length; i++)
printf("object: %d, id: %d\n", i, my_class->par[i]->my_id);
}
int main(){
//Instantiate object on the CPU
CopyClass cpuClass;
cpuClass.par = new Parent*[length];
for(int i = 0; i < length; ++i) {
cpuClass.par[i] = new Child;
cpuClass.par[i]->my_id = i+1;} // so we can prove that things are working
//Allocate storage for object onto GPU and copy host object to device
CopyClass * gpuClass;
cudaMalloc(&gpuClass,sizeof(CopyClass));
cudaMemcpy(gpuClass,&cpuClass,sizeof(CopyClass),cudaMemcpyHostToDevice);
//Copy dynamically allocated child objects to GPU
Parent ** d_par;
d_par = new Parent*[length];
for(int i = 0; i < length; ++i) {
cudaMalloc(&d_par[i],sizeof(Child));
printf("\tCopying data\n");
cudaMemcpy(d_par[i],cpuClass.par[i],sizeof(Child),cudaMemcpyHostToDevice);
}
//Copy the d_par array itself to the device
Parent ** td_par;
cudaMalloc(&td_par, length * sizeof(Parent *));
cudaMemcpy(td_par, d_par, length * sizeof(Parent *), cudaMemcpyHostToDevice);
//copy *pointer value* of td_par to appropriate location in top level object
cudaMemcpy(&(gpuClass->par),&(td_par),sizeof(Parent **),cudaMemcpyHostToDevice);
test_kernel<<<1,1>>>(gpuClass);
cudaDeviceSynchronize();
return 0;
}
$ nvcc -arch=sm_61 -o t29 t29.cu
$ cuda-memcheck ./t29
========= CUDA-MEMCHECK
Copying data
Copying data
Copying data
Copying data
Copying data
object: 0, id: 1
object: 1, id: 2
object: 2, id: 3
object: 3, id: 4
object: 4, id: 5
========= ERROR SUMMARY: 0 errors
$
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.