简体   繁体   English

CudaMemCpy在复制向量时返回cudaErrorInvalidValue <cv::Point3f>

[英]CudaMemCpy returns cudaErrorInvalidValue on copying vector<cv::Point3f>

CudaMemCpy returns cudaErrorInvalidValue on copying vector onto the device. CudaMemCpy在将向量复制到设备上时返回cudaErrorInvalidValue。 I have tried giving "&input", "&input[0]",... I always get the same error but don't understand why? 我试过给“&input”,“&input [0]”,...我总是得到同样的错误,但不明白为什么?

Can you copy a vector using cudaMemcpy or do I need to copy the contents of that vector in a new array first? 你可以使用cudaMemcpy复制一个向量,或者我是否需要先在一个新数组中复制该向量的内容?

void computeDepthChangeMap(unsigned char* depthChangeMap, size_t size, std::vector<cv::Point3f>* input, float dcf, int width, int height)                                           {
    unsigned char* dev_depthChangeMap = 0;
    float* dev_dcf = 0;
    int* dev_wdt = 0;
    int arraySize = size;
    cv::Point3f* dev_input = 0;
    cudaError_t cudaStatus;

    cudaStatus = cudaSetDevice(0);
    cudaStatus = cudaMalloc((void**)&dev_depthChangeMap, size);
    cudaStatus = cudaMalloc((void**)&dev_input, size);
    cudaStatus = cudaMalloc((void**)&dev_dcf, sizeof(float));
    cudaStatus = cudaMalloc((void**)&dev_wdt, sizeof(int));

    cudaStatus = cudaMemcpy(dev_depthChangeMap, depthChangeMap, size, cudaMemcpyHostToDevice);
    cudaStatus = cudaMemcpy(dev_wdt, &width, sizeof(int), cudaMemcpyHostToDevice);
    cudaStatus = cudaMemcpy(dev_dcf, &dcf, sizeof(float), cudaMemcpyHostToDevice);
    cudaStatus = cudaMemcpy(dev_input, &input[0], sizeof(cv::Point3f)*size, cudaMemcpyHostToDevice);

    //cuaStatus returns cudaErrorInvalidValue >> PROBLEM HERE << 

    dim3 threadsPerBlock(8, 8); //init x, y
    dim3 numBlocks(width / threadsPerBlock.x, height / threadsPerBlock.y);

    addKernel <<<numBlocks, threadsPerBlock >>>(dev_depthChangeMap, dev_dcf, dev_input, dev_wdt);


    cudaStatus = cudaGetLastError();   
    cudaStatus = cudaDeviceSynchronize();
    cudaStatus = cudaMemcpy(depthChangeMap, dev_depthChangeMap, size, cudaMemcpyDeviceToHost);
}

__global__ void addKernel(unsigned char* dev_depthChangeMap, float* dcf, cv::Point3f* inp, int* wdt)
{
    register int row_idx = (blockIdx.x * blockDim.x) + threadIdx.x;
    register int col_idx = (blockIdx.y * blockDim.y) + threadIdx.y;
    register int idx = row_idx * (*wdt) + col_idx;

    register float depth = inp[idx].z;
    register float depthR = inp[idx + 1].z;
    register float depthD = inp[idx + *wdt].z;

    //and so on

}

Yes, you can copy from std::vector using cudaMemcpy . 是的,您可以使用cudaMemcpystd::vector进行复制。

You don't have your sizes set up correctly: 您没有正确设置尺寸:

void computeDepthChangeMap(unsigned char* depthChangeMap, size_t size, std::vector<cv::Point3f>* input, float dcf, int width, int height)                                           {

...
cudaStatus = cudaMalloc((void**)&dev_input, size);
                                            ^^^^

cudaStatus = cudaMemcpy(dev_input, &input[0], sizeof(cv::Point3f)*size, cudaMemcpyHostToDevice);
                                                     ^^^^^^^^^^^^^^^^^

These size parameters should all be in bytes . 这些大小参数都应以字节为单位 You can't copy data of length sizeof(cv::Point3f)*size bytes into an allocation of length size bytes. 您不能复制长度的数据sizeof(cv::Point3f)*size字节到的长度分配size字节。

Also, it seems that your function parameter is a pointer to a vector: 此外,您的函数参数似乎是一个指向向量的指针:

std::vector<cv::Point3f>* input,

based on the code you have shown, this is probably not what you want. 根据您显示的代码,这可能不是您想要的。 You probably either want to pass the vector by value : 您可能要么按值传递向量:

std::vector<cv::Point3f> input,

or more likely, by reference : 或更可能通过参考

std::vector<cv::Point3f> &input,

Since you haven't shown how you intend to call this function, it's not possible to be entirely sure what is best here. 由于您尚未显示您打算如何调用此功能,因此无法完全确定此处的最佳功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM