OpenCV-将GpuMat复制到cuda设备数据

Question

I am trying to copy the data in a cv::cuda::GpuMat to a uint8_t* variable which is to be used in a kernel. 我正在尝试将cv::cuda::GpuMat的数据复制到uint8_t*变量中，该变量将在内核中使用。

The GpuMat contains an image data of resolution 752x480 and of type CV_8UC1. GpuMat包含分辨率为752x480且类型为CV_8UC1的图像数据。 Below is the sample code: 下面是示例代码：

uint8_t *imgPtr;
cv::Mat left, downloadedLeft;
cv::cuda::GpuMat gpuLeft;

left = imread("leftview.jpg", cv::IMREAD_GRAYSCALE);
gpuLeft.upload(left);

cudaMalloc((void **)&imgPtr, sizeof(uint8_t)*gpuLeft.rows*gpuLeft.cols);
cudaMemcpyAsync(imgPtr, gpuLeft.ptr<uint8_t>(), sizeof(uint8_t)*gpuLeft.rows*gpuLeft.cols, cudaMemcpyDeviceToDevice);

// following code is just for testing and visualization...
cv::cuda::GpuMat gpuImg(left.rows, left.cols, left.type(), imgPtr);
gpuImg.download(downloadedLeft);
imshow ("test", downloadedLeft);
waitKey(0);

But the output is not as expected. 但是输出不符合预期。 Following are the input and output image respectively. 以下分别是输入和输出图像。

INPUT INPUT

OUTPUT OUTPUT

I have tried giving the cv::Mat source to the cudaMemcpy . 我尝试将cv::Mat源代码提供给cudaMemcpy 。 It seems to be working fine. 它似乎工作正常。 The issue seems to be with the cv::cuda::GpuMat and cudaMemcpy . 问题似乎与cv::cuda::GpuMat和cudaMemcpy 。 A similar issue is discussed in the here 这里讨论了类似的问题

Also, if the image with is 256 or 512, the program seems to be working fine. 另外，如果的图像为256或512，则该程序似乎运行正常。

What is that I am missing? 我想念的是什么？ What should be done for the 752x480 image to work properly? 为使752x480图像正常工作，应该怎么做？

Answer 1

OpenCV GpuMat uses strided storage (so the image is not stored contiguously in memory). OpenCV GpuMat使用跨步存储（因此映像不会连续存储在内存中）。 In short, your example fails for most cases because 简而言之，您的示例在大多数情况下都会失败，因为

You don't copy the whole image to the CUDA memory allocation, and 您不会将整个映像复制到CUDA内存分配中，并且
You don't correctly specify the memory layout when you create the second GpuMat instance from the GPU pointer. 从GPU指针创建第二个GpuMat实例时，您没有正确指定内存布局。

By my reading of the documentation, you probably want something like this: 通过阅读文档，您可能想要这样的东西：

uint8_t *imgPtr;
cv::Mat left, downloadedLeft;
cv::cuda::GpuMat gpuLeft;

left = imread("leftview.jpg", cv::IMREAD_GRAYSCALE);
gpuLeft.upload(left);

cudaMalloc((void **)&imgPtr, gpuLeft.rows*gpuLeft.step);
cudaMemcpyAsync(imgPtr, gpuLeft.ptr<uint8_t>(), gpuLeft.rows*gpuLeft.step, cudaMemcpyDeviceToDevice);

// following code is just for testing and visualization...
cv::cuda::GpuMat gpuImg(left.rows, left.cols, left.type(), imgPtr, gpuLeft.step);
gpuImg.download(downloadedLeft);
imshow ("test", downloadedLeft);
waitKey(0);

[Written by someone who has never used OpenCV, not compiled or tested, use at own risk] [由从未使用过OpenCV，未经编译或测试，使用自担风险的人撰写]

The only time your code would work correctly would be when the row pitch of the GpuMat was serendipitously the same as the number of columns times the size of the type stored in the matrix. 只有当GpuMat的行间距意外地等于列数乘以矩阵中存储的类型大小时，代码才能正常工作。 This is likely to be images with sizes which are round powers of two. 这可能是大小为2的幂的图像。

OpenCV-将GpuMat复制到cuda设备数据

问题描述

1 个解决方案

解决方案1
3 已采纳

OpenCV-将GpuMat复制到cuda设备数据

问题描述

1 个解决方案

解决方案1 3 已采纳

解决方案1
3 已采纳