I use TX1 board with L4T 28.1.
I compiled opencv on the board with DWITH_CUDA=ON with cuda 8.0 toolkit.
When I try to use opencv functions that use gpu I get errors:
I try to declare GpuMat:
GpuMat TestGpuMat(480, 640, CV_16UC1, 0x55);
and get Segmentation fault during runtime in module cv::cuda::GpuMat::create().
I can allocate the same matrix if I provide the allocated memory:
CudaMallocManaged((void**)&MyBuf, 640*480*sizeof(unsigned short));
GpuMat TestGpuMat(480, 640, CV_16UC1, MyBuf);
In that case it works, but I receive the fault when I try to send the GpuMat to cuda::warpAffine function, Then I get the following exception:
OpenCVError: Gpu API call (invalid argument) in setTo
Any suggestions?
This code works:
cudaMallocManaged((void**)&dptr,w*h*sizeof(unsigned short));
cudaMemset(dptr,128,sizeof(unsigned short)*w*h);
//cudaDeviceSynchronize();
dptr[w/2+h*h/2] = 255;
cuda::GpuMat d_img(h,w,CV_16UC1,dptr);
Mat h_warp = getRotationMatrix2D({w/2,h/2},-45.f,1);
cuda::GpuMat d_res;
cuda::warpAffine(d_img,d_res,h_warp,h_img.size());
Mat h_res;
d_res.download(h_res);
imshow("window",h_res);
waitKey(0);
You also may try to use either cudaMalloc() or cudaMallocPitch() instead of cudaMallocManaged(). In general, managed memory is a bit harder to handle. It needs some sort of synchronization during concurrency between CPU and GPU. If you don't know how a function is implemented, you should start off you trials with non-managed allocations.
unsigned short* dptr;
size_t pitch;
cudaMallocPitch((void**)&dptr,&pitch,w*sizeof(unsigned short),h);
cuda::GpuMat d_img(h,w,CV_16UC1,dptr, pitch/sizeof(unsigned short));
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.