简体   繁体   中英

cuda, OpenGL interoperability: cudaErrorMemoryAllocation error on cudaGraphicsGLRegisterBuffer

I am having random cuda memory allocation error on use of cudaGraphicsGLRegisterBuffer() . I have a fairly large OpenGL PBO object which is shared with it and CUDA. The PBO object is created as follows:

GLuint          buffer;
glGenBuffers(1, &buffer);
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, buffer);
glBufferData(target, rows * cols * 4, NULL, GL_DYNAMIC_COPY);
glUnmapBuffer(_target);
glBindBuffer(_target, 0);

The object is quite large. width and height are 5000. However, it allocates fine on my GPU. Now, I am sharing this between OpenGL and CUDA as follows. I have a simple class to manage the it as follows:

class CudaPBOGraphicsResource
{
public:
    CudaPBOGraphicsResource(GLuint pbo_id);
    ~CudaPBOGraphicsResource();
     inline cudaGraphicsResource_t resource() const { return _cgr; }
private:
    cudaGraphicsResource_t          _cgr;
};

CudaPBOGraphicsResource::CudaPBOGraphicsResource(GLuint pbo_id)
{
    checkCudaErrors(cudaGraphicsGLRegisterBuffer(&_cgr, pbo_id,
                    cudaGraphicsRegisterFlagsNone));
    checkCudaErrors(cudaGraphicsMapResources(1, &_cgr, 0));
}

CudaPBOGraphicsResource::~CudaPBOGraphicsResource()
{
    if (_cgr) {
        checkCudaErrors(cudaGraphicsUnmapResources(1, &_cgr, 0));
    }
}

Now I do the OpenGL and CUDA interoperability as follows:

{
    CudaPBOGraphicsResource input_cpgr(pbo_id);
    uchar4 * input_ptr = 0;
    size_t num_bytes;
    checkCudaErrors(cudaGraphicsResourceGetMappedPointer((void 
                    **)&input_ptr, &num_bytes,
                    input_cpgr.resource()));

    call_my_kernel(input_ptr);
}

This runs for my inputs for a while but after sometime it crashes with:

CUDA error code=2(cudaErrorMemoryAllocation) 
                 "cudaGraphicsGLRegisterBuffer(&_cgr, pbo_id, 
                  cudaGraphicsRegisterFlagsNone)" 
Segmentation fault

I am not sure why there is memory allocation going on as I thought this was shared. I added cudaDeviceSynchronize() after the kernel call but the error still persists. My call_my_kernel() function is now pretty much doing nothing, so there are no other CUDA calls that can raise this error!

I am using Cuda 7 on linux with a K4000 Quadro card.

EDIT I updated the driver to the latest 346.72 version and the error still happens. It also does not depend on the kernel call. Just calling cudaGraphicsGLRegisterBuffer() seems to leak memory on the GPU. Running nvidia-smi as the program is running shows the memory going up steadily. I am still at a loss as to why there is any copying happening...

Ok, I found the answer to my conundrum and I hope it will help anyone else using CUDA-OGL together.

The problem was that I was calling:

checkCudaErrors(cudaGraphicsGLRegisterBuffer(&_cgr, pbo_id,
                cudaGraphicsRegisterFlagsNone));

everytime. This actually needs to be called only once and then I just need to call map/unmap on the _cgr object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM