简体   繁体   中英

OpenCL/OpenGL interoperability texture segfault

I'm trying to use OpenCL with OpenGL interop. to compute pathtracing algorithm on GPU and then draw GL texture to quad. Works as intended on Intel CPU but when I try to run in on GTX 970 there's segfault on unlocking that GL texture. Dunno if that's the cause or the running kernel. I'll let the code speak for itself. I'm using OpenCL C++ wrapper btw.

GL texture creation

glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_2D, texture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, framebuffer);
glBindTexture(GL_TEXTURE_2D, 0); //Unbind texture

CL texture allocation

m_textureCL = cl::ImageGL(m_context,
        CL_MEM_READ_WRITE, 
        GL_TEXTURE_2D,
        0,
        texture,
        &errCode);

RunKernel function

//-----------------------------------------------------------------------------
//  Lock texture
//-----------------------------------------------------------------------------
std::vector<cl::Memory> glObjects; //Create vector of GL objects to lock
glObjects.push_back(m_textureCL); //Add created CL texture buffer
glFlush(); //Flush GL queue

errCode = m_cmdQueue.enqueueAcquireGLObjects(&glObjects, NULL, NULL);
if(errCode != CL_SUCCESS) {
    std::cerr << "Error locking texture" << errCode << std::endl;
    return errCode;
}
//-----------------------------------------------------------------------------

//-----------------------------------------------------------------------------
//  Run queue
//-----------------------------------------------------------------------------
errCode = m_cmdQueue.enqueueNDRangeKernel(
        m_kernel,
        cl::NullRange,
        cl::NDRange(height*width),
        cl::NullRange,
        NULL,
        NULL);
if(errCode != CL_SUCCESS) {
    std::cerr << "Error running queue: " << errCode << std::endl;
    return errCode;
}
//---------------------------------------


//-----------------------------------------------------------------------------
//  Unlock
//-----------------------------------------------------------------------------
errCode = m_cmdQueue.enqueueReleaseGLObjects(&glObjects, NULL, NULL);
if(errCode != CL_SUCCESS) {
    std::cerr << "Error unlocking texture: " << errCode << std::endl;
    return errCode;
} <<------ Here's where segfault occurs, can't get past this point

Kernel function def.

__kernel void RadianceGPU (
    __write_only image2d_t texture,
    other_stuff...)

Writing to texture in kernel

write_imagef(
        texture,
        (int2)(x, height-y-1),
        (float4)(
            clamp(framebuffer[id].x, 0.0f, 1.0f),
            clamp(framebuffer[id].y, 0.0f, 1.0f),
            clamp(framebuffer[id].z, 0.0f, 1.0f),
            1.0f) * 1.0f);

Interesting is that write_imagef() works despite the texture being UNSIGNED_BYTE.

EDIT: So I finally figured out what caused the problem. It was setting wrong display while creating CL properties. I just pasted there window from GLFW, which causes problem on Nvidia drivers. You need to use glxGetCurrentDisplay or glfwGetX11Display. This fixes the segfault.

I´m not sure this is your problem but I´ll give it a shot anyway.

You have not synchronized access to glObjects in a portable way. From OpenCL 1.1:

Prior to calling clEnqueueAcquireGLObjects, the application must ensure that any pending GL operations which access the objects specified in mem_objects have completed. This may be accomplished portably by issuing and waiting for completion of a glFinish command on all GL contexts with pending references to these objects. Implementations may offer more efficient synchronization methods; for example on some platforms calling glFlush may be sufficient, or synchronization may be implicit within a thread, or there may be vendor - specific extensions that enable placing a fence in the GL command stream and waiting for completion of that fence in the CL command queue. Note that no synchronization methods other than glFinish are portable between OpenGL implementations at this time.

Basically glFinish is required for portable behavior.

In the paragraph below the already quoted there is more info that might be of interest:

Similarly, after calling clEnqueueReleaseGLObjects, the application is responsible for ensuring that any pending OpenCL operations which access the objects specified in mem_objects have completed prior to executing subsequent GL commands which reference these objects. This may be accomplished portably by calling clWaitForEvents with the event object returned by clEnqueueRelease GL Objects, or by calling clFinish. As above, some implementations may offer more efficient methods.

Here is a link to the document quoted from: https://www.khronos.org/registry/cl/specs/opencl-1.1.pdf#nameddest=section-9.8.6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM