简体   繁体   中英

ArrayFire: function with an OpenCL kernel called from main function

the function is the following (extracted from http://arrayfire.org/docs/interop_opencl.htm )

unique main function

int main() {
    size_t length = 10;
    // Create ArrayFire array objects:
    af::array A = af::randu(length, f32);
    af::array B = af::constant(0, length, f32);
    // ... additional ArrayFire operations here
    // 2. Obtain the device, context, and queue used by ArrayFire
    static cl_context af_context = afcl::getContext();
    static cl_device_id af_device_id = afcl::getDeviceId();
    static cl_command_queue af_queue = afcl::getQueue();
    // 3. Obtain cl_mem references to af::array objects
    cl_mem * d_A = A.device<cl_mem>();
    cl_mem * d_B = B.device<cl_mem>();
    // 4. Load, build, and use your kernels.
    //    For the sake of readability, we have omitted error checking.
    int status = CL_SUCCESS;
    // A simple copy kernel, uses C++11 syntax for multi-line strings.
    const char * kernel_name = "copy_kernel";
    const char * source = R"(
        void __kernel
        copy_kernel(__global float * gA, __global float * gB)
        {
            int id = get_global_id(0);
            gB[id] = gA[id];
        }
    )";
    // Create the program, build the executable, and extract the entry point
    // for the kernel.
    cl_program program = clCreateProgramWithSource(af_context, 1, &source, NULL, &status);
    status = clBuildProgram(program, 1, &af_device_id, NULL, NULL, NULL);
    cl_kernel kernel = clCreateKernel(program, kernel_name, &status);
    // Set arguments and launch your kernels
    clSetKernelArg(kernel, 0, sizeof(cl_mem), d_A);
    clSetKernelArg(kernel, 1, sizeof(cl_mem), d_B);
    clEnqueueNDRangeKernel(af_queue, kernel, 1, NULL, &length, NULL, 0, NULL, NULL);
    // 5. Return control of af::array memory to ArrayFire
    A.unlock();
    B.unlock();
    // ... resume ArrayFire operations
    // Because the device pointers, d_x and d_y, were returned to ArrayFire's
    // control by the unlock function, there is no need to free them using
    // clReleaseMemObject()
    return 0;
}

that work well, since the final values ​​of B coincide with those of A, ie af_print(B); match A, but when I write the functions separately as follows:

separately main function

arraycopy function

void arraycopy(af::array A, af::array B,size_t length) {
    // 2. Obtain the device, context, and queue used by ArrayFire   
    static cl_context af_context = afcl::getContext();
    static cl_device_id af_device_id = afcl::getDeviceId();
    static cl_command_queue af_queue = afcl::getQueue();
    // 3. Obtain cl_mem references to af::array objects
    cl_mem * d_A = A.device<cl_mem>();
    cl_mem * d_B = B.device<cl_mem>();
    // 4. Load, build, and use your kernels.
    //    For the sake of readability, we have omitted error checking.
    int status = CL_SUCCESS;
    // A simple copy kernel, uses C++11 syntax for multi-line strings.
    const char * kernel_name = "copy_kernel";
    const char * source = R"(
        void __kernel
        copy_kernel(__global float * gA, __global float * gB)
        {
            int id = get_global_id(0);
            gB[id] = gA[id];
        }
    )";
    // Create the program, build the executable, and extract the entry point
    // for the kernel.
    cl_program program = clCreateProgramWithSource(af_context, 1, &source, NULL, &status);
    status = clBuildProgram(program, 1, &af_device_id, NULL, NULL, NULL);
    cl_kernel kernel = clCreateKernel(program, kernel_name, &status);
    // Set arguments and launch your kernels
    clSetKernelArg(kernel, 0, sizeof(cl_mem), d_A);
    clSetKernelArg(kernel, 1, sizeof(cl_mem), d_B);
    clEnqueueNDRangeKernel(af_queue, kernel, 1, NULL, &length, NULL, 0, NULL, NULL);
    // 5. Return control of af::array memory to ArrayFire
    A.unlock();
    B.unlock();
    // ... resume ArrayFire operations
    // Because the device pointers, d_x and d_y, were returned to ArrayFire's
    // control by the unlock function, there is no need to free them using
    // clReleaseMemObject()
}

main function

int main()
{
    size_t length = 10;
    af::array A = af::randu(length, f32);
    af::array B = af::constant(0, length, f32);
    arraycopy(A, B, length);
    af_print(B);//does not match A
}

the final values of B have not changed, why is this happening? and what should I do to make it work?, thanks in advance

You pass af::array into arraycopy by value, not by reference, hence A and B in main remain unchanged regardless of what you do inside arraycopy . You can pass B by reference: af::array &B in parameter list. I'd also recommend passing A by const-reference as a custom to avoid unnecessary copies ( const af::array &A ).

The reason behind the behavior you are seeing is reference counting. But it is not a bug for sure and falls inline with C++ language behavior.

af::array objects when created using assignment or equivalent operations perform only copy of meta data and keep a shared pointer.

In the version of your code where it is a function, B is passed by value , thus internally B from arraycopy function is a copy of meta data of B from main function and sharing the pointer to the data from array B of main. At this point, if the user does a device call to fetch the pointer, we assume it is for writing to locations of that pointer. Therefore, when device is called on a array object has a shared pointer with reference count > 1, we make a copy of original array (B from main) and return the pointer to that memory. Therefore, if you do af_print(B) inside you will see the correct values. This is essentially copy-on-write - Since B is passed by value, you are not seeing the modified results of B from arraycopy function.

In the very first line I said, it falls in line with C++ behavior because, if the object B needs to be modified from a function it has to be passed by reference. Passing it by value only makes the value change inside the function - which is exactly how ArrayFire is handling af::array objects.

Hope that clears the confusion.

Pradeep. ArrayFire Dev Team.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM