I have an array of uint8_t. The size of the array is about 2.000.000. I need to do some calculations on these values, but after I call the kernel and copy the modified values back, it returns only zeros.
I'm creating the array, the "row" and "columns" are int.
uint8_t arrayIn[rows * columns];
uint8_t arrayOut[rows * columns];
I'm creating the cl_mem objects and copy the array data into.
arrayInMem = clCreateBuffer(context, CL_MEM_READ_ONLY, rows * columns * sizeof(uint8_t), NULL, &err);
arrayOutMem = clCreateBuffer(context, CL_MEM_WRITE_ONLY, rows * columns * sizeof(uint8_t), NULL, &err);
err = clEnqueueWriteBuffer(img_cmd_queue, arrayInMem, CL_TRUE, 0, rows * columns * sizeof(uint8_t), arrayIn, 0, NULL, NULL);
Setting the kernel arg like this.
err = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&arrayInMem);
err = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&arrayOutMem);
Reading back to the host the modified array.
err = clEnqueueReadBuffer(img_cmd_queue, arrayOutMem, CL_TRUE, 0, MEM_SIZE * sizeof(uint8_t), arrayOut, 0, NULL, NULL);
The kernel signature look like this:
__kernel void calculate(__global uchar * arrayInKernel, __global uchar * arrayOutKernel){
//do some calculation like this eg.
//int gid = get_global_id(0);
//arrayOutKernel[gid] = 2 * arrayInKernel[gid];
}
Could somebody help, what am I missing out?
Your code is fine, assuming MEM_SIZE = rows * columns
. The argument order in clEnqueueReadBuffer
also is correct.
I could imagine that you forgot to call clFinish(img_cmd_queue);
after clEnqueueWriteBuffer
, clEnqueueNDRangeKernel
and clEnqueueReadBuffer
and before you check the results in arrayOut
. All these commands end up in a queue and without clFinish
the queue may be executed after you checked results.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.