简体   繁体   English

有两个循环语句,如何在opencl内核中编写它?

[英]There are two circular statement, how can l write it in opencl kernel?

There are two circular statement, for example: 有两个循环语句,例如:

for(int i=0;i<1000;i++)
 for (int j=0;j<1000;j++)
{
  for(int k=i*5;k<i*5+5;k++)
   for(int l=j*5;l<j*5+5;j++)
  {
   marrytemp=A[i]+B[j]+marry;
  }  
 marry[i,j]=marrytemp;
}

how can l write it in opencl kernel? 我该如何在opencl内核中编写它?

Write the kernel to handle the inner two loops (k,l), then enqueue it as a 2D kernels with global size of i,j. 编写内核以处理内部两个循环(k,l),然后将其排队为全局大小为i,j的2D内核。

Edit to add outline of kernel: 编辑以添加内核轮廓:

The kernel would be something along the lines of: 内核类似于:

__kernel void innerLoop(__global float* A, __global float* B, __global float* marry)
{
    int i = get_global_id(1);
    int j = get_global_id(0);
    int marraytemp = 0;
    for(int k=i*5;k<i*5+5;k++)
    {
        for(int l=j*5;l<j*5+5;j++)
        {
            marrytemp=A[i]+B[j]+marrytemp;
        }  
    }
    marry[i,j]=marrytemp;
}

And then it would be called something like: 然后将其称为:

clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&A);
clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&B);
clSetKernelArg(kernel, 2, sizeof(cl_mem), (void *)&marray);

size_t global_item_size[] = {100, 100};
clEnqueueNDRangeKernel(command_queue, kernel, 2, NULL, &global_item_size, NULL, 0, NULL, NULL);

Both of these need additional support code (such as creating command_queue and kernel ) and have not been compiled. 这两个都需要其他支持代码(例如,创建command_queuekernel ),并且尚未编译。 They are just to give you the idea of how to split your four nested loops into an OpenCL kernel. 它们只是为了让您了解如何将四个嵌套循环拆分为一个OpenCL内核。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM