如何使用OpenCL内核来做累加器？

Question

    __kernel void cl_test(__global int* Number)
    {
       int id = get_global_id(0);
       if (id%5==0)
       {
           Number[0]++;
       }
       if (id%10==0)
       {
           Number[1]++;
       }
    }

As you can see, this is a very simple OpenCL kernel test code, what I want is to collect the number divisible by 5 and 10 in a range. 如您所见，这是一个非常简单的OpenCL内核测试代码，我想要的是收集在一个范围内被5和10整除的数字。

So here is the problem: since every work item's calculation is not pure parallel, the Number[0] or [1] in different items are related. 这就是问题所在：由于每个工作项的计算都不是完全并行的，因此不同项中的Number [0]或[1]是相关的。 I can't get the correct result by reading the Number[0] or Number[1]. 我无法通过读取Number [0]或Number [1]来获得正确的结果。

Is there any solution like the "global variable" in C++? 有没有像C ++中的“全局变量”这样的解决方案？

Thanks! 谢谢！

Answer 1

You need to use atomic operations. 您需要使用原子操作。

__kernel void cl_test(__global int* Number)
{
   int id = get_global_id(0);
   if (id%5==0)
   {
       atomic_inc(Number);
   }
   if (id%10==0)
   {
       atomic_inc(&Number[1]);
   }
}

You should avoid using those as much as possible as atomic operations tend to be rather slow precisely because they make sure that it works correctly across threads. 您应该避免尽可能多地使用原子操作，因为原子操作往往会相当慢，因为原子操作会确保原子操作在线程之间正确运行。

Answer 2

Atomic add will solve the summing problem 原子加法将解决求和问题

 __kernel void cl_test(__global int* Number)
    {
       int id = get_global_id(0);
       if (id%5==0)
       {
           atomic_add( Number, 1 );
       }
       if (id%10==0)
       {
           atomic_add( Number +1, 1 );
       }
    }

如何使用OpenCL内核来做累加器？

问题描述

2 个解决方案

解决方案1
4 已采纳 2014-01-07 13:05:09

解决方案2
2 2014-01-07 13:05:43

如何使用OpenCL内核来做累加器？

问题描述

2 个解决方案

解决方案1 4 已采纳 2014-01-07 13:05:09

解决方案2 2 2014-01-07 13:05:43

解决方案1
4 已采纳 2014-01-07 13:05:09

解决方案2
2 2014-01-07 13:05:43