CUDA-检查重复值并添加两个值

Question

I have two group of arrays 我有两组数组

a1 a2 a3 a4 a5 a6 a7 a8 <= name it as key1
b1 b2 b3 b4 b5 b6 b7 b8 <= val1
c1 c2 c3 c4 c5 c6 c7 c8

and 和

d1 d2 d3 d4 d5 d6 d7 d8 <= key2
e1 e2 e3 e4 e5 e6 e7 e8 <= val2
f1 f2 f3 f4 f5 f6 f7 f8

The arrays a1,...,an and d1,...,dn are sorted and might be repeated. 数组a1,...,an和d1,...,dn已排序，可能会重复。 ie their values might be something like 1 1 2 3 4 6 7 7 7 ... I want to check if for each Tuple di,ei check if it is equal to any of ai,bi . 也就是说，它们的值可能像1 1 2 3 4 6 7 7 7 ...我想检查每个Tuple di,ei是否等于ai,bi任何一个。 If it is (di==ai,bi==ei ) then I have to combine fi and ci using some function eg add and store in fi. 如果它是(di==ai,bi==ei ），那么我必须使用某些功能（例如在fi中添加并存储）来组合fi和ci 。

Firstly, is it possible to do this using zip iterators and transformation in thurst library to solve this efficiently? 首先，是否可以在hurst库中使用zip迭代器和转换来有效地解决此问题？

Secondly, the simplest method that I can imagine is to count occurance of number of each keys (ai) do prefix sum and use both to get start and end index of each keys and then for each di use above counting to iterate through those indices and check if ei==di . 其次，我能想到的最简单的方法是计算每个键的数目(ai)进行前缀求和，并同时使用它来获取每个键的开始和结束索引，然后对于上面的每个di计数，迭代这些索引和检查ei==di 。 and perform the transformation. 并执行转换。

ie If I have 即如果我有

1 1 2 3 5 6 7
2 3 4 5 2 4 6
2 4 5 6 7 8 5

as first array, I count the occurance of 1,2,3,4,5,6,7,...: 作为第一个数组，我计算1,2,3,4,5,6,7，...的出现：

2 1 1 0 1 1 1 <=name it as count

and then do prefix sum to get: 然后做前缀sum来获得：

2 3 4 4 5 6 7  <= name it as cumsum

and use this to do: 并使用它来做：

for each element di,
    for i in (cumsum[di] -count[di]) to cumsum[di]:
        if ei==val1[i] then performAddition;

What I fear is that since not all threads are equal, this will lead to warp divergence, and I may not have efficient performance. 我担心的是，由于并非所有线程都相等，所以这将导致翘曲发散，并且我可能没有高效的性能。

Answer 1

You could treat your data as two key-value tables.Table1: (a,b) -> c and Table2: (d,e)->f , where pair (a,b) and (d,e) are keys, and c , f are values. 您可以将数据视为两个键值表。Table1： (a,b) -> c和Table2： (d,e)->f ，其中(a,b)和(d,e)是键， c和f是值。

Then your problem simplifies to 然后您的问题简化为

foreach key in Table2
  if key in Table1
    Table2[key] += Table1[key]

Suppose a and b have limited ranges and are positive, such as unsigned char , a simple way to combine a and b into one key is 假设a和b范围有限且为正数，例如unsigned char ，则将a和b组合为一个键的简单方法是

unsigned short key = (unsigned short)(a) * 256 + b;

If the range of key is still not too large as in the above example, you could create your Table1 as 如果如上例所示， key的范围仍然不太大，则可以将Table1创建为

int Table1[65536];

Checking if key in Table1 becomes 检查Table1 key是否变为

if (Table1[key] != INVALID_VALUE)
  ....

With all these restrictions, implementation with thrust should be very simple. 在所有这些限制下，采用推力实施应该非常简单。

Similar combining method could still be used if a and b have larger range like int . 如果a和b范围较大，例如int仍然可以使用类似的合并方法。

But if the range of key is too large, you have to go to the method suggested by Robert Crovella. 但是，如果key范围太大，则必须使用Robert Crovella建议的方法。

CUDA-检查重复值并添加两个值

问题描述

1 个解决方案

解决方案1
2 2016-07-05 21:22:40

CUDA-检查重复值并添加两个值

问题描述

1 个解决方案

解决方案1 2 2016-07-05 21:22:40

解决方案1
2 2016-07-05 21:22:40