简体   繁体   English

atomicInc()无法正常运作

[英]atomicInc() is not working

I have tried below program using atomicInc(). 我已经尝试过以下使用atomicInc()的程序。

__global__ void ker(int *count)
{
    int n=1;
    int x = atomicInc ((unsigned int *)&count[0],n);
    CUPRINTF("In kernel count is %d\n",count[0]);
}

int main()
{
    int hitCount[1];
    int *hitCount_d;

    hitCount[0]=1;
    cudaMalloc((void **)&hitCount_d,1*sizeof(int));

    cudaMemcpy(&hitCount_d[0],&hitCount[0],1*sizeof(int),cudaMemcpyHostToDevice);

    ker<<<1,4>>>(hitCount_d);

    cudaMemcpy(&hitCount[0],&hitCount_d[0],1*sizeof(int),cudaMemcpyDeviceToHost);

    printf("count is %d\n",hitCount[0]);
  return 0;
}

Output is: 输出为:

In kernel count is 1
In kernel count is 1
In kernel count is 1
In kernel count is 1

count is 1

I'm not understanding why it is not incrementing. 我不明白为什么它没有增加。 Can anyone help 谁能帮忙

Referring to the documentation , atomicInc does this: 参考文档atomicInc这样做:

for the following: 对于以下内容:

atomicInc ((unsigned int *)&count[0],n);

compute: 计算:

((count[0] >= n) ? 0 : (count[0]+1))

and store the result back in count[0] 并将结果存储回count[0]

(If you're not sure what the ? operator does, look here ) (如果不确定?运算符的作用,请看这里

Since you've passed n = 1, and count[0] starts out at 1, atomicInc never actually increments the variable count[0] beyond 1. 由于您已经传递了n = 1,并且count[0]从1开始, atomicInc实际上不会将变量count[0]增量atomicInc超过1。

If you want to see it increment beyond 1, pass a larger value for n . 如果您希望看到它增加到大于1,则为n传递一个更大的值。

The variable n actually acts as a "rollover value" for the incrementing process. 变量n实际上充当增量过程的“滚动值”。 When the variable to be incremented actually reaches the value of n , the next atomicInc will reset it to zero. 当要增加的变量实际达到n值时,下一个atomicInc会将其重置为零。

Although you haven't asked the question, you might ask, "Why do I never see a value of zero, if I am hitting the rollover value?" 尽管您没有问过这个问题,但您可能会问:“如果我要达到转存值,为什么为什么我永远不会看到零值?”

To answer this, you must remember that all 4 of your threads are executing in lockstep. 要回答这个问题,您必须记住所有4个线程都在同步执行。 All 4 of them execute the atomicInc instruction before any execute the subsequent print statement. 它们中的所有4个都在执行后续的print语句之前执行atomicInc指令。

Therefore we have a variable of count[0] which starts out at 1. 因此,我们有一个从1开始的count[0]变量。

  1. The first thread to execute the atomic resets it to zero. 执行原子的第一个线程将其重置为零。
  2. The next thread increments it to 1. 下一个线程将其递增为1。
  3. The third thread resets it to zero. 第三个线程将其重置为零。
  4. The fourth and final thread increments it to 1. 第四个也是最后一个线程将其递增为1。

Then all 4 threads print out the value. 然后,所有4个线程都会打印出该值。

As another experiment, try launching 5 threads instead of 4, see if you can predict what the value printed out will be. 作为另一个实验,尝试启动5个线程而不是4个线程,看看是否可以预测输出的值。

ker<<<1,5>>>(hitCount_d);

As @talonmies indicated in the comments, if you swap your atomicInc for an atomicAdd : 如评论中的@talonmies所示,如果将atomicInc交换为atomicAdd

int x = atomicAdd ((unsigned int *)&count[0],n);

You'll get results that you were probably expecting. 您将获得预期的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM