[英]atomicInc() is not working
I have tried below program using atomicInc(). 我已经尝试过以下使用atomicInc()的程序。
__global__ void ker(int *count)
{
int n=1;
int x = atomicInc ((unsigned int *)&count[0],n);
CUPRINTF("In kernel count is %d\n",count[0]);
}
int main()
{
int hitCount[1];
int *hitCount_d;
hitCount[0]=1;
cudaMalloc((void **)&hitCount_d,1*sizeof(int));
cudaMemcpy(&hitCount_d[0],&hitCount[0],1*sizeof(int),cudaMemcpyHostToDevice);
ker<<<1,4>>>(hitCount_d);
cudaMemcpy(&hitCount[0],&hitCount_d[0],1*sizeof(int),cudaMemcpyDeviceToHost);
printf("count is %d\n",hitCount[0]);
return 0;
}
Output is: 输出为:
In kernel count is 1
In kernel count is 1
In kernel count is 1
In kernel count is 1
count is 1
I'm not understanding why it is not incrementing. 我不明白为什么它没有增加。 Can anyone help
谁能帮忙
Referring to the documentation , atomicInc
does this: 参考文档 ,
atomicInc
这样做:
for the following: 对于以下内容:
atomicInc ((unsigned int *)&count[0],n);
compute: 计算:
((count[0] >= n) ? 0 : (count[0]+1))
and store the result back in count[0]
并将结果存储回
count[0]
(If you're not sure what the ?
operator does, look here ) (如果不确定
?
运算符的作用,请看这里 )
Since you've passed n
= 1, and count[0]
starts out at 1, atomicInc
never actually increments the variable count[0]
beyond 1. 由于您已经传递了
n
= 1,并且count[0]
从1开始, atomicInc
实际上不会将变量count[0]
增量atomicInc
超过1。
If you want to see it increment beyond 1, pass a larger value for n
. 如果您希望看到它增加到大于1,则为
n
传递一个更大的值。
The variable n
actually acts as a "rollover value" for the incrementing process. 变量
n
实际上充当增量过程的“滚动值”。 When the variable to be incremented actually reaches the value of n
, the next atomicInc
will reset it to zero. 当要增加的变量实际达到
n
值时,下一个atomicInc
会将其重置为零。
Although you haven't asked the question, you might ask, "Why do I never see a value of zero, if I am hitting the rollover value?" 尽管您没有问过这个问题,但您可能会问:“如果我要达到转存值,为什么为什么我永远不会看到零值?”
To answer this, you must remember that all 4 of your threads are executing in lockstep. 要回答这个问题,您必须记住所有4个线程都在同步执行。 All 4 of them execute the
atomicInc
instruction before any execute the subsequent print statement. 它们中的所有4个都在执行后续的print语句之前执行
atomicInc
指令。
Therefore we have a variable of count[0]
which starts out at 1. 因此,我们有一个从1开始的
count[0]
变量。
Then all 4 threads print out the value. 然后,所有4个线程都会打印出该值。
As another experiment, try launching 5 threads instead of 4, see if you can predict what the value printed out will be. 作为另一个实验,尝试启动5个线程而不是4个线程,看看是否可以预测输出的值。
ker<<<1,5>>>(hitCount_d);
As @talonmies indicated in the comments, if you swap your atomicInc
for an atomicAdd
: 如评论中的@talonmies所示,如果将
atomicInc
交换为atomicAdd
:
int x = atomicAdd ((unsigned int *)&count[0],n);
You'll get results that you were probably expecting. 您将获得预期的结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.