简体   繁体   English

跨并发内核执行的CUDA全局原子操作

[英]CUDA global atomic operations across concurrent kernel executions

My CUDA application performs an associative reduction over a volume. 我的CUDA应用程序执行整个卷的关联缩减。 Essentially each thread computes values which are atomically added to overlapping locations of the same output buffer in global memory. 本质上,每个线程都会计算原子添加到全局内存中同一输出缓冲区的重叠位置的值。

Is it possible to concurrently launch this kernel with different input parameters and the same output buffer? 是否可以使用不同的输入参数和相同的输出缓冲区同时启动该内核? In other words, each kernel would share the same global buffer and write to it atomically. 换句话说,每个内核将共享相同的全局缓冲区并以原子方式对其进行写入。

All kernels are running on the same GPU. 所有内核都在同一GPU上运行。

Yes, it's possible. 是的,有可能。 atomic operations to global memory are device-wide. 对全局内存的原子操作是整个设备范围的。 They will be atomic with respect to any code running on the device. 对于设备上运行的任何代码,它们都是原子的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM