简体繁体 English

跨并发内核执行的CUDA全局原子操作

[英]CUDA global atomic operations across concurrent kernel executions

原文 2019-08-10 02:35:51 0 1 cuda/ atomic/ cuda-streams/ gpu-atomics

My CUDA application performs an associative reduction over a volume. 我的CUDA应用程序执行整个卷的关联缩减。 Essentially each thread computes values which are atomically added to overlapping locations of the same output buffer in global memory. 本质上，每个线程都会计算原子添加到全局内存中同一输出缓冲区的重叠位置的值。

Is it possible to concurrently launch this kernel with different input parameters and the same output buffer? 是否可以使用不同的输入参数和相同的输出缓冲区同时启动该内核？ In other words, each kernel would share the same global buffer and write to it atomically. 换句话说，每个内核将共享相同的全局缓冲区并以原子方式对其进行写入。

All kernels are running on the same GPU. 所有内核都在同一GPU上运行。

1 个解决方案

Yes, it's possible. 是的，有可能。 atomic operations to global memory are device-wide. 对全局内存的原子操作是整个设备范围的。 They will be atomic with respect to any code running on the device. 对于设备上运行的任何代码，它们都是原子的。

CUDA原子操作和并发内核启动 - CUDA atomic operations and concurrent kernel launch

时序 Kernel 在 CUDA 上执行 - Timing Kernel Executions on CUDA

CUDA：还原还是原子操作？ - CUDA: reduction or atomic operations?

CUDA 原子操作列表 - CUDA List of atomic operations

CUDA原子操作 - Cuda atomic operations

CUDA中的原子操作 - atomic operations in CUDA

无符号空头上的Cuda原子运算 - Cuda atomic operations on unsigned short

CUDA：对无符号字符的原子操作 - CUDA: Atomic operations on unsigned chars

使用nvprof计数CUDA内核执行 - Using nvprof to Count CUDA Kernel Executions

CUDA 中的重叠传输和 kernel 执行与两个循环 - Overlapping transfers and kernel executions in CUDA with two loops

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 CUDA原子操作和并发内核启动 - CUDA atomic operations and concurrent kernel launch 时序 Kernel 在 CUDA 上执行 - Timing Kernel Executions on CUDA CUDA：还原还是原子操作？ - CUDA: reduction or atomic operations? CUDA 原子操作列表 - CUDA List of atomic operations CUDA原子操作 - Cuda atomic operations CUDA中的原子操作 - atomic operations in CUDA 无符号空头上的Cuda原子运算 - Cuda atomic operations on unsigned short CUDA：对无符号字符的原子操作 - CUDA: Atomic operations on unsigned chars 使用nvprof计数CUDA内核执行 - Using nvprof to Count CUDA Kernel Executions CUDA 中的重叠传输和 kernel 执行与两个循环 - Overlapping transfers and kernel executions in CUDA with two loops

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM