简体   繁体   English

CUDA:需要同步才能读取设备存储器变量

[英]cuda: need of synchronization for reading device memory variable

I am running an iterative program in cuda, which runs till convergence. 我正在cuda中运行一个迭代程序,直到迭代为止。 As said in this SO post ( Are cuda kernel calls synchronous or asynchronous ), from point of view of CPU, cuda kernels are asynchronous. 就像在这篇SO帖子中所说的那样(从cuda内核调用是同步还是异步 ),从CPU的角度来看,cuda内核是异步的。

In my program, one of the kernel checks for convergence and returns the boolean value to the host to read. 在我的程序中,内核之一检查收敛性并将布尔值返回给主机以进行读取。 I wanted to know, whether I need to do 我想知道是否需要做

cudaDeviceSynchronize()

before reading the boolean value? 在读取布尔值之前?

It depends how are you returning the Boolean value back to the CPU. 这取决于如何将布尔值返回给CPU。 are you using cudaMemcpy? 您在使用cudaMemcpy吗? if yes then you don't have to use cudaDeviceSynchronize(), since cudaMemcpy will block until the kernel finishes execution and then copies data from GPU to CPU. 如果是,则不必使用cudaDeviceSynchronize(),因为cudaMemcpy将阻塞直到内核完成执行,然后将数据从GPU复制到CPU。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM