简体   繁体   English

使用推力::reduce 捕获CUDA 图流

[英]CUDA graph stream capture with thrust::reduce

When I am trying to capture stream execution to build CUDA graph, call to thrust::reduce causes a runtime error cudaErrorStreamCaptureUnsupported: operation not permitted when stream is capturing .当我尝试捕获流执行以构建 CUDA 图时,调用thrust::reduce会导致运行时错误cudaErrorStreamCaptureUnsupported: operation not permitted when stream is capturing I have tried returning the reduction result to both host and device variables, and I am calling reduction in a proper stream by the means of thrust::cuda::par.on(stream) .我已经尝试将归约结果返回给主机和设备变量,并且我通过thrust::cuda::par.on(stream)在适当的流中调用归约。 Is there any way I can add thrust functions execution to CUDA graphs?有什么方法可以将thrust函数执行添加到 CUDA 图中?

Thrust's reduction operation is a blocking operation on the host side. Thrust 的归约操作是主机端的阻塞操作。 I am assuming that you are using the result of reduction as a parameter to one of your following kernels.我假设您将归约结果用作以下内核之一的参数。 So that when you are capturing a CUDA graph, it cannot instantiate the graph executable because you are dependent on a variable that is on the host side but not available until the reduction kernel finishes execution.因此,当您捕获 CUDA 图时,它无法实例化图可执行文件,因为您依赖于主机端的变量,但在归约内核完成执行之前不可用。 As a solution, you can try adding a host node to your graph that returns the result of the reduction.作为解决方案,您可以尝试将主机节点添加到您的图形中,以返回减少的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM