简体   繁体   English

PyCuda使用Streams执行Thrust

[英]PyCuda executing Thrust using Streams

I'm trying to adapt the code found here: https://wiki.tiker.net/PyCuda/Examples/ThrustInterop ...to use cuda streams. 我正在尝试修改此处的代码: https ://wiki.tiker.net/PyCuda/Examples/ThrustInterop ...以使用cuda流。

(Please excuse that I'm new to c++, and have a few weeks experience with cuda only.) (请原谅我不是C ++的新手,并且只有几周的cuda使用经验。)

My main attempt and sticking point has been along the lines of adjusting the NVCC function like so to receive a cuda stream as an arg, and supply to the Thrust call: 我的主要尝试和症结在于调整NVCC功能,例如,接收作为arg的cuda流,并提供给Thrust调用:

nvcc_function = FunctionBody(
   FunctionDeclaration(Value('void', 'my_sort'),
                       [Value('CUdeviceptr', 'input_ptr'),
                        Value('int', 'length'),
                        Value('cudaStream_t','stream')]),
   Block([Statement('thrust::device_ptr<float> thrust_ptr((float*)input_ptr)'),
          Statement('thrust::sort(thrust::cuda::par.on(stream),thrust_ptr, thrust_ptr+length)')]))

I'm getting the error "'cudaStream_t' has not been declared" (referring to the NVCC function argument). 我收到错误“尚未声明'cudaStream_t'”(指的是NVCC函数参数)。

I've tried adding 'cuda_runtime.h' to both the host and device includes lists but to no avail. 我尝试将'cuda_runtime.h'添加到主机和设备的包含列表,但无济于事。

I am not familiar with pyCUDA or thrust but I am familiar with CUDA. 我不熟悉pyCUDA或推力,但我熟悉CUDA。 One of the possible things that come to mind is that some reason the "cuda_runtime.h" might not be included despite being specified. 我想到的可能事情之一是,尽管指定了“ cuda_runtime.h”,但仍可能不包含某些原因。 Are you sure that the pyCUDA framework will indicate an error when it cannot find a specific include? 您确定pyCUDA框架找不到特定的include时会指示错误吗?

Also another thing that caught my attention is that you are using CUdeviceptr which is a part of the driver api , whereas cudaStream_t is a part of the runtime api , which operates on a different level. 另外引起我注意的另一件事是,您正在使用CUdeviceptr ,它是driver api的一部分,而cudaStream_truntime api的一部分, runtime api在不同的级别上运行。

From NVIDIA documentation, it seems that the driver api equivalent type would be CUstream . 从NVIDIA文档看来, driver api等效类型将是CUstream Source: http://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__STREAM.html#group__CUDA__STREAM 来源: http : //docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__STREAM.html#group__CUDA__STREAM

So the problem might be in mixing the functionality of apis on different levels. 因此,问题可能出在不同级别上混合api的功能。 As I said though, I am not familiar with the exact framework you're using, those are just some suggestions that might or might not turn out useful. 就像我说的,我不熟悉您使用的确切框架,这些只是一些可能有用或可能没有用的建议。

Good luck with debugging! 祝您调试顺利!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM