简体   繁体   English

CUDA中有最大数量的流吗?

[英]Is there a maximum number of streams in CUDA?

Is there a maximum number of streams that can be created in CUDA? 是否可以在CUDA中创建最大数量的流?

To clarify I mean CUDA streams as in the stream that allows you to execute kernels and memory operations. 澄清我的意思是流中允许您执行内核和内存操作的CUDA流。

There is no realistic limit to the number of streams you can create (at least 1000's). 您可以创建的流数量没有实际限制(至少1000个)。 However, there's a limit to the number of streams you can use effectively to achieve concurrency. 但是,有效实现并发的流数量是有限的。

In Fermi, the architecture supports 16-way concurrent kernel launches, but there is only a single connection from the host to the GPU. 在Fermi中,该架构支持16路并发内核启动,但从主机到GPU只有一个连接。 So even if you have 16 CUDA streams, they'll eventually get funneled into one HW queue. 因此,即使您有16个CUDA流,它们最终也会进入一个硬件队列。 This can create false data-dependencies, and limit the amount of concurrency one can easily get. 这可能会创建错误的数据依赖关系,并限制可以轻松获得的并发数量。

With Kepler, the number of connections between the Host and the GPU is now 32 (instead of one with Fermi). 使用Kepler,主机和GPU之间的连接数现在是32(而不是Fermi的连接数)。 With the new Hyper-Q technology, it is now much easier to keep the GPU busy with concurrent work. 使用新的Hyper-Q技术,现在更容易让GPU忙于并发工作。

我没有在任何文档中看到限制,但这并不意味着所有流将同时执行,因为这是一个硬硬件限制(多处理器,寄存器等)。

According to this NVIDIA presentation, max is 16 streams (on Fermi). 根据NVIDIA的演示,max是16个流(在Fermi上)。 http://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf http://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf

To clarify, I've successfully created more than 16 streams, but I think the hardware can only support 16 concurrent kernels, so the excess ones are wasted in terms of concurrency. 为了澄清,我已成功创建了超过16个流,但我认为硬件只能支持16个并发内核,因此多余的内核在并发方面浪费了。

Kepler is probably different. 开普勒可能与众不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM