简体   繁体   English

是否可以手动设置用于一个 CUDA stream 的 SM?

[英]Is it possible to manually set the SMs used for one CUDA stream?

By default, the kernel will use all available SMs of the device (if enough blocks).默认情况下,kernel 将使用设备的所有可用 SM(如果有足够的块)。 However, now I have 2 stream with one computational-intense and one memory-intense, and I want to limit the maximal SMs used for 2 stream respectively (after setting the maximal SMs, the kernel in one stream will use up to maximal SMs, like 20SMs for computational-intense and 4SMs for memory-intense), is it possible to do so? However, now I have 2 stream with one computational-intense and one memory-intense, and I want to limit the maximal SMs used for 2 stream respectively (after setting the maximal SMs, the kernel in one stream will use up to maximal SMs,比如计算密集型的 20SM 和内存密集型的 4SM),是否可以这样做? (if possible, which API should I use) (如果可能,我应该使用哪个 API)

In short, no there is no way to do what you envisage.简而言之,没有没有办法按照你的设想去做。

The CUDA execution model doesn't provide that sort of granularity, and that isn't an accident. CUDA 执行 model 没有提供那种粒度,这不是偶然的。 By abstracting that level of scheduling and work distribution away, it means (within reason) any code you can run on the smallest GPU of a given architecture can also run on the largest without any modification.通过抽象出该级别的调度和工作分配,这意味着(在合理范围内)任何可以在给定架构的最小 GPU 上运行的代码也可以在最大的 Z52F9EC21735243AD9917D32Z 上运行而无需任何修改。 That is important from a portability and interoperability point of view.从可移植性和互操作性的角度来看,这很重要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM