简体   繁体   English

内核lauch指定流,但是具有默认的共享内存大小

[英]Kernel lauch specifying the stream, but with default shared memory size

I need to specify the stream for a kernel launch in CUDA. 我需要为CUDA中的内核启动指定流。 The kernel uses some shared memory with its size defined in the kernel code. 内核使用一些共享内存,其大小在内核代码中定义。

static const int cBlockSize = 256;

__global__ fooKernel(void* param)
{
    __shared__ uint32_t words[cBlockSize/16];
    // implementation follows, using 2 bits of shared memory per thread
}

However, the shared memory size parameter goes before the stream parameter in a kernel launch expression. 但是,共享内存大小参数在内核启动表达式中位于流参数之前。 So how to tell CUDA to use shared memory size specified by the kernel code and ignore what's in the launch code? 那么,如何告诉CUDA使用内核代码指定的共享内存大小,而忽略启动代码中的内容呢?

fooKernel<<<N/cBlockSize, cBlockSize, /* What to put here? */, stream>>>(param);

Obviously, I would like to avoid code duplication putting (cBlockSize/16)*sizeof(uint32_t) there again. 显然,我想避免代码重复地再次放入(cBlockSize/16)*sizeof(uint32_t) In reality the expression is more complex. 实际上,表达更为复杂。

Statically allocated and dynamically allocated shared memory are treated separately, in many respects. 在许多方面,分别处理静态分配和动态分配的共享内存。

If you have no intention of using dynamically allocated shared memory, it is safe to pass the default value of zero as the third kernel launch parameter, regardless of any intentions you may have around the use of statically allocated shared memory. 如果您无意使用动态分配的共享内存,则可以安全地将默认值零作为第三个内核启动参数,无论您打算使用静态分配的共享内存如何。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM