[英]Numba cuda dynamic shared memory: more than one type?
I am aware that I can create a dynamic shared memory array for a numba-compiled CUDA kernel by passing the size in as the forth argument to the kernel call:我知道我可以通过将大小作为第四个参数传递给 kernel 调用,为 numba 编译的 CUDA kernel 创建一个动态共享数组 memory:
...
foo_kernel[grid, block, stream, shared_bytes](...)
...
@cuda.jit
def foo_kernel(...) -> None:
a = cuda.shared.array(0, nb.int32)
From here, I can slice a
if I want to treat as several arrays.从这里,如果我想将
a
视为多个 arrays,我可以对其进行切片。
However, what if I want to have two arrays of different dtypes?但是,如果我想要两个不同数据类型的 arrays 怎么办? Can I do something like:
我可以做类似的事情吗:
...
a = cuda.shared.array(0, nb.int32)
b = cuda.shared.array(0, nb.float32)
...
and then slice b
so that I access values appropriately non-overlapping with a
?然后切片
b
以便我适当地访问不与a
重叠的值?
Aha -- some googling finds: https://curiouscoding.nl/posts/numba-cuda-speedup/#v15-dynamic-shared-memory啊哈——一些谷歌搜索发现: https://curiouscoding.nl/posts/numba-cuda-speedup/#v15-dynamic-shared-memory
Which confirms that different dtypes are indeed supported, using the trick I guessed (as pictured above).这证实了确实支持不同的数据类型,使用我猜到的技巧(如上图所示)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.