简体   繁体   English

Numba cuda 动态共享 memory:不止一种?

[英]Numba cuda dynamic shared memory: more than one type?

I am aware that I can create a dynamic shared memory array for a numba-compiled CUDA kernel by passing the size in as the forth argument to the kernel call:我知道我可以通过将大小作为第四个参数传递给 kernel 调用,为 numba 编译的 CUDA kernel 创建一个动态共享数组 memory:

    ...
    foo_kernel[grid, block, stream, shared_bytes](...)
    ...

@cuda.jit
def foo_kernel(...) -> None:
    a = cuda.shared.array(0, nb.int32)

From here, I can slice a if I want to treat as several arrays.从这里,如果我想将a视为多个 arrays,我可以对其进行切片。

However, what if I want to have two arrays of different dtypes?但是,如果我想要两个不同数据类型的 arrays 怎么办? Can I do something like:我可以做类似的事情吗:

    ...
    a = cuda.shared.array(0, nb.int32)
    b = cuda.shared.array(0, nb.float32)
    ...

and then slice b so that I access values appropriately non-overlapping with a ?然后切片b以便我适当地访问不与a重叠的值?

Aha -- some googling finds: https://curiouscoding.nl/posts/numba-cuda-speedup/#v15-dynamic-shared-memory啊哈——一些谷歌搜索发现: https://curiouscoding.nl/posts/numba-cuda-speedup/#v15-dynamic-shared-memory

Which confirms that different dtypes are indeed supported, using the trick I guessed (as pictured above).这证实了确实支持不同的数据类型,使用我猜到的技巧(如上图所示)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM