Numba cuda 动态共享 memory：不止一种？

Question

I am aware that I can create a dynamic shared memory array for a numba-compiled CUDA kernel by passing the size in as the forth argument to the kernel call:我知道我可以通过将大小作为第四个参数传递给 kernel 调用，为 numba 编译的 CUDA kernel 创建一个动态共享数组 memory：

    ...
    foo_kernel[grid, block, stream, shared_bytes](...)
    ...

@cuda.jit
def foo_kernel(...) -> None:
    a = cuda.shared.array(0, nb.int32)

From here, I can slice a if I want to treat as several arrays.从这里，如果我想将a视为多个 arrays，我可以对其进行切片。

However, what if I want to have two arrays of different dtypes?但是，如果我想要两个不同数据类型的 arrays 怎么办？ Can I do something like:我可以做类似的事情吗：

    ...
    a = cuda.shared.array(0, nb.int32)
    b = cuda.shared.array(0, nb.float32)
    ...

and then slice b so that I access values appropriately non-overlapping with a ?然后切片b以便我适当地访问不与a重叠的值？

Answer 1

Aha -- some googling finds: https://curiouscoding.nl/posts/numba-cuda-speedup/#v15-dynamic-shared-memory啊哈——一些谷歌搜索发现： https://curiouscoding.nl/posts/numba-cuda-speedup/#v15-dynamic-shared-memory

Which confirms that different dtypes are indeed supported, using the trick I guessed (as pictured above).这证实了确实支持不同的数据类型，使用我猜到的技巧（如上图所示）。

Numba cuda 动态共享 memory：不止一种？

问题描述

1 个解决方案

解决方案1
0 2022-11-13 04:23:14

Numba cuda 动态共享 memory：不止一种？

问题描述

1 个解决方案

解决方案1 0 2022-11-13 04:23:14

解决方案1
0 2022-11-13 04:23:14