简体   繁体   English

在 Numba 中,以 CUDA 为目标时,如何将数组复制到常量 memory 中?

[英]In Numba, how to copy an array into constant memory when targeting CUDA?

I have a sample code that illustrates the issue:我有一个示例代码来说明这个问题:

import numpy as np
from numba import cuda, types
import configs


def main():
    arr = np.empty(0, dtype=np.uint8)

    stream = cuda.stream()
    d_arr = cuda.to_device(arr, stream=stream)
    kernel[configs.BLOCK_COUNT, configs.THREAD_COUNT, stream](d_arr)


@cuda.jit(types.void(
    types.Array(types.uint8, 1, 'C'),
), debug=configs.CUDA_DEBUG)
def kernel(d_arr):
    arr = cuda.const.array_like(d_arr)


if __name__ == "__main__":
    main()

When I run this code with cuda-memcheck, I get:当我使用 cuda-memcheck 运行此代码时,我得到:

numba.errors.ConstantInferenceError: Failed in nopython mode pipeline (step: nopython rewrites)
Constant inference not possible for: arg(0, name=d_arr)

Which seems to indicate that array I passed in was not a constant so it could not be copied to constant memory - is that the case?这似乎表明我传入的数组不是常量,因此无法将其复制到常量 memory - 是这样吗? If so, how can I copy to constant memory an array that was given to a kernel as input?如果是这样,我如何将作为输入提供给 kernel 的数组复制到常量 memory ?

You don't copy to constant array using an array that was given to the kernel as input.您不会使用作为输入提供给 kernel 的数组复制到常量数组。 That type of input array is already in the device, and device code cannot write to constant memory.该类型的输入数组已经在设备中,并且设备代码无法写入常量 memory。

Constant memory can only be written to from host code, and the constant syntax expects the array to be a host array.常量 memory 只能从主机代码写入,并且常量语法要求数组是主机数组。

Here is an example:这是一个例子:

$ cat t32.py
import numpy as np
from numba import cuda, types, int32, int64

a = np.ones(3,dtype=np.int32)
@cuda.jit
def generate_mutants(b):
    c_a = cuda.const.array_like(a)
    b[0] = c_a[0]

if __name__ == "__main__":
    b = np.zeros(3,dtype=np.int32)
    generate_mutants[1, 1](b)
    print(b)
$ python t32.py
[1 0 0]
$

Note that the implementation of constant memory in Numba CUDA has some behavioral differences compared to what is possible with CUDA C/C++, this issue highlights some of them.请注意,与 CUDA C/C++ 相比,在 Numba CUDA 中实现常量 memory 存在一些行为差异,本期重点介绍了其中一些。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM