CUDA-Python：如何在Python中启动CUDA内核（Numba 0.25）？

Question

could you please help me understand how to write CUDA kernels in Python? 你能帮我理解如何用Python编写CUDA内核吗？ AFAIK, numba.vectorize can be performed on cuda, cpu, parallel(multi-cpus) , based on target . AFAIK， numba.vectorize可以基于目标在cuda，cpu，parallel（multi-cpus）上执行。 But target='cuda' requires to set up CUDA kernels. 但是target ='cuda'需要设置CUDA内核。

The main issue is that many examples, answers in Internet are related to deprecated NumbaPro library, so it's hard to follow to such as not-updated WIKIs , especially if you're newbie. 主要问题是很多例子，互联网上的答案都与弃用的 NumbaPro库有关，因此很难跟上未更新的 WIKI ，特别是如果你是新手。

I have: 我有：

latest Anaconda (v2) 最新的Anaconda（v2）
latest Numba (v0.25) 最新的Numba（第25卷）
latest CUDA toolkit (v7) 最新的CUDA工具包（v7）

Here is the error I'm getting: 这是我得到的错误：

numba.cuda.cudadrv.driver.CudaAPIError: 1 Call to cuLaunchKernel results in CU DA_ERROR_INVALID_VALUE numba.cuda.cudadrv.driver.CudaAPIError： 1调用cuLaunchKernel导致CU DA_ERROR_INVALID_VALUE

import numpy as np
import time

from numba import vectorize, cuda

@vectorize(['float32(float32, float32)'], target='cuda')
def VectorAdd(a, b):
    return a + b

def main():
    N = 32000000

    A = np.ones(N, dtype=np.float32)
    B = np.ones(N, dtype=np.float32)

    start = time.time()
    C = VectorAdd(A, B)
    vector_add_time = time.time() - start

    print "C[:5] = " + str(C[:5])
    print "C[-5:] = " + str(C[-5:])

    print "VectorAdd took for % seconds" % vector_add_time

if __name__ == '__main__':
    main()

Answer 1

The code, as posted, is correct and will run on a Python 2 Numbapro/Accelerate system without error. 发布的代码是正确的，可以在Python 2 Numbapro / Accelerate系统上运行而不会出错。

It was likely that the particular system being used to run the code wasn't very large in capacity and was hitting a display driver watchdog or free memory error with 32 million element vectors. 可能是用于运行代码的特定系统的容量不是很大，并且使用3200万个元素向量击中了显示驱动程序看门狗或空闲内存错误。 Reducing the size of the input data allowed the code to run correctly. 减小输入数据的大小允许代码正确运行。

[This answer assembled from comments and added as a community wiki entry to get this question off the unanswered list] [这个答案汇总了评论，并作为社区维基条目添加，以便将这个问题从未答复的列表中删除]

CUDA-Python：如何在Python中启动CUDA内核（Numba 0.25）？

问题描述

1 个解决方案

解决方案1
1 已采纳

CUDA-Python：如何在Python中启动CUDA内核（Numba 0.25）？

问题描述

1 个解决方案

解决方案1 1 已采纳

解决方案1
1 已采纳