如何在Numba cuda中的字符串数组上执行内核功能？

Question

I have an array of strings that i read from file ,i want to compare each line of my file to a specific string..the file is too large (about 200 MB of lines) 我有一个从文件读取的字符串数组，我想将文件的每一行与特定字符串进行比较。文件太大（大约200 MB行）

i have followed this tutorial https://nyu-cds.github.io/python-numba/05-cuda/ but it doesn't show exactly how to deal with array of strings/characters. 我遵循了本教程https://nyu-cds.github.io/python-numba/05-cuda/，但是它没有确切显示如何处理字符串/字符数组。

import numpy as np
from numba import cuda



@cuda.jit
def my_kernel(io_array):

    tx = cuda.threadIdx.x

    ty = cuda.blockIdx.x

    bw = cuda.blockDim.x

    pos = tx + ty * bw
    if pos < io_array.size:  # Check array boundaries
        io_array[pos]   # i want here to compare each line of the string array to a specific line

def main():
    a = open("test.txt", 'r')  # open file in read mode

    print("the file contains:")
    data = country = np.array(a.read())


    # Set the number of threads in a block
    threadsperblock = 32

    # Calculate the number of thread blocks in the grid
    blockspergrid = (data.size + (threadsperblock - 1)) // threadsperblock

    # Now start the kernel
    my_kernel[blockspergrid, threadsperblock](data)


    # Print the result
    print(data)

if __name__ == '__main__':
        main()

I have two problems. 我有两个问题。

First: how to send my sentence (string) that i want to compare each line of my file to it to the kernal function. 第一：如何发送我想将文件的每一行与其字符串比较的句子（字符串）到内核函数。 (in the io_array without affecting the threads computation) （在io_array中，不影响线程计算）

Second: it how to deal with string array? 第二：它如何处理字符串数组？ i get this error when i run the above code 运行上面的代码时出现此错误

this error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: typing of intrinsic-call at test2.py (18)

File "test2.py", line 18:
def my_kernel(io_array):
    <source elided>
    if pos < io_array.size:  # Check array boundaries
        io_array[pos]   # do the computation

PS i'm new to Cuda and have just started learning it. PS我刚来到Cuda并刚刚开始学习它。

Answer 1

First of all this: 首先：

data = country = np.array(a.read())

doesn't do what you think it does. 不按照您的想法去做。 It does not yield a numpy array that you can index like this: 它不会产生一个可以索引的numpy数组，如下所示：

io_array[pos]

If you don't believe me, just try that in ordinary python code with something like: 如果您不相信我，只需在普通的python代码中尝试一下，例如：

print(data[0])

and you'll get an error. 你会得到一个错误。 If you want help with that, just ask your question on the python or numpy tag. 如果您需要帮助，请在python或numpy标签上提问。

So we need a different method to load the string data from disk. 因此，我们需要一种不同的方法来从磁盘加载字符串数据。 For simplicity, I choose to use numpy.fromfile() . 为了简单起见，我选择使用numpy.fromfile() 。 This method will require that all lines in your file are of the same width. 此方法将要求文件中的所有行都具有相同的宽度。 I like that concept. 我喜欢这个概念。 There's more information you would have to describe if you want to handle lines of varying lengths. 如果要处理不同长度的线，则必须描述更多信息。

If we start out that way, we can load the data as an array of bytes, and use that: 如果以这种方式开始，则可以将数据加载为字节数组，并使用它：

$ cat test.txt
the quick brown fox.............
jumped over the lazy dog........
repeatedly......................
$ cat t43.py
import numpy as np
from numba import cuda

@cuda.jit
def my_kernel(str_array, check_str, length, lines, result):

    col,line = cuda.grid(2)
    pos = (line*(length+1))+col
    if col < length and line < lines:  # Check array boundaries
        if str_array[pos] != check_str[col]:
            result[line] = 0

def main():
    a = np.fromfile("test.txt", dtype=np.byte)
    print("the file contains:")
    print(a)
    print("array length is:")
    print(a.shape[0])
    print("the check string is:")
    b = a[33:65]
    print(b)
    i = 0
    while a[i] != 10:
        i=i+1
    line_length = i
    print("line length is:")
    print(line_length)
    print("number of lines is:")
    line_count = a.shape[0]/(line_length+1)
    print(line_count)
    res = np.ones(line_count)
    # Set the number of threads in a block
    threadsperblock = (32,32)

    # Calculate the number of thread blocks in the grid
    blocks_x = (line_length/32)+1
    blocks_y = (line_count/32)+1
    blockspergrid = (blocks_x,blocks_y)
    # Now start the kernel
    my_kernel[blockspergrid, threadsperblock](a, b, line_length, line_count, res)


    # Print the result
    print("matching lines (match = 1):")
    print(res)

if __name__ == '__main__':
        main()
$ python t43.py
the file contains:
[116 104 101  32 113 117 105  99 107  32  98 114 111 119 110  32 102 111
 120  46  46  46  46  46  46  46  46  46  46  46  46  46  10 106 117 109
 112 101 100  32 111 118 101 114  32 116 104 101  32 108  97 122 121  32
 100 111 103  46  46  46  46  46  46  46  46  10 114 101 112 101  97 116
 101 100 108 121  46  46  46  46  46  46  46  46  46  46  46  46  46  46
  46  46  46  46  46  46  46  46  10]
array length is:
99
the check string is:
[106 117 109 112 101 100  32 111 118 101 114  32 116 104 101  32 108  97
 122 121  32 100 111 103  46  46  46  46  46  46  46  46]
line length is:
32
number of lines is:
3
matching lines (match = 1):
[ 0.  1.  0.]
$

如何在Numba cuda中的字符串数组上执行内核功能？

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-05-02 02:58:08

如何在Numba cuda中的字符串数组上执行内核功能？

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-05-02 02:58:08

解决方案1
2 已采纳 2019-05-02 02:58:08