简体   繁体   English

如何使用 Numba 加速 python 的字典

[英]How can I speed up python's dictionary with Numba

I need to store a few cells in array of Boolean values.我需要在布尔值数组中存储一些单元格。 At first I used numpy, but when arrays started to take a lot of memory, I've got an idea to store non-zero elements in dictionary with tuples as keys (because it's hashable type).起初我使用 numpy,但是当数组开始占用大量内存时,我有了一个想法,将非零元素存储在字典中,并将元组作为键(因为它是可散列类型)。 For emaxple: {(0, 0, 0): True, (1, 2, 3): True} (This is two cells in "3D array" with indices 0,0,0 and 1,2,3, but number of dimensions are unknown in advance and defined when I run my algorythm).对于 emaxple: {(0, 0, 0): True, (1, 2, 3): True} (这是“3D 数组”中的两个单元格,索引为 0,0,0 和 1,2,3,但数字维度的数量是事先未知的,并在我运行我的算法时定义)。 It helped a lot, because non-zero cells fills just a small part of the array.它有很大帮助,因为非零单元格只填充了数组的一小部分。

For writing and getting values from this dict I need to use loops:为了从这个字典中写入和获取值,我需要使用循环:

def fill_cells(indices, area_dict):
    for i in indices:
        area_dict[tuple(i)] = 1

def get_cells(indices, area_dict):
    n = len(indices)
    out = np.zeros(n, dtype=np.bool)
    for i in range(n):
        out[i] = tuple(indices[i]) in area_dict.keys()
    return out

Now I need to speed up it with Numba.现在我需要用 Numba 加速它。 Numba doesn't support native Python's dict(), so I used numba.typed.Dict. Numba 不支持原生 Python 的 dict(),所以我使用了 numba.typed.Dict。 The problem is that Numba want to know size of the keys in stage of defining fucntions, so I can't even create the dictionary (length of keys are unknown in advance and defined when I call the function):问题是Numba想在定义函数的阶段知道键的大小,所以我什至无法创建字典(键的长度是事先未知的,并且在调用函数时定义):

@njit
def make_dict(n):
    out = {(0,)*n:True}
    return out

Numba can't infer the types of dictionary keys correctly and returns the error: Numba 无法正确推断字典键的类型并返回错误:

Compilation is falling back to object mode WITH looplifting enabled because Function "make_dict" failed type inference due to: Invalid use of Function(<built-in function mul>) with argument(s) of type(s): (tuple(int64 x 1), int64)

If I change n to concrete number in function, it works.如果我将 n 更改为函数中的具体数字,则它起作用。 I solved it with this trick:我用这个技巧解决了它:

n = 10
s = '@njit\ndef make_dict():\n\tout = {(0,)*%s:True}\n\treturn out' % n
exec(s)

But I think this is wrong inefficient way.但我认为这是错误的低效方式。 And I steel need to use my fill_cells and get_cells function with @njit decorator, but Numba returns the same error because I'm trying to create tuple from numpy array in this functions.我需要将我的 fill_cells 和 get_cells 函数与 @njit 装饰器一起使用,但 Numba 返回相同的错误,因为我试图在此函数中从 numpy 数组创建元组。

I understand the fundamental limitations of Numba (and compilation in general), but maybe there is some way to speed up functions, or, maybe you have another solution to my cell-storing problem?我了解 Numba 的基本限制(以及一般的编译),但也许有一些方法可以加快函数的速度,或者,您对我的单元格存储问题有另一种解决方案吗?

Final solution:最终解决方案:

The main problem was that Numba need to know length of tuples when defining function that creates it.主要问题是 Numba 在定义创建元组的函数时需要知道元组的长度。 The trick is to redefine function each time.诀窍是每次都重新定义函数。 I needed to generate string with code that defines fucntion and run it with exec() :我需要使用定义功能的代码生成字符串并使用exec()运行它:

n = 10
s = '@njit\ndef arr_to_tuple(a):\n\treturn (' + ''.join('a[%i],' % i for i in range(n)) + ')'
exec(s)

After that I can call arr_to_tuple(a) to create tuples that can be uses in another @njit - decorated functions.之后,我可以调用arr_to_tuple(a)来创建可以在另一个 @njit 装饰函数中使用的元组。

For example, creating empty dictionary of tuple keys, that needed to solve the problem:例如,创建元组键的空字典,需要解决的问题:

@njit
def make_empty_dict():
    tpl = arr_to_tuple(np.array([0]*5))
    out = {tpl:True}
    del out[tpl]
    return out

I write one element in dictionary because it's one of the ways of Numba to infer types.我在字典中写了一个元素,因为它是 Numba 推断类型的方法之一。

Also, I need to use my fill_cells and get_cells function that described in the question.另外,我需要使用问题中描述的fill_cellsget_cells函数。 This is how I've rewritten them with Numba:这就是我用 Numba 重写它们的方式:

Writing elements.写作要素。 Just changed tuple() to arr_to_tuple():刚刚将 tuple() 更改为 arr_to_tuple():

@njit
def fill_cells_nb(indices, area_dict):
    for i in range(len(indices)):
        area_dict[arr_to_tuple(indices[i])] = True

Getting elements from dictionary required a bit of creepy code:从字典中获取元素需要一些令人毛骨悚然的代码:

@njit
def get_cells_nb(indices, area_dict):
    n = len(indices)
    out = np.zeros(n, dtype=np.bool_)
    for i in range(n):
        new_len = len(area_dict)
        tpl = arr_to_tuple(indices[i])
        area_dict[tpl] = True
        old_len = len(area_dict)
        if new_len == old_len:
            out[i] = True
        else:
            del area_dict[tpl]
    return out

My version of Numba (0.46) doesn't support .contains (in) operator and try-except construction.我的 Numba (0.46) 版本不支持 .contains (in) 运算符和 try-except 构造。 If you have version that supports it you can write more "regular" solution for it.如果您有支持它的版本,您可以为它编写更多“常规”解决方案。

So when I want to check if the element with some index exist in dictionary I memorize the length of it, then I write in dictionary something with mentioned index.因此,当我想检查字典中是否存在具有某个索引的元素时,我会记住它的长度,然后我在字典中写下带有提到的索引的内容。 If length changed I conclude that the element wasn't exist.如果长度改变,我得出结论,该元素不存在。 Otherwise the element exists.否则该元素存在。 Looks like very slow solution, but it's not.看起来很慢的解决方案,但事实并非如此。

Speed test:速度测试:

Solutions work surprisingly fast.解决方案的运行速度惊人。 I tested it with %timeit in comparison to native-Python optimized code:与原生 Python 优化代码相比,我使用 %timeit 对其进行了测试:

  1. arr_to_tuple() 5 times faster than regular tuple() function arr_to_tuple()比常规tuple()函数快 5 倍
  2. get_cells with numba 3 times faster for one element and 40 times faster for big arrays of elements compare to native-Python written get_cells原生 Python 编写的 get_cells相比,带有 numba 的 get_cells对于一个元素快 3 倍,对于大元素数组快 40 倍
  3. fill_cells with numba 4 times faster for one element and 40 times faster for big arrays of elements compare to native-Python written fill_cells原生 Python 编写的 fill_cells相比,带有 numba 的 fill_cells对于一个元素快 4 倍,对于大元素数组快 40 倍

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM