如何使用 Numba 加速 python 的字典

Question

I need to store a few cells in array of Boolean values.我需要在布尔值数组中存储一些单元格。 At first I used numpy, but when arrays started to take a lot of memory, I've got an idea to store non-zero elements in dictionary with tuples as keys (because it's hashable type).起初我使用 numpy，但是当数组开始占用大量内存时，我有了一个想法，将非零元素存储在字典中，并将元组作为键（因为它是可散列类型）。 For emaxple: {(0, 0, 0): True, (1, 2, 3): True} (This is two cells in "3D array" with indices 0,0,0 and 1,2,3, but number of dimensions are unknown in advance and defined when I run my algorythm).对于 emaxple： {(0, 0, 0): True, (1, 2, 3): True} （这是“3D 数组”中的两个单元格，索引为 0,0,0 和 1,2,3，但数字维度的数量是事先未知的，并在我运行我的算法时定义）。 It helped a lot, because non-zero cells fills just a small part of the array.它有很大帮助，因为非零单元格只填充了数组的一小部分。

For writing and getting values from this dict I need to use loops:为了从这个字典中写入和获取值，我需要使用循环：

def fill_cells(indices, area_dict):
    for i in indices:
        area_dict[tuple(i)] = 1

def get_cells(indices, area_dict):
    n = len(indices)
    out = np.zeros(n, dtype=np.bool)
    for i in range(n):
        out[i] = tuple(indices[i]) in area_dict.keys()
    return out

Now I need to speed up it with Numba.现在我需要用 Numba 加速它。 Numba doesn't support native Python's dict(), so I used numba.typed.Dict. Numba 不支持原生 Python 的 dict()，所以我使用了 numba.typed.Dict。 The problem is that Numba want to know size of the keys in stage of defining fucntions, so I can't even create the dictionary (length of keys are unknown in advance and defined when I call the function):问题是Numba想在定义函数的阶段知道键的大小，所以我什至无法创建字典（键的长度是事先未知的，并且在调用函数时定义）：

@njit
def make_dict(n):
    out = {(0,)*n:True}
    return out

Numba can't infer the types of dictionary keys correctly and returns the error: Numba 无法正确推断字典键的类型并返回错误：

Compilation is falling back to object mode WITH looplifting enabled because Function "make_dict" failed type inference due to: Invalid use of Function(<built-in function mul>) with argument(s) of type(s): (tuple(int64 x 1), int64)

If I change n to concrete number in function, it works.如果我将 n 更改为函数中的具体数字，则它起作用。 I solved it with this trick:我用这个技巧解决了它：

n = 10
s = '@njit\ndef make_dict():\n\tout = {(0,)*%s:True}\n\treturn out' % n
exec(s)

But I think this is wrong inefficient way.但我认为这是错误的低效方式。 And I steel need to use my fill_cells and get_cells function with @njit decorator, but Numba returns the same error because I'm trying to create tuple from numpy array in this functions.我需要将我的 fill_cells 和 get_cells 函数与 @njit 装饰器一起使用，但 Numba 返回相同的错误，因为我试图在此函数中从 numpy 数组创建元组。

I understand the fundamental limitations of Numba (and compilation in general), but maybe there is some way to speed up functions, or, maybe you have another solution to my cell-storing problem?我了解 Numba 的基本限制（以及一般的编译），但也许有一些方法可以加快函数的速度，或者，您对我的单元格存储问题有另一种解决方案吗？

Answer 1

Final solution:最终解决方案：

The main problem was that Numba need to know length of tuples when defining function that creates it.主要问题是 Numba 在定义创建元组的函数时需要知道元组的长度。 The trick is to redefine function each time.诀窍是每次都重新定义函数。 I needed to generate string with code that defines fucntion and run it with exec() :我需要使用定义功能的代码生成字符串并使用exec()运行它：

n = 10
s = '@njit\ndef arr_to_tuple(a):\n\treturn (' + ''.join('a[%i],' % i for i in range(n)) + ')'
exec(s)

After that I can call arr_to_tuple(a) to create tuples that can be uses in another @njit - decorated functions.之后，我可以调用arr_to_tuple(a)来创建可以在另一个 @njit 装饰函数中使用的元组。

For example, creating empty dictionary of tuple keys, that needed to solve the problem:例如，创建元组键的空字典，需要解决的问题：

@njit
def make_empty_dict():
    tpl = arr_to_tuple(np.array([0]*5))
    out = {tpl:True}
    del out[tpl]
    return out

I write one element in dictionary because it's one of the ways of Numba to infer types.我在字典中写了一个元素，因为它是 Numba 推断类型的方法之一。

Also, I need to use my fill_cells and get_cells function that described in the question.另外，我需要使用问题中描述的fill_cells和get_cells函数。 This is how I've rewritten them with Numba:这就是我用 Numba 重写它们的方式：

Writing elements.写作要素。 Just changed tuple() to arr_to_tuple():刚刚将 tuple() 更改为 arr_to_tuple()：

@njit
def fill_cells_nb(indices, area_dict):
    for i in range(len(indices)):
        area_dict[arr_to_tuple(indices[i])] = True

Getting elements from dictionary required a bit of creepy code:从字典中获取元素需要一些令人毛骨悚然的代码：

@njit
def get_cells_nb(indices, area_dict):
    n = len(indices)
    out = np.zeros(n, dtype=np.bool_)
    for i in range(n):
        new_len = len(area_dict)
        tpl = arr_to_tuple(indices[i])
        area_dict[tpl] = True
        old_len = len(area_dict)
        if new_len == old_len:
            out[i] = True
        else:
            del area_dict[tpl]
    return out

My version of Numba (0.46) doesn't support .contains (in) operator and try-except construction.我的 Numba (0.46) 版本不支持 .contains (in) 运算符和 try-except 构造。 If you have version that supports it you can write more "regular" solution for it.如果您有支持它的版本，您可以为它编写更多“常规”解决方案。

So when I want to check if the element with some index exist in dictionary I memorize the length of it, then I write in dictionary something with mentioned index.因此，当我想检查字典中是否存在具有某个索引的元素时，我会记住它的长度，然后我在字典中写下带有提到的索引的内容。 If length changed I conclude that the element wasn't exist.如果长度改变，我得出结论，该元素不存在。 Otherwise the element exists.否则该元素存在。 Looks like very slow solution, but it's not.看起来很慢的解决方案，但事实并非如此。

Speed test:速度测试：

Solutions work surprisingly fast.解决方案的运行速度惊人。 I tested it with %timeit in comparison to native-Python optimized code:与原生 Python 优化代码相比，我使用 %timeit 对其进行了测试：

arr_to_tuple() 5 times faster than regular tuple() function arr_to_tuple()比常规tuple()函数快 5 倍
get_cells with numba 3 times faster for one element and 40 times faster for big arrays of elements compare to native-Python written get_cells与原生 Python 编写的 get_cells相比，带有 numba 的 get_cells对于一个元素快 3 倍，对于大元素数组快 40 倍
fill_cells with numba 4 times faster for one element and 40 times faster for big arrays of elements compare to native-Python written fill_cells与原生 Python 编写的 fill_cells相比，带有 numba 的 fill_cells对于一个元素快 4 倍，对于大元素数组快 40 倍

如何使用 Numba 加速 python 的字典

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-12-28 14:36:03

Final solution:最终解决方案：

Speed test:速度测试：

如何使用 Numba 加速 python 的字典

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-12-28 14:36:03

Final solution:最终解决方案：

Speed test:速度测试：

解决方案1
0 已采纳 2019-12-28 14:36:03