简体   繁体   English

在 Python 中使用数组更快的 for 循环

[英]Faster for-loops with arrays in Python

N, M = 1000, 4000000
a = np.random.uniform(0, 1, (N, M))
k = np.random.randint(0, N, (N, M))

out = np.zeros((N, M))
for i in range(N):
    for j in range(M):
        out[k[i, j], j] += a[i, j]

I work with very long for-loops;我使用很长的 for 循环; %%timeit on above with pass replacing the operation yields上面的%%timeitpass替换操作产量

1min 19s ± 663 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

this is unacceptable in context (C++ took 6.5 sec).这在上下文中是不可接受的(C++ 耗时 6.5 秒)。 There's no reason for above to be done with Python objects;没有理由使用 Python 对象完成上述操作; arrays have well-defined types.数组具有明确定义的类型。 Implementing this in C/C++ as an extension is an overkill on both developer and user ends;在 C/C++ 中实现它作为扩展对开发人员和用户端来说都是一种矫枉过正; I'm just passing arrays to loop and do arithmetic on.我只是将数组传递给循环并进行算术运算。

Is there a way to tell Numpy "move this logic to C", or another library that can handle nested loops involving only arrays?有没有办法告诉 Numpy“将此逻辑移至 C”,或其他可以处理仅涉及数组的嵌套循环的库? I seek it for the general case, not workarounds for this specific example (but if you have one I can open a separate Q&A).我在一般情况下寻求它,而不是针对这个特定示例的解决方法(但如果你有一个,我可以打开一个单独的问答)。

This is basically the idea behind Numba .这基本上是Numba背后的想法。 Not as fast as C, but it can get close... It uses a jit compiler to compile python code to machine and it's compatible with most Numpy functions.不如 C 快,但可以接近……它使用 jit 编译器将 python 代码编译为机器,并且与大多数 Numpy 函数兼容。 (In the docs you find all the details) (在文档中您可以找到所有详细信息)

import numpy as np
from numba import njit


@njit
def f(N, M):
    a = np.random.uniform(0, 1, (N, M))
    k = np.random.randint(0, N, (N, M))

    out = np.zeros((N, M))
    for i in range(N):
        for j in range(M):
            out[k[i, j], j] += a[i, j]
    return out


def f_python(N, M):
    a = np.random.uniform(0, 1, (N, M))
    k = np.random.randint(0, N, (N, M))

    out = np.zeros((N, M))
    for i in range(N):
        for j in range(M):
            out[k[i, j], j] += a[i, j]
    return out

Pure Python:纯Python:

%%timeit

N, M = 100, 4000
f_python(M, N)

338 ms ± 12.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)每个循环 338 ms ± 12.6 ms(7 次运行的平均值 ± 标准偏差,每次 1 次循环)

With Numba:使用 Numba:

%%timeit

N, M = 100, 4000
f(M, N)

12 ms ± 534 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)每个循环 12 ms ± 534 µs(7 次运行的平均值 ± 标准偏差,每次 100 次循环)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM