简体   繁体   English

如何使我的 python 代码更快地求和?

[英]How do I make my python code for summing much faster?

The code below is for a sum of the defined function from the lower limit 2 and I vary the upper limit of the sum because I want to find out where the sum converges.下面的代码是定义的 function 从下限 2 的总和,我改变总和的上限,因为我想找出总和收敛的位置。 I found that the bigger the upper limit the slower the code runs.我发现上限越大,代码运行越慢。 How do I make this code run fast for any large upper limit value?对于任何较大的上限值,如何使此代码快速运行? The code is as follows:代码如下:

def h(i):
    x = (-1)**(i+1)
    y = 1000000000-(i-1)
    z = (log(i))**20
    return x*y*z

 gx = sum(h(i) for i in range (2, 1000000000+1))
 d_gx = gx/1000000000
 print(d_gx)

Numba is a python library for just-in-time optimization of pure python code without the need for external compilation steps. Numba是一个 python 库,用于即时优化纯 python 代码,无需外部编译步骤。

The two important functions I'll introduce here are numba.njit and numba.vectorize which are both decorators.我将在这里介绍的两个重要函数是numba.njitnumba.vectorize ,它们都是装饰器。 njit optimizes an arbitrary pure function and vectorize makes functions operate on both scalars and ndarrays. njit优化任意纯 function 和vectorize使函数在标量和 ndarray 上运行。

In [1]: from numba import vectorize, njit; from math import log

In [2]: def h(i):
   ...:     x = (-1)**(i+1)
   ...:     y = 1000000000-(i-1)
   ...:     z = (log(i))**20
   ...:     return x*y*z
   ...:

In [3]: %timeit sum(h(i) for i in range (2, 1000000+1))
646 ms ± 9.16 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

As you can see here, the naive implementation of your function takes an average of 646ms on a reduced amount of the input space.正如您在此处看到的,在减少输入空间的情况下,function 的简单实现平均需要 646 毫秒。 We can improve this by jitting your function:我们可以通过 jitting 您的 function 来改善这一点:

In [4]: jit_h = njit()(h)

In [5]: %timeit sum(jit_h(i) for i in range (2, 1000000+1))

179 ms ± 3.88 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

We've trimmed this down to 179ms, which is a huge improvement over the original 646ms.我们已将其缩减到 179 毫秒,这比原来的 646 毫秒有了很大的改进。 Because for-loops are slow, we can try to vectorize the operation using numpy arrays as inputs:因为 for 循环很慢,我们可以尝试使用 numpy arrays 作为输入来向量化操作:

In [6]: vectorize_h = vectorize()(h)

In [7]: %timeit sum(vectorize_h(i) for i in range (2, 1000000+1))
657 ms ± 4.55 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

As expected, vectorizing the input permits passing scalars, but doesn't dramatically improve performance -- in fact, it's a bit slower?正如预期的那样,向量化输入允许传递标量,但并没有显着提高性能——事实上,它有点慢? What if we operate on an entire numpy array?如果我们对整个 numpy 阵列进行操作会怎样?

In [8]: import numpy as np

In [9]: %timeit sum(vectorize_h(np.arange(2,1000000+1))) 

149 ms ± 1.78 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Finally, what if we replace the sum builtin with numpy ndarray sum?最后,如果我们用 numpy ndarray sum 替换内置的sum怎么办?

In [10]: %timeit vectorize_h(np.arange(2,1000000+1)).sum() 
17.2 ms ± 207 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

A further reduction down to 17.2ms -- a huge improvement over the naive implementation.进一步减少到 17.2 毫秒 - 对幼稚实现的巨大改进。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM