简体   繁体   English

提高numpy三角函数的运算性能

[英]Improve performance of operation on numpy trigonometric functions

I have a rather large code which I need to optimize. 我有一个相当大的代码,我需要优化。 After some analysis using time.time() , I've found that the line that takes up the most processing time (it is executed thousands of times) is this one: 在使用time.time()进行一些分析之后,我发现占用最多处理时间的行(它执行了数千次)是这样的:

A = np.cos(a) * np.cos(b) - np.sin(a) * np.sin(b) * np.sin(c - d)

where all the variables can be randomly defined with: 所有变量都可以随机定义:

N = 5000
a = np.random.uniform(0., 10., N)
b = np.random.uniform(0., 50., N)
c = np.random.uniform(0., 30., N)
d = np.random.uniform(0., 25., N)

Is there a way to improve the performance of the calculation of A ? 有没有办法提高A的计算性能? As I'm already using numpy , I'm pretty much out of ideas. 由于我已经使用numpy ,我几乎没有想法。

By using the product-to-sum trig. 通过使用乘积和总和。 identities , you can reduce the number of trig. 身份 ,你可以减少三角形的数量。 function calls. 函数调用。 In the following, func1 and func2 compute the same value, but func2 makes fewer calls to trig. 在下文中, func1func2计算相同的值,但func2对trig的调用次数较少。 functions. 职能。

import numpy as np

def func1(a, b, c, d):
    A = np.cos(a) * np.cos(b) - np.sin(a) * np.sin(b) * np.sin(c - d)
    return A

def func2(a, b, c, d):
    s = np.sin(c - d)
    A = 0.5*((1 - s)*np.cos(a - b) + (1 + s)*np.cos(a + b))
    return A

Here's a timing comparison with N = 5000 : 这是N = 5000的时序比较:

In [48]: %timeit func1(a, b, c, d)
1000 loops, best of 3: 374 µs per loop

In [49]: %timeit func2(a, b, c, d)
1000 loops, best of 3: 241 µs per loop

Did you tried to use so Python accelerator like Numba, Cython, Pythran or anything else? 您是否尝试使用像Numba,Cython,Pythran或其他任何东西的Python加速器?

I did some test with Pythran. 我和Pythran做了一些测试。 Here is the result: 结果如下:

Original code : 原始代码:

  • Python + numpy : 1000 loops, best of 3: 1.43 msec per loop Python + numpy:1000个循环,最佳3:每循环1.43毫秒
  • Pythran : 1000 loops, best of 3:777usec per loop Pythran:1000个循环,每个循环最好3:777usec
  • Pythran + SIMD : 1000 loops, best of 3:488 usec per loop Pythran + SIMD:1000次循环,最佳3:488次循环使用

Code provided by Warren: 代码由Warren提供:

  • Python + numpy : 1000 loops, best of 3: 1.05 msec per loop Python + numpy:1000次循环,最佳3:每循环1.05毫秒
  • Pythran : 1000 loops, best of 3: 646 usec per loop Pythran:1000个循环,每个循环最好3:646 usec
  • Pythran + SIMD : 1000 loops, best of 3: 425 usec per loop Pythran + SIMD:1000次循环,最佳3:425次循环使用

This is done with N = 5000 这是在N = 5000的情况下完成的

  • Update * : 更新*:

Here is the code : 这是代码:

# pythran export func1(float[], float[], float[], float[])
# pythran export func2(float[], float[], float[], float[])
import numpy as np

def func1(a, b, c, d):
    A = np.cos(a) * np.cos(b) - np.sin(a) * np.sin(b) * np.sin(c - d)
    return A

def func2(a, b, c, d):
    s = np.sin(c - d)
    A = 0.5*((1 - s)*np.cos(a - b) + (1 + s)*np.cos(a + b))
    return A

And command line: 和命令行:

$ pythran test.py  # Default compilation
$ pythran test.py -march=native -DUSE_BOOST_SIMD  # Pythran with code vectorization

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM