numpy.vectorize：为什么这么慢？

Question

The expit function, in scipy.special, is a vectorized sigmoid function. scipy.special 中的 expit 函数是一个向量化的 sigmoid 函数。 It computes 1 / (1+e^(-x)), which is complicated, probably involving a Taylor series.它计算 1 / (1+e^(-x))，这很复杂，可能涉及泰勒级数。

I learned about the "fast sigmoid", 1 / (1 + abs(x)), which ought to be much faster -- but the builtin expit function massively outperforms it, even when I hand it as a lambda expression to numpy.vectorize.我了解了“快速 sigmoid”，1 / (1 + abs(x))，它应该快得多——但是内置的 expit 函数大大优于它，即使我将它作为 lambda 表达式交给 numpy.vectorize .

Here's one way to test them:这是测试它们的一种方法：

from scipy.special import expit
data = np.random.rand(1000000)

The built-in, complicated sigmoid is fast:内置的复杂 sigmoid 很快：

%prun expit(data)

3 function calls in 0.064 seconds

Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.064    0.064    0.064    0.064 <string>:1(<module>)
     1    0.000    0.000    0.064    0.064 {built-in method builtins.exec}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

The simpler sigmoid is around 20 times slower:更简单的 sigmoid 大约慢 20 倍：

%prun np.vectorize( lambda x: (x / (1 + abs(x)) + 1) / 2 )(data)

2000023 function calls in 1.992 seconds

Ordered by: internal time

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1000001    1.123    0.000    1.294    0.000 <string>:1(<lambda>)
      1    0.558    0.558    1.950    1.950 function_base.py:2276(_vectorize_call)
1000001    0.170    0.000    0.170    0.000 {built-in method builtins.abs}
      4    0.098    0.025    0.098    0.025 {built-in method numpy.core.multiarray.array}
      1    0.041    0.041    1.991    1.991 function_base.py:2190(__call__)
      1    0.000    0.000    0.068    0.068 function_base.py:2284(<listcomp>)
      1    0.000    0.000    1.992    1.992 {built-in method builtins.exec}
      1    0.000    0.000    1.991    1.991 <string>:1(<module>)
      1    0.000    0.000    0.000    0.000 function_base.py:2220(_get_ufunc_and_otypes)
      1    0.000    0.000    0.000    0.000 function_base.py:2162(__init__)
      1    0.000    0.000    0.000    0.000 function_base.py:2242(<listcomp>)
      2    0.000    0.000    0.000    0.000 numeric.py:414(asarray)
      1    0.000    0.000    0.000    0.000 {built-in method numpy.core.umath.frompyfunc}
      1    0.000    0.000    0.000    0.000 function_base.py:2266(<listcomp>)
      2    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
      1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
      1    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
      1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Answer 1

I'll just quote the the vectorize docstring : "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop."我将引用vectorize 文档字符串：“提供 vectorize 函数主要是为了方便，而不是为了性能。实现本质上是一个 for 循环。”

You want to make 1/(1 + abs(x)) fast.你想让1/(1 + abs(x))快。 numpy has a function called numpy.abs (also called numpy.absolute --they are different names for the same object). numpy 有一个名为numpy.abs的函数（也称为numpy.absolute它们是同一个对象的不同名称）。 It computes the absolute value of each element of its argument, and it does this in C code, so it is fast.它计算其参数的每个元素的绝对值，并且它在 C 代码中执行此操作，因此速度很快。 Also, the Python built-in function abs knows how to dispatch arguments to objects that have the method __abs__ , which numpy arrays do, so you can also use Python's abs() to compute the element-wise absolute value of a numpy array.此外，Python 内置函数abs知道如何将参数分派给具有__abs__方法的__abs__ ，numpy 数组就是这样做的，因此您还可以使用 Python 的abs()来计算 numpy 数组的元素绝对值。 In the following, though, I'll use np.abs .不过，在下文中，我将使用np.abs 。

Here's an example of the use of np.abs :下面是一个使用np.abs ：

In [25]: x = np.array([-2, -1.5, 0, 5, 10])

In [26]: np.abs(x)
Out[26]: array([  2. ,   1.5,   0. ,   5. ,  10. ])

Here's a comparision of the performance of scipy.special.expit and 1.0/(1 + np.abs(x)) for a large array x :这是scipy.special.expit和1.0/(1 + np.abs(x))对于大型数组x ：

In [34]: from scipy.special import expit

In [35]: x = np.random.randn(100000)

In [36]: %timeit expit(x)
1000 loops, best of 3: 771 µs per loop

In [37]: %timeit 1.0/(1 + np.abs(x))
1000 loops, best of 3: 279 µs per loop

So 1.0/(1 + np.abs(x)) is quite a bit faster than expit(x) .所以1.0/(1 + np.abs(x))比expit(x)快很多。

numpy.vectorize：为什么这么慢？

问题描述

1 个解决方案

解决方案1
6 2016-12-05 02:43:29

numpy.vectorize：为什么这么慢？

问题描述

1 个解决方案

解决方案1 6 2016-12-05 02:43:29

解决方案1
6 2016-12-05 02:43:29