简体   繁体   English

为什么numpy.apply_along_axis似乎比Python循环慢?

[英]Why does numpy.apply_along_axis seem to be slower than Python loop?

I'm confused about when numpy's numpy.apply_along_axis() function will outperform a simple Python loop. 当numpy的numpy.apply_along_axis()函数优于简单的Python循环时,我很困惑。 For example, consider the case of a matrix with many rows, and you wish to compute the sum of each row: 例如,考虑具有许多行的矩阵的情况,并且您希望计算每行的总和:

x = np.ones([100000, 3])
sums1 = np.array([np.sum(x[i,:]) for i in range(x.shape[0])])
sums2 = np.apply_along_axis(np.sum, 1, x)

Here I am even using a built-in numpy function, np.sum , and yet calculating sums1 (Python loop) takes less than 400ms while calculating sums2 ( apply_along_axis ) takes over 2000ms (NumPy 1.6.1 on Windows). 在这里,我甚至使用内置的numpy函数np.sum ,但计算sums1 (Python循环)需要不到400ms,而计算sums2apply_along_axis )需要2000ms(Windows上的NumPy 1.6.1)。 By further way of comparison, R's rowMeans function can often do this in less than 20ms (I'm pretty sure it's calling C code) while the similar R function apply() can do it in about 600ms. 通过进一步的比较,R的rowMeans函数通常可以在不到20ms的时间内完成(我很确定它调用C代码),而类似的R函数apply()可以在大约600ms内完成。

np.sum take an axis parameter, so you could compute the sum simply using np.sum采用axis参数,因此您可以简单地使用计算总和

sums3 = np.sum(x, axis=1)

This is much faster than the 2 methods you posed. 这比你提出的两种方法快得多。

$ python -m timeit -n 1 -r 1 -s "import numpy as np;x=np.ones([100000,3])" "np.apply_along_axis(np.sum, 1, x)"
1 loops, best of 1: 3.21 sec per loop

$ python -m timeit -n 1 -r 1 -s "import numpy as np;x=np.ones([100000,3])" "np.array([np.sum(x[i,:]) for i in range(x.shape[0])])"
1 loops, best of 1: 712 msec per loop

$ python -m timeit -n 1 -r 1 -s "import numpy as np;x=np.ones([100000,3])" "np.sum(x, axis=1)"
1 loops, best of 1: 1.81 msec per loop

(As for why apply_along_axis is slower — I don't know, probably because the function is written in pure Python and is much more generic and thus less optimization opportunity than the array version.) (至于为什么apply_along_axis比较慢 - 我不知道,可能是因为该函数是用纯Python编写的,并且更通用,因此比数组版本更少的优化机会。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM