简体   繁体   English

Python:将函数应用于数组的每一列

[英]Python: applying function to each column of an array

I need to apply a function to each column of a numpy array. 我需要将一个函数应用于numpy数组的每一列。 I can't do it for each element of the array but it must be each column as each column combined together represents an information. 我无法对数组的每个元素执行此操作,但它必须是每个列,因为组合在一起的每个列都代表一种信息。

import numpy as np
C = np.random.normal(0, 1, (500, 30))

Is this the most efficient way to do this (for illustration I am using np.sum): 这是执行此操作的最有效方法吗(出于说明目的,我正在使用np.sum):

C2 = [ np.sum( C[ :, i ] )  for i in range( 0, 30) ]

The array C is 500x4000 and I am applying a time consuming function to each column as well. 数组C为500x4000,我也在每列上应用了一个耗时的函数。

You can try np.apply_along_axis : 您可以尝试np.apply_along_axis

In [21]: A = np.array([[1,2,3],[4,5,6]])

In [22]: A
Out[22]: 
array([[1, 2, 3],
       [4, 5, 6]])

In [23]: np.apply_along_axis(np.sum, 0, A)
Out[23]: array([5, 7, 9])

In [24]: np.apply_along_axis(np.sum, 1, A)
Out[24]: array([ 6, 15])

It appears to take ~75% of the time to use this instead: 似乎需要大约75%的时间来使用它:

[ np.sum(row) for row in C.T ]

It also is more Pythonic. 它也是更Pythonic的。 For reference, these are the timeit results. 作为参考,这些都是timeit结果。

>>> timeit('[ np.sum( C[ :, i ] )  for i in range( 0, 30) ]', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.418906474798
>>> print timeit('[ np.sum(row) for row in C.T ]', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.345153254432
>>> print timeit('np.apply_along_axis(np.sum, 0, C)', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.732931300891

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM