在numpy数组的每一行上应用函数

Question

I have a (16000000,5) numpy array, and I want to apply this function on each row. 我有一个（16000000,5）numpy数组，我想在每行上应用此函数。

def f(row):
#returns a row of the same length.
    return [row[0]+0.5*row[1],row[2]+0.5*row[3],row[3]-0.5*row[2],row[4]-0.5*row[3],row[4]+1]

vectorizing would operate slow. 向量化将运行缓慢。

I tried going like this 我尝试过这样

np.column_stack((arr[:,0]+0.5*arr[:,1],arr[:,2]+0.5*arr[:,3],arr[:,3]-0.5*arr[:,2],arr[:,4]-0.5*arr[:,3],arr[:,4]+1))

but I get Memory Error. 但出现内存错误。

What is the fastest way to do this? 最快的方法是什么？

Answer 1

In [104]: arr=np.random.rand(1000000,5)
In [105]: %timeit a=np.column_stack((arr[:,0]+0.5*arr[:,1],arr[:,2]+0.5*arr[:,3],arr[:,3]-0.5*arr[:,2],arr[:,4]-0.5*arr[:,3],arr[:,4]+1))
10 loops, best of 3: 86.3 ms per loop

In [106]: %timeit a2=map(f,arr)1 loops, best of 3: 10.2 s per loop


In [98]: a2=map(f,arr)

In [99]: %timeit a2=map(f,arr)
100 loops, best of 3: 10.5 ms per loop

In [100]: np.all(a==a2)
Out[100]: True

Answer 2

You're better off preallocate and splitting the operations into separate lines, you don't gain anything here in terms of readability or speed by using column_stack. 最好是预先分配操作并将操作分成几行，使用column_stack在可读性或速度方面都不会获得任何好处。

result = np.zeros_like(arr)
result[:, 0] = arr[:, 0] + .5 * arr[:, 1]
result[:, 1] = arr[:, 2] + .5 * arr[:, 3]
result[:, 2] = arr[:, 3] - .5 * arr[:, 2]
result[:, 3] = arr[:, 4] - .5 * arr[:, 3]
result[:, 4] = arr[:, 4] + 1

在numpy数组的每一行上应用函数

问题描述

2 个解决方案

解决方案1
2 2013-04-26 22:53:34

解决方案2
2 已采纳 2013-04-26 23:03:21

在numpy数组的每一行上应用函数

问题描述

2 个解决方案

解决方案1 2 2013-04-26 22:53:34

解决方案2 2 已采纳 2013-04-26 23:03:21

解决方案1
2 2013-04-26 22:53:34

解决方案2
2 已采纳 2013-04-26 23:03:21