Numpy：改善數組的奇特索引

Question

為numpy尋找更快的花式索引，我正在運行的代碼在np.take()處np.take() 。 我用np.reshape()嘗試了order=F/C ，沒有任何改善。 Python operator沒有double transpose情況下可以很好地工作，但是它們等於np.take().

p    = np.random.randn(3500, 51)
rows = np.asarray(range(p.shape[0]))
cols = np.asarray([1,2,3,4,5,6,7,8,9,10,15,20,25,30,40,50])

%timeit p[rows][:, cols]
%timeit p.take(cols, axis = 1 )
%timeit np.asarray(operator.itemgetter(*cols)(p.T)).T

1000 loops, best of 3: 301 µs per loop
10000 loops, best of 3: 132 µs per loop
10000 loops, best of 3: 135 µs per loop

Answer 1

測試幾個選項：

In [3]: p[rows][:,cols].shape
Out[3]: (3500, 16)
In [4]: p[rows[:,None],cols].shape
Out[4]: (3500, 16)
In [5]: p[:,cols].shape
Out[5]: (3500, 16)
In [6]: p.take(cols,axis=1).shape
Out[6]: (3500, 16)

時間測試-普通p[:,cols]最快。 盡可能使用切片。

In [7]: timeit p[rows][:,cols].shape
100 loops, best of 3: 2.78 ms per loop
In [8]: timeit p.take(cols,axis=1).shape
1000 loops, best of 3: 739 µs per loop
In [9]: timeit p[rows[:,None],cols].shape
1000 loops, best of 3: 1.43 ms per loop
In [10]: timeit p[:,cols].shape
1000 loops, best of 3: 649 µs per loop

我已經看到itemgetter用於列表，但沒有用於數組。 這是一個迭代一組索引的類。 這兩條線在做同樣的事情：

In [23]: timeit np.asarray(operator.itemgetter(*cols)(p.T)).T.shape
1000 loops, best of 3: 738 µs per loop
In [24]: timeit np.array([p.T[c] for c in cols]).T.shape
1000 loops, best of 3: 748 µs per loop

注意， pT[c]是pT[c,:]或p[:,c].T 。 由於cols相對較少，並且忽略了rows高級索引，因此其時間接近p[:,cols] 。

Numpy：改善數組的奇特索引

問題描述

1 個解決方案

解決方案1
2 已采納 2016-08-19 19:29:58

Numpy：改善數組的奇特索引

問題描述

1 個解決方案

解決方案1 2 已采納 2016-08-19 19:29:58

解決方案1
2 已采納 2016-08-19 19:29:58