快速索引點積，用於numpy / scipy

Question

我正在使用numpy進行線性代數。 我想做快速的子集索引 dot和其他線性操作。

在處理大型矩陣時，像A[:,subset].dot(x[subset])類A[:,subset].dot(x[subset])切片解決方案可能比在完整矩陣上進行乘法運算要長。

A = np.random.randn(1000,10000)
x = np.random.randn(10000,1)
subset = np.sort(np.random.randint(0,10000,500))

時間顯示，當列在一個塊中時，子索引編制會更快。

%timeit A.dot(x)
100 loops, best of 3: 4.19 ms per loop

%timeit A[:,subset].dot(x[subset])
100 loops, best of 3: 7.36 ms per loop

%timeit A[:,:500].dot(x[:500])
1000 loops, best of 3: 1.75 ms per loop

仍然加速不是我期望的（快20倍！）。

有誰知道允許通過numpy或scipy進行此類快速操作的庫/模塊的概念？

現在，我正在使用cython通過cblas庫編寫快速的列索引點積。 但是對於更復雜的操作（偽逆或子索引最小二乘求解），我並不一定要獲得良好的加速。

謝謝！

Answer 1

好吧，這更快。

%timeit A.dot(x)
#4.67 ms

%%timeit
y = numpy.zeros_like(x)
y[subset]=x[subset]
d = A.dot(y)
#4.77ms

%timeit c = A[:,subset].dot(x[subset])
#7.21ms

並且您擁有all(d-ravel(c)==0) == True 。

請注意，這有多快取決於輸入。 使用subset = array([1,2,3])您可以得出我的解決方案的時間幾乎相同，而最后一個解決方案的時間是46micro seconds 。

基本上，如果subset的大小不比x的大小小很多，這會更快

快速索引點積，用於numpy / scipy

問題描述

1 個解決方案

解決方案1
0 2015-01-22 12:09:12

快速索引點積，用於numpy / scipy

問題描述

1 個解決方案

解決方案1 0 2015-01-22 12:09:12

解決方案1
0 2015-01-22 12:09:12