简体   繁体   English

快速的numpy花式索引

[英]Fast numpy fancy indexing

My code for slicing a numpy array (via fancy indexing) is very slow. 我的切片numpy数组的代码(通过花哨的索引)非常慢。 It is currently a bottleneck in program. 它目前是计划的瓶颈。

a.shape
(3218, 6)

ts = time.time(); a[rows][:, cols]; te = time.time(); print('%.8f' % (te-ts));
0.00200009

What is the correct numpy call to get an array consisting of the subset of rows 'rows' and columns 'col' of the matrix a? 获取由矩阵a的行'rows'和列'col'的子集组成的数组的正确numpy调用是什么? (in fact, I need the transpose of this result) (事实上​​,我需要这个结果的转置)

Let my try to summarize the excellent answers by Jaime and TheodrosZelleke and mix in some comments. 让我试着总结Jaime和TheodrosZelleke的优秀答案,并在一些评论中加入。

  1. Advanced (fancy) indexing always returns a copy, never a view. 高级(花哨)索引始终返回副本,而不是视图。
  2. a[rows][:,cols] implies two fancy indexing operations, so an intermediate copy a[rows] is created and discarded. a[rows][:,cols]意味着两个花哨的索引操作,因此创建并丢弃中间副本a[rows] Handy and readable, but not very efficient. 方便可读,但效率不高。 Moreover beware that [:,cols] usually generates a Fortran contiguous copy form a C-cont. 此外要注意[:,cols]通常会生成一个C-cont的Fortran连续副本。 source. 资源。
  3. a[rows.reshape(-1,1),cols] is a single advanced indexing expression basing on the fact that rows.reshape(-1,1) and cols are broadcast to the shape of the intended result. a[rows.reshape(-1,1),cols]是一个高级索引表达式,它基于rows.reshape(-1,1)cols 广播到预期结果的形状这一事实。
  4. A common experience is that indexing in a flattened array can be more efficient than fancy indexing, so another approach is 一个常见的经验是,扁平数组中的索引可能比花式索引更有效,因此另一种方法是

     indx = rows.reshape(-1,1)*a.shape[1] + cols a.take(indx) 

    or 要么

     a.take(indx.flat).reshape(rows.size,cols.size) 
  5. Efficiency will depend on memory access patterns and whether the starting array is C-countinous or Fortran continuous, so experimentation is needed. 效率将取决于内存访问模式以及起始数组是C-countinous还是Fortran连续,因此需要进行实验。

  6. Use fancy indexing only if really needed: basic slicing a[rstart:rstop:rstep, cstart:cstop:cstep] returns a view (although not continuous) and should be faster! 仅在真正需要时使用花式索引: 基本切片 a[rstart:rstop:rstep, cstart:cstop:cstep]返回一个视图(虽然不是连续的)并且应该更快!

To my surprise this, kind of lenghty expression, which calculates first linear 1D-indices, is more than 50% faster than the consecutive array indexing presented in the question: 令我惊讶的是,计算第一个线性1D指数的长度表达式比问题中提出的连续数组索引快50%以上:

(a.ravel()[(
   cols + (rows * a.shape[1]).reshape((-1,1))
   ).ravel()]).reshape(rows.size, cols.size)

UPDATE: OP updated the description of the shape of the initial array. 更新: OP更新了初始数组形状的描述。 With the updated size the speedup is now above 99% : 随着更新的大小,加速现在超过99%

In [93]: a = np.random.randn(3218, 1415)

In [94]: rows = np.random.randint(a.shape[0], size=2000)

In [95]: cols = np.random.randint(a.shape[1], size=6)

In [96]: timeit a[rows][:, cols]
10 loops, best of 3: 186 ms per loop

In [97]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
1000 loops, best of 3: 1.56 ms per loop

INITAL ANSWER: Here is the transcript: INITAL ANSWER:以下是成绩单:

In [79]: a = np.random.randn(3218, 6)
In [80]: a.shape
Out[80]: (3218, 6)

In [81]: rows = np.random.randint(a.shape[0], size=2000)
In [82]: cols = np.array([1,3,4,5])

Time method 1: 时间方法1:

In [83]: timeit a[rows][:, cols]
1000 loops, best of 3: 1.26 ms per loop

Time method 2: 时间方法2:

In [84]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
1000 loops, best of 3: 568 us per loop

Check that results are actually the same: 检查结果是否实际相同:

In [85]: result1 = a[rows][:, cols]
In [86]: result2 = (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)

In [87]: np.sum(result1 - result2)
Out[87]: 0.0

You can get some speed up if you slice using fancy indexing and broadcasting: 如果使用花式索引和广播切片,您可以加快速度:

from __future__ import division
import numpy as np

def slice_1(a, rs, cs) :
    return a[rs][:, cs]

def slice_2(a, rs, cs) :
    return a[rs[:, None], cs]

>>> rows, cols = 3218, 6
>>> rs = np.unique(np.random.randint(0, rows, size=(rows//2,)))
>>> cs = np.unique(np.random.randint(0, cols, size=(cols//2,)))
>>> a = np.random.rand(rows, cols)
>>> import timeit
>>> print timeit.timeit('slice_1(a, rs, cs)',
                        'from __main__ import slice_1, a, rs, cs',
                        number=1000)
0.24083110865
>>> print timeit.timeit('slice_2(a, rs, cs)',
                        'from __main__ import slice_2, a, rs, cs',
                        number=1000)
0.206566124519

If you think in term of percentages, doing something 15% faster is always good, but in my system, for the size of your array, this is taking 40 us less to do the slicing, and it is hard to believe that an operation taking 240 us will be your bottleneck. 如果按照百分比来考虑,做一些比15%快的东西总是好的,但在我的系统中,对于阵列的大小,这需要40美元来做切片,并且很难相信操作采取240我们将成为你的瓶颈。

Using np.ix_ you can a similar speed to ravel/reshape, but with code that is more clear: 使用np.ix_你可以使用类似的速度进行ravel / reshape,但代码更清晰:

a = np.random.randn(3218, 1415)
rows = np.random.randint(a.shape[0], size=2000)
cols = np.random.randint(a.shape[1], size=6)
a = np.random.randn(3218, 1415)
rows = np.random.randint(a.shape[0], size=2000)
cols = np.random.randint(a.shape[1], size=6)

%timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
#101 µs ± 2.36 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


%timeit ix_ = np.ix_(rows, cols); a[ix_]
#135 µs ± 7.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

ix_ = np.ix_(rows, cols)
result1 = a[ix_]
result2 = (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
​
np.sum(result1 - result2)
0.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM