简体   繁体   English

使用索引列表切割n维numpy数组

[英]Slicing n-dimensional numpy array using list of indices

Say I have a 3 dimensional numpy array: 说我有一个3维numpy数组:

np.random.seed(1145)
A = np.random.random((5,5,5))

and I have two lists of indices corresponding to the 2nd and 3rd dimensions: 我有两个对应于第二和第三维度的索引列表:

second = [1,2]
third = [3,4]

and I want to select the elements in the numpy array corresponding to 我想选择numpy数组中对应的元素

A[:][second][third]

so the shape of the sliced array would be (5,2,2) and 所以切片阵列的形状为(5,2,2)

A[:][second][third].flatten()

would be equivalent to to: 相当于:

In [226]:

for i in range(5):
    for j in second:
        for k in third:
            print A[i][j][k]

0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658

Is there a way to slice a numpy array in this way? 有没有办法以这种方式切割numpy数组? So far when I try A[:][second][third] I get IndexError: index 3 is out of bounds for axis 0 with size 2 because the [:] for the first dimension seems to be ignored. 到目前为止,当我尝试A[:][second][third]我得到IndexError: index 3 is out of bounds for axis 0 with size 2因为第一维的[:]似乎被忽略了。

Numpy uses multiple indexing, so instead of A[1][2][3] , you can--and should--use A[1,2,3] . Numpy使用多个索引,因此您可以 - 而且应该 - 使用A[1,2,3]而不是A[1][2][3] A[1,2,3]

You might then think you could do A[:, second, third] , but the numpy indices are broadcast , and broadcasting second and third (two one-dimensional sequences) ends up being the numpy equivalent of zip , so the result has shape (5, 2) . 你可能认为你可以做A[:, second, third] ,但是numpy索引是广播的 ,广播secondthird (两个一维序列)最终是zip的numpy等价物,所以结果有形(5, 2)

What you really want is to index with, in effect, the outer product of second and third . 你真正想要的是实际上用secondthird的外积进行索引。 You can do this with broadcasting by making one of them, say second into a two-dimensional array with shape (2,1). 您可以通过其中的一个与广播做到这一点,说second与形状(2,1)二维数组。 Then the shape that results from broadcasting second and third together is (2,2) . 然后,由secondthird广播一起产生的形状是(2,2)

For example: 例如:

In [8]: import numpy as np

In [9]: a = np.arange(125).reshape(5,5,5)

In [10]: second = [1,2]

In [11]: third = [3,4]

In [12]: s = a[:, np.array(second).reshape(-1,1), third]

In [13]: s.shape
Out[13]: (5, 2, 2)

Note that, in this specific example, the values in second and third are sequential. 注意,在该具体示例中, secondthird中的值是顺序的。 If that is typical, you can simply use slices: 如果这是典型的,您可以简单地使用切片:

In [14]: s2 = a[:, 1:3, 3:5]

In [15]: s2.shape
Out[15]: (5, 2, 2)

In [16]: np.all(s == s2)
Out[16]: True

There are a couple very important difference in those two methods. 这两种方法有一些非常重要的区别。

  • The first method would also work with indices that are not equivalent to slices. 第一种方法也适用于与切片不等效的索引。 For example, it would work if second = [0, 2, 3] . 例如,如果second = [0, 2, 3] ,它将起作用。 (Sometimes you'll see this style of indexing referred to as "fancy indexing".) (有时你会看到这种索引方式被称为“花式索引”。)
  • In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. 在第一种方法(使用广播和“花式索引”)中,数据是原始数组的副本 In the second method (using only slices), the array s2 is a view into the same block of memory used by a . 在第二种方法(仅使用切片),阵列s2是一个视图到由所使用的存储器中的相同块a An in-place change in one will change them both. 一个就地改变将改变它们。

One way would be to use np.ix_ : 一种方法是使用np.ix_

>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True

Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function. 缺点是您必须明确指定缺少的坐标范围,但您可以将其包装到函数中。

I think there are three problems with your approach: 我认为你的方法有三个问题:

  1. Both second and third should be slices secondthird都应该是slices
  2. Since the 'to' index is exclusive, they should go from 1 to 3 and from 3 to 5 由于'to'索引是独占的,因此它们应该从13 ,从35
  3. Instead of A[:][second][third] , you should use A[:,second,third] 而不是A[:][second][third] ,你应该使用A[:,second,third]

Try this: 试试这个:

>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))                       
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482,  0.80820122,  0.64878266,  0.62689481,  0.01298507,
        0.42112921,  0.23104051,  0.34601169,  0.24838564,  0.66162209,
        0.96115751,  0.07338851,  0.33109539,  0.55168356,  0.33925748,
        0.2353348 ,  0.91254398,  0.44692211,  0.60975602,  0.64610556])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM