简体   繁体   English

使用numpy.where()迭代矩阵

[英]Using numpy.where() to iterate through a matrix

There's something about numpy.where() I do not understand: 有一些关于numpy.where()我不明白:

Let's say I have a 2D numpy ndarray: 假设我有一个2D numpy ndarray:

import numpy as np
twodim =  np.array([[1, 2, 3, 4],  [1, 6, 7, 8], [1, 1, 1, 12],  [17, 3, 15, 16], [17, 3, 18, 18]])

Now, would like to create a function which "checks" this numpy array for a variety of conditions. 现在,想创建一个“检查”这个numpy数组的函数,用于各种条件。

array([[ 1,  2,  3,  4],
       [ 1,  6,  7,  8],
       [ 1,  1,  1, 12],
       [17,  3, 15, 16],
       [17,  3, 18, 18]])

For example, which entries in this array have (A) even numbers (B) greater than 7 (C) divisible by 3? 例如,此数组中的哪些条目具有(A)偶数(B)大于7(C)可被3整除?

I would like to use numpy.where() for this, and iterate through each entry of this array, finally finding the elements which match all conditions (if such an entry exists): 我想为此使用numpy.where() ,并遍历此数组的每个条目,最后找到符合所有条件的元素(如果存在这样的条目):

   even_entries = np.where(twodim % 2 == 0)
   greater_seven = np.where(twodim > 7 )
   divisible_three = np.where(twodim % 3 == 0)

How does one do this? 怎么做到这一点? I am not sure how to iterate through Booleans... 我不知道如何遍历布尔...

I could access the indices of the matrix (i,j) via 我可以通过访问矩阵(i,j)的索引

np.argwhere(even_entries)

We could do something like 我们可以做点什么

import numpy as np
twodim =  np.array([[1, 2, 3, 4],  [1, 6, 7, 8], [1, 1, 1, 12],  [17, 3, 15, 16], [17, 3, 18, 18]])
even_entries = np.where(twodim % 2 == 0)
greater_seven = np.where(twodim > 7 )
divisible_three = np.where(twodim % 3 == 0)
for row in even_entries:
    for item in row:
        if item: #equivalent to `if item == True`
                for row in greater_seven:
                    for item in row:
                        if item: #equivalent to `if item == True`
                            for row in divisible_three:
                                for item in row:
                                    if item: #equivalent to `if item == True`
                                        # something like print(np.argwhere())

Any advice? 有什么建议?

EDIT1: Great ideas below. 编辑1:下面的好主意。 As @hpaulj mentions "Your tests produce a boolean matrix of the same shape as twodim" This is a problem I'm running into as I toy around---not all conditionals produce matrices the same shape as my starting matrix. 正如@hpaulj所提到的那样“你的测试会产生一个与twodim相同形状的布尔矩阵”这是我遇到的一个问题,因为我玩弄了 - 并非所有条件都产生与我的起始矩阵相同的矩阵。 For instance, let's say I'm comparing whether the array element has a matching array to the left or right (ie horizontally) 例如,假设我正在比较数组元素是否具有左侧或右侧的匹配数组(即水平)

twodim[:, :-1] == twodim[:, 1:]

That results in a (5,3) Boolean array, whereas our original matrix is a (5,4) array 这导致(5,3)布尔数组,而我们的原始矩阵是(5,4)数组

array([[False, False, False],
       [False, False, False],
       [ True,  True, False],
       [False, False, False],
       [False, False,  True]], dtype=bool)

If we do the same vertically, that results in a (4,4) Boolean array, whereas the original matrix is (5,4) 如果我们垂直地执行相同操作,则会产生(4,4)布尔数组,而原始矩阵为(5,4)

twodim[:-1] == twodim[1:]

array([[ True, False, False, False],
       [ True, False, False, False],
       [False, False, False, False],
       [ True,  True, False, False]], dtype=bool) 

If we wished to know which entries have both vertical and horizontal pairs, it is non-trivial to figure out which dimension we are in. 如果我们想知道哪些条目同时具有垂直和水平对,那么确定我们所处的维度是非常重要的。

Your tests produce a boolean matrix of the same shape as twodim : 您的测试生成一个与twodim相同形状的布尔矩阵:

In [487]: mask3 = twodim%3==0
In [488]: mask3
Out[488]: 
array([[False, False,  True, False],
       [False,  True, False, False],
       [False, False, False,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True]], dtype=bool)

As other answers noted you can combine tests logically - with and and or. 正如其他答案所指出的,您可以逻辑地组合测试 - 和和或。

np.where is the same as np.nonzero (in this use), and just returns the coordinates of the True values - as a tuple of 2 arrays. np.wherenp.nonzero (在此用途中)相同,只返回True值的坐标 - 作为2个数组的元组。

In [489]: np.nonzero(mask3)
Out[489]: 
(array([0, 1, 2, 3, 3, 4, 4, 4], dtype=int32),
 array([2, 1, 3, 1, 2, 1, 2, 3], dtype=int32))

argwhere returns the same values, but as a transposed 2d array. argwhere返回相同的值,但作为转置的2d数组。

In [490]: np.argwhere(mask3)
Out[490]: 
array([[0, 2],
       [1, 1],
       [2, 3],
       [3, 1],
       [3, 2],
       [4, 1],
       [4, 2],
       [4, 3]], dtype=int32)

Both the mask and tuple can be used to index your array directly: masktuple都可以用来直接索引你的数组:

In [494]: twodim[mask3]
Out[494]: array([ 3,  6, 12,  3, 15,  3, 18, 18])
In [495]: twodim[np.nonzero(mask3)]
Out[495]: array([ 3,  6, 12,  3, 15,  3, 18, 18])

The argwhere can't be used directly for indexing, but may be more suitable for iteration, especially if you want the indexes as well as the values: argwhere不能直接用于索引,但可能更适合迭代,特别是如果你想要索引以及值:

In [496]: for i,j in np.argwhere(mask3):
   .....:     print(i,j,twodim[i,j])
   .....:     
0 2 3
1 1 6
2 3 12
3 1 3
3 2 15
4 1 3
4 2 18
4 3 18

The same thing with where requires a zip : 用同样的事情where需要一个zip

for i,j in zip(*np.nonzero(mask3)): print(i,j,twodim[i,j])

BUT in general in numpy we try to avoid iteration. 但总的来说,在numpy我们试图避免迭代。 If you can use twodim[mask] directly your code will be much faster. 如果你可以直接使用twodim[mask] ,你的代码会快得多。

Logical combinations of the boolean masks are easier to produce than combinations of the where indices. 布尔掩码的逻辑组合比where索引的组合更容易产生。 To use the indices I'd probably resort to set operations (union, intersect, difference). 要使用索引,我可能会求助于set操作(并集,交叉,差异)。


As for a reduced size test, you have to decide how that maps on to the original array (and other tests). 对于缩小尺寸的测试,您必须决定如何映射到原始数组(以及其他测试)。 eg 例如

A (5,3) mask (difference between columns): A(5,3)掩码(列之间的差异):

In [505]: dmask=np.diff(twodim, 1).astype(bool)
In [506]: dmask
Out[506]: 
array([[ True,  True,  True],
       [ True,  True,  True],
       [False, False,  True],
       [ True,  True,  True],
       [ True,  True, False]], dtype=bool)

It can index 3 columns of the original array 它可以索引原始数组的3列

In [507]: twodim[:,:-1][dmask]
Out[507]: array([ 1,  2,  3,  1,  6,  7,  1, 17,  3, 15, 17,  3])
In [508]: twodim[:,1:][dmask]
Out[508]: array([ 2,  3,  4,  6,  7,  8, 12,  3, 15, 16,  3, 18])

It can also be combined with 3 columns of another mask: 它也可以与另一个面具的3列组合:

In [509]: dmask & mask3[:,:-1]
Out[509]: 
array([[False, False,  True],
       [False,  True, False],
       [False, False, False],
       [False,  True,  True],
       [False,  True, False]], dtype=bool)

It is still easier to combine tests in the boolean array form than with where indices. 在布尔数组形式中组合测试比where索引中组合测试更容易。

If you want to find where all three conditions are satisfied: 如果要查找满足所有三个条件的位置:

import numpy as np
twodim =  np.array([[1, 2, 3, 4],  [1, 6, 7, 8], [1, 1, 1, 12],  [17, 3, 15, 16], [17, 3, 18, 18]])

mask = (twodim % 2 == 0) & (twodim > 7) & (twodim % 3 =0)

print(twodim[mask])

[12 18 18]

Not sure what you want in the end whether all elements in the row must satisfy the condition and to find those rows or if you want individual elements. 最终不确定你想要的是行中的所有元素是否必须满足条件并找到那些行或者是否需要单个元素。

import numpy as np
twodim =  np.array([[1, 2, 3, 4],  [1, 6, 7, 8], [1, 1, 1, 12],  [17, 3, 15, 16], [17, 3, 18, 18]])
condition = (twodim % 2. == 0.) & (twodim > 7.) & (twodim % 3. ==0.)
location = np.argwhere(condition == True) 


for i in location: 
     print i, twodim[i[0],i[1]],

>>> [2 3] 12 [4 2] 18 [4 3] 18

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM