简体   繁体   English

根据数组中的值拆分NumPy数组(条件)

[英]Split NumPy array according to values in the array (a condition)

I have an array: 我有一个数组:

    arr = [(1,1,1), (1,1,2), (1,1,3), (1,1,4)...(35,1,22),(35,1,23)]

I want to split my array according to the third value in each ordered pair. 我想根据每个有序对中的第三个值拆分数组。 I want each third value of 1 to be the start of a new array. 我希望每个第三个值成为新数组的开始。 The results should be: 结果应为:

    [(1,1,1), (1,1,2),...(1,1,35)][(1,2,1), (1,2,2),...(1,2,46)]

and so on. 等等。 I know numpy.split should do the trick but I'm lost as to how to write the condition for the split. 我知道numpy.split应该可以解决问题,但是我不知道如何编写拆分条件。

Here's a quick idea, working with a 1d array. 这是一个快速的想法,使用一维数组。 It can be easily extended to work with your 2d array: 它可以轻松扩展以与您的2d数组一起使用:

In [385]: x=np.arange(10)

In [386]: I=np.where(x%3==0)

In [387]: I
Out[387]: (array([0, 3, 6, 9]),)

In [389]: np.split(x,I[0])
Out[389]: 
[array([], dtype=float64),
 array([0, 1, 2]),
 array([3, 4, 5]),
 array([6, 7, 8]),
 array([9])]

The key is to use where to find the indecies where you want split to act. 关键是在where可以找到要split的索引。


For a 2d arr 对于2d arr

First make a sample 2d array, with something interesting in the 3rd column: 首先创建一个示例2d数组,在第三列中添加一些有趣的东西:

In [390]: arr=np.ones((10,3))
In [391]: arr[:,2]=np.arange(10)
In [392]: arr
Out[392]: 
array([[ 1.,  1.,  0.],
       [ 1.,  1.,  1.],
       ...
       [ 1.,  1.,  9.]])

Then use the same where and boolean to find indexes to split on: 然后使用相同的where和boolean查找要分割的索引:

In [393]: I=np.where(arr[:,2]%3==0)

In [395]: np.split(arr,I[0])
Out[395]: 
[array([], dtype=float64),
 array([[ 1.,  1.,  0.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  2.]]),
 array([[ 1.,  1.,  3.],
       [ 1.,  1.,  4.],
       [ 1.,  1.,  5.]]),
 array([[ 1.,  1.,  6.],
       [ 1.,  1.,  7.],
       [ 1.,  1.,  8.]]),
 array([[ 1.,  1.,  9.]])]

I cannot think of any numpy functions or tricks to do this . 我想不出任何numpy函数或技巧来做到这一点。 A simple solution using for loop would be - 使用for循环的简单解决方案是-

In [48]: arr = [(1,1,1), (1,1,2), (1,1,3), (1,1,4),(1,2,1),(1,2,2),(1,2,3),(1,3,1),(1,3,2),(1,3,3),(1,3,4),(1,3,5)]

In [49]: result = []

In [50]: for i in arr:
   ....:     if i[2] == 1:
   ....:         tempres = []
   ....:         result.append(tempres)
   ....:     tempres.append(i)
   ....:

In [51]: result
Out[51]:
[[(1, 1, 1), (1, 1, 2), (1, 1, 3), (1, 1, 4)],
 [(1, 2, 1), (1, 2, 2), (1, 2, 3)],
 [(1, 3, 1), (1, 3, 2), (1, 3, 3), (1, 3, 4), (1, 3, 5)]]

From looking at the documentation it seems like specifying the index of where to split on will work best. 文档上看,似乎最好指定分割位置的索引。 For your specific example the following works if arr is already a 2dimensional numpy array: 对于您的特定示例,如果arr已经是二维numpy数组,则可以使用以下命令:

np.split(arr, np.where(arr[:,2] == 1)[0])

arr[:,2] returns a list of the 3rd entry in each tuple. arr[:,2]返回每个元组中第3个条目的列表。 The colon says to take every row and the 2 says to take the 3rd column, which is the 3rd component. 冒号表示要占用每一行,而冒号2表示要占用第三列,这是第三部分。

We then use np.where to return all the places where the 3rd coordinate is a 1. We have to do np.where()[0] to get at the array of locations directly. 然后,我们使用np.where返回第三个坐标为1的所有位置。我们必须执行np.where()[0]才能直接到达位置数组。

We then plug in the indices we've found where the 3rd coordinate is 1 to np.split which splits at the desired locations. 然后,我们将找到的第三个坐标插入1到np.split的索引中,该索引在所需位置分割。

Note that because the first entry has a 1 in the 3rd coordinate it will split before the first entry. 请注意,由于第一个条目在第三个坐标中具有1,因此它将在第一个条目之前拆分。 This gives us one extra "split" array which is empty. 这为我们提供了一个额外的“拆分”数组,该数组为空。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM