简体   繁体   English

逐行用另一个ndarray过滤numpy ndarray

[英]Filter numpy ndarray with another ndarray, row by row

I have 2 numpy ndarray 我有2 numpy ndarray

The first contain x and y values : 第一个包含x和y值:

xy_arr = [[ 736190.125         1130.        ]
 [ 736190.16666667    1130.        ]
 [ 736190.20833333    1130.        ]
...,
 [ 736190.375         1140.        ]
 [ 736190.41666667    1140.        ]
 [ 736190.45833333    1140.        ]
 [ 736190.5           1140.        ]]

the second have xy and index values and is much bigger than the first: 第二个具有xy和index值,并且比第一个大得多:

xyind_arr = [[  7.35964000e+05   1.02000000e+03   0.00000000e+00]
 [  7.35964042e+05   1.02000000e+03   1.00000000e+00]
 [  7.35964083e+05   1.02000000e+03   2.00000000e+00]
 ..., 
 [  7.36613397e+05   1.09500000e+03   3.07730000e+04]
 [  7.36613404e+05   1.10000000e+03   3.07740000e+04]
 [  7.36613411e+05   1.10500000e+03   3.07750000e+04]]

I want to keep all rows of the xyind_arr where values are same in xy_arr like : 我想保留xyind_arr的所有行,其中xy_arr中的值相同:

(xyind_arr[:,0] == xy_arr[:,0]) and (xyind_arr[:,1] == xy_arr[:,1]) (xyind_arr [:,0] == xy_arr [:,0])和(xyind_arr [:,1] == xy_arr [:,1])

My code : 我的代码:

sub_array = xyind_arr[((xyind_arr[:, 0] == xy_arr[:, 0]) &
                       (xyind_arr[:, 1] == xy_arr[:, 1]))]

Only work if the xy_array have one element. 仅在xy_array具有一个元素的情况下有效。 For example : 例如 :

import numpy as np

xy_arr = np.array([[56, 400]])
xyind_arr = np.array([[5, 6, 0],[8, 12, 1],[9, 17, 2],[56, 400, 3],[23, 89, 4]])

sub_array = xyind_arr[((xyind_arr[:, 0] == xy_arr[:, 0]) &
                       (xyind_arr[:, 1] == xy_arr[:, 1]))]

print(sub_array)

result OK : 结果OK:

[[ 56 400   3]]

But with 但是随着

xy_arr = np.array([[5, 6],[8, 12],[23, 89]])

The result is 结果是

[]

And I expected 我期望

[[5, 6, 0],[8, 12, 1],[23, 89, 4]]

Is there any clean numpy method to obtain this filtered sub array ? 是否有任何干净的numpy方法来获取此过滤后的子数组?


Edit : 编辑:

Finally I let down the numpy solution and use the python set() : 最后,我放下了numpy解决方案,并使用python set():

    xy_arr_set = set(map(tuple, xy_arr))
    xyind_arr_set = set(map(tuple, xyind_arr))
    for x, y, ind in xyind_arr_set:
        if (x,y) in xy_arr_set:
            "do what i need"

There is numpy.isin but it tests only against a scalar array; numpy.isin但它仅针对标量数组进行测试; there is no tuple-comparison in it. 没有元组比较。 You could use this method to find all rows of Array1 where the 0th column entry is in 0th column of Array2, and also the 1st column entry is in 1st column of Array2. 您可以使用此方法查找Array1的所有行,其中第0列条目位于Array2的第0列中,并且第1列条目位于Array2的第1列中。 But this is different from your task, because there is no guarantee that both 0th and 1st entry were found in the same row of Array2. 但这与您的任务不同,因为不能保证 Array2 的同一行中找到第0个和第1个条目。

Since xyind_arr is much larger, I think it should be acceptable to loop over the smaller array xy_arr , applying one of the xy_arr filters at a time, and concatenate the results. 由于xyind_arr更大,因此我认为可以遍历较小的数组xy_arr ,一次应用xy_arr过滤器之一,然后合并结果,这是可以接受的。 For this to work, the rows of xy_arr must be unique, so better check that first: 为此, xy_arr的行必须是唯一的,因此最好先检查一下:

xy_arr = np.unique(xy_arr, axis=0)    
sub_array = np.concatenate([xyind_arr[(xyind_arr[:, 0] == xy_arr[k, 0]) &
                            (xyind_arr[:, 1] == xy_arr[k, 1])]
                            for k in np.arange(xy_arr.shape[0])], axis=0) 

Note: the order of rows will not be preserved. 注意:行的顺序将不会保留。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM