简体   繁体   English

如果两点之间的距离小于某个阈值,则从列表中删除点

[英]Removing points from list if distance between 2 points is below a certain threshold

I have a list of points and I want to keep the points of the list only if the distance between them is greater than a certain threshold. 我有一个点列表,并且仅当点之间的距离大于某个阈值时才想保留列表中的点。 So, starting from the first point, if the the distance between the first point and the second is less than the threshold then I would remove the second point then compute the distance between the first one and the third one. 因此,从第一点开始,如果第一点和第二点之间的距离小于阈值,那么我将移除第二点,然后计算第一点和第三点之间的距离。 If this distance is less than the threshold, compare the first and fourth point. 如果该距离小于阈值,则比较第一点和第四点。 Else move to the distance between the third and fourth and so on. 否则移动到第三和第四之间的距离,依此类推。

So for example, if the threshold is 2 and I have 因此,例如,如果阈值为2并且我有

list = [1, 2, 5, 6, 10]

then I would expect 那我期望

new_list = [1, 5, 10]

Thank you! 谢谢!

Not a fancy one-liner, but you can just iterate the values in the list and append them to some new list if the current value is greater than the last value in the new list, using [-1] : 这不是花哨的单行代码,但是如果当前值大于新列表中的最后一个值,则可以使用[-1]迭代列表中的值并将它们附加到新列表中:

lst = range(10)
diff = 3

new = []
for n in lst:
    if not new or abs(n - new[-1]) >= diff:
        new.append(n)

Afterwards, new is [0, 3, 6, 9] . 之后, new[0, 3, 6, 9]


Concerning your comment "What if i had instead a list of coordinates (x,y)?": In this case you do exactly the same thing, except that instead of just comparing the numbers, you have to find the Euclidean distance between two points. 关于您的评论“如果我有一个坐标列表(x,y),该怎么办?”:在这种情况下,您做的事情完全一样,除了要不只是比较数字,还必须找到两点之间的欧几里得距离 So, assuming lst is a list of (x,y) pairs: 因此,假设lst(x,y)对的列表:

if not new or ((n[0]-new[-1][0])**2 + (n[1]-new[-1][1])**2)**.5 >= diff:

Alternatively, you can convert your (x,y) pairs into complex numbers. 或者,您可以将(x,y)对转换为complex For those, basic operations such as addition, subtraction and absolute value are already defined, so you can just use the above code again. 对于这些,已经定义了基本操作,例如加,减和绝对值,因此您可以再次使用以上代码。

lst = [complex(x,y) for x,y in lst]

new = []
for n in lst:
    if not new or abs(n - new[-1]) >= diff:  # same as in the first version
        new.append(n)
print(new)

Now, new is a list of complex numbers representing the points: [0j, (3+3j), (6+6j), (9+9j)] 现在, new是代表点的复数列表: [0j, (3+3j), (6+6j), (9+9j)]

While the solution by tobias_k works, it is not the most efficient (in my opinion, but I may be overlooking something). 尽管tobias_k的解决方案有效,但这并不是最有效的解决方案(我认为,但我可能会忽略某些事情)。 It is based on list order and does not consider that the element which is close (within threshold) to the maximum number of other elements should be eliminated the last in the solution. 它基于列表顺序,不认为与其他元素的最大数量接近(在阈值内)的元素应在解决方案中最后消除。 The element that has the least number of such connections (or proximities) should be considered and checked first. 具有此类连接(或邻近)数量最少的元素应首先考虑并检查。 The approach I suggest will likely allow retaining the maximum number of points that are outside the specified thresholds from other elements in the given list. 我建议的方法可能会允许保留给定列表中其他元素的超出指定阈值的最大点数。 This works very well for list of vectors and therefore x,y or x,y,z coordinates. 这对于向量列表非常有效,因此适用于x,y或x,y,z坐标。 If however you intend to use this solution with a list of scalars, you can simply include this line in the code orig_list=np.array(orig_list)[:,np.newaxis].tolist() 但是,如果您打算将此解决方案与标量列表一起使用,则可以在代码orig_list=np.array(orig_list)[:,np.newaxis].tolist()简单包含此行。

Please see the solution below: 请参阅以下解决方案:

import numpy as np

thresh = 2.0

orig_list=[[1,2], [5,6], ...]

nsamp = len(orig_list)
arr_matrix = np.array(orig_list)
distance_matrix = np.zeros([nsamp, nsamp], dtype=np.float)

for ii in range(nsamp):
    distance_matrix[:, ii] = np.apply_along_axis(lambda x: np.linalg.norm(np.array(x)-np.array(arr_matrix[ii, :])),
                                                              1,
                                                              arr_matrix)


n_proxim = np.apply_along_axis(lambda x: np.count_nonzero(x < thresh),
                               0,
                               distance_matrix)

idx = np.argsort(n_proxim).tolist()
idx_out = list()

for ii in idx:
    for jj in range(ii+1):
        if ii not in idx_out:
            if self.distance_matrix[ii, jj] < thresh:
                if ii != jj:
                    idx_out.append(jj)

pop_idx = sorted(np.unique(idx_out).tolist(),
                 reverse=True)

for pop_id in pop_idx:
    orig_list.pop(pop_id)

nsamp = len(orig_list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM