简体   繁体   English

在二维数组中查找到最近邻居的距离

[英]Find distance to nearest neighbor in 2d array

I have a 2D array and I want to find for each (x, y) point the distance to its nearest neighbor as fast as possible. 我有一个2D数组,我想为每个(x, y)点找到到其最近邻居的距离尽快。

I can do this using scipy.spatial.distance.cdist : 我可以使用scipy.spatial.distance.cdist做到这一点

import numpy as np
from scipy.spatial.distance import cdist

# Random data
data = np.random.uniform(0., 1., (1000, 2))
# Distance between the array and itself
dists = cdist(data, data)
# Sort by distances
dists.sort()
# Select the 1st distance, since the zero distance is always 0.
# (distance of a point with itself)
nn_dist = dists[:, 1]

This works, but I feel like its too much work and KDTree should be able to handle this but I'm not sure how. 这行得通,但是我觉得它的工作量很大, KDTree应该可以处理这个,但是我不确定如何。 I'm not interested in the coordinates of the nearest neighbor, I just want the distance (and to be as fast as possible). 我对最近的邻居的坐标不感兴趣,我只想要距离(并尽可能快)。

KDTree can do this. KDTree可以做到这一点。 The process is almost the same as when using cdist. 该过程与使用cdist时几乎相同。 But cdist is much faster. 但是cdist更快。 And as pointed out in the comments, cKDTree is even faster: 正如评论中指出的那样,cKDTree甚至更快:

import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import KDTree
from scipy.spatial import cKDTree
import timeit

# Random data
data = np.random.uniform(0., 1., (1000, 2))

def scipy_method():
    # Distance between the array and itself
    dists = cdist(data, data)
    # Sort by distances
    dists.sort()
    # Select the 1st distance, since the zero distance is always 0.
    # (distance of a point with itself)
    nn_dist = dists[:, 1]
    return nn_dist

def KDTree_method():
    # You have to create the tree to use this method.
    tree = KDTree(data)
    # Then you find the closest two as the first is the point itself
    dists = tree.query(data, 2)
    nn_dist = dists[0][:, 1]
    return nn_dist

def cKDTree_method():
    tree = cKDTree(data)
    dists = tree.query(data, 2)
    nn_dist = dists[0][:, 1]
    return nn_dist

print(timeit.timeit('cKDTree_method()', number=100, globals=globals()))
print(timeit.timeit('scipy_method()', number=100, globals=globals()))
print(timeit.timeit('KDTree_method()', number=100, globals=globals()))

Output: 输出:

0.34952507635557595
7.904083715193579
20.765962179145546

Once again, then very unneeded proof that C is awesome! 再一次,那么非常不需要的证据证明C很棒!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM