简体   繁体   中英

Find closest/similar value(vector) inside a matrix

let's say I have the following numpy matrix (simplified):

matrix = np.array([[1, 1],
               [2, 2],
               [5, 5],
               [6, 6]]
              )

And now I want to get the vector from the matrix closest to a "search" vector:

search_vec = np.array([3, 3])

What I have done is the following:

min_dist = None
result_vec = None
for ref_vec in matrix:
    distance = np.linalg.norm(search_vec-ref_vec)
    distance = abs(distance)
    print(ref_vec, distance)
    if min_dist == None or min_dist > distance:
        min_dist = distance
        result_vec = ref_vec

The result works, but is there a native numpy solution to do it more efficient? My problem is, that the bigger the matrix becomes, the slower the entire process will be. Are there other solutions that handle these problems in a more elegant and efficient way?

Approach #1

We can use Cython-powered kd-tree for quick nearest-neighbor lookup , which is very efficient both memory-wise and with performance -

In [276]: from scipy.spatial import cKDTree

In [277]: matrix[cKDTree(matrix).query(search_vec, k=1)[1]]
Out[277]: array([2, 2])

Approach #2

With SciPy's cdist -

In [286]: from scipy.spatial.distance import cdist

In [287]: matrix[cdist(matrix, np.atleast_2d(search_vec)).argmin()]
Out[287]: array([2, 2])

Approach #3

With Scikit-learn's Nearest Neighbors -

from sklearn.neighbors import NearestNeighbors

nbrs = NearestNeighbors(n_neighbors=1).fit(matrix)
closest_vec = matrix[nbrs.kneighbors(np.atleast_2d(search_vec))[1][0,0]]

Approach #4

With Scikit-learn's kdtree -

from sklearn.neighbors import KDTree
kdt = KDTree(matrix, metric='euclidean')
cv = matrix[kdt.query(np.atleast_2d(search_vec), k=1, return_distance=False)[0,0]]

Approach #5

From eucl_dist package (disclaimer: I am its author) and following the wiki contents , we could leverage matrix-multiplication -

M = matrix.dot(search_vec)
d = np.einsum('ij,ij->i',matrix,matrix) + np.inner(search_vec,search_vec) -2*M
closest_vec = matrix[d.argmin()]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM