[英]In Python, how can I compare a numpy array to each row of a matrix to chose the row that is most similar to the vector?
For instance, if I have a 1D array [91, 119, 161, 203, 259]
and a 2D array [[90,120,160,200,260], [95,115,165,204,255]]
, how can I determine which row from the latter is a better match for the former?例如,如果我有一个一维数组
[91, 119, 161, 203, 259]
和一个二维数组[[90,120,160,200,260], [95,115,165,204,255]]
,我如何确定后者的哪一行是更好的匹配?
In this case, it would be the first row because most numbers are only off by 1, whereas in the second row most numbers are off by 4. Also, the RMSE between the vector and the first row is 1.612, whereas for the second row it is 3.606.在这种情况下,它将是第一行,因为大多数数字仅相差 1,而在第二行中,大多数数字相差 4。此外,向量和第一行之间的 RMSE 为 1.612,而对于第二行它是 3.606。
There are many distance metrics, but cosine is a good measure for vector similarities.有许多距离度量,但余弦是向量相似性的一个很好的度量。 You can use
scipy.spatial.distance.cosine
to find the cosine distance.您可以使用
scipy.spatial.distance.cosine
来查找余弦距离。 You'll want the vector with the smallest cosine distance.您将需要具有最小余弦距离的向量。
In code:在代码中:
import numpy as np
from scipy.spatial.distance import cosine
v = np.array([91, 119, 161, 203, 259])
matrix = np.array([[90,120,160,200,260], [95,115,165,204,255]])
assert np.argmin([cosine(v, row) for row in matrix]) == 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.