繁体   English   中英

如何使用Python从最接近所有其他向量的向量列表中找到向量?

[英]How to find a vector from a list of vectors that is nearest to all other vectors using Python?

我有一个向量列表作为一个numpy数组。

[[ 1., 0., 0.],
 [ 0., 1., 2.] ...]

它们都具有相同的尺寸。 我如何发现在向量空间中哪个向量最接近数组中所有其他向量? 有计算这个的scipy或sklearn函数吗?

Update

“最接近”是指余弦和欧几里得距离。

Update 2

假设我有4个向量(a,b,c,d),向量之间的余弦距离为:

a,b = 0.2

a,c = 0.9

a,d = 0.7

b,c = 0.5

b,d = 0.75

c,d = 0.8

因此,对于每个向量,我得到:

{
    'a': [1,0.2,0.9,0.7],

    'b': [0.2,1,0.5,0.75],

    'c' : [0.9,0.5,1,0.75],

    'd' : [0.7,0.75,0.8,1]
}

可以说矢量d是与a,b,c最相似的一个吗?

您可以像这样蛮力。 请注意,这是O(n ^ 2),并且对于大n会变慢。

import numpy as np

def cost_function(v1, v2):
    """Returns the square of the distance between vectors v1 and v2."""
    diff = np.subtract(v1, v2)
    # You may want to take the square root here
    return np.dot(diff, diff)

n_vectors = 5
vectors = np.random.rand(n_vectors,3)

min_i = -1
min_cost = 0
for i in range (0, n_vectors):
    sum_cost = 0.0
    for j in range(0, n_vectors):
        sum_cost = sum_cost + cost_function(vectors[i,:],vectors[j,:])
    if min_i < 0 or min_cost > sum_cost:
        min_i = i
        min_cost = sum_cost
    print('{} at {}: {:.3f}'.format(i, vectors[i,:], sum_cost))
print('Lowest cost point is {} at {}: {:.3f}'.format(min_i, vectors[min_i,:], min_cost))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM