简体   繁体   English

将欧几里德距离加到矩阵

[英]Adding Euclidean distance to a matrix

I explain what I have to develop. 我解释了我必须发展的东西。

Let's say I have to perform a function that is responsible for receiving two matrices, which have the same number of columns but can differ in the number of rows. 假设我必须执行一个函数,该函数负责接收两个矩阵,它们的列数相同,但行数可以不同。

In summary, we will have two matrices of vectors with the same dimension but different number N of elements. 总而言之,我们将有两个向量矩阵,它们的维数相同,但元素数N不同。

I have to calculate the Euclidean distance between each of the vectors that make up my two matrices, and then store it in another matrix that will contain the Euclidean distance between all my vectors. 我必须计算组成两个矩阵的每个向量之间的欧几里得距离,然后将其存储在另一个矩阵中,该矩阵将包含我所有向量之间的欧几里得距离。

This is the code I have developed: 这是我开发的代码:

def compute_distances(x, y):
    # Dimension:
    N, d = x.shape
    M, d_ = y.shape

    # The dimension should be the same
    if d != d_:
        print "Dimensiones de x e y no coinciden, no puedo calcular las distancias..."
        return None

    # Calculate distance with loops:
    D = np.zeros((N, M))
    i = 0
    j = 0
    for v1 in x:
       for v2 in y:
            if(j != M):
                D[i,j] = math.sqrt(sum([(xi-yi)**2 for xi,yi in zip(v1,v2)]))
            #print "[",i,",",j,"]"
                j = j + 1
            else:
                j = 0
       i = i + 1;

    print D

In this method I am receiving the two matrices to later create a matrix that will have the Euclidean distances between the vectors of my matrices x and y . 在这种方法中,我将接收两个矩阵,以稍后创建一个矩阵,该矩阵将在矩阵xy的向量之间具有欧几里得距离。

The problem is the following, I do not know how, to each one of the calculated Euclidean distance values ​​I have to assign the correct position of the new matrix D that I have generated. 问题如下,我不知道如何向每个计算出的欧几里得距离值分配我生成的新矩阵D的正确位置。

My main function has the following structure: 我的主要功能具有以下结构:

n = 1000
m = 700
d = 10

x = np.random.randn(n, d)
y = np.random.randn(m, d)

print "x shape =", x.shape
print "y shape =", y.shape

D_bucle = da.compute_distances(x, y)
D_cdist = cdist(x, y)

print np.max(np.abs(D_cdist - D_bucle))

B_cdist calculates the Euclidean distance using efficient methods. B_cdist使用有效的方法来计算欧几里得距离。 It has to have the same result as D_bucle that calculates the same as the other but with non efficient code, but I'm not getting what the result should be. 它必须具有与D_bucle相同的结果,该结果与D_bucle计算的D_bucle相同,但使用的代码效率不高,但是我没有得到应有的结果。

I think it's when I create my Euclidean matrix D that is not doing it correctly, then the calculations are incorrect. 我认为是当我创建欧几里得矩阵D ,它做得不正确,所以计算不正确。

Updated!!! 更新!!! I just updated my solution, my problem is that firstly I didnt know how to asign to the D Matrix my correct euclidean vector result for each pair of vectors, Now I khow how to asign it but now my problem is that only the first line from D Matrix is having a correct result in comparison with cdist function 我刚刚更新了解决方案,我的问题是,首先我不知道如何将每对向量的正确欧氏向量结果分配给D矩阵,现在我知道如何分配它,但是现在的问题是,仅从第一行开始与cdist函数相比,D Matrix的结果正确

not fully understanding what you're asking, but I do see one problem which may explain your results: 尚未完全理解您的要求,但我确实看到一个可能解释您结果的问题:

for v1 in x:
  for v2 in y:
    D = math.sqrt(sum([(xi-yi)**2 for xi,yi in zip(v1,v2)]))

You are overwriting the value of D each of the NxM times you go through this loop. 在此循环中,每NxM次您将覆盖D的值。 When you're done D only contains the distance of the last compare. 完成后,D仅包含最后一个比较的距离。 You might need something like D[i,j] = math.sqrt(... 您可能需要类似D [i,j] = math.sqrt(...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM