Adding Euclidean distance to a matrix

Question

I explain what I have to develop.

Let's say I have to perform a function that is responsible for receiving two matrices, which have the same number of columns but can differ in the number of rows.

In summary, we will have two matrices of vectors with the same dimension but different number N of elements.

I have to calculate the Euclidean distance between each of the vectors that make up my two matrices, and then store it in another matrix that will contain the Euclidean distance between all my vectors.

This is the code I have developed:

def compute_distances(x, y):
    # Dimension:
    N, d = x.shape
    M, d_ = y.shape

    # The dimension should be the same
    if d != d_:
        print "Dimensiones de x e y no coinciden, no puedo calcular las distancias..."
        return None

    # Calculate distance with loops:
    D = np.zeros((N, M))
    i = 0
    j = 0
    for v1 in x:
       for v2 in y:
            if(j != M):
                D[i,j] = math.sqrt(sum([(xi-yi)**2 for xi,yi in zip(v1,v2)]))
            #print "[",i,",",j,"]"
                j = j + 1
            else:
                j = 0
       i = i + 1;

    print D

In this method I am receiving the two matrices to later create a matrix that will have the Euclidean distances between the vectors of my matrices x and y .

The problem is the following, I do not know how, to each one of the calculated Euclidean distance values I have to assign the correct position of the new matrix D that I have generated.

My main function has the following structure:

n = 1000
m = 700
d = 10

x = np.random.randn(n, d)
y = np.random.randn(m, d)

print "x shape =", x.shape
print "y shape =", y.shape

D_bucle = da.compute_distances(x, y)
D_cdist = cdist(x, y)

print np.max(np.abs(D_cdist - D_bucle))

B_cdist calculates the Euclidean distance using efficient methods. It has to have the same result as D_bucle that calculates the same as the other but with non efficient code, but I'm not getting what the result should be.

I think it's when I create my Euclidean matrix D that is not doing it correctly, then the calculations are incorrect.

Updated!!! I just updated my solution, my problem is that firstly I didnt know how to asign to the D Matrix my correct euclidean vector result for each pair of vectors, Now I khow how to asign it but now my problem is that only the first line from D Matrix is having a correct result in comparison with cdist function

Answer 1

not fully understanding what you're asking, but I do see one problem which may explain your results:

for v1 in x:
  for v2 in y:
    D = math.sqrt(sum([(xi-yi)**2 for xi,yi in zip(v1,v2)]))

You are overwriting the value of D each of the NxM times you go through this loop. When you're done D only contains the distance of the last compare. You might need something like D[i,j] = math.sqrt(...

Adding Euclidean distance to a matrix

Question

1 answers

solution1
0 2017-11-27 23:59:12

Adding Euclidean distance to a matrix

Question

1 answers

solution1 0 2017-11-27 23:59:12

solution1
0 2017-11-27 23:59:12