简体   繁体   中英

Calculate euclidean distance between vectors with cluster medoids

I have array consist of 3 vectors that represent 3 objects

X2=array([[ 5.43840675, -1.05259078, -0.21793506,  8.56686818, -2.58056957,
        -0.07310339, -0.31181501,  0.02696586],
       [ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
        -0.04897565, -0.34271698, -0.0339766 ],
       [ 5.93081714, -1.52272427,  0.40706477,  8.56256569, -3.216366  ,
        -0.0108426 , -0.57434619, -0.18952662]])

model1 = KMedoids(n_clusters=2, random_state=0).fit(X2)
    

and cluster labels for them are [1, 0, 0]

medoids are

medoids=array([[ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
        -0.04897565, -0.34271698, -0.0339766 ],
       [ 5.43840675, -1.05259078, -0.21793506,  8.56686818, -2.58056957,
        -0.07310339, -0.31181501,  0.02696586]])
    

I want to calculate the distance for each object in (X2) with each cluster (0,1), for example for object [1] with cluster (0)

 X2[1]=([ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
        -0.04897565, -0.34271698, -0.0339766 ])
medoids[0]=[ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
            -0.04897565, -0.34271698, -0.0339766 ]

the distance (a) should be zero since there is no difference between them.

        a=euclidean_distances(X2[1].reshape(-1, 1), X2[model1.medoid_indices_][0].reshape(-1, 1))
        

Any idea what can be the issue?

The euclidean distance function is working as expected, as it is calculating the distance between each item in the two arrays. In this regard, the euclidean distance matrix is symmetrical.

import numpy as np
from sklearn_extra.cluster import KMedoids
from sklearn.metrics.pairwise import euclidean_distances


X2=np.array([[ 5.43840675, -1.05259078, -0.21793506,  8.56686818, -2.58056957,
        -0.07310339, -0.31181501,  0.02696586],
       [ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
        -0.04897565, -0.34271698, -0.0339766 ],
       [ 5.93081714, -1.52272427,  0.40706477,  8.56256569, -3.216366  ,
        -0.0108426 , -0.57434619, -0.18952662]])

model1 = KMedoids(n_clusters=2, random_state=0).fit(X2)

medoids=np.array([[ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
        -0.04897565, -0.34271698, -0.0339766 ],
       [ 5.43840675, -1.05259078, -0.21793506,  8.56686818, -2.58056957,
        -0.07310339, -0.31181501,  0.02696586]])

X2[1]=([ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
        -0.04897565, -0.34271698, -0.0339766 ])

medoids[0]=[ 5.72318296, -0.99665473, -0.14540062,  8.32051008, -3.36201189,
            -0.04897565, -0.34271698, -0.0339766 ]

a = (X2[1].reshape(-1, 1))
b = (X2[model1.medoid_indices_][0].reshape(-1, 1))

# dist(x, y) = sqrt(dot(x, x) - 2 * dot(x, y) + dot(y, y))
dist =euclidean_distances(a, b)
print(dist)

This is what you would see:

[[ 0.          6.71983769  5.86858358  2.59732712  9.08519485  5.77215861
   6.06589994  5.75715956]
 [ 6.71983769  0.          0.85125411  9.31716481  2.36535716  0.94767908
   0.65393775  0.96267813]
 [ 5.86858358  0.85125411  0.          8.4659107   3.21661127  0.09642497
   0.19731636  0.11142402]
 [ 2.59732712  9.31716481  8.4659107   0.         11.68252197  8.36948573
   8.66322706  8.35448668]
 [ 9.08519485  2.36535716  3.21661127 11.68252197  0.          3.31303624
   3.01929491  3.32803529]
 [ 5.77215861  0.94767908  0.09642497  8.36948573  3.31303624  0.
   0.29374133  0.01499905]
 [ 6.06589994  0.65393775  0.19731636  8.66322706  3.01929491  0.29374133
   0.          0.30874038]
 [ 5.75715956  0.96267813  0.11142402  8.35448668  3.32803529  0.01499905
   0.30874038  0.        ]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM