I have ran a kmeans algorithm using sklearn.cluster.KMeans
, where I save the results in the object kmeans_results
I can do cl_centers = kmeans_results.cluster_centers_
in order to obtain the cluster centers.
cl_centers
look like this:
array([[0.69332691, 0.9118433 , 0.14215727, 0.00903798],
[0.41407049, 0.95964501, 0.19565154, 0.03157038],
[0.88239715, 0.65602688, 0.20304053, 0.01066663],
[0.65413307, 0.92372214, 0.36504241, 0.03482278]])
I would like to calculate the in between distance of these 4 points, and choose the smallest one, together with their "labels" (where label is just the array index).
The ideal output would be something like:
"The smallest distance is x, and it occurs between cluster 0 and cluster 3"
By "distance" I mean Euclidean distance
Is there a pythonic way of doing this ?
you can try scipy.spatial.distance.pdist(your_array)
which gives you distance matrix between points. Then get your minimal distance
The solution to your problem consists of 2 parts.
cl_centers
array. So as @zelenov aleksey suggested for the first part, the scipy.spatial.distance.pdist
will calculate the pair-wise distances. and then you can create a list of combination of pairwise indices to select from using itertools.combinations
The following will give you the ideal output you stated in your question:
import numpy as np
from scipy.spatial.distance import pdist
import itertools as it
centers_arr = np.array([[0.69332691, 0.9118433 , 0.14215727, 0.00903798],
[0.41407049, 0.95964501, 0.19565154, 0.03157038],
[0.88239715, 0.65602688, 0.20304053, 0.01066663],
[0.65413307, 0.92372214, 0.36504241, 0.03482278]])
pairs = list(it.combinations(range(4),2))
d = pdist(centers_arr)
print("The smallest distance is {:}, and it occurs between cluster {:} and cluster {:}".format(d.min(), *pairs[d.argmin(axis=0)]))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.