简体   繁体   中英

Finding the two clusters with smallest distance between a list of clusters

I have a list of solution clusters in the following format:

Input:

test = []

# solutions, centroid
test.append([[3,5],4])
test.append([[2,8],5])
test.append([[1,3],2])
test.append([[5,9],7])

Out: [[[3, 5], 4], [[2, 8], 5], [[1, 3], 2], [[5, 9], 7]]

How would I return the union of the 2 clusters with the smallest centroid distance out of all clusters?

Goal Output: [[2,8,3,5],4.5]

The order of the solutions in the union cluster is not relevant.

I have spent a considerable amount of time trying to come up with a solution with multiple loops to no avail.

You can do with two for-loop :

res = []
for t1 in test:
    for t2 in test:
        if t1 != t2:
            res.append([t1[0]+t2[0], (t1[1]+t2[1])/2])
  
# Or as one-line          
res = [[(t1[0]+ t2[0]), (t1[1]+t2[1])/2] for t1 in test for t2 in test if t1 != t2]

Output:

>>> sorted(res, key=lambda x: x[1])[0]
[[3, 5, 1, 3], 3.0]

>>> sorted(res, key=lambda x: x[1])
[[[3, 5, 1, 3], 3.0],
 [[1, 3, 3, 5], 3.0],
 [[2, 8, 1, 3], 3.5],
 [[1, 3, 2, 8], 3.5],
 [[3, 5, 2, 8], 4.5],
 [[2, 8, 3, 5], 4.5],
 [[1, 3, 5, 9], 4.5],
 [[5, 9, 1, 3], 4.5],
 [[3, 5, 5, 9], 5.5],
 [[5, 9, 3, 5], 5.5],
 [[2, 8, 5, 9], 6.0],
 [[5, 9, 2, 8], 6.0]]

You can calculate all distances (essentially all combinations 2 by 2) and sort them by smallest distance, them join the lists (the code below assumes test as you defined in your question):

from itertools import combinations

best_pair = sorted([(abs(el1[1] - el2[1]), el1, el2) for el1, el2 in combinations(test, 2)])[0]
result = [best_pair[1][0] + best_pair[2][0], abs(best_pair[1][1] + best_pair[2][1])/2]

There are optimisations that can be made if you have too many clusters, as the algorithm above is quadratic in time and memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM