简体   繁体   English

NumPy:向量化到一组点的距离之和

[英]NumPy: vectorize sum of distances to a set of points

I'm trying to implementing a k -medoids clustering algorithm in Python/NumPy. 我正在尝试在Python / NumPy中实现k -medoids聚类算法。 As part of this algo, I have to compute the sum of distances from objects to their "medoids" (cluster representatives). 作为该算法的一部分,我必须计算从对象到它们的“ medoids”(集群代表)的距离之和。

I have: a distance matrix on five points 我有:五点距离矩阵

n_samples = 5
D = np.array([[ 0.        ,  3.04959014,  4.74341649,  3.72424489,  6.70298441],
              [ 3.04959014,  0.        ,  5.38516481,  4.52216762,  6.16846821],
              [ 4.74341649,  5.38516481,  0.        ,  1.02469508,  8.23711114],
              [ 3.72424489,  4.52216762,  1.02469508,  0.        ,  7.69025357],
              [ 6.70298441,  6.16846821,  8.23711114,  7.69025357,  0.        ]])

a set of initial medoids 一组初始药物

medoids = np.array([0, 3])

and the cluster memberships 和集群成员

cl = np.array([0, 0, 1, 1, 0])

I can compute the required sum using 我可以使用计算总和

>>> np.sum(D[i, medoids[cl[i]]] for i in xrange(n_samples))
10.777269622938899

but that uses a Python loop. 但这使用了Python循环。 Am I missing some kind of vectorized idiom for computing this sum? 我是否缺少某种用于计算此和的矢量化习惯用法?

How about: 怎么样:

In [17]: D[np.arange(n_samples),medoids[cl]].sum()
Out[17]: 10.777269629999999

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM