简体   繁体   English

如何使用PHP从距离矩阵中获取聚类?

[英]How can I get clusters from distance matrix, using PHP?

I have distance matrix as two-dimensional array, like this: 我将距离矩阵作为二维数组,如下所示:

距离矩阵

So, I need to find clusters, of elements with its help. 所以,我需要在它的帮助下找到元素集群。 I can do it, using hierarchic clusterization, like k-means. 我可以使用层次聚类来完成它,就像k-means一样。 I have found such example here PHP K-Means 我在PHP K-Means中找到了这样的例子

How can I convert my two-dimensional array into array of points, listed in this example? 如何将我的二维数组转换为此示例中列出的点数组?

$points = [
[80,55],[86,59],[19,85],[41,47],[57,58],
[76,22],[94,60],[13,93],[90,48],[52,54],
[62,46],[88,44],[85,24],[63,14],[51,40],
[75,31],[86,62],[81,95],[47,22],[43,95],
[71,19],[17,65],[69,21],[59,60],[59,12],
[15,22],[49,93],[56,35],[18,20],[39,59],
[50,15],[81,36],[67,62],[32,15],[75,65],
[10,47],[75,18],[13,45],[30,62],[95,79],
[64,11],[92,14],[94,49],[39,13],[60,68],
[62,10],[74,44],[37,42],[97,60],[47,73],
];

First: a nitpick: k-Means is not a hierarchical clustering algorithm, see https://www.quora.com/What-is-the-difference-between-k-means-and-hierarchical-clustering for details o the difference. 第一:挑剔:k-Means不是一种层次聚类算法,详情请见https://www.quora.com/What-is-the-difference-between-k-means-and-hierarchical-clustering

Second: you don't want to convert a distance matrix back to the points it originated from as you take a step back. 第二:当您退后一步时,您不希望将距离矩阵转换回它起源的点。 Sadly the k-Means implementation you linked only has an API that allows you to enter raw coordinates and assumes Euclidean distance, therefore you have some possibilities, depending on your requirements: 可悲的是,您链接的k-Means实现只有一个API,允许您输入原始坐标并假设欧几里德距离,因此您有一些可能性,具体取决于您的要求:

  1. Where do you get the distance matrix from? 你从哪里得到距离矩阵? If it is possible, get the raw coordinates (and make sure the distance measure is euclidean distance) and use the library you linked. 如果可能,获取原始坐标(并确保距离度量是欧氏距离)并使用您链接的库。

  2. Override the Point class in the library you linked: specifically the getDistanceWith method to return values from your matrix 覆盖您链接的库中的Point类:特别是getDistanceWith方法,用于从矩阵返回值

  3. If you only need to calculate the cluster once, use python and sklearn. 如果您只需要计算一次集群,请使用python和sklearn。 This library does exactly what you want. 这个库完全符合你的要求。 Especially: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.cluster.hierarchy.linkage.html 特别是: https//docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.cluster.hierarchy.linkage.html

  4. Write your own code: clustering is quite an easy topic and therefore it is a nice coding exercise. 编写自己的代码:集群是一个非常简单的主题,因此它是一个很好的编码练习。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM