简体繁体 English

以k表示选择簇数

[英]Choosing number of clusters in k means

原文 2010-11-20 07:33:19 1 1 algorithm/ matlab

I want to cluster a large sample of data and for it I am using k means function in MATLAB. 我想聚类大量数据，为此我在MATLAB中使用了k均值函数。 The problem is that it returns a matrix with all the data sorted in the number of clusters I specify. 问题是它返回一个矩阵，其中所有数据都按我指定的簇数排序。

How can I know which number of clusters is optimal. 我怎么知道哪个簇是最佳的。

I thought that if I would get the equal number of elements in each cluster that would be optimal but this never happens. 我以为，如果我在每个群集中得到相等数量的元素，那将是最佳选择，但这永远不会发生。 Rather it can go on clustering the data for any number I put. 相反，它可以继续对我输入的任何数字进行数据聚类。

Please help... 请帮忙...

1 个解决方案

I read and I think an answer to this could be :- In kmeans we are trying to partition the data according to the means as the data comes so theoretically our best dataset would be where each partition has equal number of data. 我读了一下，我认为对此的答案可能是：-在kmeans中，我们试图根据数据出现时的方式对数据进行分区，因此从理论上讲，我们最好的数据集将是每个分区具有相等数量的数据。

I used kmeans++ which was a better algorithm than kmeans because it does not initialise a random value and then iterated over the number of partitions till the sizes of partitions were almost equal. 我使用kmeans ++是一种比kmeans更好的算法，因为它不初始化随机值，然后遍历分区的数量直到分区的大小几乎相等。 This was an approximate figure as say for 3 i got 2180,729,1219 and for 4 i was getting 30,2422, 1556,120 so I chose 3 as my final answer............ 这是一个大概的数字，比如说3我得到2180,729,1219，而4我得到30,2422，1556,120，所以我选择3作为我的最终答案.......

使用具有L方法的平滑器来确定K-Means簇的数量 - Using a smoother with the L Method to determine the number of K-Means clusters

K-Means用于对角线聚类 - K-Means for diagonal clusters

聚类与不均匀聚类（k均值） - Clustering with uneven clusters (k-means)

需要比较 K-means 聚类相似度 - Need to compare the K-means clusters similarity

在k中选择簇值意味着算法 - Choosing the cluster values in k means algorithm

使用BIC（MATLAB）在K均值聚类中的最佳聚类数 - optimum number of clusters in K mean clustering using BIC, (MATLAB)

根据连接距离，使用K-means plus plus聚类算法创建聚类 - Creating clusters with K-means plus plus clustering algorithm based on connected distance

如何找到选择3种类型的k个对象的方式数量 - how to find the number of ways of choosing k objects of 3 types

如何找到从 k 个子集中选择一个的组合数 - How to find number of combinations of choosing one from k subsets

如何找到给定大小n的聚类方式的数目和k的聚类数目的递归公式？ - How to find the recurrence formula for the number of ways of clustering given size n and number of clusters to be k?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用具有L方法的平滑器来确定K-Means簇的数量 - Using a smoother with the L Method to determine the number of K-Means clusters K-Means用于对角线聚类 - K-Means for diagonal clusters 聚类与不均匀聚类（k均值） - Clustering with uneven clusters (k-means) 需要比较 K-means 聚类相似度 - Need to compare the K-means clusters similarity 在k中选择簇值意味着算法 - Choosing the cluster values in k means algorithm 使用BIC（MATLAB）在K均值聚类中的最佳聚类数 - optimum number of clusters in K mean clustering using BIC, (MATLAB) 根据连接距离，使用K-means plus plus聚类算法创建聚类 - Creating clusters with K-means plus plus clustering algorithm based on connected distance 如何找到选择3种类型的k个对象的方式数量 - how to find the number of ways of choosing k objects of 3 types 如何找到从 k 个子集中选择一个的组合数 - How to find number of combinations of choosing one from k subsets 如何找到给定大小n的聚类方式的数目和k的聚类数目的递归公式？ - How to find the recurrence formula for the number of ways of clustering given size n and number of clusters to be k?

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM