简体   繁体   English

数据聚类

[英]Clustering of data

I have a 2-dimensional dataset with several points (say 100), each having x and y coordinate in MATLAB. 我有一个带有多个点(例如100个点)的二维数据集,每个点在MATLAB中都具有x和y坐标。 I need to cluster these points around some predefined points (say 5) according to the nearest neighbour (Euclidean distance). 我需要根据最近的邻居(欧几里得距离)将这些点聚集在一些预定义的点(例如5)周围。 But, each predefined point has a limit to the number of points associated with it. 但是,每个预定义点都有一个与之关联的点数限制。 for ex. 对于前。 predefined point 1 should have a cluster of 20 points from the dataset, the second should have 10, third should have 30 and so on without overlapping and each point should be classified. 预定义点1应该具有来自数据集的20个点的聚类,第二个应该具有10个点,第三个应该具有30个点,依此类推,以此类推且不重叠,并且应该对每个点进行分类。 Is there any function that I can use to do this? 我有什么功能可以使用吗? In normal clustering, I cannot define the size of an individual cluster. 在普通群集中,我无法定义单个群集的大小。 Thank you in advance. 先感谢您。

You can use knnsearch in MATLAB to find the nearest neighbours. 您可以在MATLAB中使用knnsearch来查找最近的邻居。 https://ch.mathworks.com/help/stats/knnsearch.html https://ch.mathworks.com/help/stats/knnsearch.html

Therefore specify a reference point and select the amount of nearest points. 因此,请指定参考点并选择最近点的数量。 The non-overlap needs to be addressed in a second step. 第二步需要解决非重叠问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM