繁体   English   中英

使用VLFeat创建集群后,将描述符分配给集群中心

[英]Assign descriptors to cluster centers after creating clusters using VLFeat

我正在使用k-means聚类我的数据,但我没有使用标准算法,我使用近似最近邻(ANN)算法来加速样本到中心的比较。 这可以通过以下方式轻松完成:

[clusterCenters, trainAssignments] = vl_kmeans(trainDescriptors, clusterCount, 'Algorithm', 'ANN', 'MaxNumComparisons', ceil(clusterCount / 50));

现在,当我运行此代码时,变量' trainDescriptors '被聚类,并且使用ANN将每个描述符分配给' clusterCenters '。

我还有另一个变量' testDescriptors '。 我想将它们分配给集群中心。 并且必须使用与“ trainDescriptors ”相同的方法完成此分配,但AFAIK vl_kmeans函数不会返回为快速分配而构建的树。

所以,我的问题是,是否有可能分配“testDescriptors”“clustersCenters”作为vl_kmeans功能分配给“clusterCenters” trainDescriptors',如果是我该怎么办呢?

好吧,我已经弄清楚了。 它可以像下面这样做:

clusterCount = 1024;
datasetTrain = single(rand(128, 100000)); 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 1 - cluster train data and get train assignments
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

[clusterCenters, trainAssignments_actual] = vl_kmeans(datasetTrain, clusterCount, ...
    'Algorithm', 'ANN', ...
    'Distance', 'l2', ...
    'NumRepetitions', 1, ...
    'NumTrees', 3, ...
    'MaxNumComparisons', ceil(clusterCount / 50) ...
);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 2 - assign train data to clusters centers
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

forest = vl_kdtreebuild(clusterCenters, ...
    'Distance', 'l2', ...
    'NumTrees', 3 ...
);

trainAssignments_expected = vl_kdtreequery(forest, clusterCenters, datasetTrain);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 3 - validate second assignment
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

validation = isequal(trainAssignments_actual, trainAssignments_expected);

在步骤2中,我使用群集中心创建新树,然后再次将数据分配给中心。 它给出了有效的结果。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM