简体繁体 English

Opencv中的视觉词袋

[英]Bag of Visual Words in Opencv

原文 2013-10-24 09:46:05 4 1 opencv/ computer-vision/ k-means/ surf/ feature-extraction

I am using BOW in opencv for clustering the features of variable size. 我在opencv中使用BOW来聚类可变大小的功能。 However one thing is not clear from the documentation of the opencv and also i am unable to find the reason for this question: 但是有一件事从opencv的文档中不清楚，我也无法找到这个问题的原因：

assume: dictionary size = 100. 假设：字典大小= 100。

I use surf to compute the features, and each image has variable size descriptors eg: 128 x 34, 128 x 63, etc. Now in BOW each of them are clustered and I get a fixed descriptor size of 128 x 100 for a image. 我使用surf来计算特征，每个图像都有可变大小的描述符，例如：128 x 34,128 x 63等。现在在BOW中，每个都是聚类的，我得到一个固定的描述符大小为128 x 100的图像。 I know 100 is the cluster center created using kmeans clustering. 我知道100是使用kmeans聚类创建的集群中心。

But I am confused in that, if image has 128 x 63 descriptors, than how come it clusters into 100 clusters which is impossible using kmeans UNLESS i convert the descriptor matrix to 1D. 但我感到困惑的是，如果图像有128 x 63个描述符，那么它将如何聚类成100个聚类，这是不可能使用kmeans，除非我将描述符矩阵转换为1D。 Wont converting to 1D will lose valid 128 dimensional information of a single key points? 转换为1D会丢失单个关键点的有效128维信息吗？

I need to know how is the descriptor matrix manipulated to get 100 cluter centers from only 63 features. 我需要知道如何操纵描述符矩阵以从仅63个特征中获得100个cluter中心。

1 个解决方案

Think it like this. 这样想吧。

You have 10 cluster means total and 6 features for current image. 总共有10个集群平均值，当前图像有6个特征。 First 3 of those features are closest to 5th mean and remaining 3 of them are closest to 7th, 8th, and 9th mean respectively. 这些特征中的前3个最接近第5个平均值，其余3个最接近第7个，第8个和第9个平均值。 Then your feature will be like [0, 0, 0, 0, 3, 0, 1, 1, 1, 0] or normalized version of this. 然后你的功能将像[0, 0, 0, 0, 3, 0, 1, 1, 1, 0]或其标准化版本。 Which is 10 dimensional, and that is equal to number of cluster mean. 这是10维，并且等于聚类均值的数量。 So you can create 100000 dimensional vector from 63 features if you want. 因此，如果需要，您可以从63个特征创建100000维向量。

But still I think there is something wrong, because after you applied BOW, your features should be 1x100 not 128x100. 但我仍然觉得有问题，因为在你应用BOW之后，你的功能应该是1x100而不是128x100。 Your cluster means are 128x1 and you are assigning your 128x1 sized features (you hvae 34 128x1 feature for first image, 63 128x1 feature for second image, etc.) to those means. 您的群集意味着128x1，您将分配128x1大小的功能（第一个图像具有34 128x1功能，第二个图像具有63 128x1功能等）。 So in basic you are assigning 34 or 63 features to 100 means, your result should be 1x100. 因此，基本上你将34或63个特征分配给100个均值，你的结果应该是1x100。