简体繁体 English

袋的单词培训和测试opencv，matlab

[英]Bag of words training and testing opencv, matlab

原文 2012-07-22 17:47:40 2 2 matlab/ opencv/ image-processing/ matlab-cvst/ object-recognition

I'm implementing Bag Of Words in opencv by using SIFT features in order to make a classification for a specific dataset. 我正在使用SIFT功能在opencv中实现Bag Of Words，以便对特定数据集进行分类。 So far, I have been apple to cluster the descriptors and generate the vocabulary. 到目前为止，我一直是苹果集群描述符并生成词汇。 As I know, I have to train SVM ... but i have some questions which i'm really confused about them. 据我所知，我必须训练SVM ...但我有一些问题，我真的很困惑他们。 The major problem is the concept behind the implementations, these are my questions: 主要问题是实现背后的概念，这些是我的问题：

1- When I extract the features and then create the vocabulary, shall I extract the features for all the objects (let's say 5 objects)and put them in one file, so I make all of them in a one vocabulary file that has all the words? 1-当我提取特征然后创建词汇表时，我是否应该提取所有对象的特征（比如5个对象）并将它们放在一个文件中，所以我将它们全部放在一个包含所有对象的单个词汇表文件中话？ and how I will separate them later on when I do the classification? 以及在我进行分类时如何将它们分开？

2- How to implement the SVM? 2-如何实现SVM？ I know the functions that are used in openCV but how? 我知道openCV中使用的函数但是如何？

3- I can do the work in MATLAB, which I mean the implementation of the SVM training, but is there any code available that can guide me through my work? 3-我可以在MATLAB中完成工作，我的意思是SVM培训的实施，但有没有可用的代码可以指导我完成我的工作？ I have seen the code used by Andrea Vedaldi, here but he is working only with one class each time and another issue that he is not showing how to create the .mat file that he's using in his exercises. 我已经看到安德烈Vedaldi使用的代码，在这里，但他每次用一个类和另一个问题，他没有展示如何创建，他是用他的练习.MAT文件只工作。 All other implementations that I could find, they are not using SVM. 我能找到的所有其他实现，他们都没有使用SVM。 So, can you guide in this point too! 所以，你能指导这一点吗！

Thank you 谢谢

2 个解决方案

Local features 当地特色

When you work with SIFT, you usually want to extract local features. 使用SIFT时，通常需要提取本地功能。 What does that means? 这意味着什么？ You have your image and from this image you will locate points from which you will extract local feature vectors. 您有图像，从该图像中可以找到从中提取局部特征向量的点。 A local feature vector is just a vector consisting of numerical values that describes the visual information of the image region from which it was extracted. 局部特征向量只是由数值组成的向量，该数值描述从中提取它的图像区域的视觉信息。 Although the number of local feature vectors that you can extract from image A does not need to be the same as the number of feature vectors that you can extract from image B, the number components of a local feature vector (ie its dimensionality) is always the same. 虽然可以从图像A中提取的局部特征向量的数量不需要与可以从图像B中提取的特征向量的数量相同，但是局部特征向量的数量分量（即其维数）始终是相同。

Now, if you want to use your local feature vectors to classify images you have a problem. 现在，如果您想使用本地特征向量对图像进行分类，则会出现问题。 In traditional image classification, each image is described by a global feature vector, which, in the context of machine learning, can be seen as a set of numerical attributes. 在传统的图像分类中，每个图像由全局特征向量描述，在机器学习的上下文中，可以将其视为一组数字属性。 However, when you extract a set of local feature vectors you don't have a global representation of each image which is required for image classification. 但是，当您提取一组局部特征向量时，您没有图像分类所需的每个图像的全局表示。 A technique that can be employed to solve this problem is the bag of words, also known as bag of visual words (BoW). 可用于解决该问题的技术是一袋词，也称为视觉词袋（BoW）。

Bag of visual words 袋视觉词

Here's the (very) simplified BoW algorithm: 这是（非常）简化的BoW算法：

Extract the SIFT local feature vectors from your set of images; 从您的图像集中提取SIFT局部特征向量;
Put all this local feature vectors into a single set. 将所有这些局部特征向量放入一个集合中。 At this point you don't even need to store from which image each local feature vector was extracted; 此时，您甚至不需要存储从中提取每个局部特征向量的图像;
Apply a clustering algorithm (eg k-means) over the set of local feature vectors in order to find centroid coordinates and assign an id to each centroid. 在局部特征向量集上应用聚类算法（例如k-means），以便找到质心坐标并为每个质心分配一个id。 This set of centroids will be your vocabulary; 这组质心将成为你的词汇;
The global feature vector will be a histogram that counts how many times each centroid occurred in each image. 全局特征向量将是一个直方图，用于计算每个图像中每个质心出现的次数。 To compute the histogram find the nearest centroid for each local feature vector. 要计算直方图，请找到每个局部特征向量的最近质心。

Image Classification 图像分类

Here I am assuming that your problem is the following: 在这里，我假设您的问题如下：

You have as input a set of labeled images and a set of non-labeled images which you want to assign a label based on its visual appearance. 您可以输入一组带标签的图像和一组未标记的图像，您可以根据其视觉外观为其分配标签。 Suppose your problem is to classify landscape photography. 假设你的问题是对风景摄影进行分类。 You image labels could be, for example, “mountains”, “beach” or “forest”. 您的图像标签可以是，例如，“山”，“海滩”或“森林”。

The global feature vector extracted from each image (ie its bag of visual words) can be seen as a set of numerical attributes. 从每个图像（即其视觉词袋）中提取的全局特征向量可以被视为一组数字属性。 This set of numerical attributes representing the visual characteristics of each image and the corresponding image labels can be used to train classifier. 表示每个图像的视觉特性的这组数值属性和相应的图像标签可用于训练分类器。 For example, you could use a data mining software such as Weka , which has an implementation of SVM, known as SMO, to solve your problem. 例如，您可以使用Weka等数据挖掘软件来解决您的问题，该软件具有SVM实现（称为SMO）。

Basically, you only have to format the global feature vectors and corresponding image labels according to the ARFF file format , which is, basically, a CSV of global feature vectors followed by image label. 基本上，您只需要根据ARFF文件格式格式化全局特征向量和相应的图像标签，这基本上是全局特征向量的CSV，后跟图像标签。

Here's a very good article introducing Bag of Words model for classification using OpenCV v2.2. 这是一篇非常好的文章，介绍使用OpenCV v2.2进行分类的Bag of Words模型。 http://app-solut.com/blog/2011/07/the-bag-of-words-model-in-opencv-2-2/ http://app-solut.com/blog/2011/07/the-bag-of-words-model-in-opencv-2-2/

A follow-up article on using Normal Bayes Classifier for image categorization. 关于使用常规贝叶斯分类器进行图像分类的后续文章。 http://app-solut.com/blog/2011/07/using-the-normal-bayes-classifier-for-image-categorization-in-opencv/ http://app-solut.com/blog/2011/07/using-the-normal-bayes-classifier-for-image-categorization-in-opencv/

Also includes a ~200-line code demo on Caltech-256 dataset is available. 还包括一个关于Caltech-256数据集的~200行代码演示。 http://code.google.com/p/open-cv-bow-demo/downloads/detail?name=bowdemo.tar.gz&can=2&q= http://code.google.com/p/open-cv-bow-demo/downloads/detail?name=bowdemo.tar.gz&can=2&q=

Here's something to get a intuitive feel of the process of Image Classification: http://www.robots.ox.ac.uk/~vgg/share/practical-image-classification.htm 这里可以直观地了解图像分类过程： http ： //www.robots.ox.ac.uk/~vgg/share/practical-image-classification.htm

Really helped me clarify a lot of questions. 真的帮我澄清了很多问题。 I hope it helps someone. 我希望它对某人有帮助。 :) :)