简体繁体 English

Haar级联分类器训练问题

[英]Haar cascaded Classifier Training Problems

原文 2012-02-14 12:53:48 5 1 image/ image-processing/ opencv/ computer-vision

I've created multiple haar cascaded classifiers of face. 我创建了多个haar级联的面孔分类器。 I used a different number of positives and negatives each time. 我每次都使用不同数量的正面和负面。

For example, 例如，

1st classifier: 5000 positive and 3000 negatives 第一个分类器： 5000个正数和3000个负数
2nd classifier: 3000 positive and 3000 negatives (deleted 2000 redundant/similar images) 第二分类器： 3000个正片和3000个负片（删除了2000个冗余/相似图片）

the efficiency of both these classifiers was almost same... 这两个分类器的效率几乎相同...

Problems: 问题：

Isn't there a method by which I can delete all redundant images in my database prior to training? 在训练之前，没有一种方法可以删除数据库中的所有冗余图像吗？
What are the ideal lighting and background conditions for training Classifier? 训练分类器的理想照明和背景条件是什么？
How many images in database are considered ideal for best performance or does it depend on the type of data in the set? 数据库中有多少张图像被认为是最佳性能的理想选择，还是取决于集合中的数据类型？

Regards, 问候，

Saleh... 萨利赫...

1 个解决方案

All the best for your work. 祝您工作顺利。

Answers: 答案：

I want to know how you have deleted the redundant images while training the second classifier. 我想知道您在训练第二个分类器时如何删除冗余图像。 I can not tell you the exact solution. 我无法告诉您确切的解决方案。 One solution could be : Take a simple haar feature, get the feature vectors (say F1 and F2) for two images . 一种解决方案可能是：采取简单的haar特征，获取两个图像的特征向量（例如F1和F2）。 If the correlation between the F1 and F2 is zero (or less than some threshold), then images are similar. 如果F1和F2之间的相关性为零（或小于某个阈值），则图像相似。 You have to test this. 您必须对此进行测试。 And if it works then please let me know. 如果有效，请告诉我。
It depends on the application. 这取决于应用程序。 If one wants to use a classifier in a scenario with changing illuminations and backgrounds, then such images should be included in training classifiers. 如果要在光照和背景不断变化的场景中使用分类器，则此类图像应包含在训练分类器中。
Training database should contain many images (typically thousands of images). 培训数据库应包含许多图像（通常为数千个图像）。 The important thing is the variations in the images in terms of appearance, illusion, shadows etc. Variation in database makes classifier more robust. 重要的是图像在外观，错觉，阴影等方面的差异。数据库中的差异使分类器更可靠。