简体繁体 English

在训练强大的级联分类器时要考虑的建议？

[英]Advice to consider when training a robust cascade classifier?

原文 2013-07-03 09:31:36 9 1 algorithm/ opencv/ machine-learning/ computer-vision/ classification

I'm training a cascade classifier in order to detect animals in images. 我正在训练一个级联分类器，以便检测图像中的动物。 Unfortunately my false positive rate is quite high (super high using Haar and LBP, acceptable using HOG). 不幸的是，我的假阳性率非常高（使用Haar和LBP超高，使用HOG可接受）。 I'm wondering how I could possibly improve my classifier. 我想知道如何改进我的分类器。

Here are my questions: 这是我的问题：

what is the amount of training samples that is necessary for a robust detection? 强大检测所需的训练样本量是多少？ I've read somewhere that 4000 pos and 800 neg samples are needed. 我读过某个地方需要4000个pos和800个neg样本。 Is that a good estimate? 这是一个很好的估计吗？
how different should the training samples be? 训练样本应该有多大差异？ Is there a way to quantify image difference in order to include / exclude possible 'duplicate' data? 有没有办法量化图像差异，以包含/排除可能的“重复”数据？
how should I deal with occluded objects? 我应该如何处理被遮挡的物体？ should I train only the part of the animal that is visible, or should I rather pick my ROI so that the average ROI is quite constant? 我应该只培训可见的动物部分，还是应该选择我的投资回报率以使平均投资回报率保持不变？
re occluded objects: animals have legs, arms, tails, heads etc. Since some body parts tend to be occluded quite often, does it make sense to select the 'torso' as the ROI? 被遮挡的物体：动物有腿，胳膊，尾巴，头等。由于一些身体部位往往被遮挡，选择'躯干'作为投资回报率是否有意义？
should I try to downscale my images and train on smaller images sizes? 我是否应该尝试缩小图像尺寸并在较小的图像尺寸上进行训练？ Could this possibly improve things? 这可能会改善一些事情吗？

I'm open for any pointers here! 我愿意接受任何指示！

1 个解决方案

4000 pos - 800 neg is a bad ratio. 4000 pos - 800 neg是一个糟糕的比例。 The thing with negative samples is that you need to train your system as many of them as possible, since Adaboost ML algorithm -the core algorithm for all haar like feature selection processes- depends highly on them. 负样本的东西是你需要尽可能多地训练你的系统，因为Adaboost ML算法 - 所有类似于特征选择过程的核心算法 - 在很大程度上取决于它们。 Using 4000 / 10000 would be a good enhancement. 使用4000/10000将是一个很好的增强。
Detecting "animals" is a hard problem. 检测“动物”是一个难题。 Since your problem is a decision process, which is already NP-hard , you are increasing complexity with your range of classification. 由于您的问题是一个已经非常难以处理的决策过程，因此您的分类范围会越来越复杂。 Start with cats first. 先从猫开始。 Have a system that detects cats. 有一个检测猫的系统。 Then apply the same to the dogs. 然后将其应用于狗。 Have, say 40 systems, detecting different animals and use them for your purpose later on. 有40个系统可以检测不同的动物，并在以后使用它们。
For training, do not use occluded objects as positives. 对于训练，不要使用被遮挡的对象作为正面。 ie if you want to detect frontal faces, then train frontal faces with only applying position and orientation changes, without including any other object in front of it. 即，如果要检测正面，则仅使用位置和方向更改来训练正面，而不在其前面包括任何其他对象。
Downscaling is not important as the haar classifier itself downscales everything to 24x24. 降尺度并不重要，因为haar分类器本身将所有内容缩小到24x24。 Watch whole viola-jones presentation when you have enough time. 当你有足够的时间时，观看整个中提琴演讲。
Good luck. 祝好运。