简体繁体 English

训练级联分类器以检测纸张

[英]Training Cascade classifier to detect a paper

原文 2015-10-27 18:16:18 0 1 opencv/ object/ detection/ cascade

I am trying to train a cascade classifier to detect a paper (8.5"x11") on a window (real house windows). 我正在尝试训练级联分类器，以检测窗户（真实窗户）上的纸张（8.5“ x11”）。 Basically, I am trying to measure the width and height of the windows based on the paper. 基本上，我正在尝试根据纸张测量窗户的宽度和高度。 Once I detected the paper, I can get the width and height of the window through simple ratio calculation (since paper is fixed size) 一旦检测到纸张，就可以通过简单的比率计算获得窗口的宽度和高度（因为纸张是固定尺寸）

I tried using just plain paper with no luck. 我试过只用普通纸，没有运气。 It could detect the paper but also some random objects. 它既可以检测纸张，也可以检测一些随机物体。 It wasn't too reliable and the training only took 31 minutes. 不太可靠，培训只花了31分钟。 Positive samples were generated from 15 different pictures of paper (cropped). 从15张不同的照片纸（已裁剪）中生成阳性样本。 Negative samples are 300x300 window images. 负样本是300x300的窗口图像。 Parameters: -numStages 1 -nsplits 2 -minHitRate 0.995 -maxFalseAlarmRate 0.9 -numPos 400 -numNeg 400 -w 62 -h 80 参数：-numStages 1 -nsplits 2 -minHitRate 0.995 -maxFalseAlarmRate 0.9 -numPos 400 -numNeg 400 -w 62 -h 80

Now I am trying to detect same paper size but with printed object on it (to provide some patterns). 现在，我尝试检测相同的纸张尺寸，但上面有打印的对象（以提供一些图案）。 Basically, I printed a big Android logo and tried to train the cascade classifier to detect it. 基本上，我打印了一个大的Android徽标，并尝试训练级联分类器来检测它。 Here's my parameters: -numStages 1 -nsplits 2 -minHitRate 0.995 -maxFalseAlarmRate 0.9 -numPos 890 -numNeg 890 -w 62 -h 80 (negative images are in 150x150 pixel resolution) 这是我的参数：-numStages 1 -nsplits 2 -minHitRate 0.995 -maxFalseAlarmRate 0.9 -numPos 890 -numNeg 890 -w 62 -h 80（负像的分辨率为150x150像素）

So I got better results than that of plain paper. 所以我得到的结果比普通纸要好。 I tried to input some of the positive samples (generated by opencv_createsamples) to the cascade classifier and it detects the paper (printed Android) with high accuracy. 我试图将一些正样本（由opencv_createsamples生成）输入到级联分类器中，并且它可以高精度地检测纸张（打印的Android）。 The problem occurs when I input a real image (picture of a window with the Android paper), the classifier does not detect the paper at all. 当我输入真实图像（带有Android纸张的窗口图片）时，会出现问题，分类器根本无法检测到纸张。

Take note that when I input the real image, I resize it to 150x150, so the object to detect (paper) becomes even smaller (around 31x40) and I tried to set the minimum size parameter (in detectMultiScale) to 31x40. 请注意，当我输入真实图像时，我将其尺寸调整为150x150，因此要检测的对象（纸张）变得更小（大约31x40），并且我尝试将最小尺寸参数（在detectMultiScale中）设置为31x40。

Also, when I try to increase the number of stages, it gives me a 'required leaf false alarm rate achieved' error no matter how much I experiment with -minHitRate and -MaxFalseAlarmRate parameters. 另外，当我尝试增加阶段数时，无论我用-minHitRate和-MaxFalseAlarmRate参数进行多少实验，它都会给我一个“达到所需的叶错误警报率”错误。 Even with these two parameters are set to very low values (0.3 and 0.3 respectively). 即使将这两个参数设置为非常低的值（分别为0.3和0.3）。

Do you guys have any suggestions? 你们有什么建议吗？ What else should you think I try? 您还认为我还能尝试什么？ I am thinking retraining the system with more complex pattern, would that be helpful? 我正在考虑以更复杂的模式对系统进行培训，这会有所帮助吗？ I just need some opinions because I have been training my classifier for 3 weeks with more than 50x attempts trying different parameters and image sizes. 我只需要一些意见，因为我已经训练分类器3周，尝试了50次尝试不同的参数和图像大小。 I am just tired and ran out of ideas to try... 我只是累了，筋疲力尽尝试...

Thanks in advance. 提前致谢。

1 个解决方案

Keep the following points in mind while training and you should achieve good results- 训练时请牢记以下几点，您应该取得良好的成绩-

Specify only the parameters that you can't do without such as 仅指定您不能没有的参数，例如

-numPos -numPos
-numNeg -numNeg

Use the default values for other parameters like 对其他参数使用默认值，例如

-minHitRate -minHitRate
-maxFalseAlarmRate -maxFalseAlarmRate
-weightTrimRate -weightTrimRate
-maxDepth -maxdepth
-maxWeakCount -maxWeakCount

Once you have generated a classifier successfully, you can play around with the other values. 成功生成分类器后，即可使用其他值。
Get a good number of original positive samples and negative samples rather than creating them from a small number of samples using opencv_createsamples as training the classifier with the same samples again and again does not increase its accuracy. 获得大量原始正样本和负样本，而不是使用opencv_createsamples作为少量训练，反复训练相同的样本，从而从少量样本中创建样本，这并不会提高其准确性。 Also note that -numPos is not the total number of positive samples present in the .vec file. 还要注意， -numPos不是.vec文件中存在的阳性样本总数。 Instead, it is the number of positive samples to be fed into each stage of classifier training so this number should be somewhat less than the total number of positive samples, 相反，它是要输入到分类器训练每个阶段中的阳性样本数，因此该数目应略少于阳性样本总数，
The property of cascade classifiers is that they use a series of weak classifiers in order to provide good classification at low computation costs. 级联分类器的特性是它们使用一系列弱分类器，以便以较低的计算成本提供良好的分类。 Hence, it is very important that you train your classifier through a sufficient number of stages otherwise, it would not work. 因此，非常重要的一点是，您必须通过足够多的阶段来训练分类器，否则它将无法正常工作。
Check the amount of free memory (RAM) on your system and specify the following parameters accordingly- 检查系统上的可用内存（RAM）数量，并相应地指定以下参数-

-precalcValBufSize -precalcValBufSize
-precalcIdxBufSize -precalcIdxBufSize

So, if you have 1GB of free memory, you can split it into halves. 因此，如果您有1GB的可用内存，则可以将其分成两半。 Do keep in mind that you should not consume all of the free memory otherwise your system might experience failures or the training might terminate prematurely due to insufficient memory. 请记住，不要消耗掉所有可用内存，否则系统可能会遇到故障，或者由于内存不足而导致培训过早终止。