简体繁体 English

使用OpenCV Haar-cascades进行人脸特征检测

[英]Face Features Detection Using OpenCV Haar-cascades

原文 2017-03-17 01:54:05 2 1 java/ opencv/ face-detection/ haar-classifier/ eye-detection

I am using Java with OpenCV Library to detect Face , Eyes and Mouth using Laptop Camera . 我正在使用Java与OpenCV库来检测使用Laptop Camera Face ， Eyes和Mouth 。

What I have done so far: 到目前为止我做了什么：

Capture Video Frames using VideoCapture object. 使用VideoCapture对象捕获视频帧。
Detect Face using Haar-Cascades . 使用Haar-Cascades检测Face 。
Divide the Face region into Top Region and Bottom Region. 将Face区域划分为Top Region和Bottom Region。
Search for Eyes inside Top region. 搜索Top区域内的Eyes 。
Search for Mouth inside Bottom region. 搜索Bottom区域内的Mouth 。

Problem I am facing: 我面临的问题：

At first Video is running normally and suddenly it becomes slower. 起初视频正常运行，然后突然变慢。

Main Questions: 主要问题：

Do Higher Cameras' Resolutions work better for Haar-Cascades? 高等级相机的分辨率是否适用于Haar-Cascades？
Do I have to capture Video Frames in a certain scale? 我是否必须以一定比例捕获视频帧？ for example (100px X100px) ? 例如(100px X100px) ？
Do Haar-Cascades work better in Gray-scale Images? Haar-Cascades在Gray-scale图像中的效果更好吗？
Does different lighting conditions make difference? 不同的照明条件会有所不同吗？
What does the method detectMultiScale(params) exactly do? detectMultiScale(params)的方法究竟做了什么？
If I want to go for further analysis for Eye Blinking , Eye Closure Duration , Mouth Yawning , Head Nodding and Head Orientation to Detect Fatigue (Drowsiness) By Using Support Vector Machine , any advices? 如果我想进一步分析Eye Blinking ， Eye Closure Duration ， Mouth Yawning ， Head Nodding和Head Orientation来检测疲劳（困倦）通过使用Support Vector Machine ，任何建议？

Your help is appreciated! 非常感谢您的帮助！

1 个解决方案

The following article , would give you an overview of the things going under the hood, I would highly recommend to read the article. 下面的文章将向您概述幕后的内容，我强烈建议您阅读这篇文章。

Do Higher Cameras' Resolutions work better for Haar-Cascades? 高等级相机的分辨率是否适用于Haar-Cascades？

Not necessarily, the cascade.detectMultiScale has params to adjust for various input width, height scenarios, like minSize and maxSize , These are optional params However, But you can tweak these to get robust predictions if you have control over the input image size. 不一定， cascade.detectMultiScale有params来调整各种输入宽度，高度场景，如minSize和maxSize ，这些是可选参数但是，如果您可以控制输入图像大小，可以调整这些以获得可靠的预测。 If you set the minSize to smaller value and ignore maxSize then it will work for smaller and high res images as well, but the performance would suffer. 如果将minSize设置为较小的值并忽略maxSize那么它也适用于较小和较高分辨率的图像，但性能会受到影响。 Also if you imagine now, How come there is no differnce between High-res and low-res images then you should consider that the cascade.detectMultiScale internally scales the images to lower resolutions for performance boost, that is why defining the maxSize and minSize is important to avoid any unnecessary iterations. 此外，如果您现在想象，为什么高分辨率和低分辨率图像之间没有差异，那么您应该考虑cascade.detectMultiScale内部将图像缩放到较低分辨率以提升性能，这就是为什么定义maxSize和minSize是避免任何不必要的迭代很重要。

Do I have to capture Video Frames in a certain scale? 我是否必须以一定比例捕获视频帧？ for example (100px X100px) 例如（100px X100px）

This mainly depends upon the params you pass to the cascade.detectMultiScale . 这主要取决于传递给cascade.detectMultiScale的参数。 Personally I guess that 100 x 100 would be too small for smaller face detection in the frame as some features would be completely lost while resizing the frame to smaller dimensions, and the cascade.detectMultiScale is highly dependent upon the gradients or features in the input image. 我个人认为100 x 100对于帧中较小的面部检测来说太小了，因为在将帧尺寸调整到较小尺寸时某些特征会完全丢失，而cascade.detectMultiScale高度依赖于输入图像中的渐变或特征。

But if the input frame only has face as a major part, and there are no other smaller faces dangling behind then you may use 100 X 100 . 但是如果输入框架只有面部作为主要部分，并且没有其他较小的面孔悬空，那么你可以使用100 X 100 。 I have tested some sample faces of size 100 x 100 and it worked pretty well. 我测试了一些尺寸为100 x 100样品面，效果非常好。 And if this is not the case then 300 - 400 px width should work good. 如果情况并非如此，则300 - 400 px宽度应该可以正常工作。 However you would need to tune the params in order to achieve accuracy. 但是，您需要调整参数以获得准确性。

Do Haar-Cascades work better in Gray-scale Images? Haar-Cascades在灰度图像中的效果更好吗？

They work only in gray-scale images. 它们仅适用于灰度图像。

In the article , if you read the first part, you will come to know that it face detection is comprised of detecting many binary patterns in the image, This basically comes from the ViolaJones , paper which is the basic of this algorithm. 在文章中，如果您阅读第一部分，您将会发现它的面部检测包括检测图像中的许多二进制模式，这基本上来自ViolaJones ，这是该算法的基础。

Does different lighting conditions make difference? 不同的照明条件会有所不同吗？

May be in some cases, largely Haar-features are lighting invariant. 可能在某些情况下，主要是哈尔特征是照明不变的。

If you are considering different lighting conditions as taking images under green or red light, then it may not affect the detection, The haar-features (since dependent on gray-scale) are independent of the RGB color of input image. 如果您考虑在不同的光照条件下拍摄绿色或红色光下的图像，那么它可能不会影响检测。哈尔特征（因为依赖于灰度）与输入图像的RGB颜色无关。 The detection mainly depends upon the gradients/features in the input image. 检测主要取决于输入图像中的梯度/特征。 So as far as there are enough gradient differences in the input image such as eye-brow has lower intensity than fore-head, etc. it will work fine. 因此，只要在输入图像中有足够的梯度差异，例如眉毛的强度低于前脑等，它就可以正常工作。

But consider a case when input image has back-light or very low ambient light, In that case it may be possible that some prominent features are not found, which may result in face not detected. 但考虑输入图像具有背光或非常低的环境光的情况。在这种情况下，可能未找到一些突出的特征，这可能导致未检测到面部。

What does the method detectMultiScale(params) exactly do? detectMultiScale（params）的方法究竟做了什么？

I guess, if you have read the article , by this time, then you must be knowing it well. 我想，如果你已经读过article了，那么你一定很了解它。

If I want to go for further analysis for Eye Blinking, Eye Closure Duration, Mouth Yawning, Head Nodding and Head Orientation to Detect Fatigue (Drowsiness) By Using Support Vector Machine, any advices? 如果我想进一步分析眼睛闪烁，闭眼持续时间，嘴巴打呵欠，头部点头和头部方向来检测疲劳（困倦）通过使用支持向量机，任何建议？

No, I won't suggest you to perform these type of gesture detection with SVM, as it would be extremely slow to run 10 different cascades to conclude current facial state, However I would recommend you to use some Facial Landmark Detection Framework, such as Dlib , You may search for some other frameworks as well, because the model size of dlib is nearly 100MB and it may not suit your needs if you want to port it to mobile device. 不，我不建议您使用SVM执行这些类型的手势检测，因为运行10个不同的级联来结束当前的面部状态会非常慢，但我建议您使用一些面部标志检测框架，例如Dlib ，您也可以搜索其他一些框架，因为dlib的模型大小接近100MB，如果您想将其移植到移动设备，它可能不适合您的需求。 So the key is ** Facial Landmark Detection **, once you get the full face labelled, you can draw conclusions like if the mouth if open or the eyes are blinking, and it works in Real-time, so your video processing won't suffer much. 所以关键是**面部地标检测**，一旦你得到标记的全脸，你可以得出结论，如果嘴巴打开或眼睛闪烁，它可以实时工作，所以你的视频处理赢了'受苦很多。