简体   繁体   English

此网络摄像头面部检测有什么问题?

[英]What's wrong with this webcam face detection?

Dlib has a really handy, fast and efficient object detection routine, and I wanted to make a cool face tracking example similar to the example here . Dlib有一个非常方便,快速且有效的对象检测例程,我想制作一个酷人脸跟踪示例,类似于此处的示例。

OpenCV, which is widely supported, has VideoCapture module that is fairly quick (a fifth of a second to snapshot compared with 1 second or more for calling up some program that wakes up the webcam and fetches a picture). 广泛支持的OpenCV具有相当快的VideoCapture模块(快照的五分之一秒,而调用某些唤醒网络摄像头并获取图片的程序则需要1秒或更多)。 I added this to the face detector Python example in Dlib. 我将此添加到Dlib中的面部检测器Python示例中。

If you directly show and process the OpenCV VideoCapture output it looks odd because apparently OpenCV stores BGR instead of RGB order. 如果直接显示和处理OpenCV VideoCapture输出,则看起来很奇怪,因为显然OpenCV存储BGR而不是RGB顺序。 After adjusting this, it works, but slowly: 调整后,它可以工作,但是很慢:

from __future__ import division
import sys

import dlib
from skimage import io


detector = dlib.get_frontal_face_detector()
win = dlib.image_window()

if len( sys.argv[1:] ) == 0:
    from cv2 import VideoCapture
    from time import time

    cam = VideoCapture(0)  #set the port of the camera as before

    while True:
        start = time()
        retval, image = cam.read() #return a True bolean and and the image if all go right

        for row in image:
            for px in row:
                #rgb expected... but the array is bgr?
                r = px[2]
                px[2] = px[0]
                px[0] = r
        #import matplotlib.pyplot as plt
        #plt.imshow(image)
        #plt.show()

        print( "readimage: " + str( time() - start ) )

        start = time()
        dets = detector(image, 1)
        print "your faces: %f" % len(dets)
        for i, d in enumerate( dets ):
            print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
                i, d.left(), d.top(), d.right(), d.bottom()))
            print("from left: {}".format( ( (d.left() + d.right()) / 2 ) / len(image[0]) ))
            print("from top: {}".format( ( (d.top() + d.bottom()) / 2 ) /len(image)) )
        print( "process: " + str( time() - start ) )

        start = time()
        win.clear_overlay()
        win.set_image(image)
        win.add_overlay(dets)

        print( "show: " + str( time() - start ) )
        #dlib.hit_enter_to_continue()



for f in sys.argv[1:]:
    print("Processing file: {}".format(f))
    img = io.imread(f)
    # The 1 in the second argument indicates that we should upsample the image
    # 1 time.  This will make everything bigger and allow us to detect more
    # faces.
    dets = detector(img, 1)
    print("Number of faces detected: {}".format(len(dets)))
    for i, d in enumerate(dets):
        print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
            i, d.left(), d.top(), d.right(), d.bottom()))

    win.clear_overlay()
    win.set_image(img)
    win.add_overlay(dets)
    dlib.hit_enter_to_continue()


# Finally, if you really want to you can ask the detector to tell you the score
# for each detection.  The score is bigger for more confident detections.
# Also, the idx tells you which of the face sub-detectors matched.  This can be
# used to broadly identify faces in different orientations.
if (len(sys.argv[1:]) > 0):
    img = io.imread(sys.argv[1])
    dets, scores, idx = detector.run(img, 1)
    for i, d in enumerate(dets):
        print("Detection {}, score: {}, face_type:{}".format(
            d, scores[i], idx[i]))

From the output of the timings in this program, it seems processing and grabbing the picture are each taking a fifth of a second, so you would think it should show one or 2 updates per second - however, if you raise your hand it shows in the webcam view after 5 seconds or so! 从该程序的计时输出来看,似乎处理和抓取图片的时间分别为五分之一秒,因此您认为它应该每秒显示一次或两次更新-但是,如果举手,它会显示5秒钟左右后即可看到网络摄像头视图!

Is there some sort of internal cache keeping it from grabbing the latest webcam image? 是否存在某种内部缓存,无法捕获最新的网络摄像头图像? Could I adjust or multi-thread the webcam input process to fix the lag? 我可以调整网络摄像头输入过程或对其进行多线程处理以解决延迟吗? This is on an Intel i5 with 16gb RAM. 这是在具有16GB RAM的Intel i5上。

Update 更新资料

According to here, it suggests the read grabs a video frame by frame . 根据这里的建议,建议阅读器逐帧捕获视频。 This would explain it grabbing the next frame and the next frame, until it finally caught up to all the frames that had been grabbed while it was processing. 这将解释它捕获下一帧和下一帧,直到最终捕获处理时捕获的所有帧。 I wonder if there is an option to set the framerate or set it to drop frames and just click a picture of the face in the webcam now on read? 我不知道是否有设置帧率的选项,或者将其设置为丢帧,而且只需点击脸部的图片在网络摄像头现在读? http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_gui/py_video_display/py_video_display.html#capture-video-from-camera http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_gui/py_video_display/py_video_display.html#capture-video-from-camera

I feel your pain. 我感到你很痛苦。 I actually recently worked with that webcam script (multiple iterations; substantially edited). 实际上,我最近使用了该网络摄像头脚本(多次迭代;经过实质性编辑)。 I got it to work really well, I think. 我认为我的工作真的很好。 So that you can see what I did, I created a GitHub Gist with the details (code; HTML readme file; sample output): 为了了解我所做的事情,我创建了一个GitHub Gist,其中包含详细信息(代码; HTML自述文件;示例输出):

https://gist.github.com/victoriastuart/8092a3dd7e97ab57ede7614251bf5cbd https://gist.github.com/victoriastuart/8092a3dd7e97ab57ede7614251bf5cbd

Maybe the problem is that there is a threshold is set. 也许问题在于设置了阈值。 As described here 如上所述这里

dots = detector(frame, 1)

Should be changed to 应该改为

dots = detector(frame)

To avoid a threshold. 避免阈值。 This is works for me, but at the same time, there is a problem that frames are processed too fast. 这对我来说是有效的,但同时存在一个问题,即帧处理速度太快。

If you want to show a frame read in OpenCV, you can do it with the help of cv2.imshow() function without any need of changing the colors order. 如果要显示在OpenCV中读取的帧,可以在cv2.imshow()函数的帮助下完成它,而无需更改颜色顺序。 On the other hand, if you still want to show the picture in matplotlib, then you can't avoid using the methods like this: 另一方面,如果您仍想在matplotlib中显示图片,则无法避免使用如下方法:

b,g,r = cv2.split(img)
img = cv2.merge((b,g,r))

That's the only thing I can help you with for now=) 那是我目前唯一可以帮助您的事情)

I tried multithreading, and it was just as slow, then I multithreaded with just the .read() in the thread, no processing, no thread locking, and it worked quite fast - maybe 1 second or so of delay, not 3 or 5. See http://www.pyimagesearch.com/2015/12/21/increasing-webcam-fps-with-python-and-opencv/ 我尝试了多线程,但速度.read()慢,然后我在线程中仅使用.read()进行了多线程,没有任何处理,没有线程锁定,并且工作得非常快-大概有1秒左右的延迟,而不是3或5请参阅http://www.pyimagesearch.com/2015/12/21/increasing-webcam-fps-with-python-and-opencv/

from __future__ import division
import sys
from time import time, sleep
import threading

import dlib
from skimage import io


detector = dlib.get_frontal_face_detector()
win = dlib.image_window()

class webCamGrabber( threading.Thread ):
    def __init__( self ):
        threading.Thread.__init__( self )
        #Lock for when you can read/write self.image:
        #self.imageLock = threading.Lock()
        self.image = False

        from cv2 import VideoCapture, cv
        from time import time

        self.cam = VideoCapture(0)  #set the port of the camera as before
        #self.cam.set(cv.CV_CAP_PROP_FPS, 1)


    def run( self ):
        while True:
            start = time()
            #self.imageLock.acquire()
            retval, self.image = self.cam.read() #return a True bolean and and the image if all go right

            print( type( self.image) )
            #import matplotlib.pyplot as plt
            #plt.imshow(image)
            #plt.show()

            #print( "readimage: " + str( time() - start ) )
            #sleep(0.1)

if len( sys.argv[1:] ) == 0:

    #Start webcam reader thread:
    camThread = webCamGrabber()
    camThread.start()

    #Setup window for results
    detector = dlib.get_frontal_face_detector()
    win = dlib.image_window()

    while True:
        #camThread.imageLock.acquire()
        if camThread.image is not False:
            print( "enter")
            start = time()

            myimage = camThread.image
            for row in myimage:
                for px in row:
                    #rgb expected... but the array is bgr?
                    r = px[2]
                    px[2] = px[0]
                    px[0] = r


            dets = detector( myimage, 0)
            #camThread.imageLock.release()
            print "your faces: %f" % len(dets)
            for i, d in enumerate( dets ):
                print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
                    i, d.left(), d.top(), d.right(), d.bottom()))
                print("from left: {}".format( ( (d.left() + d.right()) / 2 ) / len(camThread.image[0]) ))
                print("from top: {}".format( ( (d.top() + d.bottom()) / 2 ) /len(camThread.image)) )
            print( "process: " + str( time() - start ) )

            start = time()
            win.clear_overlay()
            win.set_image(myimage)
            win.add_overlay(dets)

            print( "show: " + str( time() - start ) )
            #dlib.hit_enter_to_continue()



for f in sys.argv[1:]:
    print("Processing file: {}".format(f))
    img = io.imread(f)
    # The 1 in the second argument indicates that we should upsample the image
    # 1 time.  This will make everything bigger and allow us to detect more
    # faces.
    dets = detector(img, 1)
    print("Number of faces detected: {}".format(len(dets)))
    for i, d in enumerate(dets):
        print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
            i, d.left(), d.top(), d.right(), d.bottom()))

    win.clear_overlay()
    win.set_image(img)
    win.add_overlay(dets)
    dlib.hit_enter_to_continue()


# Finally, if you really want to you can ask the detector to tell you the score
# for each detection.  The score is bigger for more confident detections.
# Also, the idx tells you which of the face sub-detectors matched.  This can be
# used to broadly identify faces in different orientations.
if (len(sys.argv[1:]) > 0):
    img = io.imread(sys.argv[1])
    dets, scores, idx = detector.run(img, 1)
    for i, d in enumerate(dets):
        print("Detection {}, score: {}, face_type:{}".format(
            d, scores[i], idx[i]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM