简体   繁体   中英

Unexpected results when passing webcam frames to object detection

Background

I have a well-trained ssd320x320 tensorflow model from tensorflow-zoo . The reports are pretty good as the train log indicates a low loss and the eval log indicates that 7 out of 9 test images were detected successfully. The model was trained with GPU and was saved as ckpt3 .

The goal is to detect when a person "likes" with their hand.

Problem

Loading a model from its last checkpoint works well, and I achieved detection with the following function:

    def test1(self):
        # Works great
        for img_path in glob.glob("test_dir\*.jpg"):
            plt.figure()
            plt.imshow(self.get_image_np_with_detections(self._load_image_into_numpy_array(img_path)))
            plt.show()

# Note that get get_image_np_with_detections() is the detection @tf.function() 
# as it is written in tensorflow documentation, with no changes.
# _load_image() function simply returns np.array(Image.open(path))

Object detection in image was successfuly achieved in test1 . Problem is that I failed to detect an object in webcam frames .

From another function, which opens my webcam, I call the same detection function for each frame. This function is failing, as not even one green detection box appears on the screen.

    def open_webcam(self):
        # Doesn't show detection green boxes at all
        cap = cv2.VideoCapture(0)
        while cap.isOpened():
            ret, image_np = cap.read()
            im_detected = self.get_image_np_with_detections(image_np)
            cv2.imshow('object detection', cv2.resize(im_detected, (800, 600)))
        # release, destroy...

Where is the problem

During my debug, I have saved screenshots from my webcam, while running the open_webcam() function (took a screenshot every 1-2 seconds). The screenshots were saved into test_dir , and then were processed to test1 . The test was successful as all screenshots were marked with a green detection box (hand-like-sign).

This test indicates that the problem regards the way I pass frames to the function, as all the frames were successfully detected in the test1 approach, but not in real-time. To summarize:

  • I failed to detect a like-sign in a webcam frame (real-time).
  • I saved the frame inside test_dir , with a unique-id.
  • I managed to detect a like-sign after opening the jpg , in test1() (9/10 screenshots).

I have tried to...

  • pass frames as numpy array with no luck.
  • Expand the dimensions as mentioned in tf-documentation, again, with no luck.

Note that...

  • I have only 1 label, which is Like (hand-sign).
  • I used around 25 train images, and 9 test images.
  • As mentioned, the model works great when opening saved jpg files. Eval report looks good.
  • PY is 3.7 , TF is 2.7 , CV is 4.5.5 .

Thanks in advance!

TensorFlow model is most likely to be trained on RGB images, while cv2 works with BGR. Try

image_np  = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)

Also, model may be trained on normalized images, so, if changing BGR to RGB doesn't help, try

image_np = image_np / 255.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM