简体   繁体   English

Tensorflow2 Object 检测计数 API 教程

[英]Tensorflow2 Object Detection Counting API for tutorial

I've racked my brain at customizing the TensorFlow object detection using webcam tutorial to count how many objects are detected from each classification.我已经绞尽脑汁定制 TensorFlow object 检测使用网络摄像头教程计算从每个分类中检测到的对象数量。 I trained my custom detection model using the efficientdet_d0_coco17_tpu-32 model.我使用efficientdet_d0_coco17_tpu-32 model训练了我的自定义检测model。 I am also using the 'detect_from_webcam.py' tutorial script.我也在使用“detect_from_webcam.py”教程脚本。 I was able to get the detection working and displaying classifications on the screen.我能够让检测工作并在屏幕上显示分类。 Now I would like to display how many of each classification is detected.现在我想显示检测到每个分类的数量。

I have looked at and attempted the TensorFlow object counting API and just can't seem to understand how to integrate it with my custom trained model.我已经查看并尝试了 TensorFlow object 计数 API 并且似乎无法理解如何将其与我的自定义训练 Z20DFDBCZF366 集成。 Counting_API 计数_API

Forgive me if this is a silly question as I am starting out with Python coding and machine learning in general.如果这是一个愚蠢的问题,请原谅我,因为我一般从 Python 编码和机器学习开始。 Thanks in advance for your help!在此先感谢您的帮助!

I am using Tensorflow 2.4.1 and Python 3.7.0我正在使用 Tensorflow 2.4.1 和 Python 3.7.0

Can anyone help me or point me to what I would need to add to count the objects detected?谁能帮助我或指出我需要添加什么来计算检测到的对象?

This is the command I pass to the script using CMD:这是我使用 CMD 传递给脚本的命令:

python detect_from_webcam.py -m research\object_detection\inference_graph\saved_model -l research\object_detection\Training\labelmap.pbtxt

This is the script:这是脚本:

import numpy as np
import argparse
import tensorflow as tf
import cv2
import pathlib

from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
from api import object_counting_api
from utils import backbone
# patch tf1 into `utils.ops`
utils_ops.tf = tf.compat.v1

# Patch the location of gfile
tf.gfile = tf.io.gfile


def load_model(model_path):
    model = tf.saved_model.load(model_path)
    return model


def run_inference_for_single_image(model, image):
    image = np.asarray(image)
    # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
    input_tensor = tf.convert_to_tensor(image)
    # The model expects a batch of images, so add an axis with `tf.newaxis`.
    input_tensor = input_tensor[tf.newaxis,...]
    
    # Run inference
    output_dict = model(input_tensor)

    # All outputs are batches tensors.
    # Convert to numpy arrays, and take index [0] to remove the batch dimension.
    # We're only interested in the first num_detections.
    num_detections = int(output_dict.pop('num_detections'))
    output_dict = {key: value[0, :num_detections].numpy()
                   for key, value in output_dict.items()}
    output_dict['num_detections'] = num_detections
    #print(num_detections)
    # detection_classes should be ints.
    output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)
    
    # Handle models with masks:
    if 'detection_masks' in output_dict:
        # Reframe the the bbox mask to the image size.
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                                    output_dict['detection_masks'], output_dict['detection_boxes'],
                                    image.shape[0], image.shape[1])      
        detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8)
        output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
    
    return output_dict


def run_inference(model, category_index, cap):
    
    while True:
        ret, image_np = cap.read()
        
        # Actual detection.
        output_dict = run_inference_for_single_image(model, image_np)
        # Visualization of the results of a detection.
        vis_util.visualize_boxes_and_labels_on_image_array(
            image_np,
            output_dict['detection_boxes'],
            output_dict['detection_classes'],
            output_dict['detection_scores'],
            category_index,
            instance_masks=output_dict.get('detection_masks_reframed', None),
            use_normalized_coordinates=True,
            line_thickness=8)
           
        cv2.imshow('object_detection', cv2.resize(image_np, (1920, 1080)))
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cap.release()
            cv2.destroyAllWindows()
            break


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Detect objects inside webcam videostream')
    parser.add_argument('-m', '--model', type=str, required=True, help='Model Path')
    parser.add_argument('-l', '--labelmap', type=str, required=True, help='Path to Labelmap')
    args = parser.parse_args()

    detection_model = load_model(args.model)
    category_index = label_map_util.create_category_index_from_labelmap(args.labelmap, use_display_name=True)
    
    cap = cv2.VideoCapture(0)
    run_inference(detection_model, category_index, cap)

You can count objects in an image using single_image_object_counting.py of tensorflow object counting api .您可以使用tensorflow object 计数 apisingle_image_object_counting.py计算图像中的对象。 You just replace ssd_mobilenet_v1_coco_2018_01_28 with your own model containing inference graph.您只需将ssd_mobilenet_v1_coco_2018_01_28替换为您自己的包含推理图的 model 即可。

You can refer code as shown below您可以参考如下所示的代码

input_video = "image.jpg"
detection_graph, category_index = backbone.set_model(MODEL_DIR)

is_color_recognition_enabled = False # set it to true for enabling the color prediction for the detected objects

# targeted objects counting
result = object_counting_api.single_image_object_counting(input_video, detection_graph, category_index, is_color_recognition_enabled) 

print (result)

For more details you can refer here .有关更多详细信息,您可以参考此处

Note: This answer don't write the detection count on the image or video, just compute the detection count as a single value.注意:此答案不要将检测计数写在图像或视频上,只需将检测计数计算为单个值。

After a lot of python code reviews, I achieved to get just the detection count for a given class:经过大量 python 代码审查后,我只获得了给定 class 的检测计数:

threshold=0.5
labels="dog"
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
detection_count = 0
output_dict = run_inference_for_single_image(model, image_np)

  for i, (y_min, x_min, y_max, x_max) in enumerate(output_dict['detection_boxes']):
    # validates if score has a acceptable value and if its class match with expected class
    if output_dict['detection_scores'][i] > threshold and (labels == None or category_index[output_dict['detection_classes'][i]]['name'] in labels):
      detection_count += 1

With the detection count value ready to use, you could add it to an image or video.准备好使用检测计数值后,您可以将其添加到图像或视频中。

I will share the entire code when it's ready.准备好后,我将分享整个代码。 Is based on this:基于此:

https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/colab_tutorials/object_detection_tutorial.ipynb https://colab.research.google.com/github/tensorflow/models/blob/master/research/object_detection/colab_tutorials/object_detection_tutorial.ipynb

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM