简体   繁体   中英

Tensorflow Object detection API: Print detected class as output to terminal

I have a simple question, but I can't figure out how to do it. I am using the TF Object detection API to detect images, it is working fine and given an image it will draw the bounding box with a label and confidence score of what class it thinks its detected. My question is how can I print the detected class (as a string) and the score to terminal ie not just on the image but as an output to the terminal too.

Below is the code responsible for the image detection

with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      # Each box represents a part of the image where a particular object was detected.
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class label.
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      # Actual detection.
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8, min_score_thresh=.2)
      plt.figure(figsize=IMAGE_SIZE)
      plt.imshow(image_np)
      plt.show()

Thanks in advance, first post on Stack Overflow so please go easy on me

Well that's very easy. The classes are encrypted in the category_index which is a dict , so you could do something like this:

with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
  # Each box represents a part of the image where a particular object was detected.
  boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
  # Each score represent how level of confidence for each of the objects.
  # Score is shown on the result image, together with the class label.
  scores = detection_graph.get_tensor_by_name('detection_scores:0')
  classes = detection_graph.get_tensor_by_name('detection_classes:0')
  num_detections = detection_graph.get_tensor_by_name('num_detections:0')
  # Actual detection.
  (boxes, scores, classes, num_detections) = sess.run(
      [boxes, scores, classes, num_detections],
      feed_dict={image_tensor: image_np_expanded})

  # Here output the category as string and score to terminal
  print([category_index.get(i) for i in classes[0]])
  print(scores)

Simply, go to the utils directory in object_detection folder and open the script visualization_utils.py . You will find a function namely visualize_boxes_and_labels_on_image_array , add a print command in the end of function to print the variable class_name ( print(class_name) ). Now run your code and see the magic.

Dat and Omar..I have a basic question.. When we print the array, it contains an array of the top 100 scores and classes.. Of this only 2 to 3 are actually displayed in the output image (with bounded boxes and accuracy). How can I subset only those values that are actually displayed in the output image? Is it possible or do we need to set a fixed accuracy threshold? (and risk losing some objects that are displayed in the output image).

Below is the code to fix your problem. TF Version 1.12.0 I used a webcam to test.

From ..\\models\\research\\object_detection\\utils\\visualization_utils.py go to def visualize_boxes_and_labels_on_image_array and fix the for loop.

Print the display_str after display_str gets defined (line 21 I believe), if you print at the end of the for loop you will get an error of class_name referenced before assignment. When ever an object wasn't detected through my camera feed I received this error if I added the print statement at the bottom as Ravish suggested.

  for i in range(min(max_boxes_to_draw, boxes.shape[0])):
    if scores is None or scores[i] > min_score_thresh:
      box = tuple(boxes[i].tolist())
      if instance_masks is not None:
        box_to_instance_masks_map[box] = instance_masks[i]
      if instance_boundaries is not None:
        box_to_instance_boundaries_map[box] = instance_boundaries[i]
      if keypoints is not None:
        box_to_keypoints_map[box].extend(keypoints[i])
      if scores is None:
        box_to_color_map[box] = groundtruth_box_visualization_color
      else:
        display_str = ''
        if not skip_labels:
          if not agnostic_mode:
            if classes[i] in category_index.keys():
              class_name = category_index[classes[i]]['name']
            else:
              class_name = 'N/A'
            display_str = str(class_name)
            print(display_str)
        if not skip_scores:
          if not display_str:
            display_str = '{}%'.format(int(100*scores[i]))
          else:
            display_str = '{}: {}%'.format(display_str, int(100*scores[i]))
        box_to_display_str_map[box].append(display_str)
        if agnostic_mode:
          box_to_color_map[box] = 'DarkOrange'
        else:
          box_to_color_map[box] = STANDARD_COLORS[
              classes[i] % len(STANDARD_COLORS)]
    #(print(class_name)) -- doesn't work : error, class name referenced before assignment

I was confused at first as well. Got over 100 boxes whether only one was drawn on my image. Agree with all answers. Got my easy copy and paste solution for your inferecne.py:

    #assume you've got this in your inference.py
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks'),
        use_normalized_coordinates=True,
        line_thickness=8)

    # This is the way I'm getting my coordinates
    boxes = output_dict['detection_boxes']
    max_boxes_to_draw = boxes.shape[0]
    scores = output_dict['detection_scores']
    min_score_thresh=.5
    for i in range(min(max_boxes_to_draw, boxes.shape[0])):
        if scores is None or scores[i] > min_score_thresh:
            # boxes[i] is the box which will be drawn
            print ("This box is gonna get used", boxes[i])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM