簡體   English   中英

張量流中邊界框的坐標

[英]coordinates of bounding box in tensorflow

我想要來自 tensorflow 模型的預測邊界框的坐標。
我正在使用這里的對象檢測腳本。
在遵循 stackoverflow 上的一些答案之后,我將最后一個檢測塊修改為

for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  width, height = image.size
  print(width,height)
  ymin = output_dict['detection_boxes'][5][0]*height
  xmin = output_dict['detection_boxes'][5][1]*width
  ymax = output_dict['detection_boxes'][5][2]*height
  xmax = output_dict['detection_boxes'][5][3]*width
  #print(output_dict['detection_boxes'][0])
  print (xmin,ymin)
  print (xmax,ymax)

但是 output_dict['detection_boxes'] 中有 100 個元組。
即使對於那些無法預測的圖像,也有 100 個元組

我想要的是單個圖像的所有邊界框的坐標。

如果你查看你正在使用的模型的 pipeline.config 文件,你可以看到在某些地方,最大框數設置為 100。例如,在ssd_mobilenet_v1配置文件中,這是演示中的模型筆記本你可以在下面看到它

post_processing {
  batch_non_max_suppression {
    ...
    max_detections_per_class: 100
    max_total_detections: 100
  }
}

這也是輸入閱讀器的默認設置(對於訓練和評估),您可以更改它們,但這僅與您正在訓練/評估的原因相關。 如果您想在不重新訓練模型的情況下進行推理,您可以簡單地使用預先訓練的模型(同樣,例如 ssd_mobilenet_v1),並在使用--config_override參數時自行導出它,以更改我上面提到的值網絡管理系統。

在 expand_dims 行之后,您可以添加這些代碼。 filtered_boxes 變量將給出預測值大於 0.5 的邊界框。

  (boxes, scores, classes, num) = sess.run(
      [detection_boxes, detection_scores, detection_classes, num_detections],
      feed_dict={image_tensor: image_np_expanded})
  indexes = []
  import os
  for i in range (classes.size):
    if(classes[0][i] in range(1,91) and scores[0][i]>0.5):
        indexes.append(i)
  filtered_boxes = boxes[0][indexes, ...]
  filtered_scores = scores[0][indexes, ...]
  filtered_classes = classes[0][indexes, ...]
  filtered_classes = list(set(filtered_classes))
  filtered_classes = [int(i) for i in filtered_classes]
for image_path in TEST_IMAGE_PATHS:
  image_np = cv2.imread(image_path)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)

#if using cv2 to load image
(im_width, im_height) = image_np.shape[:2]

ymin = output_dict['detection_boxes'][0][0]*im_height
xmin = output_dict['detection_boxes'][0][1]*im_width
ymax = output_dict['detection_boxes'][0][2]*im_height
xmax = output_dict['detection_boxes'][0][3]*im_width

使用上面的代碼,您將獲得以最大分數檢測到的類的所需邊界框坐標,該類位於第一個方括號指示的第 0 個位置。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM