[英]Get the bounding box coordinates in the TensorFlow object detection API tutorial
[英]coordinates of bounding box in tensorflow
我想要來自 tensorflow 模型的預測邊界框的坐標。
我正在使用這里的對象檢測腳本。
在遵循 stackoverflow 上的一些答案之后,我將最后一個檢測塊修改為
for image_path in TEST_IMAGE_PATHS:
image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
width, height = image.size
print(width,height)
ymin = output_dict['detection_boxes'][5][0]*height
xmin = output_dict['detection_boxes'][5][1]*width
ymax = output_dict['detection_boxes'][5][2]*height
xmax = output_dict['detection_boxes'][5][3]*width
#print(output_dict['detection_boxes'][0])
print (xmin,ymin)
print (xmax,ymax)
但是 output_dict['detection_boxes'] 中有 100 個元組。
即使對於那些無法預測的圖像,也有 100 個元組
我想要的是單個圖像的所有邊界框的坐標。
如果你查看你正在使用的模型的 pipeline.config 文件,你可以看到在某些地方,最大框數設置為 100。例如,在ssd_mobilenet_v1的配置文件中,這是演示中的模型筆記本你可以在下面看到它
post_processing {
batch_non_max_suppression {
...
max_detections_per_class: 100
max_total_detections: 100
}
}
這也是輸入閱讀器的默認設置(對於訓練和評估),您可以更改它們,但這僅與您正在訓練/評估的原因相關。 如果您想在不重新訓練模型的情況下進行推理,您可以簡單地使用預先訓練的模型(同樣,例如 ssd_mobilenet_v1),並在使用--config_override
參數時自行導出它,以更改我上面提到的值網絡管理系統。
在 expand_dims 行之后,您可以添加這些代碼。 filtered_boxes 變量將給出預測值大於 0.5 的邊界框。
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
indexes = []
import os
for i in range (classes.size):
if(classes[0][i] in range(1,91) and scores[0][i]>0.5):
indexes.append(i)
filtered_boxes = boxes[0][indexes, ...]
filtered_scores = scores[0][indexes, ...]
filtered_classes = classes[0][indexes, ...]
filtered_classes = list(set(filtered_classes))
filtered_classes = [int(i) for i in filtered_classes]
for image_path in TEST_IMAGE_PATHS:
image_np = cv2.imread(image_path)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
#if using cv2 to load image
(im_width, im_height) = image_np.shape[:2]
ymin = output_dict['detection_boxes'][0][0]*im_height
xmin = output_dict['detection_boxes'][0][1]*im_width
ymax = output_dict['detection_boxes'][0][2]*im_height
xmax = output_dict['detection_boxes'][0][3]*im_width
使用上面的代碼,您將獲得以最大分數檢測到的類的所需邊界框坐標,該類位於第一個方括號指示的第 0 個位置。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.