保存在数据帧中检测到的对象：tensorflow object_detection

Question

我正在运行 github 存储库 tensorflow/object_deteciton 中的典型代码： https : //github.com/tensorflow/models/tree/master/research/object_detection

特别是“object_detection_tutorial.ipynb”文件。 主循环是这里的这一节：

with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      # Actual detection.
      (boxes, scores, classes, num) = sess.run(
          [detection_boxes, detection_scores, detection_classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8)
      plt.figure(figsize=IMAGE_SIZE)
      plt.imshow(image_np)

我只是在寻找一些关于实际保存图像在数据框中识别的内容的最佳方法的建议，该数据框理想地存储为图像中检测到的每个对象检测到的对象的类别。

任何帮助将不胜感激（：

Answer 1

好吧，我猜有点晚了，但我现在正在研究这个。 所以我在几天内经历了同样的痛苦，最终得到了一些工作。 只是将自己限制在您的代码片段中，我添加了一些片段并得到了这个：

# Initialize hitlist
hitf = open("hitlist.csv",'w')
hitf.write('image,class,score,bb0,bb1,bb2,bb3\n')
hitlim = 0.5

with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        # Definite input and output Tensors for detection_graph
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        # Each box represents a part of the image where a particular object was detected.
        detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        # Each score represent how level of confidence for each of the objects.
        # Score is shown on the result image, together with the class label.
        detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
        detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')

        for image_path in TEST_IMAGE_PATHS:
          image = Image.open(image_path)
          # the array based representation of the image will be used later in order to prepare the
          # result image with boxes and labels on it.
          image_np = load_image_into_numpy_array(image)
          # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
          image_np_expanded = np.expand_dims(image_np, axis=0)
          # Actual detection.
            (boxes, scores, classes, num) = sess.run(
              [detection_boxes, detection_scores, detection_classes, num_detections],
              feed_dict={image_tensor: image_np_expanded})

          # Write the results to hitlist - one line per hit over the 0.5
            nprehit = scores.shape[1] # 2nd array dimension
            for j in range(nprehit):
                fname = "image"+str(i)
                classid = int(classes[i][j])
                classname = category_index[classid]["name"]
                score = scores[i][j]
                if (score>=hitlim):
                    sscore = str(score)
                    bbox = boxes[i][j]
                    b0 = str(bbox[0])
                    b1 = str(bbox[1])
                    b2 = str(bbox[2])
                    b3 = str(bbox[3])
                    line = ",".join([fname,classname,sscore,b0,b1,b2,b3])
                    hitf.write(line+"\n")

          # Visualization of the results of a detection.
          vis_util.visualize_boxes_and_labels_on_image_array(
              image_np,
              np.squeeze(boxes),
              np.squeeze(classes).astype(np.int32),
              np.squeeze(scores),
              category_index,
              use_normalized_coordinates=True,
              line_thickness=8)
          plt.figure(figsize=IMAGE_SIZE)
          plt.imshow(image_np)                      

# close hitlist
hitf.flush()
hitf.close()

笔记：

添加的代码分为三部分，一部分用于初始化hitlist.csv ，一部分为每个高于 0.5 置信限制的“预命中”添加一行，一部分用于关闭文件。
它故意不是很“pythonic”，它使用简单而明显的结构来说明正在发生的事情。 除了",".join(...) ，我非常喜欢我无法抗拒。
可以通过查看分数的第 2 个维度或 classid 来找到预命中数。
返回的 classid 是float s，即使您最需要它们作为整数。 转换很容易。
这里可能有一些小的复制和粘贴错误，因为我真的没有 MVE（最小可验证示例）。
我正在使用rfcn_resnet101_coco_2017_11_08对象检测模型而不是ssd_mobilenet_v1_coco_2017_11_17所以我的命中列表和分数有点不同（实际上更糟）。

这是 csv 的样子：

image,class,score,bb0,bb1,bb2,bb3
image0,kite,0.997912,0.086756825,0.43700624,0.1691603,0.4966739
image0,person,0.9968072,0.7714941,0.15771112,0.945292,0.20014654
image0,person,0.9858992,0.67766637,0.08734644,0.8385928,0.12563995
image0,kite,0.9683157,0.26249793,0.20640253,0.31359094,0.2257214
image0,kite,0.8578382,0.3803091,0.42938906,0.40701985,0.4453904
image0,person,0.85244817,0.5692219,0.06266626,0.6282138,0.0788657
image0,kite,0.7622662,0.38192448,0.42580333,0.4104231,0.442965
image0,person,0.6722884,0.578461,0.022049228,0.6197509,0.036917627
image0,kite,0.6671517,0.43708095,0.80048573,0.47312954,0.8156846
image0,person,0.6577289,0.5996533,0.13272598,0.63358027,0.1430584
image0,kite,0.5893124,0.3790631,0.3451705,0.39845183,0.35965574
image0,person,0.51051,0.57377476,0.025907507,0.6221084,0.04294989

对于此图像（来自 ipython notebook - 但具有不同的对象检测模型）。

Answer 2

我认为所描述的 object_detection.py 文件的语法发生了一些变化。 我为新语法稍微调整了所描述的答案：

这是您应该在代码中找到的位置：

   # Actual detection.
      output_dict = run_inference_for_single_image(image_np, detection_graph)

然后可以添加：

  # store boxes in dataframe!
  cut_off_scores = len(list(filter(lambda x: x >= 0.1, output_dict['detection_scores'])))
  detect_scores = []
  detect_classes = []
  detect_ymin = []
  detect_xmin = []
  detect_ymax = []
  detect_xmax = []
  for j in range(cut_off_scores):
      detect_scores.append(output_dict['detection_scores'][j])
      detect_classes.append(output_dict['detection_classes'][j])
       # Assumption: ymin, xmin, ymax, xmax:
      boxes = output_dict['detection_boxes'][j]
      detect_ymin.append(boxes[0])
      detect_xmin.append(boxes[1])
      detect_ymax.append(boxes[2])
      detect_xmax.append(boxes[3])
      # Assumption: your files are named image1, image2, etc.
      Identifier = ("image" + str(n))
      Id_list = [Identifier] * cut_off_scores
      Detected_objects = pd.DataFrame(
        {'Image': Id_list,
         'Score': detect_scores,
         'Class': detect_classes,
         'Ymin':  detect_ymin,
         'Xmax': detect_xmax,
         'Ymax': detect_ymax,
         'Xmax':  detect_xmax
        })

Answer 3

以上两种方法我都试过了，都没有成功。 Mike Wise 的代码有一个小错误，就是缺少 i 的值。 还有User27074，追加xmax、xmin等值有问题

我尝试了一个简单的代码，它的工作原理是将检测到的对象的坐标以百分比形式保存，稍后需要与图像的高度和宽度相乘。

detected_boxes = []
h = image_height = 500 #Change accordingly
w = image_width = 500 #change accordingly
#Columns' format 'ymin','xmin','ymax', 'xmax', 'class', 'Detection score'
for i, box in enumerate(np.squeeze(boxes)):
    if (np.squeeze(scores)[i] > 0.85):
        box[0] = int(box[0] * h)
        box[1] = int(box[1] * w)
        box[2] = int(box[2] * h)
        box[3] = int(box[3] * w)
        box = np.append(box, np.squeeze(classes)[i])
        box = np.append(box, np.squeeze(scores)[i]*100)
        detected_boxes.append(box)
np.savetxt('detection_coordinates.csv', detected_boxes, fmt='%i', delimiter=',')

保存在数据帧中检测到的对象：tensorflow object_detection

问题描述

3 个解决方案

解决方案1
2 2018-04-24 19:59:36

解决方案2
2 2018-06-11 08:49:17

解决方案3
0 2019-10-02 16:45:00

保存在数据帧中检测到的对象：tensorflow object_detection

问题描述

3 个解决方案

解决方案1 2 2018-04-24 19:59:36

解决方案2 2 2018-06-11 08:49:17

解决方案3 0 2019-10-02 16:45:00

解决方案1
2 2018-04-24 19:59:36

解决方案2
2 2018-06-11 08:49:17

解决方案3
0 2019-10-02 16:45:00