簡體   English   中英

如何從 faster-RCNN 計算 F1 分數和其他分類指標? (PyTorch 中的對象檢測)

[英]How can I calculate the F1-score and other classification metrics from a faster-RCNN? (object detection in PyTorch)

我正在努力解決這個問題,但很難理解如何在 object 檢測任務中計算 f1 分數。

理想情況下,我想知道圖像中每個目標的假陽性、真陽性、假陰性和真陰性(這是一個二進制問題,圖像中的 object 作為一個 class,背景作為另一類)。

最后我還想從圖像中提取誤報邊界框。 我不確定這是否有效,但我會將圖像名稱和 bbox 預測以及它們是否是誤報等保存到 numpy 文件中。

我目前將此設置為批量大小為 1,因此我可以對每個圖像應用非最大抑制算法:

def apply_nms(orig_prediction, iou_thresh=0.3):
    
    # torchvision returns the indices of the bboxes to keep
    keep = torchvision.ops.nms(orig_prediction['boxes'], orig_prediction['scores'], iou_thresh)
    
    final_prediction = orig_prediction
    final_prediction['boxes'] = final_prediction['boxes'][keep]
    final_prediction['scores'] = final_prediction['scores'][keep]
    final_prediction['labels'] = final_prediction['labels'][keep]
    
    return final_prediction


cpu_device = torch.device("cpu")
model.eval()
with torch.no_grad():
  for images, targets in valid_data_loader:
    images = list(img.to(device) for img in images)
    outputs = model(images)
    outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
    predictions = apply_nms(outputs[0], iou_thresh=0.3)

關於如何確定上述分類指標和 f1 分數的任何想法?

我在 torchvision 提供的評估代碼中遇到過這一行,想知道它是否會幫助我前進:

res = {target["image_id"].item(): output for target, output in zip(targets, outputs)}

object 檢測中術語精度、召回率和 F1 分數的使用有點令人困惑,因為這些指標最初用於二元評估任務(例如分類)。 無論如何,在 object 檢測中它們的含義略有不同:

let: TP - 一組成功匹配到 ground truth 的預測對象 object(高於你使用的任何數據集的 IOU 閾值,通常為 0.5 或 0.7) FP - 一組未成功匹配到 ground truth 的預測對象 object FN - 未成功匹配預測的地面實況對象集 object

Precision: TP / (TP + FP)
Recall:    TP / (TP + FN)
F1:        2*Precision*Recall /(Precision + Recall)

您可以找到許多匹配步驟(匹配地面實況和預測對象)的實現,通常提供用於評估的數據集,或者您可以自己實現。 我會建議py-motmetrics 存儲庫

IOU 計算的簡單實現可能如下所示:

def iou(self,a,b):
    """
    Description
    -----------
    Calculates intersection over union for all sets of boxes in a and b

    Parameters
    ----------
    a : tensor of size [batch_size,4] 
        bounding boxes
    b : tensor of size [batch_size,4]
        bounding boxes.

    Returns
    -------
    iou - float between [0,1]
        average iou for a and b
    """
    
    area_a = (a[2]-a[0]) * (a[3]-a[1])
    area_b = (b[2]-b[0]) * (b[3]-b[1])
    
    minx = max(a[0], b[0])
    maxx = min(a[2], b[2])
    miny = max(a[1], b[1])
    maxy = min(a[3], b[3])
    
    intersection = max(0, maxx-minx) * max(0,maxy-miny)
    union = area_a + area_b - intersection
    iou = intersection/union
    
    return iou

所以我已經實現了要在全球范圍內計算的 f1 分數——這是針對整個數據集的。

下面的實現給出了確定驗證集的 f1 分數的示例。

model 的輸出是字典格式,因此我們需要將它們放入張量中,如下所示:

predicted_boxes (list): [[train_index, class_prediction, prob_score, x1, y1, x2, y2],[],...[]]

train_index:特定bbox來自的圖像的索引class_prediction:integer代表class預測的值prob_score:bbox x1,y1,x2,y2的輸出客觀性得分:(x1,y1)和(x2,y2)bbox坐標

gt_boxes (list): [[train_index, class_prediction, prob_score, x1, y1, x2, y2],[],...[]]

其中prob_score對於地面實況輸入僅為1 (只要指定並填寫該維度,它實際上可以是任何東西)。

IoU 也在 torchvision 中實現,這使一切變得容易得多。

我希望這對其他人有幫助,因為我在其他任何地方都找不到 object 檢測中 f1 分數的另一個實現。

model_test.eval()
with torch.no_grad():
  global_tp = []
  global_fp = []
  global_gt = []


  valid_df_unique = get_unique(valid_df['image_id'])

  for images, targets in valid_data_loader:
    images = list(img.to(device) for img in images)
    outputs = model_test(images)
    outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
    predictions = apply_nms(outputs[0], iou_thresh=0.1)


    # looping through each class
    for c in range(num_classes):
      # detections (list): predicted_boxes that are class c
      detections = []
      # ground_truths (list): gt_boxes that are class c
      ground_truths = []

    

      for b,la,s in zip(predictions['boxes'], predictions['labels'],predictions['scores']): 
        updated_detection_array = [targets[0]['image_id'].item(), la.item(), s.item(), b[0].item(),b[1].item(),b[2].item(),b[3].item()]
        if la.item() == c:
          detections.append(updated_detection_array)

      for b,la in zip(targets[0]['boxes'], targets[0]['labels']): 
        updated_gt_array = [targets[0]['image_id'].item(), la.item(), 1, b[0].item(),b[1].item(),b[2].item(),b[3].item()]
        if la.item() == c:
          ground_truths.append(updated_gt_array)
          global_gt.append(updated_gt_array)


    
      # use Counter to create a dictionary where key is image # and value
      # is the # of bboxes in the given image
      amount_bboxes = Counter([gt[0] for gt in ground_truths])

      # goal: keep track of the gt bboxes we have already "detected" with prior predicted bboxes
      # key: image #
      # value: tensor of 0's (size is equal to # of bboxes in the given image)
      for key, value in amount_bboxes.items():
        amount_bboxes[key] = torch.zeros(value)

      # sort over the probabiliity scores of the detections
      detections.sort(key = lambda x: x[2], reverse = True)
      
      true_Positives = torch.zeros(len(detections))
      false_Positives = torch.zeros(len(detections))
      total_gt_bboxes = len(ground_truths)

      false_positives_frame = []
      true_positives_frame = []


      # iterate through all detections in given class c
      for detection_index, detection in enumerate(detections):
        # detection[0] indicates image #
        # ground_truth_image: the gt bbox's that are in same image as detection
        ground_truth_image = [bbox for bbox in ground_truths if bbox[0] == detection[0]]

        # num_gt_boxes: number of ground truth boxes in given image
        num_gt_boxes = len(ground_truth_image)
        best_iou = 0
        best_gt_index = 0


        for index, gt in enumerate(ground_truth_image):
          
          iou = torchvision.ops.box_iou(torch.tensor(detection[3:]).unsqueeze(0), 
                                        torch.tensor(gt[3:]).unsqueeze(0))
          
          if iou > best_iou:
            best_iou = iou
            best_gt_index = index

        if best_iou > iou_threshold:
          # check if gt_bbox with best_iou was already covered by previous detection with higher confidence score
          # amount_bboxes[detection[0]][best_gt_index] == 0 if not discovered yet, 1 otherwise
          if amount_bboxes[detection[0]][best_gt_index] == 0:
            true_Positives[detection_index] = 1
            amount_bboxes[detection[0]][best_gt_index] == 1
            true_positives_frame.append(detection)
            global_tp.append(detection)

          else:
            false_Positives[detection_index] = 1
            false_positives_frame.append(detection)
            global_fp.append(detection)
        else:
          false_Positives[detection_index] = 1
          false_positives_frame.append(detection)
          global_fp.append(detection)


# remove nan values from ground truth list as list contains every mitosis image row entry (including images with no targets)
global_gt_updated = []
for gt in global_gt:
  if math.isnan(gt[3]) == False:
    global_gt_updated.append(gt)


global_fn = len(global_gt_updated) - len(global_tp)

precision = len(global_tp)/ (len(global_tp)+ len(global_fp))
recall = len(global_tp)/ (len(global_tp) + global_fn)

f1_score =  2* (precision * recall)/ (precision + recall)

print(len(global_tp))
print(recall)
print(precision)
print(f1_score)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM