为什么使用tensorflow做物体检测时平均准确率和平均召回率很低？

Question

I use this link to learn object detection on windows 10.我使用此链接来学习 Windows 10 上的对象检测。

I use anaconda(python3.6),tensorflow 1.12.0.我使用 anaconda(python3.6)，tensorflow 1.12.0。

I prepared 400 pictures and divided them into two classes(stones and cars).我准备了 400 张图片并将它们分为两类（石头和汽车）。

Then I used this command to train:然后我用这个命令来训练：

cd E:\\test\\models-master\\research\\object_detection cd E:\\test\\models-master\\research\\object_detection

python model_main.py --pipeline_config_path=training/ssd_mobilenet_v1_coco.config --model_dir=training/ --num_train_steps=10000 python model_main.py --pipeline_config_path=training/ssd_mobilenet_v1_coco.config --model_dir=training/ --num_train_steps=10000

The content in ssd_mobilenet_v1_coco.config: ssd_mobilenet_v1_coco.config 中的内容：

# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 2
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 10
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  #fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
  #from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 1000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path:'data/train.record'
  }
  label_map_path:'data/side_vehicle.pbtxt'
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: 'data/test.record'
  }
  label_map_path: 'data/side_vehicle.pbtxt'
  shuffle: false
  num_readers: 1
}

Now it has trained 6000 steps, but the average precision and average recall are not close to 1, you can see it from following pictures:现在已经训练了6000步，但是平均准确率和平均召回率都没有接近1，可以从下图看出：

Terminal of pycharm outputs this information: pycharm 的终端输出以下信息：

How to increase the average precision and average recall？如何提高平均准确率和平均召回率？

Answer 1

I know where the problem is.我知道问题出在哪里。

First, there are too few pictures,I increased the number of pictures to 4000.第一，图片太少，我把图片数量增加到4000张。

Second, I should use fine tune.I added 4 part in pipeline config file ssd_mobilenet_v1_coco.config :其次，我应该使用微调。我在管道配置文件ssd_mobilenet_v1_coco.config添加了 4 部分：

  fine_tune_checkpoint: "ssd_inception_v2_coco_2018_01_28/model.ckpt"
  fine_tune_checkpoint_type: "detection"
  from_detection_checkpoint: true
  load_all_detection_checkpoint_vars: true

Check the label again to see if there are any pictures that are mislabeled.再次检查标签，看看是否有任何贴错标签的图片。 It is very important to check.检查是非常重要的。

Then the average precision and average recall increased.然后平均准确率和平均召回率增加。

It's not the best if the mAP reaches 1. It's very powerful if it reaches about 0.48. mAP达到1并不是最好的，达到0.48左右就很厉害了。

为什么使用tensorflow做物体检测时平均准确率和平均召回率很低？

问题描述

1 个解决方案

解决方案1
0 已采纳

为什么使用tensorflow做物体检测时平均准确率和平均召回率很低？

问题描述

1 个解决方案

解决方案1 0 已采纳

解决方案1
0 已采纳