Tensorflow 2 Object 檢測 API - 官方模型：無法更改 params_override 參數中的其他參數

Question

當在 ModelZoo 中使用來自 TensorFlow 官方模型的 Object 檢測模型時，有一個名為 params_override 的參數。 根據此處的代碼（ https://github.com/tensorflow/models/blob/master/official/modeling/hyperparams/params_dict.py ），似乎在使用給定的一組默認參數創建 ParamsDict 之后，它然后覆蓋某些參數。他們在 RetinaNet model ( https://github.com/tensorflow/models/tree/master/official/vision/detection ) 的 README.md 中給出的示例是--params_override="{ type: retinanet, train: { checkpoint: { path: ${RESNET_CHECKPOINT?}, prefix: resnet50/ }, train_file_pattern: ${TRAIN_FILE_PATTERN?} }, eval: { val_json_file: ${VAL_JSON_FILE?}, eval_file_pattern: ${EVAL_FILE_PATTERN?} } }" 。 這些默認鍵（即類型、train.checkpoint、train.prefix、train.train_file_pattern、eval.val_json_file、eval.eval_file_pattern）工作正常。 但是，當我嘗試更改其他參數時，它會報錯，例如： KeyError: 'The key `num_classes:2` does not exist. To extend the existing keys, use `override` with `is_strict` = False.' KeyError: 'The key `num_classes:2` does not exist. To extend the existing keys, use `override` with `is_strict` = False.' . 這發生在我嘗試更改超出提供的參數的任何參數上。 其他示例包括： architecture{...,num_classes:2},eval:{...,eval_samples:100}, train:{...,total_steps:100, batch_size:8} ，所有這些都給出相同The key does not exist錯誤。 這是作為默認設置的一部分輸出的params.yaml文件的示例。

anchor:
  anchor_size: 4.0
  aspect_ratios: [1.0, 2.0, 0.5]
  num_scales: 3
architecture:
  backbone: resnet
  max_level: 7
  min_level: 3
  multilevel_features: fpn
  num_classes: 91
  parser: retinanet_parser
  use_bfloat16: false
enable_summary: true
eval:
  batch_size: 8
  eval_dataset_type: tfrecord
  eval_file_pattern: annotations\test.record
  eval_samples: 5000
  eval_timeout: null
  input_sharding: true
  min_eval_interval: 180
  num_images_to_visualize: 0
  num_steps_per_eval: 1000
  type: box
  use_json_file: true
  val_json_file: ''
fpn:
  fpn_feat_dims: 256
  use_batch_norm: true
  use_separable_conv: false
isolate_session_state: false
model_dir: probe_detection_models\v1
norm_activation:
  activation: relu
  batch_norm_epsilon: 0.0001
  batch_norm_momentum: 0.997
  batch_norm_trainable: true
  use_sync_bn: false
postprocess:
  max_total_size: 100
  nms_iou_threshold: 0.5
  pre_nms_num_boxes: 5000
  score_threshold: 0.05
  use_batched_nms: false
predict:
  batch_size: 8
resnet:
  resnet_depth: 50
retinanet_head:
  num_convs: 4
  num_filters: 256
  use_separable_conv: false
retinanet_loss:
  box_loss_weight: 50
  focal_loss_alpha: 0.25
  focal_loss_gamma: 1.5
  huber_loss_delta: 0.1
retinanet_parser:
  aug_rand_hflip: true
  aug_scale_max: 1.0
  aug_scale_min: 1.0
  autoaugment_policy_name: v0
  match_threshold: 0.5
  max_num_instances: 100
  num_channels: 3
  output_size: [640, 640]
  skip_crowd_during_training: true
  unmatched_threshold: 0.5
  use_autoaugment: false
spinenet:
  model_id: '49'
strategy_config:
  all_reduce_alg: null
  distribution_strategy: one_device
  num_gpus: 1
  num_packs: 1
  task_index: -1
  tpu: Quadro
  worker_hosts: null
strategy_type: one_device
train:
  batch_size: 64
  checkpoint:
    path: resnet50-2018-02-07
    prefix: resnet50/
  frozen_variable_prefix: ''
  gradient_clip_norm: 0.0
  input_partition_dims: null
  input_sharding: false
  iterations_per_loop: 100
  l2_weight_decay: 0.0001
  learning_rate:
    init_learning_rate: 0.08
    learning_rate_levels: [0.008, 0.0008]
    learning_rate_steps: [15000, 20000]
    type: step
    warmup_learning_rate: 0.0067
    warmup_steps: 500
  num_cores_per_replica: null
  optimizer:
    momentum: 0.9
    nesterov: true
    type: momentum
  regularization_variable_regex: .*(kernel|weight):0$
  total_steps: 22500
  train_dataset_type: tfrecord
  train_file_pattern: annotations\train.record
  transpose_input: false
type: retinanet
use_tpu: false

如何更改--params_override中提供的默認參數以外的參數？

附錄 - 完整錯誤消息：

Traceback (most recent call last):
  File "models/official/vision/detection/main.py", line 265, in <module>
    app.run(main)
  File "C:\Users\212765830\AppData\Local\Continuum\anaconda3\envs\tf-official\lib\site-packages\absl\app.py", line 303, in run
    _run_main(main, args)
  File "C:\Users\212765830\AppData\Local\Continuum\anaconda3\envs\tf-official\lib\site-packages\absl\app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "models/official/vision/detection/main.py", line 260, in main
    run()
  File "models/official/vision/detection/main.py", line 185, in run
    params = params_dict.override_params_dict(
  File "C:\Users\212765830\AppData\Local\Continuum\anaconda3\envs\tf-official\lib\site-packages\official\modeling\hyperparams\params_dict.py", line 443, in override_params_dict
    params.override(params_dict, is_strict)
  File "C:\Users\212765830\AppData\Local\Continuum\anaconda3\envs\tf-official\lib\site-packages\official\modeling\hyperparams\params_dict.py", line 166, in override
    self._override(override_params, is_strict)  # pylint: disable=protected-access
  File "C:\Users\212765830\AppData\Local\Continuum\anaconda3\envs\tf-official\lib\site-packages\official\modeling\hyperparams\params_dict.py", line 183, in _override
    self.__dict__[k]._override(v, is_strict)  # pylint: disable=protected-access
  File "C:\Users\212765830\AppData\Local\Continuum\anaconda3\envs\tf-official\lib\site-packages\official\modeling\hyperparams\params_dict.py", line 176, in _override
    raise KeyError('The key `{}` does not exist. '
KeyError: 'The key `total_steps:100` does not exist. To extend the existing keys, use `override` with `is_strict` = False.'

Answer 1

我在 README 文檔的下方看到了附加說明。

您可以使用命令創建 YAML 配置文件，例如 my_retinanet.yaml。 該文件指定要覆蓋的參數，至少應包括以下字段。

python3 ~/models/official/vision/detection/main.py \
  --strategy_type=tpu \
  --tpu="${TPU_NAME?}" \
  --model_dir="${MODEL_DIR?}" \
  --mode=train \
  --config_file="my_retinanet.yaml"

配置文件為：

# my_retinanet.yaml
type: 'retinanet'
train:
  train_file_pattern: <path to the TFRecord training data>
eval:
  eval_file_pattern: <path to the TFRecord validation data>
  val_json_file: <path to the validation annotation JSON file>

或者 2）您可以使用內聯配置（YAML 或 JSON 格式）：

python3 ~/models/official/vision/detection/main.py \
  --model_dir=<model folder> \
  --strategy_type=one_device \
  --num_gpus=1 \
  --mode=train \
  --params_override="eval:
 eval_file_pattern: <Eval TFRecord file pattern>
 batch_size: 8
 val_json_file: <COCO format groundtruth JSON file>
predict:
 predict_batch_size: 8
architecture:
 use_bfloat16: False
train:
 total_steps: 1
 batch_size: 8
 train_file_pattern: <Eval TFRecord file pattern>
use_tpu: False
"

出於某種原因，JSON 內聯格式對我不起作用，但 my_retinanet.yaml 文件對我有用。

Tensorflow 2 Object 檢測 API - 官方模型：無法更改 params_override 參數中的其他參數

問題描述

1 個解決方案

解決方案1
0 2021-04-01 21:44:37

Tensorflow 2 Object 檢測 API - 官方模型：無法更改 params_override 參數中的其他參數

問題描述

1 個解決方案

解決方案1 0 2021-04-01 21:44:37

解決方案1
0 2021-04-01 21:44:37