Error while exporting interference graph in Tensorflow 1.15

Question

I'm encountering the following error when trying to export my trained models (tuned to 1500 steps):

Traceback (most recent call last):
  File "export_inference_graph.py", line 150, in <module>
    tf.app.run()
  File "C:\Users\USERNAME\anaconda3\envs\test123\lib\site-packages\tensorflow_core\python\platform\app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "C:\Users\USERNAME\anaconda3\envs\test123\lib\site-packages\absl\app.py", line 312, in run
    _run_main(main, args)
  File "C:\Users\USERNAME\anaconda3\envs\test123\lib\site-packages\absl\app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "export_inference_graph.py", line 146, in main
    write_inference_graph=FLAGS.write_inference_graph)
  File "C:\Users\USERNAME\Desktop\DirectML_Tensorflow_Library\Tensorflow\workspace\training_demo\object_detection\exporter.py", line 455, in export_inference_graph
    write_inference_graph=write_inference_graph)
  File "C:\Users\USERNAME\Desktop\DirectML_Tensorflow_Library\Tensorflow\workspace\training_demo\object_detection\exporter.py", line 384, in _export_inference_graph
    trained_checkpoint_prefix=checkpoint_to_use)
  File "C:\Users\USERNAME\Desktop\DirectML_Tensorflow_Library\Tensorflow\workspace\training_demo\object_detection\exporter.py", line 295, in write_graph_and_checkpoint
    saver.restore(sess, trained_checkpoint_prefix)
  File "C:\Users\USERNAME\anaconda3\envs\test123\lib\site-packages\tensorflow_core\python\training\saver.py", line 1282, in restore
    checkpoint_prefix)
    
ValueError: The passed save_path is not a valid checkpoint: C:\\Users\\USERNAME\\Desktop\\DirectML_Tensorflow_Library\\Tensorflow\\workspace\\training_demo\\my_models\\my_ssd_mobilenet_v1_coco_2018_01_28\\checkpoints\\model.ckpt-1500

This was my config path when setting up the model:

model {
  ssd {
    num_classes: 1
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    feature_extractor {
      type: "ssd_mobilenet_v1"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 3.99999989895e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.0299999993294
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.999700009823
          center: true
          scale: true
          epsilon: 0.0010000000475
          train: true
        }
      }
    override_base_feature_extractor_hyperparams: true
      
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    box_predictor {
      convolutional_box_predictor {
        conv_hyperparams {
          regularizer {
            l2_regularizer {
              weight: 3.99999989895e-05
            }
          }
          initializer {
            truncated_normal_initializer {
              mean: 0.0
              stddev: 0.0299999993294
            }
          }
          activation: RELU_6
          batch_norm {
            decay: 0.999700009823
            center: true
            scale: true
            epsilon: 0.0010000000475
            train: true
          }
        }
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.800000011921
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.20000000298
        max_scale: 0.949999988079
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.333299994469
      }
    }
    post_processing {
      batch_non_max_suppression {
        score_threshold: 0.300000011921
        iou_threshold: 0.600000023842
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
    normalize_loss_by_num_matches: true
    loss {
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_loss {
        weighted_sigmoid {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.990000009537
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
  }
}
train_config {
  batch_size: 12
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
  optimizer {
    rms_prop_optimizer {
      learning_rate {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.00400000018999
          decay_steps: 800720
          decay_factor: 0.949999988079
        }
      }
      momentum_optimizer_value: 0.899999976158
      decay: 0.899999976158
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "C:\\Users\\USERNAME\\Desktop\\DirectML_Tensorflow_Library\\Tensorflow\\workspace\\training_demo\\pre-trained-model\\ssd_mobilenet_v1_coco_2018_01_28\\model.ckpt"
  from_detection_checkpoint: true
  num_steps: 1500
}
train_input_reader {
  label_map_path: "C:\\Users\\USERNAME\\Desktop\\DirectML_Tensorflow_Library\\Tensorflow\\workspace\\training_demo\\annotations\\label_map.pbtxt"
  tf_record_input_reader {
    input_path: "C:\\Users\\USERNAME\\Desktop\\DirectML_Tensorflow_Library\\Tensorflow\\workspace\\training_demo\\annotations\\train.record"
  }
}
eval_config {
  num_examples: 540
  max_evals: 10
  use_moving_averages: false
}
eval_input_reader {
  label_map_path: "C:\\Users\\USERNAME\\Desktop\\DirectML_Tensorflow_Library\\Tensorflow\\workspace\\training_demo\\annotations\\label_map.pbtxt"
  shuffle: false
  num_readers: 1
  tf_record_input_reader {
    input_path: "C:\\Users\\USERNAME\\Desktop\\DirectML_Tensorflow_Library\\Tensorflow\\workspace\\training_demo\\annotations\\test.record"
  }
}

I've trained the same dataset with some other models from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md#coco-trained-models and the same error occurs again.

My checkpoints folder contains "model.ckpt-1500.data-00000-of-00001", "model.ckpt-1500.index", "model.ckpt-1500.meta" and "checkpoint". In "checkpoint", the model_checkpoint_path: "model.ckpt-1500".

So the checkpoint exists, but it is not recognized as being a valid checkpoint when I try exporting it.

Answer 1

I've solved my issue, and I'm posting my answer here in case this might help someone else in the future who also has the same problem setting up directml 1.15.5 and using pretrained models from the model detection zoo.

Go into anaconda3 --> envs --> [name of your tensorflow environment; in my case test123] --> Lib --> Site packages --> tensorflow_estimator --> python --> estimator --> run_config.py

Change save_checkpoints_steps = None to save_checkpoints_steps = 100 (or any number of your choice), then to make this work you'll also need to change save_checkpoints_secs = 600 to save_checkpoints_secs = None

So the problem was that the file "run_config.py" was not set to saving my checkpoints.

Error while exporting interference graph in Tensorflow 1.15

Question

1 answers

solution1
0 2021-12-27 08:51:33

Error while exporting interference graph in Tensorflow 1.15

Question

1 answers

solution1 0 2021-12-27 08:51:33

solution1
0 2021-12-27 08:51:33