I have just started working with tensorflow in python. I am trying to train Single shot detection using tensorflow for pascalvoc dataset. While creating tfrecords and during evaluation using VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt trained model there was no error. Whereas, when I am trying to train pascalvoc 2007 or 2012 datasets using ssd_300_vgg.ckpt pre-trained model I am getting following error.
2017-08-25 20:03:03.001268: I tensorflow/core/common_runtime gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro M5000M, pci bus id: 0000:01:00.0)
INFO:tensorflow:Error reported to Coordinator: <type 'exceptions.ValueError'>, Can't load save_path when it is None.
Traceback (most recent call last):
File "train_ssd_network.py", line 391, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train_ssd_network.py", line 387, in main
sync_optimizer=None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 738, in train
master, start_standard_services=False, config=session_config) as sess:
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 965, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 793, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 954, in managed_session
start_standard_services=start_standard_services)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 709, in prepare_or_wait_for_session
init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/session_manager.py", line 281, in prepare_session
init_fn(sess)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/variables.py", line 660, in callback
saver.restore(session, model_path)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1558, in restore
raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.
I am using following script to fine-tune the model
DATASET_DIR=./tfrecords
TRAIN_DIR=./logs/
CHECKPOINT_PATH=./checkpoints/ssd_300_vgg.ckpt
python train_ssd_network.py \
--train_dir=${TRAIN_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=pascalvoc_2012 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--checkpoint_path=${CHECKPOINT_PATH} \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=10
The model ssd_300_vgg.ckpt is stored at the location./checkpoints
Please let me know if anyone has the solution.
Three suggestions:
Check the path when restoring the model
saver = tf.train.import_meta_graph(model_path)
Check the path when restoring the checkpoint
saver.restore(sess, tf.train.latest_checkpoint(cur_dir))
Check the parameters when saving the model
saver = tf.train.Saver(save_relative_paths=True)
CHECKPOINT_PATH=./checkpoints/ssd_300_vgg.ckpt/ssd_300_vgg.ckpt
I was having the same problem even when I was proving correct path.
I was passing the correct path like this :
sess = tf.Session()
saver = tf.train.import_meta_graph('model_dir/model.meta')
restore = saver.restore(sess,tf.train.latest_checkpoint('model_dir/'))
But I was getting the error, so I opened checkpoint file as .txt, the path in checkpoint file was wrong that's why it was not able to load the file.
So if you are getting the same error, check the checkpoint file path by opening it.
检查您的路径,您可能指向一个不存在的文件。
What someone might want, when checking through those answers:
saver.restore(sess, tf.train.latest_checkpoint("directorytosavedmodel/./"))
in other words the ./
works in the directory the model is saved… (I was looking through this thread thinking, I just want to restore a model not save one before I restore..)
It means that model files are not there and saver.restore
can't read the model files.
Sometimes (In my case), you cloned a repo, but you forgot to download pretrained model
and locate them in the path saver.restore
pointing to.
(If you save them in you'r code by yourself, see @Maikefer answer)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.