Error converting FaceNet model into ONNX format

Question

System information

OS Platform and Distribution: Linux Ubuntu 19.10
Tensorflow Version: 1.15
Python version: 3.7

Issue

I downloaded a tensorflow model of FaceNet from this page , and I'm trying to convert it from.pb into a.onnx file, however it raises the following error:

To Reproduce

root@xesk-VirtualBox:/home/xesk/Desktop# python -m tf2onnx.convert --saved-model home/xesk/Desktop/2s/20180402-114759/20180402-114759.pb --output model.onnx

    2020-08-03 20:18:05.081538: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
    2020-08-03 20:18:05.081680: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    2020-08-03 20:18:07,431 - WARNING - '--tag' not specified for saved_model. Using --tag serve
    Traceback (most recent call last):
    File "/usr/lib/python3.8/runpy.py", line 193, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File "/usr/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
    File "/usr/local/lib/python3.8/dist-packages/tf2onnx/convert.py", line 171, in
    main()
    File "/usr/local/lib/python3.8/dist-packages/tf2onnx/convert.py", line 131, in main
    graph_def, inputs, outputs = tf_loader.from_saved_model(
    File "/usr/local/lib/python3.8/dist-packages/tf2onnx/tf_loader.py", line 288, in from_saved_model
    _from_saved_model_v2(model_path, input_names, output_names, tag, signatures, concrete_function)
    File "/usr/local/lib/python3.8/dist-packages/tf2onnx/tf_loader.py", line 247, in _from_saved_model_v2
    imported = tf.saved_model.load(model_path, tags=tag) # pylint: disable=no-value-for-parameter
    File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/load.py", line 603, in load
    return load_internal(export_dir, tags, options)
    File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/load.py", line 614, in load_internal
    loader_impl.parse_saved_model_with_debug_info(export_dir))
    File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 56, in parse_saved_model_with_debug_info
    saved_model = _parse_saved_model(export_dir)
    File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/saved_model/loader_impl.py", line 110, in parse_saved_model
    raise IOError("SavedModel file does not exist at: %s/{%s|%s}" %
    OSError: SavedModel file does not exist at: home/xesk/Desktop/2s/20180402-114759/20180402-114759.pb/{saved_model.pbtxt|saved_model.pb}

Additional context

I'm not running any CUDA or similars, only CPU. The model downloaded is the 20180402-114759 . It's the first time I'm working with this tools, and I'm a bit of a beginner in this AI world, so I might be missing something obvious. Of course, I checked the path and the command syntax several times. Might be something to do with the format of the files i downloaded?

EDIT

Following Venkatesh Wadawadagi 's answer, I'm going for Option 1. Changing the name of the .meta file solved the problem of the script from not recognising it.

The script is running more or less correctly, and finishes creating the export_dir directory, with export_dir > 0 > variables subfolders. However, they are empty.

The console output is this:

xesk@xesk:~/Desktop/UP2S/ACROMEGALLY/20180402-114759$ python3 ./pb2sm
2020-08-10 16:02:26.128846: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-08-10 16:02:26.129114: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2020-08-10 16:02:26.129137: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (xesk): /proc/driver/nvidia/version does not exist
2020-08-10 16:02:26.129501: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-08-10 16:02:26.139076: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2592000000 Hz
2020-08-10 16:02:26.139506: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x44018d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-10 16:02:26.139520: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/queue_runner_impl.py:391: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
2020-08-10 16:02:32.681265: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 17676288 exceeds 10% of system memory.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value InceptionResnetV1/Block8/Branch_0/Conv2d_1x1/BatchNorm/beta/Adam
     [[{{node save/SaveV2_1}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./pb2sm", line 17, in <module>
    strip_default_attrs=True)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/saved_model/builder_impl.py", line 595, in add_meta_graph_and_variables
    saver.save(sess, variables_path, write_meta_graph=False, write_state=False)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 1193, in save
    raise exc
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 1176, in save
    {self.saver_def.filename_tensor_name: checkpoint_file})
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value InceptionResnetV1/Block8/Branch_0/Conv2d_1x1/BatchNorm/beta/Adam
     [[node save/SaveV2_1 (defined at /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'save/SaveV2_1':
  File "./pb2sm", line 17, in <module>
    strip_default_attrs=True)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/saved_model/builder_impl.py", line 589, in add_meta_graph_and_variables
    saver = self._maybe_create_saver(saver)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/saved_model/builder_impl.py", line 227, in _maybe_create_saver
    allow_empty=True)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 828, in __init__
    self.build()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 840, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 878, in _build
    build_restore=build_restore)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 499, in _build_internal
    save_tensor = self._AddShardedSaveOps(filename_tensor, per_device)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 291, in _AddShardedSaveOps
    return self._AddShardedSaveOpsForV2(filename_tensor, per_device)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 265, in _AddShardedSaveOpsForV2
    sharded_saves.append(self._AddSaveOps(sharded_filename, saveables))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 206, in _AddSaveOps
    save = self.save_op(filename_tensor, saveables)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/training/saver.py", line 122, in save_op
    tensors)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 1946, in save_v2
    name=name)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Is it possible that I'm missing some library to install? Seems to have something to do with some CUDA implementation, which I have none. Is it possible?

Answer 1

The command you're using:

python -m tf2onnx.convert --saved-model home/xesk/Desktop/2s/20180402-114759/20180402-114759.pb --output model.onnx

Note that Fac.net trained model that you're using has got only frozen graph ( .pb file) and checkpoint ( .ckpt ) and does not have saved-model that your command is looking for.

So basically you are passing the path to the .pb file of the frozen graph, which is different from the .pb file of a SavedModel (which you don't have). Savedmodel will have variables folder along with saved_model.pb file.

That's why the error:

OSError: SavedModel file does not exist

Read more about SavedModel here .

To proceed with ONNX conversion, you have two options:

Convert checkpoint to SavedModel:

Use the following code for that:

import os
import tensorflow as tf

trained_checkpoint_prefix = 'model-20180402-114759.ckpt-275'
export_dir = os.path.join('export_dir', '0')

graph = tf.Graph()
with tf.compat.v1.Session(graph=graph) as sess:
    # Restore from checkpoint
    loader = tf.compat.v1.train.import_meta_graph(trained_checkpoint_prefix + '.meta')
    loader.restore(sess, trained_checkpoint_prefix)

    # Export checkpoint to SavedModel
    builder = tf.compat.v1.saved_model.builder.SavedModelBuilder(export_dir)
    builder.add_meta_graph_and_variables(sess,
                                         [tf.saved_model.TRAINING, tf.saved_model.SERVING],
                                         strip_default_attrs=True)
    builder.save()

Note: .data , .index and .meta should have same prefix then this code will work. So rename .meta file.

mv model-20180402-114759.meta model-20180402-114759.ckpt-275.meta

For example:

Make use of ckpt file or frozen-graph.pb for onnx conversion

From checkpoint format:

python -m tf2onnx.convert --checkpoint tensorflow-model-meta-file-path --output model.onnx --inputs input0:0,input1:0 --outputs output0:0

From graphdef/frozen-graph format:

python -m tf2onnx.convert --graphdef tensorflow-model-graphdef-file --output model.onnx --inputs input0:0,input1:0 --outputs output0:0

If your TensorFlow model is in a format other than saved model , then you need to provide the inputs and outputs of the model graph.

From this :

If your model is in checkpoint or graphdef format and you do not know the input and output nodes of the model, you can use the summarize_graph TensorFlow utility. The summarize_graph tool does need to be downloaded and built from source. If you have the option of going to your model provider and obtaining the model in saved model format, then we recommend doing so.

Answer 2

I have encountered similiar error. In my case, I made a mistake by giving pb file instead of path/to/savedmodel which should be the path to the directory containing saved_model.pb . So assuming your 20180402-114759.pb is at directory home/xesk/Desktop/2s/20180402-114759 , the command should be:

python -m tf2onnx.convert --saved-model home/xesk/Desktop/2s/20180402-114759 --output model.onnx

Please refer to Getting Started Converting TensorFlow to ONNX and Using the SavedModel format for more information.

Error converting FaceNet model into ONNX format

Question

2 answers

solution1
1 2020-08-03 21:27:54

solution2
0 2023-01-02 18:57:02

Error converting FaceNet model into ONNX format

Question

2 answers

solution1 1 2020-08-03 21:27:54

solution2 0 2023-01-02 18:57:02

solution1
1 2020-08-03 21:27:54

solution2
0 2023-01-02 18:57:02