简体   繁体   English

如何将 Cloud TPU 与 Tensorflow Lite Model Maker 一起使用?

[英]How can I use a Cloud TPU with Tensorflow Lite Model Maker?

I'm training an object detection model (EfficientDet-Lite) using Tensorflow Lite Model Maker in Colab and I'd like to use a Cloud TPU.我正在 Colab 中使用 Tensorflow Lite Model Maker 训练 object 检测 model (EfficientDet-Lite),我想使用 Cloud TPU。 I have all the images in a GCS bucket and provide a CSV file.我将所有图像都放在一个 GCS 存储桶中,并提供一个 CSV 文件。 When I call object_detector.create I get the following error:当我调用 object_detector.create 时,出现以下错误:

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in shape(self)
   1196         # `_tensor_shape` is declared and defined in the definition of
   1197         # `EagerTensor`, in C.
-> 1198         self._tensor_shape = tensor_shape.TensorShape(self._shape_tuple())
   1199       except core._NotOkStatusException as e:
   1200         six.raise_from(core._status_to_exception(e.code, e.message), None)

InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /tmp/tfhub_modules/db7544dcac01f8894d77bea9d2ae3c41ba90574c/variables/variables: Unimplemented: File system scheme '[local]' not implemented (file: '/tmp/tfhub_modules/db7544dcac01f8894d77bea9d2ae3c41ba90574c/variables/variables')

That looks like it's trying to process some local files in the CloudTPU, which doesn't work...看起来它正在尝试处理 CloudTPU 中的一些本地文件,但这是行不通的……

The gist of what I'm doing is:我正在做的事情的要点是:

tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
train_data, validation_data, test_data = object_detector.DataLoader.from_csv(
    drive_dir + csv_name,
    images_dir = "images" if not tpu else None,
    cache_dir = drive_dir + "cub_cache",
)
spec = MODEL_SPEC(tflite_max_detections=10, strategy='tpu', tpu=tpu.master(), gcp_project="xxx")
model = object_detector.create(train_data=train_data, 
                               model_spec=spec, 
                               validation_data=validation_data, 
                               epochs=epochs, 
                               batch_size=batch_size,
                               train_whole_model=True)

I can't find any example with Model Maker that uses Cloud TPU.我找不到任何使用 Cloud TPU 的 Model Maker 示例。

Edit: the error seems to occur when the EfficientDet model gets loaded, so somehow modelmaker must be pointing to a local file that doesn't work for CloudTPU?编辑:错误似乎在加载 EfficientDet model 时发生,所以模型制造商必须以某种方式指向不适用于 CloudTPU 的本地文件?

Yeah the error is happening with TFHub, which seems to be well known.是的,错误发生在 TFHub 上,这似乎是众所周知的。 Basically TF Hub loading tries to use a local cache which TPU doesn't have access to (and the Colab doesn't even provide).基本上,TF Hub 加载会尝试使用 TPU 无法访问的本地缓存(而 Colab 甚至不提供)。 Check out https://github.com/tensorflow/hub/issues/604 which should get you past this error.查看https://github.com/tensorflow/hub/issues/604 ,它应该能让你克服这个错误。

  1. Download from TF-Hub the model you would like to train (replace X: 0<=X<=4): https://tfhub.dev/tensorflow/efficientdet/liteX/feature-vector/1从TF-Hub下载你要训练的model(替换X:0<=X<=4): https://tfhub.dev/tensorflow/efficientdet/liteX/feature-vector/1
  2. Extract the package twice until you get to the "keras_metadata.pb", "saved_model.pb" and "variables" folder提取 package 两次,直到到达“keras_metadata.pb”、“saved_model.pb”和“variables”文件夹
  3. Upload these files and folders on a Google Cloud Bucket将这些文件和文件夹上传到 Google Cloud Bucket
  4. Pass the uri argument to model_spec.get ( https://www.tensorflow.org/lite/tutorials/model_maker_object_detection ), pointing to the Cloud Bucket folder (in gs:// format)将 uri 参数传递给 model_spec.get ( https://www.tensorflow.org/lite/tutorials/model_maker_object_detection ),指向 Cloud Bucket 文件夹(gs:// 格式)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM