简体   繁体   English

将 SSD object 检测 model 转换为 TFLite 并将其从 float 量化为 uint8 用于 EdgeTPU

[英]Converting SSD object detection model to TFLite and quantize it from float to uint8 for EdgeTPU

I am having problems converting a SSD object detection model into a uint8 TFLite for the EdgeTPU.我在将 SSD object 检测 model 转换为 EdgeTPU 的 uint8 TFLite 时遇到问题。

As far as I know, I have been searching in different forums, stack overflow threads and github issues and I think I am following the right steps.据我所知,我一直在不同的论坛、堆栈溢出线程和 github 问题中进行搜索,我认为我正在遵循正确的步骤。 Something must be wrong on my jupyter notebook since I can't achive my proposal.我的 jupyter 笔记本上一定有问题,因为我无法实现我的建议。

I am sharing with you my steps explained on a Jupyter Notebook.我正在与您分享我在 Jupyter Notebook 上解释的步骤。 I think it will be more clear.我想会更清楚。

#!/usr/bin/env python
# coding: utf-8

Set-up设置

This step is to clone the repository.此步骤是克隆存储库。 If you have done it once before, you can omit this step.如果你以前做过一次,你可以省略这一步。

import os
import pathlib

# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
  while "models" in pathlib.Path.cwd().parts:
    os.chdir('..')
elif not pathlib.Path('models').exists():
  !git clone --depth 1 https://github.com/tensorflow/models

Imports进口

Needed step: This is just for making the imports需要的步骤:这仅用于进行导入

import matplotlib
import matplotlib.pyplot as plt
import pathlib
import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

import tensorflow as tf
import tensorflow_datasets as tfds


from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
#from object_detection.utils import colab_utils
from object_detection.utils import config_util
from object_detection.builders import model_builder

%matplotlib inline

Downloading a friendly model下载友好的model

For tflite is recommended to use SSD networks. 对于 tflite,建议使用 SSD 网络。 I have downloaded the following model, it is about "object detection". 我已经下载了以下model,它是关于“物体检测”的。 It works with 320x320 images. 它适用于 320x320 图像。
!tflite_convert --saved_model_dir=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model --output_file=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model.tflite

List of the strings that is used to add correct label for each box.用于为每个框添加正确 label 的字符串列表。

 PATH_TO_LABELS = '/home/jose/codeWorkspace-2.4.1/tf_2.4.1/models/research/object_detection/data/mscoco_label_map.pbtxt' category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

Export and run with TFLite使用 TFLite 导出和运行

Model conversion Model转换

On this step I convert the pb saved model to.tflite在这一步中,我将 pb 保存的 model 转换为.tflite

 .tflite_convert --saved_model_dir=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model --output_file=/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model.tflite

Model Quantization (From float to uint8) Model 量化(从浮点到 uint8)

Once the model is converted, I need to quantize it. 一旦 model 被转换,我需要量化它。 The original model picks up a float as tensor input. 原来的 model 选择一个浮点数作为张量输入。 As I want to run it on an Edge TPU I need the input and output tensors to be uint8. 因为我想在 Edge TPU 上运行它,所以我需要输入和 output 张量为 uint8。

Generating a calibration data set.生成校准数据集。

 def representative_dataset_gen(): folder = "/home/jose/codeWorkspace-2.4.1/tf_2.4.1/images_ssd_mb2_2" image_size = 320 raw_test_data = [] files = glob.glob(folder+'/*.jpeg') for file in files: image = Image.open(file) image = image.convert("RGB") image = image.resize((image_size, image_size)) #Quantizing the image between -1,1; image = (2.0 / 255.0) * np.float32(image) - 1.0 #image = np.asarray(image).astype(np.float32) image = image[np.newaxis,:,:,:] raw_test_data.append(image) for data in raw_test_data: yield [data]

(DO NOT RUN THIS ONE). (不要运行这个)。 It is the above step but with random values这是上述步骤,但具有随机值

If you don't have a dataset, you also can introduce random generated values, as if it was an image. 如果您没有数据集,您也可以引入随机生成的值,就像它是图像一样。 This is the code I used to do so: 这是我以前这样做的代码:
 ####THIS IS A RANDOM-GENERATED DATASET#### def representative_dataset_gen(): for _ in range(320): data = np.random.rand(1, 320, 320, 3) yield [data.astype(np.float32)]

Call for model convert要求model转换

converter = tf.lite.TFLiteConverter.from_saved_model('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model') converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.SELECT_TF_OPS] converter.inference_input_type = tf.uint8 converter.inference_output_type = tf.uint8 converter.allow_custom_ops = True converter.representative_dataset = representative_dataset_gen tflite_model = converter.convert()

WARNINGS:警告:

The conversion step returns a warning.转换步骤返回警告。

WARNING:absl:For model inputs containing unsupported operations which cannot be quantized, the inference_input_type attribute will default to the original type.警告:absl:对于包含无法量化的不受支持的操作的 model 输入, inference_input_type属性将默认为原始类型。 WARNING:absl:For model outputs containing unsupported operations which cannot be quantized, the inference_output_type attribute will default to the original type.警告:absl:对于包含无法量化的不受支持的操作的 model 输出, inference_output_type属性将默认为原始类型。

This makes me think conversion is not correct.这让我觉得转换是不正确的。

Saving the model保存 model

 with open('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite'.format('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model'), 'wb') as w: w.write(tflite_model) print("tflite convert complete. - {}/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite".format('/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/saved_model'))

Tests测试

Test 1: Get TensorFlow version测试一:获取TensorFlow版本

I readed that it is recommended to use nightly for this.我读到建议为此使用 nightly 。 So in my case, version is 2.6.0所以就我而言,版本是 2.6.0

 print(tf.version.VERSION)

Test 2: Get input/output tensor details测试 2:获取输入/输出张量详细信息

interpreter = tf.lite.Interpreter(model_path="/home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite") interpreter.allocate_tensors() print(interpreter.get_input_details()) print("@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@") print(interpreter.get_output_details())

Test 2 Results:测试 2 结果:

I get the following info:我得到以下信息:

[{'name': 'serving_default_input:0', 'index': 0, 'shape': array([ 1, 320, 320, 3], dtype=int32), 'shape_signature': array([ 1, 320, 320, 3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.007843137718737125, 127), 'quantization_parameters': {'scales': array([0.00784314], dtype=float32), 'zero_points': array([127], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [{'name': 'serving_default_input:0', 'index': 0, 'shape': array([ 1, 320, 320, 3], dtype=int32), 'shape_signature': array([ 1, 320, 320, 3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.007843137718737125, 127), 'quantization_parameters': {'scales': array([0.00784314], dtype= float32), 'zero_points': 数组([127], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] @@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@

[{'name': 'StatefulPartitionedCall:31', 'index': 377, 'shape': array([ 1, 10, 4], dtype=int32), 'shape_signature': array([ 1, 10, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:32', 'index': 378, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:33', 'index': 379, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_poin [{'name': 'StatefulPartitionedCall:31', 'index': 377, 'shape': array([ 1, 10, 4], dtype=int32), 'shape_signature': array([ 1, 10, 4] , dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points ': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:32', 'index': 378, 'shape': array( [ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0) , 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, { 'name': 'StatefulPartitionedCall:33', 'index': 379, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_poin ts': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:34', 'index': 380, 'shape': array([1], dtype=int32), 'shape_signature': array([1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] ts': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:34', 'index': 380, 'shape': array ([1], dtype=int32), 'shape_signature': array([1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters ': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

So, I think it is not quantizing it right所以,我认为它没有正确量化它

Converting the generated model to EdgeTPU将生成的 model 转换为 EdgeTPU

 .edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite

jose@jose-VirtualBox:~/python-envs$ edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Edge TPU Compiler version 15.0.340273435 jose@jose-VirtualBox:~/python-envs$ edgetpu_compiler -s /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Edge TPU 编译器版本 15.0.340273435

Model compiled successfully in 1136 ms. Model 在 1136 毫秒内编译成功。

Input model: /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Input size: 3.70MiB Output model: model_full_integer_quant_edgetpu.tflite Output size: 4.21MiB On-chip memory used for caching model parameters: 3.42MiB On-chip memory remaining for caching model parameters: 4.31MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 162 Operation log: model_full_integer_quant_edgetpu.log Input model: /home/jose/codeWorkspace-2.4.1/tf_2.4.1/tflite/model_full_integer_quant.tflite Input size: 3.70MiB Output model: model_full_integer_quant_edgetpu.tflite Output size: 4.21MiB On-chip memory used for caching model parameters: 3.42 MiB On-chip memory remaining for caching model parameters: 4.31MiB Off-chip memory used for streaming uncached model parameters: 0.00B Number of Edge TPU subgraphs: 1 Total number of operations: 162 Operation log: model_full_integer_quant_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. Model 编译成功,但 Edge TPU 并不支持所有操作。 A percentage of the model will instead run on the CPU, which is slower. model 的一部分将在 CPU 上运行,这会更慢。 If possible, consider updating your model to use only operations supported by the Edge TPU.如果可能,请考虑更新您的 model 以仅使用 Edge TPU 支持的操作。 For details, visit g.co/coral/model-reqs.有关详细信息,请访问 g.co/coral/model-reqs。 Number of operations that will run on Edge TPU: 112 Number of operations that will run on CPU: 50将在 Edge TPU 上运行的操作数:112 将在 CPU 上运行的操作数:50

Operator Count Status操作员计数状态

LOGISTIC 1 Operation is otherwise supported, but not mapped due to some unspecified limitation DEPTHWISE_CONV_2D 14 More than one subgraph is not supported DEPTHWISE_CONV_2D 37 Mapped to Edge TPU QUANTIZE 1 Mapped to Edge TPU QUANTIZE 4 Operation is otherwise supported, but not mapped due to some unspecified limitation CONV_2D LOGISTIC 1 操作在其他方面受支持,但由于某些未指定的限制而未映射 DEPTHWISE_CONV_2D 14 不支持多个子图 DEPTHWISE_CONV_2D 37 映射到边缘 TPU QUANTIZE 1 映射到边缘 TPU QUANTIZE 4 否则支持操作,但由于某些未指定而未映射限制 CONV_2D
58 Mapped to Edge TPU CONV_2D 14 58 映射到边缘 TPU CONV_2D 14
More than one subgraph is not supported DEQUANTIZE不支持多个子图 DEQUANTIZE
1 Operation is working on an unsupported data type DEQUANTIZE 1 Operation is otherwise supported, but not mapped due to some unspecified limitation CUSTOM 1 1 操作正在处理不受支持的数据类型 DEQUANTIZE 1 操作在其他方面受支持,但由于某些未指定的限制而未映射 CUSTOM 1
Operation is working on an unsupported data type ADD操作正在处理不受支持的数据类型 ADD
2 More than one subgraph is not supported ADD 2 不支持多于一个子图 ADD
10 Mapped to Edge TPU CONCATENATION 1 10 映射到边缘 TPU 连接 1
Operation is otherwise supported, but not mapped due to some unspecified limitation CONCATENATION 1 More than one subgraph is not supported RESHAPE 2否则支持操作,但由于某些未指定的限制而未映射 CONCATENATION 1 不支持多个子图 RESHAPE 2
Operation is otherwise supported, but not mapped due to some unspecified limitation RESHAPE 6否则支持操作,但由于某些未指定的限制而未映射 RESHAPE 6
Mapped to Edge TPU RESHAPE 4 More than one subgraph is not supported PACK 4映射到边缘 TPU RESHAPE 4 不支持多个子图 PACK 4
Tensor has unsupported rank (up to 3 innermost dimensions mapped)张量的等级不受支持(最多映射 3 个最内层维度)

The jupyter notebook i prepared can be found on the following link: https://github.com/jagumiel/Artificial-Intelligence/blob/main/tensorflow-scripts/Step-by-step-explaining-problems.ipynb我准备的jupyter notebook可以在以下链接找到: https://github.com/jagumiel/Artificial-Intelligence/blob/main/tensorflow-scripts/Step-by-step-explaining-problems.ipynb

Is there any step I am missing?有没有我遗漏的步骤? Why is not resulting my conversion?为什么没有导致我的转换?

Thank you very much in advance.非常感谢您提前。

The process, as @JaesungChung answered is well done.正如@JaesungChung 回答的那样,这个过程做得很好。

My problem was on the application which was running the.tflite model.我的问题出在运行 .tflite model 的应用程序上。 I quantized my model output to uint8, so I had to reescale my obtained values to get the right results.我将我的 model output 量化为 uint8,因此我必须重新调整获得的值以获得正确的结果。

Ie I had 10 objects because I was requesting all the detected objects with an score above 0.5.即我有 10 个对象,因为我要求所有检测到的对象得分高于 0.5。 My results were no scaled, so the detected objects scores could be perfectly 104. I had to reescale that number dividing by 255.我的结果没有按比例缩放,因此检测到的对象分数可能是完美的 104。我必须重新缩放该数字除以 255。

The same happened when graphing my results.绘制我的结果时也发生了同样的情况。 So I had to divide that number and multiplicate by the height and width.所以我不得不把这个数字除以高度和宽度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM