简体   繁体   English

TFLite 量化 Model 仍然输出浮点数

[英]TFLite Quantized Model still outputs floats

I have a CNN already working, but now it is necessary to put it in some specific hardware.我有一个 CNN 已经在工作,但现在有必要将它放在一些特定的硬件中。 For that, I've been told to quantize the model, since the hardware can only use integer operations.为此,我被告知要量化 model,因为硬件只能使用 integer 操作。

I read a good solution here: How to make sure that TFLite Interpreter is only using int8 operations?我在这里阅读了一个很好的解决方案: 如何确保 TFLite 解释器仅使用 int8 操作?

And I wrote this code to make it work:我编写了这段代码以使其工作:

model_file = "models/my_cnn.h5"

# load data
model = tf.keras.models.load_model(model_file, custom_objects={'tf': tf}, compile=False)

# convert
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint16 # or tf.uint8
converter.inference_output_type = tf.uint16  # or tf.uint8
qmodel = converter.convert()
with open('thales.tflite', 'wb') as f:
   f.write(qmodel)

interpreter = tf.lite.Interpreter(model_content=qmodel)
interpreter.allocate_tensors()
# predict
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)

image = read_image("test.png")

interpreter.set_tensor(input_details[0]['index'], image)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

When we look at the output printed we can see, first the details:当我们查看打印的 output 时我们可以看到,首先是详细信息:

input_details

[{'name': 'input_1', 'index': 87, 'shape': array([  1, 160, 160,   3], dtype=int32), 'shape_signature': array([  1, 160, 160,   3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

output_details

[{'name': 'Identity', 'index': 88, 'shape': array([  1, 160, 160,   1], dtype=int32), 'shape_signature': array([  1, 160, 160,   1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

And the output of the quantized model is:而量化后的model的output为:

...
[[0.        ]
[0.        ]
[0.        ]
...
[0.00390625]
[0.00390625]
[0.00390625]]

[[0.        ]
[0.        ]
[0.        ]
...
[0.00390625]
[0.00390625]
[0.00390625]]]]

So, I have several problems here:所以,我在这里有几个问题:

  1. In input/output details we can see that the input/output layers are int32, but I specified in the code uint16在输入/输出细节中我们可以看到输入/输出层是int32,但是我在代码中指定了uint16

  2. Also in the input/output details we can see that appears several times "float32" as dtype, and I don't understand why.同样在输入/输出细节中,我们可以看到多次“float32”作为 dtype 出现,我不明白为什么。

  3. Finally, the biggest problem is that the output contains float numbers, which should not happen.最后,最大的问题是 output 包含浮点数,这不应该发生。 So it looks like the model is not really converted to integers.所以看起来 model 并没有真正转换为整数。

How can I really quantize my CNN and why it is not working this code?我怎样才能真正量化我的 CNN 以及为什么它不能工作这个代码?

The converter.inference_input_type and converter.inference_output_type support only tf.int8 or tf.uint8 , not tf.uint16 . converter.inference_input_typeconverter.inference_output_type仅支持tf.int8tf.uint8 ,不支持tf.uint16

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM