Operation type in full integer quantization method in TensorFlowLite

Question

I want to apply Post-Training Quantization (Full integer) using TensorFlow model optimization package on a pre-trained model (LeNet5). https://www.tensorflow.org/model_optimization/guide/quantization/post_training

model = Sequential()
model._name = 'LeNet5'
model.add(tf.keras.layers.InputLayer(input_shape=(28, 28)))
model.add(tf.keras.layers.Reshape(target_shape=(28, 28, 1)))
model.add(
    Conv2D(6, kernel_size=(5, 5), strides=(1, 1), activation='tanh', padding='same'))

model.add(AveragePooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))

model.add(Conv2D(16, kernel_size=(5, 5), strides=(1, 1), activation='tanh', padding='valid'))

model.add(AveragePooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid'))

model.add(Flatten())

model.add(Dense(120, activation='tanh'))

model.add(Dense(84, activation='tanh'))

model.add(Dense(10, activation='softmax'))

and using this code I have applied Full Integer Post-Training Quantization:

     mnist_train, _ = tf.keras.datasets.mnist.load_data()
     images = tf.cast(mnist_train[0], tf.float32) / 255.0
     mnist_ds = tf.data.Dataset.from_tensor_slices((images)).batch(1)
     def representative_data_gen():
       for input_value in mnist_ds.take(100):
         yield [input_value]

    converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
    converter.representative_dataset = representative_data_gen

    converter.allow_custom_ops = True
    converter.target_spec.supported_types = [tf.int8]
    converter.inference_input_type = tf.int8
    converter.inference_output_type = tf.int8

    full_integer_quantization_model = converter.convert()
    open("tflite_model.tflite", "wb").write(full_integer_quantization_model)

It works fine in tense of accuracy but when I try to print the data type of each layer (operation, like conv, activation, bias), I see that some of operations are in int32 instead of int8.

I don't know why?

How does TFLite decide to do some ops in int32 and some in int8?

Is it possible to control this feature (is it an option) in TFLite and perform all operations as int8?

Answer 1

Have you taken a look at this https://www.tensorflow.org/lite/performance/quantization_spec ?

Bias values have 32-bits width

Operation type in full integer quantization method in TensorFlowLite

Question

1 answers

solution1
0 2022-06-15 11:44:10

Operation type in full integer quantization method in TensorFlowLite

Question

1 answers

solution1 0 2022-06-15 11:44:10

solution1
0 2022-06-15 11:44:10