简体   繁体   English

如何在量化感知训练后合并 ReLU

[英]How to merge ReLU after quantization aware training

I have a network which contains Conv2D layers followed by ReLU activations, declared as such:我有一个包含 Conv2D 层和 ReLU 激活的网络,声明如下:

x = layers.Conv2D(self.hparams['channels_count'], kernel_size=(4,1))(x)
x = layers.ReLU()(x)

And it is ported to TFLite with the following representaiton:它被移植到 TFLite,具有以下表示:

Basic TFLite network without Q-aware training没有 Q-aware 训练的基本 TFLite 网络

However, after performing quantization-aware training on the network and porting it again, the ReLU layers are now explicit in the graph:然而,在网络上执行量化感知训练并再次移植后,ReLU 层现在在图中是明确的:

TFLite network after Q-aware training Q-aware 训练后的 TFLite 网络

This results in them being processed separately on the target instead of during the evaluation of the Conv2D kernel, inducing a 10% performance loss in my overall network.这导致它们在目标上被单独处理,而不是在评估 Conv2D kernel 期间,在我的整个网络中导致 10% 的性能损失。

Declaring the activation with the following implicit syntax does not produce the problem:使用以下隐式语法声明激活不会产生问题:

x = layers.Conv2D(self.hparams['channels_count'], kernel_size=(4,1), activation='relu')(x)

Basic TFLite network with implicit ReLU activation具有隐式 ReLU 激活的基本 TFLite 网络

TFLite network with implicit ReLU after Q-aware training在 Q 感知训练后具有隐式 ReLU 的 TFLite 网络

However, this restricts the network to basic ReLU activation, whereas I would like to use ReLU6 which cannot be declared in this way.但是,这将网络限制为基本的 ReLU 激活,而我想使用不能以这种方式声明的 ReLU6。

Is this a TFLite issue?这是 TFLite 问题吗? If not, is there a way to prevent the ReLU layer from being split?如果没有,有没有办法防止 ReLU 层被分裂? Or alternatively, is there a way to manually merge the ReLU layers back into the Conv2D layers after the quantization-aware training?或者,有没有办法在量化感知训练之后手动将 ReLU 层合并回 Conv2D 层?

Edit: QA training code:编辑: QA培训代码:

def learn_qaware(self):
quantize_model = tfmot.quantization.keras.quantize_model
self.model = quantize_model(self.model)

training_generator = SCDataGenerator(self.training_set)
validate_generator = SCDataGenerator(self.validate_set)

self.model.compile(
            optimizer=self.configure_optimizers(qa_learn=True),
            loss=self.get_LLP_loss(),
            metrics=self.get_metrics(),
            run_eagerly=config['eager_mode'],
        )
self.model.fit(
    training_generator,
    epochs = self.hparams['max_epochs'],
    batch_size = 1,
    shuffle = self.hparams['shuffle_curves'],
    validation_data = validate_generator,
    callbacks = self.get_callbacks(qa_learn=True),
)

Quantized TFLite model generation code:量化 TFLite model 生成代码:

def tflite_convert(classifier):
output_file = get_tflite_filename(classifier.model_path)

# Convert the model to the TensorFlow Lite format without quantization
saved_shape = classifier.model.input.shape.as_list()
fixed_shape = saved_shape
fixed_shape[0] = 1
classifier.model.input.set_shape(fixed_shape) # Force batch size to 1 for generation
converter = tf.lite.TFLiteConverter.from_keras_model(classifier.model)
classifier.model.input.set_shape(saved_shape)

# Set the optimization flag.
converter.optimizations = [tf.lite.Optimize.DEFAULT]
# Enforce integer only quantization
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
# Provide a representative dataset to ensure we quantize correctly.
if config['eager_mode']:
    tf.executing_eagerly()
def representative_dataset():
    for x in classifier.validate_set.get_all_inputs():
        rs = x.reshape(1, x.shape[0], 1, 1).astype(np.float32)
        yield([rs])

converter.representative_dataset = representative_dataset
model_tflite = converter.convert()

# Save the model to disk
open(output_file, "wb").write(model_tflite)

return TFLite_model(output_file)

You can pass activation=tf.nn.relu6 to use ReLU6 activation.您可以通过activation=tf.nn.relu6来使用 ReLU6 激活。

I have found a workaround which works by instantiating a non-trained version of the model, then copying over the weights from the quantization aware trained model before converting to TFLite.我找到了一种解决方法,它通过实例化 model 的非训练版本,然后在转换为 TFLite 之前从经过量化感知训练的 model 复制权重。

This seems like quite a hack, so I'm still on the lookout for a cleaner solution.这似乎是一个相当黑客,所以我仍在寻找更清洁的解决方案。

Code for the workaround:解决方法的代码:

def dequantize(self):
    if not hasattr(self, 'fp_model') or not self.fp_model:
        self.fp_model = self.get_default_model()

    def find_layer_in_model(name, model):
        for layer in model.layers:
            if layer.name == name:
                return layer
        return None

    def find_weight_group_in_layer(name, layer):
        for weight_group in quant_layer.trainable_weights:
            if weight_group.name == name:
                return weight_group
        return None

    for layer in self.fp_model.layers:
        if 'input' in layer.name or 'quantize_layer' in layer.name:
            continue
        
        QUANT_TAG = "quant_"
        quant_layer = find_layer_in_model(QUANT_TAG+layer.name,self.model)
        if quant_layer is None:
            raise RuntimeError('Failed to match layer ' + layer.name)

        for i, weight_group in enumerate(layer.trainable_weights):
            quant_weight_group = find_weight_group_in_layer(QUANT_TAG+weight_group.name, quant_layer)
            if quant_weight_group is None:
                quant_weight_group = find_weight_group_in_layer(weight_group.name, quant_layer)
                if quant_weight_group is None:
                    raise RuntimeError('Failed to match weight group ' + weight_group.name)

            layer.trainable_weights[i].assign(quant_weight_group)

    self.model = self.fp_model

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Tensorflow 量化感知训练 - Tensorflow Quantization Aware Training 如何通过实验量化从 TensorFlow 的量化感知训练中获得量化权重 - How to get quantized weights from TensorFlow's quantization aware training with experimental quantization 如何使用量化感知训练完成神经网络的4位量化 - How to complete 4-bit quantization of neural network using quantization-aware-training Tensorflow Keras 模型的量化感知训练 - Quantization Aware Training for Tensorflow Keras model TensorFlow 版本 2 和 BatchNorm 折叠中的量化感知训练 - Quantization aware training in TensorFlow version 2 and BatchNorm folding 量化意识训练比量化后差 - Quantization aware training worse than post-quantization TensorFlow 量化感知训练的量化节点中附加参数的用途 - Purpose of additional parameters in Quantization Nodes of TensorFlow Quantization Aware Training Tensorflows 量化感知训练是否会导致训练期间的实际加速? - Does Tensorflows quantization aware training lead to an actual speedup during training? TensorFlow 2 使用 tf.GradientTape 进行量化感知训练 (QAT) - TensorFlow 2 Quantization Aware Training (QAT) with tf.GradientTape 使用高级keras api在Tensorflow中进行量化感知训练 - Quantization-aware training in Tensorflow using the highlevel keras api
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM