简体   繁体   English

Tensorflow 回调作为 CTC 的自定义指标

[英]Tensorflow Callback as Custom Metric for CTC

In an attempt to yield more metrics during the training of my model (written in TensorFlow version 2.1.0), like the Character Error Rate (CER) and Word Error Rate (WER), I created a callback to pass to the fit function of my model. It is able to generate the CER and WER at the end of an epoch.为了在我的 model(在 TensorFlow 版本 2.1.0 中编写)的训练过程中产生更多指标,例如字符错误率 (CER) 和单词错误率 (WER),我创建了一个回调以传递给合适的 function我的 model。它能够在一个纪元结束时生成 CER 和 WER。

It's my second choice as I wanted to create a custom metric for this, but you can only use keras backend functionality for custom metrics.这是我的第二个选择,因为我想为此创建一个自定义指标,但您只能将 keras 后端功能用于自定义指标。 Does anyone have any advice on how to convert the callback below into a Custom Metric (which can then be calculated during training on the validation and/or training data)?有没有人对如何将下面的回调转换为自定义指标有任何建议(然后可以在验证和/或培训数据的培训期间计算)?

Some roadblocks I encountered are:我遇到的一些障碍是:

  • Failure to convert the K.ctc_decode result to a sparse tensor无法将 K.ctc_decode 结果转换为稀疏张量
  • How can you calculate a distance like edit-distance using the Keras backend?如何使用 Keras 后端计算像编辑距离这样的距离?
class Metrics(tf.keras.callbacks.Callback):
    def __init__(self, valid_data, steps):
        """
        valid_data is a TFRecordDataset with batches of 100 elements per batch, shuffled and repeated infinitely. 
        steps define the amount of batches per epoch
        """
        super(Metrics, self).__init__()
        self.valid_data = valid_data
        self.steps = steps

    def on_train_begin(self, logs={}):
        self.cer = []
        self.wer = []
        
    def on_epoch_end(self, epoch, logs={}):

        imgs = []
        labels = []
        for idx, (img, label) in enumerate(self.valid_data.as_numpy_iterator()):
            if idx >= self.steps:
                break
            imgs.append(img)
            labels.extend(label)

        imgs = np.array(imgs)
        labels = np.array(labels)

        out = self.model.predict((batch for batch in imgs))        
        input_length = len(max(out, key=len))

        out = np.asarray(out)
        out_len = np.asarray([input_length for _ in range(len(out))])

        decode, log = K.ctc_decode(out,
                                    out_len,
                                    greedy=True)

        decode = [[[int(p) for p in x if p != -1] for x in y] for y in decode][0]

        for (pred, lab) in zip(decode, labels):
        
            dist = editdistance.eval(pred, lab)
            self.cer.append(dist / (max(len(pred), len(lab))))
            self.wer.append(not np.array_equal(pred, lab))

        
        print("Mean CER: {}".format(np.mean([self.cer], axis=1)[0]))
        print("Mean WER: {}".format(np.mean([self.wer], axis=1)[0]))

Solved in TF 2.3.1, but should apply for previous versions of 2.x as well.在 TF 2.3.1 中已解决,但也应适用于 2.x 的早期版本。

Some remarks:一些备注:

  • Information on how to properly implement a Tensorflow Custom Metric is scarce.关于如何正确实施 Tensorflow 自定义指标的信息很少。 The question implied the use of a callback to implement the metric.这个问题暗示了使用回调来实现指标。 This has longer epochs as a consequence (due to the explicit extra calculation of the metric on_epoch_end ), or so I believe.结果,这具有更长的时期(由于对指标on_epoch_end的显式额外计算),或者我相信。 Implementing it as a subclass of tensorflow.keras.metrics.Metric seems the right way, and yields results (if verbose is set correctly) while the epoch is ongoing.将其实现为tensorflow.keras.metrics.Metric的子类似乎是正确的方法,并且在 epoch 进行时会产生结果(如果设置正确)。
  • Calculating the edit distance for the CER is quite easily performed using tf.edit_distance (using sparse tensors), this can subsequently be used to calculate the WER using some tf logic.使用tf.edit_distance (使用稀疏张量)很容易计算 CER 的编辑距离,随后可以使用一些 tf 逻辑来计算 WER。
  • Alas, I am yet to find out how to implement both the CER and WER in one metric (as it has quite some duplicate code), if anyone knows how to do so, please contact me.唉,我还没有找到如何在一个指标中同时实现 CER 和 WER(因为它有很多重复的代码),如果有人知道如何做到这一点,请与我联系。
  • Custom metrics can simply be added into the compilation of your TF model: self.model.compile(optimizer=opt, loss=loss, metrics=[CERMetric(), WERMetric()])自定义指标可以简单地添加到 TF 模型的编译中: self.model.compile(optimizer=opt, loss=loss, metrics=[CERMetric(), WERMetric()])
class CERMetric(tf.keras.metrics.Metric):
    """
    A custom Keras metric to compute the Character Error Rate
    """
    def __init__(self, name='CER_metric', **kwargs):
        super(CERMetric, self).__init__(name=name, **kwargs)
        self.cer_accumulator = self.add_weight(name="total_cer", initializer="zeros")
        self.counter = self.add_weight(name="cer_count", initializer="zeros")

    def update_state(self, y_true, y_pred, sample_weight=None):
        input_shape = K.shape(y_pred)
        input_length = tf.ones(shape=input_shape[0]) * K.cast(input_shape[1], 'float32')

        decode, log = K.ctc_decode(y_pred,
                                    input_length,
                                    greedy=True)

        decode = K.ctc_label_dense_to_sparse(decode[0], K.cast(input_length, 'int32'))
        y_true_sparse = K.ctc_label_dense_to_sparse(y_true, K.cast(input_length, 'int32'))

        decode = tf.sparse.retain(decode, tf.not_equal(decode.values, -1))
        distance = tf.edit_distance(decode, y_true_sparse, normalize=True)

        self.cer_accumulator.assign_add(tf.reduce_sum(distance))
        self.counter.assign_add(len(y_true))

    def result(self):
        return tf.math.divide_no_nan(self.cer_accumulator, self.counter)

    def reset_states(self):
        self.cer_accumulator.assign(0.0)
        self.counter.assign(0.0)
class WERMetric(tf.keras.metrics.Metric):
    """
    A custom Keras metric to compute the Word Error Rate
    """
    def __init__(self, name='WER_metric', **kwargs):
        super(WERMetric, self).__init__(name=name, **kwargs)
        self.wer_accumulator = self.add_weight(name="total_wer", initializer="zeros")
        self.counter = self.add_weight(name="wer_count", initializer="zeros")

    def update_state(self, y_true, y_pred, sample_weight=None):
        input_shape = K.shape(y_pred)
        input_length = tf.ones(shape=input_shape[0]) * K.cast(input_shape[1], 'float32')

        decode, log = K.ctc_decode(y_pred,
                                    input_length,
                                    greedy=True)

        decode = K.ctc_label_dense_to_sparse(decode[0], K.cast(input_length, 'int32'))
        y_true_sparse = K.ctc_label_dense_to_sparse(y_true, K.cast(input_length, 'int32'))

        decode = tf.sparse.retain(decode, tf.not_equal(decode.values, -1))
        distance = tf.edit_distance(decode, y_true_sparse, normalize=True)
        
        correct_words_amount = tf.reduce_sum(tf.cast(tf.not_equal(distance, 0), tf.float32))

        self.wer_accumulator.assign_add(correct_words_amount)
        self.counter.assign_add(len(y_true))

    def result(self):
        return tf.math.divide_no_nan(self.wer_accumulator, self.counter)

    def reset_states(self):
        self.wer_accumulator.assign(0.0)
        self.counter.assign(0.0)

Alas, I am yet to find out how to implement both the CER and WER in one metric (as it has quite some duplicate code), if anyone knows how to do so, please contact me. las,我还没有找到如何在一个指标中同时实现 CER 和 WER(因为它有很多重复代码),如果有人知道如何这样做,请与我联系。

Hey, this solution really helped me a lot.嘿,这个解决方案真的帮了我很多。 As of now, there are TensorFlow 2.10 releases, so for this version, I wrote a combination of WER and CER metrics, here is the final working code:截至目前,有 TensorFlow 2.10 版本,因此对于这个版本,我编写了 WER 和 CER 指标的组合,这是最终的工作代码:

import tensorflow as tf

class CWERMetric(tf.keras.metrics.Metric):
    """ A custom TensorFlow metric to compute the Character Error Rate
    """
    def __init__(self, name='CWER', **kwargs):
        super(CWERMetric, self).__init__(name=name, **kwargs)
        self.cer_accumulator = tf.Variable(0.0, name="cer_accumulator", dtype=tf.float32)
        self.wer_accumulator = tf.Variable(0.0, name="wer_accumulator", dtype=tf.float32)
        self.counter = tf.Variable(0, name="counter", dtype=tf.int32)

    def update_state(self, y_true, y_pred, sample_weight=None):
        input_shape = tf.keras.backend.shape(y_pred)

        input_length = tf.ones(shape=input_shape[0], dtype='int32') * tf.cast(input_shape[1], 'int32')

        decode, log = tf.keras.backend.ctc_decode(y_pred, input_length, greedy=True)

        decode = tf.keras.backend.ctc_label_dense_to_sparse(decode[0], input_length)
        y_true_sparse = tf.cast(tf.keras.backend.ctc_label_dense_to_sparse(y_true, input_length), "int64")

        decode = tf.sparse.retain(decode, tf.not_equal(decode.values, -1))
        distance = tf.edit_distance(decode, y_true_sparse, normalize=True)

        correct_words_amount = tf.reduce_sum(tf.cast(tf.not_equal(distance, 0), tf.float32))

        self.wer_accumulator.assign_add(correct_words_amount)
        self.cer_accumulator.assign_add(tf.reduce_sum(distance))
        self.counter.assign_add(len(y_true))

    def result(self):
        return {
                "CER": tf.math.divide_no_nan(self.cer_accumulator, tf.cast(self.counter, tf.float32)),
                "WER": tf.math.divide_no_nan(self.wer_accumulator, tf.cast(self.counter, tf.float32))
        }

I still need to check whether it calculates CER and WER correctly, I'll find out that something is missing, I'll update this.我仍然需要检查它是否正确计算了 CER 和 WER,我会发现缺少某些东西,我会更新它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM