简体   繁体   English

实施 CTC 损失 keras

[英]Implimenting CTC loss keras

Considering the fact that you have a basic model similar to this:考虑到您有一个与此类似的基本模型:

input_layer = layers.Input(shape=(50,20))
layer = layers.Dense(123, activation = 'relu')
layer = layers.LSTM(128, return_sequences = True)(layer)
outputs = layers.Dense(20, activation='softmax')(layer)
model = Model(input_layer,outputs)

How would you implement CTC loss?您将如何实施 CTC 损失? I tried something from the keras code tutorial on OCR like this:我在 OCR 上的 keras 代码教程中尝试了一些像这样的东西:

class CTCLayer(layers.Layer):
    def __init__(self, name=None):
        super().__init__(name=name)
        self.loss_fn = keras.backend.ctc_batch_cost

    def call(self, y_true, y_pred):
        # Compute the training-time loss value and add it
        # to the layer using `self.add_loss()`.
        batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
        input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
        label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")

        input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
        label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")

        loss = self.loss_fn(y_true, y_pred, input_length, label_length)
        self.add_loss(loss)

        # At test time, just return the computed predictions
        return y_pred
labels = layers.Input(shape=(None,), dtype="float32")
input_layer = layers.Input(shape=(50,20))
layer = layers.Dense(123, activation = 'relu')
layer = layers.LSTM(128, return_sequences = True)(layer)
outputs = layers.Dense(20, activation='softmax')(layer)
output = CTCLayer()(labels,outputs)
model = Model(input_layer,outputs)

However when it came to the model.fit part it started to fall apart due to me not knowing how to feed the model the "label" input layer thing.然而,当谈到 model.fit 部分时,它开始崩溃,因为我不知道如何为模型提供“标签”输入层的东西。 I think that the approach in the tutorial is quite unambiguous so what would be a better and more efficient way to do implement the CTC loss?我认为教程中的方法非常明确,那么实现 CTC 损失的更好、更有效的方法是什么?

The only thing you are doing wrong is the Model creation model = Model(input_layer,outputs) it should be model = Model([input_layer,labels],output) that said you can also compile the model withtf.nn.ctc_loss as loss if you don't want to have 2 inputs你做错的唯一一件事是模型创建model = Model(input_layer,outputs)它应该是model = Model([input_layer,labels],output)说你也可以用tf.nn.ctc_loss作为损失编译模型如果你不想有 2 个输入

def my_loss_fn(y_true, y_pred):
  loss_value = tf.nn.ctc_loss(y_true, y_pred, y_true_length, y_pred_length, 
  logits_time_major = False)
  return tf.reduce_mean(loss_value, axis=-1)

model.compile(optimizer='adam', loss=my_loss_fn)

Something like this, Note that the code above is not tested and you need to find the the y_pred and y_true length but you can do that as is done in the ctc layer像这样,请注意,上面的代码没有经过测试,您需要找到 y_pred 和 y_true 长度,但您可以像在 ctc 层中那样进行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM