简体   繁体   English

如何在keras中实现categorical_crossentropy?

[英]How is the categorical_crossentropy implemented in keras?

I'm trying to apply the concept of distillation, basically to train a new smaller network to do the same as the original one but with less computation. 我正在尝试应用蒸馏的概念,基本上是为了训练一个新的小型网络与原始网络一样,但计算量较少。

I have the softmax outputs for every sample instead of the logits. 我有每个样本的softmax输出而不是logits。

My question is, how is the categorical cross entropy loss function implemented? 我的问题是,如何实现分类交叉熵损失函数? Like it takes the maximum value of the original labels and multiply it with the corresponded predicted value in the same index, or it does the summation all over the logits (One Hot encoding) as the formula says: 就像它采用原始标签的最大值并将其与相同索引中的相应预测值相乘,或者它在整个logits(One Hot encoding)中的总和如公式所示:

在此输入图像描述

I see that you used the tensorflow tag, so I guess this is the backend you are using? 我看到你使用了tensorflow标签,所以我猜这是你正在使用的后端?

def categorical_crossentropy(output, target, from_logits=False):
"""Categorical crossentropy between an output tensor and a target tensor.
# Arguments
    output: A tensor resulting from a softmax
        (unless `from_logits` is True, in which
        case `output` is expected to be the logits).
    target: A tensor of the same shape as `output`.
    from_logits: Boolean, whether `output` is the
        result of a softmax, or is a tensor of logits.
# Returns
    Output tensor.

This code comes from the keras source code . 此代码来自keras源代码 Looking directly at the code should answer all your questions :) If you need more info just ask ! 直接查看代码应该回答所有问题:)如果您需要更多信息,请询问!

EDIT : 编辑:

Here is the code that interests you : 以下是您感兴趣的代码:

 # Note: tf.nn.softmax_cross_entropy_with_logits
# expects logits, Keras expects probabilities.
if not from_logits:
    # scale preds so that the class probas of each sample sum to 1
    output /= tf.reduce_sum(output,
                            reduction_indices=len(output.get_shape()) - 1,
                            keep_dims=True)
    # manual computation of crossentropy
    epsilon = _to_tensor(_EPSILON, output.dtype.base_dtype)
    output = tf.clip_by_value(output, epsilon, 1. - epsilon)
    return - tf.reduce_sum(target * tf.log(output),
                          reduction_indices=len(output.get_shape()) - 1)

If you look at the return, they sum it... :) 如果你看一下回报,他们总结一下...... :)

As an answer to "Do you happen to know what the epsilon and tf.clip_by_value is doing?", 作为回答“你碰巧知道epsilon和tf.clip_by_value正在做什么吗?”,
it is ensuring that output != 0 , because tf.log(0) returns a division by zero error. 它确保output != 0 ,因为tf.log(0)返回除零错误。
(I don't have points to comment but thought I'd contribute) (我没有评论意见,但我认为我会做出贡献)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Keras Categorical_crossentropy 损失实现 - Keras Categorical_crossentropy loss implementation 在 Keras 中为“categorical_crossentropy”选择验证指标 - Selecting validation metric for `categorical_crossentropy` in Keras Keras:binary_crossentropy 和 categorical_crossentropy 混淆 - Keras: binary_crossentropy & categorical_crossentropy confusion keras自定义生成器categorical_crossentropy修复输出形状问题 - keras custom generator categorical_crossentropy fix output shape issue TensorFlow 'categorical_crossentropy' 中的 ValueError - ValueError in TensorFlow 'categorical_crossentropy' sparse_categorical_crossentropy 和 categorical_crossentropy 有什么区别? - What is the difference between sparse_categorical_crossentropy and categorical_crossentropy? 即使在keras中精度为1.00,categorical_crossentropy也会返回较小的损失值 - categorical_crossentropy returns small loss value even if accuracy is 1.00 in keras Keras:使用categorical_crossentropy而不使用单热编码的目标数组 - Keras: Use categorical_crossentropy without one-hot encoded array of targets 使用 categorical_crossentropy 时出错 - Error while using categorical_crossentropy 对图像序列使用 categorical_crossentropy - Using categorical_crossentropy for a sequence of images
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM