[英]why the cross entropy loss of mnist classification using tensorflow's estimator high-level API and raw API are different in scale?
I am reading some tensorflow example codes and I find the loss in CNN-using-estimatorAPI and the loss in raw CNN are really different in scale, but they are all the same loss function: 我正在阅读一些tensorflow示例代码,并且发现CNN-using-estimatorAPI中的损失和原始CNN中的损失在规模上确实有所不同,但它们都是相同的损失函数:
the former is loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits_train, labels=tf.cast(labels, dtype=tf.int32)))
, which use the not-one-hot label. 前者是loss_op = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits_train, labels=tf.cast(labels, dtype=tf.int32)))
,后者使用的不是一个热门标签。
the latter is loss_op =tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
, which use the one-hot vector label. 后者是loss_op =tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
,它使用单热矢量标签。
why the former loss is nearlly 0 ~ 2.39026, and the latter loss is much bigger, why is it? 为什么前者的损失几乎为0〜2.39026,而后者的损失要大得多,为什么呢?
我知道,这是因为变量初始值设定项有所不同,tf.layers。*的默认值不是tf.random_normal(),对于更大的损失,这是因为softmax_cross_entropy_with_logits中log(0)的内部处理机制认为较低的损失更准确,因为log(1e-5)=-11。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.