I have a convolutional neural network that I recently refactored to use Tensorflow's Estimator API, largely following this tutorial . However, during training, the metrics I have added to the EstimatorSpec are not being displayed on Tensorboard and do not seem to be evaluated in tfdbg, despite the name scope and metrics being present in the graph written to Tensorboard.
The relevant bits for the model_fn
are as follows:
...
predictions = tf.placeholder(tf.float32, [num_classes], name="predictions")
...
with tf.name_scope("metrics"):
predictions_rounded = tf.round(predictions)
accuracy = tf.metrics.accuracy(input_y, predictions_rounded, name='accuracy')
precision = tf.metrics.precision(input_y, predictions_rounded, name='precision')
recall = tf.metrics.recall(input_y, predictions_rounded, name='recall')
if mode == tf.estimator.ModeKeys.PREDICT:
spec = tf.estimator.EstimatorSpec(mode=mode,
predictions=predictions)
elif mode == tf.estimator.ModeKeys.TRAIN:
...
# if we're doing softmax vs sigmoid, we have different metrics
if cross_entropy == CrossEntropyType.SOFTMAX:
metrics = {
'accuracy': accuracy,
'precision': precision,
'recall': recall
}
elif cross_entropy == CrossEntropyType.SIGMOID:
metrics = {
'precision': precision,
'recall': recall
}
else:
raise NotImplementedError("Unrecognized cross entropy function: {}\t Available types are: SOFTMAX, SIGMOID".format(cross_entropy))
spec = tf.estimator.EstimatorSpec(mode=mode,
loss=loss,
train_op=train_op,
eval_metric_ops=metrics)
else:
raise NotImplementedError('ModeKey provided is not supported: {}'.format(mode))
return spec
Anyone have any thoughts on why these aren't getting written? I'm using Tensorflow 1.7 and Python 3.5. I've tried adding them explicitly via tf.summary.scalar
, and while they do get into Tensorboard that way, they're never updated after the first pass through the graph.
The metrics API has a twist to it, let's take tf.metrics.accuracy
as an example (all tf.metrics.*
work the same). This returns 2 values, the accuracy
metric, and an upate_op
, this looks like your first mistake. You should have something like this:
accuracy, update_op = tf.metrics.accuracy(input_y, predictions_rounded, name='accuracy')
accuracy
is just the value as you'd expect it to be calculated, however notice that you might want to compute accuracy across multiple calls to sess.run
, for example when you compute accuracy of a large test set that doesn't all fit in memory. That's where update_op
comes in, it accrues the results so that when you ask for accuracy
it gives you a running tally.
update_op
has no dependencies, so you either need to run it explicitly in sess.run
or add a dependency. For example, you might set it to depend on the cost function so that when the cost function is computed update_op
is computed (causing the running tally for accuracy to be updated):
with tf.control_dependencies(cost):
tf.group(update_op, other_update_ops, ...)
You can reset the value of the metrics with the local variable initializer:
sess.run(tf.local_variables_initializer())
You will need to add accuracy to tensorboard with tf.summary.scalar(accuracy)
as you mentioned you'd tried (though it appears you were adding the wrong thing).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.