[英]Regularly evaluating large test sets during Convolutional Neural Network training
I created a small convolutional neural network with TensorFlow that I want to train. 我用TensorFlow创建了一个小卷积神经网络,我想训练它。
During the training, I want to record several metrics. 在培训期间,我想记录几个指标。 One of them is the accuracy on a test set independent of the training set.
其中之一是独立于训练集的测试集的准确性。
The MNIST example shows me how to do it: MNIST示例向我展示了如何执行此操作:
# Train the model, and also write summaries.
# Every 10th step, measure test-set accuracy, and write test summaries
# All other steps, run train_step on training data, & add training summaries
def feed_dict(train):
"""Make a TensorFlow feed_dict: maps data onto Tensor placeholders."""
if train or FLAGS.fake_data:
xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data)
k = FLAGS.dropout
else:
xs, ys = mnist.test.images, mnist.test.labels
k = 1.0
return {x: xs, y_: ys, keep_prob: k}
for i in range(FLAGS.max_steps):
if i % 10 == 0: # Record summaries and test-set accuracy
summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False))
test_writer.add_summary(summary, i)
print('Accuracy at step %s: %s' % (i, acc))
else: # Record train set summarieis, and train
summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True))
train_writer.add_summary(summary, i)
What it does is feeding the whole of the test set into the evaluation at every 10 steps, and print out this accuracy. 它的作用是每10个步骤将整个测试集输入评估,并打印出这种准确性。
That's pretty cool and all, but my test set is quite a bit larger. 这非常酷,但我的测试设置相当大。 I have about 2000 "images" of dimension 30x30x30x8, so feeding the all of this dataset into the evaluation at once would blow up both my core memory and the GPU memory.
我有大约2000张尺寸为30x30x30x8的“图像”,因此将所有这些数据集立即送入评估将会破坏我的核心内存和GPU内存。
As a workaround, I have this: 作为一种解决方法,我有这个:
accuracy = mymodel.accuracy(logits, label_placeholder)
test_accuracy_placeholder = tf.placeholder(tf.float32, name="test_accuracy")
test_summary = tf.scalar_summary("accuracy", test_accuracy_placeholder)
# training loop
for batch_idx in enumerate(batches_in_trainset):
#do training here
...
# check accuracy every 10 examples
if batch_idx % 10 == 0:
test_accuracies = [] # start with empty accuracy list
# inner testing loop
for test_batch_idx in range(batches_in_testset):
# get testset batch
labels, images = testset.next_batch()
# make feed dict
feed_dict = {
# ...
}
# calculate accuracy
test_accuracy_val = sess.run(accuracy, feed_dict=test_feed_dict)
# append accuracy to the list of test accuracies
test_accuracies.append(test_accuracy_val)
# "calculate" and log the average accuracy over all test batches
summary_str = sess.run(test_summary,
feed_dict={
test_accuracy_placeholder: sum(test_accuracies) / len(test_accuracies)})
test_writer.add_summary(summary_str)
Basically, I first collect all the accuracies on the test set batches and then I feed them into a second (disconnected) graph that calculates the average of those batches. 基本上,我首先收集测试集批次的所有准确度,然后将它们输入到第二个(断开连接的)图表中,以计算这些批次的平均值。
This "kind of" works, in the sense that I am indeed able to calculate a test set accuracy on the required intervals. 从某种意义上说,这种“有点”是有效的,我确实能够在所需的时间间隔内计算出测试集的准确度。
However, this feels very awkward and has the serious drawback that I cannot record anything else other than the test set accuracy. 然而,这感觉非常尴尬并且具有严重的缺点,除了测试集准确度之外我不能记录任何其他内容。
For example, I would like to also record the loss function value on the whole test set, the activation histograsm on the whole test set, and maybe some other variables. 例如,我还想记录整个测试集上的损失函数值,整个测试集上的激活histograsm,以及其他一些变量。
Preferably this should work just like in the MNIST example. 优选地,这应当像在MNIST示例中那样工作。 Check out the TensorBoard demo here: https://www.tensorflow.org/tensorboard/index.html#events
在这里查看TensorBoard演示: https ://www.tensorflow.org/tensorboard/index.html#events
In this summary, all variables and metrics are evaluated both on the test and training set. 在本摘要中, 所有变量和指标都在测试和训练集上进行评估。 I want that too!
我也想要那个! But I want that without somehow feeding the complete test set into my model.
但是我想要的却没有以某种方式将完整的测试集提供给我的模型。
It looks like this function was added with streaming metric evaluation (contrib). 看起来这个函数添加了流量度量评估(contrib)。
https://www.tensorflow.org/api_guides/python/contrib.metrics https://www.tensorflow.org/api_guides/python/contrib.metrics
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.