简体   繁体   English

如何停止 tensorflow 中的培训工作?

[英]How can I stop a training job in tensorflow?

I'm using this tutorial to train my own object detector ( https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html ), as far as I could see, it doesn't teach us how to stop or when to stop a traning job.我正在使用本教程来训练我自己的 object 检测器( https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.ZFC35FDC70D5FC69D269883A822 ) t 教我们如何停止或何时停止培训工作。 Can you guys please help me with this?你们能帮我解决这个问题吗? I'm trainig my model for almost 24h, my total loss is about 2.我正在训练我的 model 将近 24 小时,我的总损失约为 2。

Loss is a relative value, as it does not have a direct correlation like accuracy to how good the model does so a value of 2 does not provide much insight.损失是一个相对值,因为它与 model 的精度没有直接相关性,因此值为 2 并不能提供太多洞察力。 You can see if the loss is decreasing if the loss is decreasing you can keep training the model for more number of steps.如果损失减少,您可以查看损失是否减少,您可以继续训练 model 以获得更多步数。

If your question is how to set the number of epochs.如果您的问题是如何设置时期数。 Those configurations are to be done in the *.config file.这些配置将在 *.config 文件中完成。 You can edit the config file to change the values for batch size and number of steps.您可以编辑配置文件以更改批量大小和步骤数的值。

Number of epochs trained = (Number of images in training set / batch size)*num_steps训练的 epoch 数 =(训练集中的图像数 / 批量大小)*num_steps

*One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE *一个时期是指整个数据集仅通过神经网络向前和向后传递一次

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我怎样才能停止模型训练并恢复它? - How can i stop model training and resume it? Tensorflow 停止并恢复训练 - Tensorflow stop and resume training Tensorflow:如何恢复训练模型? (蟒蛇) - Tensorflow: How can I restore model for training? (Python) 如何使TensorFlow RNN训练更强大? - How can I make TensorFlow RNN training more robust? 如何使用文件列表作为 Sagemaker 上的训练集和 Tensorflow? - How can I use a list of files as the training set on Sagemaker with Tensorflow? Tensorflow 2.0:如何像使用 PyTorch 一样完全自定义 Tensorflow 训练循环? - Tensorflow 2.0: How can I fully customize a Tensorflow training loop like I can with PyTorch? 在训练期间如何在每个 epoch 结束时调用测试集? 我正在使用张量流 - How can I call a test set at the end of each epoch during the training? I am using tensorflow 我如何在给定程序的 TensorFlow 版本 v1.x 中 plot 训练准确度、训练损失 - How can I plot training accuracy, training loss with respect to epochs in TensorFlow version v1.x in given program 如何在Sagemaker的XGBoost培训工作中用Python指定content_type? - How can I specify content_type in a training job of XGBoost from Sagemaker in Python? 如何验证我的训练作业是否正在读取增强清单文件? - How can I verify that my training job is reading the augmented manifest file?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM