[英]Is it possible in TF/Keras to save the best model AFTER X epochs?
My models run really fast but they seem to slow down because I'm saving the best model (to load in another process);我的模型运行得非常快,但它们似乎变慢了,因为我正在保存最好的模型(在另一个进程中加载); but I'm noticing the saving process itself slows down the processing.但我注意到保存过程本身会减慢处理速度。 As in the early stages of the fitting each iteration is improving it's adding more and more latency.在拟合的早期阶段,每次迭代都在改进,它增加了越来越多的延迟。
I wonder if there is a way to save the best model AFTER X epochs or save it in the background so the model training isn't delayed by saving too often?我想知道是否有办法在 X epochs 之后保存最佳模型或将其保存在后台,以便模型训练不会因保存过于频繁而延迟?
For clarity, this is how I'm running ModelCheckpoint
in Keras/TF2:为清楚起见,这就是我在ModelCheckpoint
/TF2 中运行ModelCheckpoint
:
filepath="BestModel.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
# fit the model
model.fit(x, y, epochs=40, batch_size=50, callbacks=callbacks_list)
You can use save_freq
argument of ModelCheckpoint
callback to control the frequency of saving.您可以使用save_freq
的说法ModelCheckpoint
回调来控制节电的频率。 By default, it is set to 'epoch'
which means it would save the model at the end of each epoch;默认情况下,它设置为'epoch'
,这意味着它会在每个 epoch 结束时保存模型; however, it also could be set to an integer which determines the number of batches to pass to save the model.但是,它也可以设置为一个整数,以确定要传递的批次数以保存模型。 Here is the relevant part of documentation for reference:以下是文档的相关部分供参考:
save_freq :
'epoch'
or integer. save_freq :'epoch'
或整数。 When using'epoch'
, the callback saves the model after each epoch.使用'epoch'
,回调会在每个 epoch 之后保存模型。 When using integer, the callback saves the model at end of this many batches.使用整数时,回调在这么多批次结束时保存模型。 If theModel
is compiled withexperimental_steps_per_execution=N
, then the saving criteria will be checked every Nth batch.如果Model
是用experimental_steps_per_execution=N
编译的,那么保存标准将每第 N 批检查一次。 Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch).请注意,如果保存未与时期对齐,则受监控的指标可能不太可靠(它可能反映少至 1 个批次,因为指标在每个时期都会重置)。 Defaults to'epoch'
.默认为'epoch'
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.