简体   繁体   English

在 TF/Keras 中是否可以在 X epochs 之后保存最佳模型?

[英]Is it possible in TF/Keras to save the best model AFTER X epochs?

My models run really fast but they seem to slow down because I'm saving the best model (to load in another process);我的模型运行得非常快,但它们似乎变慢了,因为我正在保存最好的模型(在另一个进程中加载​​); but I'm noticing the saving process itself slows down the processing.但我注意到保存过程本身会减慢处理速度。 As in the early stages of the fitting each iteration is improving it's adding more and more latency.在拟合的早期阶段,每次迭代都在改进,它增加了越来越多的延迟。

I wonder if there is a way to save the best model AFTER X epochs or save it in the background so the model training isn't delayed by saving too often?我想知道是否有办法在 X epochs 之后保存最佳模型或将其保存在后台,以便模型训练不会因保存过于频繁而延迟?

For clarity, this is how I'm running ModelCheckpoint in Keras/TF2:为清楚起见,这就是我在ModelCheckpoint /TF2 中运行ModelCheckpoint

filepath="BestModel.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
# fit the model
model.fit(x, y, epochs=40, batch_size=50, callbacks=callbacks_list)

You can use save_freq argument of ModelCheckpoint callback to control the frequency of saving.您可以使用save_freq的说法ModelCheckpoint回调来控制节电的频率。 By default, it is set to 'epoch' which means it would save the model at the end of each epoch;默认情况下,它设置为'epoch' ,这意味着它会在每个 epoch 结束时保存模型; however, it also could be set to an integer which determines the number of batches to pass to save the model.但是,它也可以设置为一个整数,以确定要传递的次数以保存模型。 Here is the relevant part of documentation for reference:以下是文档相关部分供参考:

save_freq : 'epoch' or integer. save_freq : 'epoch'或整数。 When using 'epoch' , the callback saves the model after each epoch.使用'epoch' ,回调会在每个 epoch 之后保存模型。 When using integer, the callback saves the model at end of this many batches.使用整数时,回调在这么多批次结束时保存模型。 If the Model is compiled with experimental_steps_per_execution=N , then the saving criteria will be checked every Nth batch.如果Model是用experimental_steps_per_execution=N编译的,那么保存标准将每第 N 批检查一次。 Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch).请注意,如果保存未与时期对齐,则受监控的指标可能不太可靠(它可能反映少至 1 个批次,因为指标在每个时期都会重置)。 Defaults to 'epoch' .默认为'epoch'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM