[英]Keras correctly saving checkpoints after extra epochs continuing training - initial epoch
ModelCheckpoint
works great when I decide to train new model and saves checkpoints as I would like them to be saved. 当我决定训练新模型并保存我希望保存的检查点时, ModelCheckpoint
效果很好。 However when I decide to train same model for n
more epochs my problem arises. 但是,当我决定再训练同一模型n
个时期时,就会出现问题。 The thing is epochs get reset to 0, which will produce some model checkpoint names as follows: 问题是将纪元重置为0,这将产生一些模型检查点名称,如下所示:
/checkpoints
checkpoint-01-0.24.h5
checkpoint-02-0.34.h5
checkpoint-03-0.37.h5
.
.
checkpoint-m-0.68.h5
checkpoint-01-0.71.h5
checkpoint-02-0.73.h5
checkpoint-03-0.74.h5
.
.
checkpoint-n-0.85.h5
Where as you can see epochs will get reset. 如您所见,纪元将被重置。 What I would like to achieve is to get number of all epochs in previous iterations and add it new epochs to get something like this: 我想要实现的是获取先前迭代中所有纪元的数量,并将其添加到新纪元中,以获得如下所示的内容:
checkpoint-(m + 01)-0.71.h5
checkpoint-(m + 02)-0.73.h5
checkpoint-(m + 03)-0.74.h5
.
.
checkpoint-(m + n)-0.85.h5
As you can read in the doc of the .fit()
function, there is a parameter that does exactly that : 正如您可以在.fit()
函数的文档中看到的.fit()
,有一个参数可以做到这一点:
initial_epoch: epoch at which to start training (useful for resuming a previous training run) initial_epoch:开始训练的时期(用于恢复以前的训练运行)
so just add : 所以只需添加:
model.fit(..., initial_epoch=m)
where as in your example, m is the first epoch to be running. 在您的示例中,m是要运行的第一个纪元。
I hope this helps :) 我希望这有帮助 :)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.