简体   繁体   English

额外的纪元继续训练后Keras正确保存检查点-初始纪元

[英]Keras correctly saving checkpoints after extra epochs continuing training - initial epoch

ModelCheckpoint works great when I decide to train new model and saves checkpoints as I would like them to be saved. 当我决定训练新模型并保存我希望保存的检查点时, ModelCheckpoint效果很好。 However when I decide to train same model for n more epochs my problem arises. 但是,当我决定再训练同一模型n个时期时,就会出现问题。 The thing is epochs get reset to 0, which will produce some model checkpoint names as follows: 问题是将纪元重置为0,这将产生一些模型检查点名称,如下所示:

/checkpoints
    checkpoint-01-0.24.h5
    checkpoint-02-0.34.h5
    checkpoint-03-0.37.h5
              .
              .
    checkpoint-m-0.68.h5
    checkpoint-01-0.71.h5
    checkpoint-02-0.73.h5
    checkpoint-03-0.74.h5
              .
              .
    checkpoint-n-0.85.h5

Where as you can see epochs will get reset. 如您所见,纪元将被重置。 What I would like to achieve is to get number of all epochs in previous iterations and add it new epochs to get something like this: 我想要实现的是获取先前迭代中所有纪元的数量,并将其添加到新纪元中,以获得如下所示的内容:

    checkpoint-(m + 01)-0.71.h5
    checkpoint-(m + 02)-0.73.h5
    checkpoint-(m + 03)-0.74.h5
              .
              .
    checkpoint-(m + n)-0.85.h5

As you can read in the doc of the .fit() function, there is a parameter that does exactly that : 正如您可以在.fit()函数的文档中看到的.fit() ,有一个参数可以做到这一点:

initial_epoch: epoch at which to start training (useful for resuming a previous training run) initial_epoch:开始训练的时期(用于恢复以前的训练运行)

so just add : 所以只需添加:

model.fit(..., initial_epoch=m)

where as in your example, m is the first epoch to be running. 在您的示例中,m是要运行的第一个纪元。

I hope this helps :) 我希望这有帮助 :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM