简体   繁体   English

PyTorch:保存优化器state的目的是什么?

[英]PyTorch: What's the purpose of saving the optimizer state?

PyTorch is capable of saving and loading the state of an optimizer. PyTorch 能够保存和加载优化器的 state。 An example is shown in the PyTorch tutorial . PyTorch 教程中显示了一个示例。 I'm currently just saving and loading the model state but not the optimizer.我目前只是保存和加载 model state 但不是优化器。 So what's the point of saving and loading the optimizer state besides not having to remember the optimizers params such as the learningrate.那么除了不必记住诸如学习率之类的优化器参数之外,保存和加载优化器 state 的意义何在。 And what's contained in the optimizer state?优化器 state 中包含什么?

I believe that saving the optimizer's state is an important aspect of logging and reproducibility.我相信保存优化器的 state 是日志记录和重现性的一个重要方面。 It stores many details about the optimizer's settings;它存储了许多关于优化器设置的细节; things including the kind of optimizer used, learning rate, weight decay, type of scheduler used (I find this very useful personally), etc. Moreover, it can be used in a similar fashion when loading pre-trained weights into your current model via .load_state_dict() such that you can pass in some stored optimizer setting/configuration into your current optimizer using the same method: optimizer.load_state_dict(some_good_optimizer.state_dict()) .包括使用的优化器类型、学习率、权重衰减、使用的调度器类型(我个人觉得这非常有用)等。此外,当通过.load_state_dict()以便您可以使用相同的方法将一些存储的优化器设置/配置传递到当前的优化器中: optimizer.load_state_dict(some_good_optimizer.state_dict())

You should save the optimizer state if you want to resume model training later.如果您想稍后恢复 model 训练,您应该保存优化器 state。 Especially if Adam is your optimizer.特别是如果 Adam 是您的优化器。 Adam is an adaptive learning rate method, which means it computes individual learning rates for various parameters. Adam 是一种自适应学习率方法,这意味着它计算各种参数的个体学习率。

It is not required if you only want to use the saved model for inference.如果您只想使用保存的 model 进行推理,则不需要。

However, It's best practice to save both model state and optimizer state.但是,最好同时保存 model state 和优化器 state。 You can also save loss history and other running metrics if you want to plot them later.如果您想稍后 plot,您还可以保存损失历史记录和其他运行指标。

I'd do it like,我会这样做,

    torch.save({
            'epoch': epochs,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'train_loss_history': loss_history,
            }, PATH)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 PyToch Big Graph 的嵌入数据集中优化器的 state_dict 的目的是什么? - What is the purpose of optimizer's state_dict in PyToch Big Graph's embedding dataset? Pytorch 中的 .optimizer 包是什么? - What is the .optimizer package in Pytorch? Pytorch 优化器的调度程序上的参数“last_epoch”是干什么用的? - What is the param `last_epoch` on Pytorch Optimizer's Schedulers is for? 将 Pytorch model.state_dict() 保存到 s3 - Saving Pytorch model.state_dict() to s3 有没有办法在 Pytorch 中监控优化器的步骤? - Is there a way to monitor optimizer's step in Pytorch? 优化器的 state_dict 中保存了什么? “state”、“param_groups”代表什么? - What are saved in optimizer's state_dict? what “state”,“param_groups” stands for? Pytorch / 加载优化器的 state dict 时的设备问题(cpu,gpu) - Pytorch / device problem(cpu, gpu) when load state dict for optimizer PyTorch:state_dict 和 parameters() 有什么区别? - PyTorch: What's the difference between state_dict and parameters()? AssertionError:在 Pytorch 的 AutomaticMixedPrecision 中没有记录此优化器的 inf 检查 - AssertionError: No inf checks were recorded for this optimizer in Pytorch's AutomaticMixedPrecision Pytorch Adam 优化器的尴尬行为? 重启更好? - Pytorch Adam optimizer's awkward behavior? better with restart?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM