[英]After building a model in Python, how do I save the model so I could shut down my computer and work on it the next day?
Say, I have finished building a regression model in Python, as below:比如说,我已经用 Python 构建了一个回归模型,如下:
from sklearn import linear_model
model = linear_model.LogisticRegression().fit(X_train, Y_train)
How do I save the "model" that I have built so I could shut down my computer and work on it the next day without having to rerun the code to get the "model" again?如何保存我构建的“模型”,以便我可以关闭计算机并在第二天继续工作,而不必重新运行代码来再次获取“模型”?
The reason why I am asking this is because my dataset is quite huge and it will take really long having to rerun to get the model again.我之所以这么问是因为我的数据集非常庞大,而且需要很长时间才能重新运行才能再次获得模型。
This is a problem of serialisation and a very simple way would be to use the pickle
module.这是一个序列化问题,一个非常简单的方法是使用
pickle
模块。 The following snippets show how you can save and load a Python object.以下片段显示了如何保存和加载 Python 对象。
To save:保存:
import pickle
with open("YOUR_FILE_NAME_HERE.pkl", 'wb') as file:
pickle.dump(model, file)
To load:装载:
# Import all your relevant libraries first
from sklearn import linear_model
...
import pickle
with open("YOUR_FILE_NAME_HERE.pkl", 'rb') as file:
model = pickle.load(file)
The idea is to essentially create a file representation of your object ( model
) and save it to file in a way that can be interpreted and loaded on demand.这个想法本质上是创建对象(
model
)的文件表示,并以可以按需解释和加载的方式将其保存到文件中。 This can be achieved in many different ways, but the simplest method with Python is to use pickle which creates a binary representation of your object and all the associated objects and modules.这可以通过许多不同的方式实现,但 Python 最简单的方法是使用 pickle,它创建对象以及所有关联对象和模块的二进制表示。
For further reading, consult the pickle
documentation here and for a better understanding of serialisation, refer to here .如需进一步阅读,请参阅此处的
pickle
文档,如需更好地理解序列化,请参阅此处。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.