简体繁体 English

在不每次运行 model 的情况下进行预测是否可行，只需调用我的火车 model 的方程来预测测试数据集？

[英]Is it feasible to do the prediction without running the model everytime, just by calling the equation of my train model to predict the test dataset?

原文 2020-11-30 06:56:31 0 1 python

I am running a linear equation and random forest model and every time i have to run a huge train data set to generate the model and eventually using the model to predict the test data set.我正在运行一个线性方程和随机森林 model 并且每次我必须运行一个巨大的火车数据集来生成 model 并最终使用 model 来预测测试数据集。 Is it possible to use only the equation of the model rather running the whole program as it takes a lot of time for prediction of the test data set?是否可以仅使用 model 的方程而不是运行整个程序，因为预测测试数据集需要大量时间？

1 个解决方案

Sure thing you can use just the equation for your linear model.当然，您可以只使用线性 model 的方程。 You just need to access coefficients and bias to do it.您只需要访问系数和偏差即可。 The way to do it depends on the framework you are using.执行此操作的方法取决于您使用的框架。

For example, you can see the coeff_ attribute in sklearn documentation .例如，您可以在sklearn文档中看到coeff_属性。

To save and then reuse the Random Forest model is a lot trickier.保存然后重用随机森林 model 是很棘手的。

The universal solution will be:通用解决方案将是：

Train your model.训练您的 model。
Serialize it with pickle to a file.用pickle将其序列化到文件中。
Whenever you need to make predictions just deserialize the file with the model and use it.每当您需要进行预测时，只需使用 model 反序列化文件并使用它。

More information about how to serialize a model with pickle or joblib . 有关如何使用pickle或joblib序列化 model 的更多信息。

Also, different frameworks usually have built-in interfaces for model serialization.此外，不同的框架通常都有内置的 model 序列化接口。