如何保存/加载优化的 GPy 回归 Model

Question

I'm trying to save my optimized Gaussian process model for use in a different script.我正在尝试保存优化的高斯过程 model 以用于不同的脚本。 My current line of thinking is to store the model information in a json file, utilizing GPy's built-in to_dict and from_dict functions.我目前的想法是利用 GPy 的内置to_dict和from_dict函数将 model 信息存储在 json 文件中。 Something along the lines of:类似的东西：

import GPy
import numpy as np
import json

X = np.random.uniform(-3.,3.,(20,1))
Y = np.sin(X) + np.random.randn(20,1)*0.05
kernel = GPy.kern.RBF(input_dim=1, variance=1., lengthscale=1.)

m = GPy.models.GPRegression(X, Y, kernel)

m.optimize(messages=True)
m.optimize_restarts(num_restarts = 10)

jt = json.dumps(m.to_dict(save_data=False), indent=4)
with open("j-test.json", 'w') as file:
    file.write(jt)

This step works with no issues, but I run into problems when I try to load the model information using:此步骤没有问题，但是当我尝试使用以下方法加载 model 信息时遇到问题：

with open("j-test.json", 'r') as file:
    d = json.load(file)  # d is a dictionary

m2 = GPy.models.GPClassification.from_dict(d, data=None)

which gives me an assertion error because "data is not None", which it is -- or at least I think so.这给了我一个断言错误，因为“数据不是无”，它是——或者至少我是这么认为的。

I'm really new to GPy and using jsons, so I'm really not sure where I've gone astray.我对 GPy 和使用 jsons 很陌生，所以我真的不确定我哪里误入歧途了。 I tried looking into the documentation, but the documentation is a bit vague and I couldn't find an example of its use.我试着查看文档，但文档有点含糊，我找不到它的使用示例。 Is there a step/concept that I missed?有没有我错过的步骤/概念？ Also, is this the best way to store and reload my model?另外，这是存储和重新加载我的 model 的最佳方式吗？ Any help with this would be greatly appreciated!对此的任何帮助将不胜感激！ Thanks!谢谢！

Answer 1

The module pickle is your friend here!模块 pickle 是你的朋友！

import pickle
with open('save.pkl', 'wb') as file:
    pickle.dump(m, file)

you can call it back in a future script with:你可以在未来的脚本中调用它：

with open('save.pkl', 'rb') as file:
    loaded_model = pickle.load(file)

Answer 2

Pickle has not been suggested as the recommended method to do this. Pickle 尚未被建议作为执行此操作的推荐方法。 See here , in the section towards the end.请参阅此处，在接近尾声的部分中。 Following is the example for the same.以下是相同的示例。

# let X, Y be data loaded above
# Model creation:
m = GPy.models.GPRegression(X, Y)
m.optimize()
# 1: Saving a model:
np.save('model_save.npy', m.param_array)
# 2: loading a model
# Model creation, without initialization:
m_load = GPy.models.GPRegression(X, Y, initialize=False)
m_load.update_model(False) # do not call the underlying expensive algebra on load
m_load.initialize_parameter() # Initialize the parameters (connect the parameters up)
m_load[:] = np.load('model_save.npy') # Load the parameters
m_load.update_model(True) # Call the algebra only once
print(m_load)

如何保存/加载优化的 GPy 回归 Model

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-10-27 15:44:52

解决方案2
0 2021-10-28 15:49:19

如何保存/加载优化的 GPy 回归 Model

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-10-27 15:44:52

解决方案2 0 2021-10-28 15:49:19

解决方案1
3 已采纳 2020-10-27 15:44:52

解决方案2
0 2021-10-28 15:49:19