简体   繁体   English

如何保存/加载优化的 GPy 回归 Model

[英]How to Save/Load Optimized GPy Regression Model

I'm trying to save my optimized Gaussian process model for use in a different script.我正在尝试保存优化的高斯过程 model 以用于不同的脚本。 My current line of thinking is to store the model information in a json file, utilizing GPy's built-in to_dict and from_dict functions.我目前的想法是利用 GPy 的内置to_dictfrom_dict函数将 model 信息存储在 json 文件中。 Something along the lines of:类似的东西:

import GPy
import numpy as np
import json

X = np.random.uniform(-3.,3.,(20,1))
Y = np.sin(X) + np.random.randn(20,1)*0.05
kernel = GPy.kern.RBF(input_dim=1, variance=1., lengthscale=1.)

m = GPy.models.GPRegression(X, Y, kernel)

m.optimize(messages=True)
m.optimize_restarts(num_restarts = 10)

jt = json.dumps(m.to_dict(save_data=False), indent=4)
with open("j-test.json", 'w') as file:
    file.write(jt)

This step works with no issues, but I run into problems when I try to load the model information using:此步骤没有问题,但是当我尝试使用以下方法加载 model 信息时遇到问题:

with open("j-test.json", 'r') as file:
    d = json.load(file)  # d is a dictionary

m2 = GPy.models.GPClassification.from_dict(d, data=None)

which gives me an assertion error because "data is not None", which it is -- or at least I think so.这给了我一个断言错误,因为“数据不是无”,它是——或者至少我是这么认为的。 断言错误

I'm really new to GPy and using jsons, so I'm really not sure where I've gone astray.我对 GPy 和使用 jsons 很陌生,所以我真的不确定我哪里误入歧途了。 I tried looking into the documentation, but the documentation is a bit vague and I couldn't find an example of its use.我试着查看文档,但文档有点含糊,我找不到它的使用示例。 Is there a step/concept that I missed?有没有我错过的步骤/概念? Also, is this the best way to store and reload my model?另外,这是存储和重新加载我的 model 的最佳方式吗? Any help with this would be greatly appreciated!对此的任何帮助将不胜感激! Thanks!谢谢!

The module pickle is your friend here!模块 pickle 是你的朋友!

import pickle
with open('save.pkl', 'wb') as file:
    pickle.dump(m, file)

you can call it back in a future script with:你可以在未来的脚本中调用它:

with open('save.pkl', 'rb') as file:
    loaded_model = pickle.load(file)

Pickle has not been suggested as the recommended method to do this. Pickle 尚未被建议作为执行此操作的推荐方法。 See here , in the section towards the end.请参阅此处,在接近尾声的部分中。 Following is the example for the same.以下是相同的示例。

# let X, Y be data loaded above
# Model creation:
m = GPy.models.GPRegression(X, Y)
m.optimize()
# 1: Saving a model:
np.save('model_save.npy', m.param_array)
# 2: loading a model
# Model creation, without initialization:
m_load = GPy.models.GPRegression(X, Y, initialize=False)
m_load.update_model(False) # do not call the underlying expensive algebra on load
m_load.initialize_parameter() # Initialize the parameters (connect the parameters up)
m_load[:] = np.load('model_save.npy') # Load the parameters
m_load.update_model(True) # Call the algebra only once
print(m_load)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM