简体   繁体   English

将 H2O 模型导入 Python

[英]Import H2O model to Python

I have model built in H2O (say, GLM model) Now, I want to import that model in Python to use for other apps.我在 H2O 中构建了模型(例如 GLM 模型)现在,我想在 Python 中导入该模型以用于其他应用程序。

How can I do it ?我该怎么做?

Try this:试试这个:

# build the model
model = H2ODeepLearningEstimator(params)
model.train(params)

# save the model
model_path = h2o.save_model(model=model, path="/tmp/mymodel", force=True)

print(model_path)
/tmp/mymodel/DeepLearning_model_python_1441838096933

# load the model
saved_model = h2o.load_model(model_path)

You need to export the model as a MOJO or POJO (prefer MOJO if your algorithm supports it).您需要将模型导出为 MOJO 或 POJO(如果您的算法支持,则首选 MOJO)。 This is a Java object, so you need to use Java to run it.这是一个 Java 对象,因此您需要使用 Java 来运行它。 There are lots of options for how to do this:如何做到这一点有很多选择:

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html

(BTW, the R API recently added h2o.predict_json() which does the conversion of arguments to JSON and the Java call for you; there appears to be nothing in the Python API yet, but if you read the R code you'll see it is not doing anything complex: just running a shell command.) (顺便说一句,R API 最近添加了 h2o.predict_json() 它将参数转换为 JSON 并为您调用 Java;Python API 中似乎还没有任何内容,但是如果您阅读R 代码,您会看到它没有做任何复杂的事情:只是运行一个 shell 命令。)

The other alternative is to stick with running the H2O server, and using it from Python.另一种选择是坚持运行 H2O 服务器,并从 Python 中使用它。 In that case you just want to save your model (a binary format), and then load it (back into the H2O cluster) in each time you want to make to make predictions: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/save-and-load-model.html在这种情况下,您只想保存模型(二进制格式),然后在每次要进行预测时将其加载(回到 H2O 集群中): http : //docs.h2o.ai/h2o/最新稳定/h2o-docs/save-and-load-model.html

The downside of this approach is you the binary format is always tied to the H2O version.这种方法的缺点是二进制格式总是与 H2O 版本相关联。 So if you upgrade H2O you cannot use your saved models any more.因此,如果您升级 H2O,您将无法再使用您保存的模型。

Newer versions of H2O have the ability to import MOJOs via the python API:较新版本的 H2O 能够通过 python API 导入 MOJO:

# re-import saved MOJO
imported_model = h2o.import_mojo(path)

new_observations = h2o.import_file(path='new_observations.csv')
predictions = imported_model.predict(new_observations)

Caution: MOJO cannot be re-imported into python in older H2O versions, which lack the h2o.import_mojo() function.注意:MOJO 不能在旧的 H2O 版本中重新导入 python,它缺少h2o.import_mojo()函数。

So h2o.save_model() seems to have lost its role - we can use just my_model.save_mojo() (notice it's not a h2o method, but a property of the model object), as these files can be used not just for Java apps deployment, but also in python as well (in fact they still use a python-Java bridge for that internally).所以h2o.save_model()似乎失去了它的作用——我们可以只使用my_model.save_mojo() (注意它不是一个h2o方法,而是模型对象的一个​​属性),因为这些文件不仅可以用于 Java 应用程序部署,但也在 python 中(实际上他们仍然在内部使用 python-Java 桥)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM