简体   繁体   English

在R环境中部署Amazon sagemaker生成的XGBoost模型

[英]Deploy an Amazon sagemaker-generated XGBoost model in R environment

I'm trying to deploy an XGBoost model , which is trained using Amazon sagemaker, in an R environment. 我正在尝试在R环境中部署XGBoost模型,该模型使用Amazon sagemaker进行了培训。 The sagemaker-generated model is stored as a Python pickle object. sagemaker生成的模型存储为Python泡菜对象。

Using the {reticulate} package in R, I'm able to import the model into R. However, using the model locally in R gives very different predictions as compared to using the model directly on Amazon sagemaker , on the same testing dataset. 使用R中的{reticulate}包,我可以将模型导入R。但是,与在同一测试数据集上直接在Amazon sagemaker上直接使用该模型相比,在R中本地使用该模型可以提供非常不同的预测。 I suspect there might be issues converting a XGBoost model stored by python into a model usable in R. Here is the relevant code I used to make the conversion: 我怀疑将python存储的XGBoost模型转换为XGBoost模型可能会有问题。这是我用来进行转换的相关代码:

library(reticulate)
library(xgboost)

model <- py_load_object("sagemaker-model")
# save the model locally, to be reload into R
model$save_model("local-model")
model_R = xgb.load("local-model")

The reason I first save the "sagemaker-model" locally and then use R to read it back is because I want to use native xgboost in R to make predictions, and not rely on reticulate for predictions. 我首先在本地保存“ sagemaker模型”,然后使用R读回它的原因是因为我想在R中使用本机xgboost进行预测,而不是依赖于网状结构进行预测。 However, the predictions are clearly not correct. 但是,这些预测显然是不正确的。

The problem is that in Python, xgboost requires np.array as input. 问题在于在Python中,xgboost需要np.array作为输入。 So you have to convert the input with the Dmatrix function. 因此,您必须使用Dmatrix函数转换输入。

Something like this: 像这样:

dtrain <- xgb.DMatrix(train$data, label=train$label)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM