[英]How do I make an SK Learn Classifier accept a 2D array as input for predictions?
So I've made a model with mixed data types and used the recommended example from the SK Learn Docs using the column transformer to build the classifer. 因此,我制作了一个具有混合数据类型的模型,并使用了列转换器来构建分类器,并使用了来自SK Learn Docs的推荐示例。
https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#sphx-glr-auto-examples-compose-plot-column-transformer-mixed-types-py https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#sphx-glr-auto-examples-compose-plot-column-transformer-mixed-types-py
Since the input comes from a csv, and is converted to a Pandas Dataframe, it looks like the X_test, X_train, y_test, y_train are all dataframes too. 由于输入来自csv,并转换为Pandas数据框,因此X_test,X_train,y_test和y_train看起来也都是数据框。 Passing y_test to the clf.predict() function works fine, and I receive the predictions.
将y_test传递到clf.predict()函数可以正常工作,并且我收到了预测。
However I want to host this model Google cloud ML Engine which accepts a 2D array of instances in the predictions request API. 但是,我想托管此模型Google Cloud ML Engine,该模型在预测请求API中接受2D实例数组。 How do I make my classifier adjust to and accept an array of inputs rather than a dataframe?
如何使分类器适应并接受输入数组而不是数据框? I realize this may be fairly trivial, but struggling to find a solution.
我意识到这可能是微不足道的,但是努力寻找解决方案。
To make your classifier compatible with Google Cloud Machine Learning Engine (CMLE), you'll need to separate out the preprocessor and the LogisticRegression classifier from the pipeline. 为了使您的分类器与Google Cloud Machine Learning Engine(CMLE)兼容,您需要从管道中分离出预处理器和LogisticRegression分类器。 You will need to perform the preprocessing logic client side, and the standalone classifier will be hosted on CMLE.
您将需要执行预处理逻辑客户端,并且独立分类器将托管在CMLE上。
After reading in the csv file and defining the number and categorical transformers, you'll need to modify the training code as follows: 读取csv文件并定义数量和分类转换器后,您需要按如下所示修改培训代码:
...
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
model = LogisticRegression(solver='lbfgs')
X_train_transformed = preprocessor.fit_transform(X_train)
model.fit(X_train_transformed, y_train)
print("model score: %.3f" % model.score(preprocessor.transform(X_test), y_test))
You can export the model (using either pickle or joblib) and deploy it on CMLE. 您可以导出模型(使用pickle或joblib)并将其部署在CMLE上。 When constructing your json request to CMLE for prediction, you'll first need to preprocess your dataframe into a 2D array using:
preprocessor.transform(X_test)
. 在构造向CMLE进行预测的json请求时,您首先需要使用以下代码将数据帧预处理为2D数组:
preprocessor.transform(X_test)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.