简体   繁体   English

如何使用 convert_coreml 转换自定义管道(分类 get_dummies)?

[英]How to convert custom pipeline (categorical get_dummies) with convert_coreml?

I'm trying to save a custom sklearn pipeline as onnx model, but I'm getting errors in the process.我正在尝试将自定义 sklearn 管道保存为 onnx model,但在此过程中出现错误。

sample code:示例代码:

from sklearn.preprocessing import OneHotEncoder
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline

from sklearn import svm
from winmltools import convert_coreml

import copy
from IPython.display import display
# https://github.com/pandas-dev/pandas/issues/8918

class MyEncoder(TransformerMixin):

    def __init__(self, columns=None):
        self.columns = columns

    def transform(self, X, y=None, **kwargs):
        return pd.get_dummies(X, dtype=np.float, columns=['ID'])

    def fit(self, X, y=None, **kwargs):
        return self

# data
X = pd.DataFrame([[100, 1.1, 3.1], [200, 4.1, 5.1], [100, 4.1, 2.1]], columns=['ID', 'X1', 'X2'])
Y = pd.Series([3, 2, 4])

# check transform
df = MyEncoder().transform(X)
display(df)

# create pipeline
pipe = Pipeline( steps=[('categorical', MyEncoder()), ('classifier', svm.SVR())] )
print(type(pipe), MyEncoder().transform(X).dtypes, '\n')

# prepare models
svm_toy  = svm.SVR()
svm_toy.fit(X,Y)
pipe_toy = copy.deepcopy(pipe).fit(X, Y)

# save onnx

# no problem here
initial_type = [('X', FloatTensorType( [None, X.shape[1]] ) ) ] 
onx = convert_sklearn(svm_toy, initial_types=initial_type  )

# something goes wrong...
initial_type = [('X', FloatTensorType( [None, X.shape[1]] ) ) ] 
onx = convert_sklearn(pipe_toy, initial_types=initial_type  )

The simple conversion goes well:简单的转换很顺利:

# no problem here
initial_type = [('X', FloatTensorType( [None, X.shape[1]] ) ) ] 
onx = convert_sklearn(svm_toy, initial_types=initial_type  )

But the pipeline conversion fails:但是管道转换失败:

# something goes wrong...
initial_type = [('X', FloatTensorType( [None, X.shape[1]] ) ) ] 
onx = convert_sklearn(pipe_toy, initial_types=initial_type  )

with the following error:出现以下错误:

MissingShapeCalculator: Unable to find a shape calculator for type ''.
It usually means the pipeline being converted contains a
transformer or a predictor with no corresponding converter
implemented in sklearn-onnx. If the converted is implemented
in another library, you need to register
the converted so that it can be used by sklearn-onnx (function
update_registered_converter). If the model is not yet covered
by sklearn-onnx, you may raise an issue to
https://github.com/onnx/sklearn-onnx/issues
to get the converter implemented or even contribute to the
project. If the model is a custom model, a new converter must
be implemented. Examples can be found in the gallery.

Am I missing something with the customized pipeline and the get_dummies ?我是否缺少自定义管道和get_dummies的某些内容?

Custom transformers, ie the ones not supported by sklearn need extra information to be recognized by ONNX.自定义转换器,即 sklearn 不支持的转换器,需要额外的信息才能被 ONNX 识别。 You need to write shape and converter functions for your transformer and then register your transformer with these two additional functions.您需要为您的变压器编写形状和转换器功能,然后使用这两个附加功能注册您的变压器。 See more in the documentation .文档中查看更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM