如何在sklearn的管道中腌制个别步骤？

Question

I am using Pipeline from sklearn to classify text. 我正在使用sklearn中的Pipeline对文本进行分类。

In this example Pipeline , I have a TfidfVectorizer and some custom features wrapped with FeatureUnion and a classifier as the Pipeline steps, I then fit the training data and do the prediction: 在这个例子中Pipeline ，我有一个TfidfVectorizer和包裹着一些自定义功能FeatureUnion和分类作为Pipeline的步骤，那么我适合训练数据做预测：

from sklearn.pipeline import FeatureUnion, Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import LinearSVC

X = ['I am a sentence', 'an example']
Y = [1, 2]
X_dev = ['another sentence']

# classifier
LinearSVC1 = LinearSVC(tol=1e-4,  C = 0.10000000000000001)

pipeline = Pipeline([
    ('features', FeatureUnion([
       ('tfidf', TfidfVectorizer(ngram_range=(1, 3), max_features= 4000)), 
       ('custom_features', CustomFeatures())])),
    ('clf', LinearSVC1),
    ])

pipeline.fit(X, Y)
y_pred = pipeline.predict(X_dev)

# etc.

Here I need to pickle the TfidfVectorizer step and leave the custom_features unpickled, since I still do experiments with them. 在这里，我需要TfidfVectorizer步骤并保留custom_features unpickled，因为我仍然使用它们进行实验。 The idea is to make the pipeline faster by pickling the tfidf step. 我们的想法是通过挑选tfidf步骤来加快管道流程。

I know I can pickle the whole Pipeline with joblib.dump , but how do I pickle individual steps? 我知道我可以使用joblib.dump来腌制整个Pipeline ，但我如何joblib.dump个别步骤呢？

Answer 1

To pickle the TfidfVectorizer, you could use: 要挑选TfidfVectorizer，您可以使用：

joblib.dump(pipeline.steps[0][1].transformer_list[0][1], dump_path)

or: 要么：

joblib.dump(pipeline.get_params()['features__tfidf'], dump_path)

To load the dumped object, you can use: 要加载转储的对象，您可以使用：

pipeline.steps[0][1].transformer_list[0][1] = joblib.load(dump_path)

Unfortunately you can't use set_params , the inverse of get_params , to insert the estimator by name. 遗憾的是，您无法使用set_params （ get_params的反转）按名称插入估算器。 You will be able to if the changes in PR#1769: enable setting pipeline components as parameters are ever merged! 如果PR＃1769中的更改：启用设置管道组件作为参数，您将能够合并！

如何在sklearn的管道中腌制个别步骤？

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-03-29 01:15:00

如何在sklearn的管道中腌制个别步骤？

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-03-29 01:15:00

解决方案1
3 已采纳 2016-03-29 01:15:00