简体   繁体   English

为什么fit_transform在此sklearn Pipeline示例中不起作用?

[英]Why doesn't fit_transform work in this sklearn Pipeline example?

I an new to sklearn Pipeline and following a sample code. 我是sklearn Pipeline的新手,并遵循示例代码。 I saw in other examples that we can do pipeline.fit_transform(train_X) , so I tried the same thing on the pipeline here pipeline.fit_transform(X) , but it gave me an error 我在其他示例中看到我们可以执行pipeline.fit_transform(train_X) ,因此我在此处的pipeline.fit_transform(X)上对管道进行了同样的尝试,但它给了我一个错误

" return self.fit(X, **fit_params).transform(X) “ return self.fit(X,** fit_params).transform(X)

TypeError: fit() takes exactly 3 arguments (2 given)" TypeError:fit()恰好接受3个参数(给定2个)“

If I remove the svm part and defined the pipeline as pipeline = Pipeline([("features", combined_features)]) , I still saw the error. 如果删除svm部分并将管道定义为pipeline = Pipeline([("features", combined_features)]) ,我仍然会看到错误。

Does anyone know why fit_transform doesn't work here? 有谁知道fit_transform为什么在这里不起作用?

from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.grid_search import GridSearchCV

from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
from sklearn.feature_selection import SelectKBest

iris = load_iris()

X, y = iris.data, iris.target

# This dataset is way to high-dimensional. Better do PCA:
pca = PCA(n_components=2)

# Maybe some original features where good, too?
selection = SelectKBest(k=1)

# Build estimator from PCA and Univariate selection:

combined_features = FeatureUnion([("pca", pca), ("univ_select", selection)])

# Use combined features to transform dataset:
X_features = combined_features.fit(X, y).transform(X)

svm = SVC(kernel="linear")

# Do grid search over k, n_components and C:

pipeline = Pipeline([("features", combined_features), ("svm", svm)])

param_grid = dict(features__pca__n_components=[1, 2, 3],
                  features__univ_select__k=[1, 2],
                  svm__C=[0.1, 1, 10])

grid_search = GridSearchCV(pipeline, param_grid=param_grid, verbose=10)
grid_search.fit(X, y)
print(grid_search.best_estimator_)

You get an error in the above example because you also need to pass the labels to your pipeline. 在上面的示例中会出现错误,因为您还需要将标签传递到管道。 You should be calling pipeline.fit_transform(X,y) . 您应该正在调用pipeline.fit_transform(X,y) The last step in your pipeline is a classifier, SVC and the fit method of a classifier also requires the labels as a mandatory argument. pipeline的最后一步是分类器, SVC ,分类器的fit方法还需要将标签作为必需参数。 The fit method of all classifiers also require labels because the classification algorithms use these labels to train the weights in your classifier. 所有分类器的fit方法也需要标签,因为分类算法使用这些标签来训练分类器中的权重。

Similarly, even if you remove the SVC , you still get an error because the fit method of SelectKBest class also requires both X and y . 同样,即使删除SVC ,也仍然会出错,因为SelectKBest类的fit方法也需要Xy

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 矢量化fit_transform如何在sklearn中工作? - How vectorizer fit_transform work in sklearn? sklearn 中的 ColumnTransformer 实现没有定义 fit 方法,它只是自动调用 fit_transform? - ColumnTransformer implementation in sklearn doesn't have a fit method defined, it just automatically calls fit_transform? ColumnTransformer 在 sklearn 中尝试 fit_transform 管道时生成 TypeError - ColumnTransformer generating a TypeError when trying to fit_transform pipeline in sklearn Python sklearn:fit_transform()不适用于GridSearchCV - Python sklearn : fit_transform() does not work for GridSearchCV 在 sklearn 的管道中使用 LabelEncoder 给出:fit_transform 需要 2 个位置参数,但给出了 3 个 - Using a LabelEncoder in sklearn's Pipeline gives: fit_transform takes 2 positional arguments but 3 were given ColumnTransformer fit_transform 不适用于管道 - ColumnTransformer fit_transform not working with pipeline sklearn.impute SimpleImputer:为什么transform()首先需要fit_transform()? - sklearn.impute SimpleImputer: why does transform() need fit_transform() first? sklearn countvectorizer 中的 fit_transform 和 transform 有什么区别? - What is the difference between fit_transform and transform in sklearn countvectorizer? sklearn中的'transform'和'fit_transform'有什么区别 - what is the difference between 'transform' and 'fit_transform' in sklearn 有没有办法组合这些 sklearn Pipelines/ColumnTransformers,这样我就不必进行多次 fit_transform() 调用? - Is there a way to combine these sklearn Pipelines/ColumnTransformers so I don't have to make multiple fit_transform() calls?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM