[英]Using sklearn Linear Regression and PCA in a single Pipeline
I have a Pandas data frame with 20 numeric features and a numeric response column. 我有一个Pandas数据框,有20个数字功能和一个数字响应列。 I would like to first apply PCA to bring the dimensionality down to 10 and then run Linear Regression to predict the numeric response.
我想首先应用PCA将维数降低到10,然后运行线性回归来预测数值响应。 I can do this currently using two steps
我目前可以使用两个步骤来完成此操作
pipeline = Pipeline([('scaling', StandardScaler()),
('pca', PCA(n_components=20, whiten=True))])
newDF = pipeline.fit_transform(numericDF)
Y = df["Response"]
model = LinearRegression()
model.fit(newDF, Y)
Is there a way to combine Linear Regression in the above pipeline? 有没有办法在上面的管道中组合线性回归? I ask this question because
我问这个问题是因为
fit_transform
is not supported in Linear Regression. fit_transform
。 fit_predict
can't be used with PCA. fit_predict
不能与PCA一起使用。 How could I run PCA and then Linear Regression all in the same pipeline? 我怎么能在同一个管道中运行PCA然后运行线性回归?
The order of the pipeline steps matters. 管道步骤的顺序很重要。 The last step might implement
predict()
, while all the previous must have fit_transform()
. 最后一步可能实现
predict()
,而前面的所有步骤都必须有fit_transform()
。 Also logically, you first transform your features and then apply a predictive classification/regression model 从逻辑上讲,您首先转换功能,然后应用预测分类/回归模型
Y = df["Response"]
X=...
pipeline = Pipeline([('scaling', StandardScaler()),
('pca', PCA(n_components=20, whiten=True)),
('regr',LinearRegression())])
newDF = pipeline.fit_predict(numericDF)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.