如何将SelectKBest的输出传递给cross_val_score函数？

Question

I have the following code here: 我在这里有以下代码：

I am trying to retrieve the 20 best features from my dataset and then test the cross validated score with the Random Forest Classifier however once I've performed SelectKBest I recieve an output: X_train_selected and X_test_selected and it's not immediately obvious to me how I pass this to the cross val score function. 我正在尝试从数据集中检索20个最佳特征，然后使用随机森林分类器测试交叉验证的分数，但是一旦执行SelectKBest，我就会收到一个输出：X_train_selected和X_test_selected，但对我而言，如何通过它并不立刻显而易见交叉瓦尔分数功能。

Answer 1

You don't need to separate train and test data for cross_val_score. 您不需要为cross_val_score分离训练和测试数据。 The function itself just takes care of it. 函数本身只是在照顾它。 When passing the features set, you need to pass the complete feature set, not X_train and X-test 通过功能集时，您需要传递完整的功能集，而不是X_train和X-test

First seperate the target variable 首先分离目标变量

target = df['result']

Then run the selectKbest and get the column names like you did, but this time instead of splitting the X into train and test, just pass them as single data set like this 然后运行selectKbest并像您一样获得列名，但是这次不是将X分为训练和测试，而是像这样将它们作为单个数据集传递

X = clean_df[colnames_selected]

Then pass the X and target to cross_val_score 然后将X和target传递给cross_val_score

scores = cross_val_score(forest, X, target, cv=10)
print("Reduced features: mean of the scores: {:.2f}".format(scores.mean()))

The whole point of the function is to perform cross validation on the dataset and return the scores using the estimators provided. 函数的重点是对数据集执行交叉验证，并使用提供的估算器返回分数。

You can also use pipelines to make this whole process more easier in go like this example . 您还可以使用管道来使整个过程变得更容易，例如以下示例。

如何将SelectKBest的输出传递给cross_val_score函数？

问题描述

1 个解决方案

解决方案1
-1 2018-06-24 10:25:03

如何将SelectKBest的输出传递给cross_val_score函数？

问题描述

1 个解决方案

解决方案1 -1 2018-06-24 10:25:03

解决方案1
-1 2018-06-24 10:25:03