为机器学习选择最佳特征

Question

Is there any way to extract best features from the data.有什么方法可以从数据中提取最佳特征。 Right now, I am using 'KBest' from sklearn.现在，我正在使用 sklearn 的“KBest”。 In this, I have to specify number of K best features that needs to be selected.在这里，我必须指定需要选择的 K 个最佳特征的数量。 Is there any way in which I don't have to specify the number of features to be extracted?有什么方法可以让我不必指定要提取的特征数量？ Rather we extract all the useful features?而是我们提取所有有用的特征？

from sklearn.feature_selection import SelectKBest
test = SelectKBest(score_func=chi2, k=4)

Answer 1

You can use "all" instead of a number您可以使用"all"而不是数字

test = SelectKBest(score_func=chi2, k="all")

From docs来自文档

k : int or “all”, optional, default=10 k : int 或“all”，可选，默认=10

Number of top features to select.要选择的主要特征的数量。 The “all” option bypasses selection, for use in a parameter search. “all”选项绕过选择，用于参数搜索。

Answer 2

Many ways to select features.多种选择特征的方法。 In wiki , you can find them.And I think the best feature selection method is that you have a deep understanding of these features.But usually we have a hard time understanding them.在wiki上，你可以找到它们。我认为最好的特征选择方法是你对这些特征有深刻的理解。但通常我们很难理解它们。

Maybe you can use 5-Kfold cross-validation to make a feature importance ranking, and them select important feature from it.也许您可以使用 5-Kfold 交叉验证来进行特征重要性排名，然后他们从中选择重要特征。

And you also can use Embedded method to select it, like this:您也可以使用Embedded方法来选择它，如下所示：

from sklearn.feature_selection import SelectFromModel
from sklearn.ensemble import GradientBoostingClassifier

#Feature selection of GBDT as base model
SelectFromModel(GradientBoostingClassifier()).fit_transform(iris.data, iris.target)

It's worth noting that you cannot delete a feature that seems to be useless alone,because it may be related to other features.So feature selection is a greedy search process, which is often time consuming.值得注意的是，不能单独删除一个看似无用的特征，因为它可能与其他特征相关。所以特征选择是一个贪婪的搜索过程，通常很耗时。

为机器学习选择最佳特征

问题描述

2 个解决方案

解决方案1
0 2019-12-02 05:32:21

解决方案2
0 2019-12-02 05:39:01

为机器学习选择最佳特征

问题描述

2 个解决方案

解决方案1 0 2019-12-02 05:32:21

解决方案2 0 2019-12-02 05:39:01

解决方案1
0 2019-12-02 05:32:21

解决方案2
0 2019-12-02 05:39:01