简体   繁体   English

如何在scikit学习中使用KFold而不是StratifiedKFold进行RFECV?

[英]How to do RFECV in scikit-learn with KFold, not StratifiedKFold?

from sklearn.cross_validation import StratifiedKFold, KFold
from sklearn.feature_selection import RFECV

rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=StratifiedKFold(y, 10),
scoring='accuracy') 
rfecv.fit(X, y)

is an example to do RFECV with StratifiedKFold. 是使用StratifiedKFold进行RFECV的示例。 The question is how to do RFECV with a normal KFold? 问题是如何用普通的KFold做RFECV?

cv=KFold(y, 10) is not the answer since KFold and StratifiedKFold takes and returns a whole different values. cv=KFold(y, 10)不是答案,因为KFoldStratifiedKFold接受并返回一个完全不同的值。

KFold(len(y), n_folds = n_folds) is the answer. KFold(len(y), n_folds = n_folds) So, for 10-fold it would be like 因此,对于10倍,

rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=KFold(len(y),n_folds=10),
scoring='accuracy')

You could create your own CV strategy manually that mimics whatever KFold does: 您可以手动创建自己的简历策略,以模仿KFold所做的一切:

def createCV():
    '''returns somthing like:

    custom_cv = [([0, 1, 2 ,3, 4, 5, 6], [7]), 
          ([0, 1, 2, 3, 4, 5], [6]), 
          ([0, 1, 2, 3, 4], [5]),
          ([0, 1, 2, 3], [4]),
          ([0, 1, 2], [3])] 
    where the 0th list element in each tuple is the training set, and the second is the test 
    '''

manual_cv  = createCV()
rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=manual_cv,
scoring='accuracy') 

You could even use and rearrange what KFold would give you in createCV to suite your cv needs. 您甚至可以使用和重新排列KFoldcreateCV为您提供的功能,以createCV您的简历需求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM