简体   繁体   English

Python:如何在堆叠模型中产生可重现的结果

[英]Python: How to produce reproducible results in stacked model

After so much trial and errors I have finally managed to build my own stacked model.经过如此多的反复试验,我终于设法构建了自己的堆叠模型。 but I am unable to produce(accuracy) the same every time.但我无法每次都产生相同的(准确度)。 I know I have to initialize random_state parameter to any value but even after explicitly writing the random_state value to some value before calling the class method I still get random results.我知道我必须将 random_state 参数初始化为任何值,但即使在调用类方法之前将 random_state 值显式写入某个值后,我仍然会得到随机结果。

class Stacking(BaseEstimator, ClassifierMixin):
    def __init__(self, BaseModels, MetaModel, nfolds = 3, seed = 1):
        self.BaseModels = BaseModels
        self.MetaModel = MetaModel
        self.nfolds = nfolds
        self.seed = np.random.seed(seed) <---- This fixed my error. thanks to foladev.

    def fit(self, X, y):
        self.BaseModels_ = [list() for model in self.BaseModels]
        self.MetaModel_ = clone(self.MetaModel)
        kf = KFold(n_splits = self.nfolds, shuffle = False, random_state = 6)
        out_of_fold_preds = np.zeros((X.shape[0], len(self.BaseModels_)))

        for index, model in enumerate(self.BaseModels_):
            for train_index, out_of_fold_index in kf.split(X, y):
                instance = clone(model)
                self.BaseModels_[index].append(instance)
                instance.fit(X[train_index], y[train_index])

                preds = instance.predict(X[out_of_fold_index])
                out_of_fold_preds[out_of_fold_index, index] = preds
                #print(model, preds, out_of_fold_preds.shape)
        self.MetaModel_.fit(out_of_fold_preds, y)
        return self

I am using LogisticRegression, SGDClassifier, RandomForestClassifer as my base models and XGBoost as my meta model.我使用 LogisticRegression、SGDClassifier、RandomForestClassifer 作为我的基本模型和 XGBoost 作为我的元模型。 random_state is present in all the models but works only on the base models. random_state 存在于所有模型中,但仅适用于基本模型。

I get error " init () got an unexpected keyword argument 'random_state'" when random_state is put in xgbclassifier.将 random_state 放入 xgbclassifier 时,出现错误“ init () 得到了意外的关键字参数‘random_state’”。

Please note, I have tried initializing random_state before calling the class.请注意,我在调用类之前尝试初始化 random_state。 Tried altering shuffle in KFold.尝试在 KFold 中改变 shuffle。 Also, how can I initialize parameters inside the class method?另外,如何在类方法中初始化参数?

From the API, it looks like xgbclassifier uses seed .从 API 看来, xgbclassifier 使用seed

xgboost.XGBClassifier(
    max_depth=3, 
    learning_rate=0.1, 
    n_estimators=100, 
    silent=True, 
    objective='binary:logistic', 
    booster='gbtree', 
    n_jobs=1, 
    nthread=None, 
    gamma=0, 
    min_child_weight=1, 
    max_delta_step=0, 
    subsample=1, 
    colsample_bytree=1, 
    colsample_bylevel=1, 
    reg_alpha=0, 
    reg_lambda=1, 
    scale_pos_weight=1, 
    base_score=0.5, 
    random_state=0, 
    seed=None, 
    missing=None, 
    **kwargs
)

May I ask why you do not set a class level seed and apply that to all methods?请问您为什么不设置类级别种子并将其应用于所有方法?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM