简体   繁体   English

无法运行StackNet分类器的predict_proba

[英]Unable to run predict_proba of StackNet Classifier

Unable to run predict_proba for Stacknet Classifier. 无法为Stacknet分类程序运行预报。

I constructed StackNet Classifier as below: 我构造了StackNet分类器,如下所示:

Level 0 : XGBClassifier, GradientBoostingRegressor,CatBoostClassifier
Level 1 : XGBClassifier

Model fit is successful. 模型拟合成功。 But, I tried to run …..model.predict_proba(Xtrain_prep). 但是,我尝试运行…..model.predict_proba(Xtrain_prep)。

faced below error exception 遇到以下错误异常

ValueError: feature_names mismatch ValueError:feature_names不匹配

I don't think, its issue with dataset. 我不认为这是数据集的问题。 it worked well with individual classifiers. 与单个分类器配合使用效果很好。 Appreciate your help on StackNet Classifier. 感谢您对StackNet分类器的帮助。


# Specify model tree for StackNet
models = [[xgb_clf, gbrt_clf, cat_clf], # Level 0
          [xgb_clf]] # Level 1

# Specify parameters for stacked model and begin training
model = StackNetClassifier(models, 
                           metric="auc", 
                           folds=2,
                           restacking=False,
                           use_retraining=True,
                           use_proba=True, # To use predict_proba after training
                           random_state=seed,
                           n_jobs=-1, 
                           verbose=1)
train_preds = model.predict_proba(X_train_prep)[:, 1]

~\Anaconda3\lib\concurrent\futures\_base.py in result(self, timeout)
    430                 raise CancelledError()
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:
    434                 raise TimeoutError()

~\Anaconda3\lib\concurrent\futures\_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

ValueError: feature_names mismatch:

found my answer in following github link. 在以下github链接中找到了我的答案。 https://github.com/dmlc/xgboost/issues/2334 https://github.com/dmlc/xgboost/issues/2334

Real Issue 实际问题

inside fit function, Xtrain is converted to numpy.ndarray, then it is passed to "predictproba" method. 在fit函数中,Xtrain转换为numpy.ndarray,然后传递给“ predictproba”方法。 so , when we input model.predict_ proba (X_train) …..this input is still in dataframe. 因此,当我们输入model.predict_ proba(X_train)…..时,此输入仍在数据框中。 Hence, the mismatch while running "predict_proba". 因此,运行“ predict_proba”时不匹配。

Solution

StackNet wrapper of "predict_proba" should convert the input dataset into numpy.ndarray. “ predict_proba”的StackNet包装器应将输入数据集转换为numpy.ndarray。 or 要么

we can convert to dataset into array as below before passing to "predict_proba": Xmatrix = Xtrain.asmatrix() trainpreds = model.predictproba(Xmatrix)[:, 1] 我们可以在传递给“ predict_proba”之前将​​数据集转换为数组,如下所示:Xmatrix = Xtrain.asmatrix()trainpreds = model.predictproba(Xmatrix)[:, 1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM