无法运行StackNet分类器的predict_proba

Question

Unable to run predict_proba for Stacknet Classifier. 无法为Stacknet分类程序运行预报。

I constructed StackNet Classifier as below: 我构造了StackNet分类器，如下所示：

Level 0 : XGBClassifier, GradientBoostingRegressor,CatBoostClassifier
Level 1 : XGBClassifier

Model fit is successful. 模型拟合成功。 But, I tried to run …..model.predict_proba(Xtrain_prep). 但是，我尝试运行…..model.predict_proba（Xtrain_prep）。

faced below error exception 遇到以下错误异常

ValueError: feature_names mismatch ValueError：feature_names不匹配

I don't think, its issue with dataset. 我不认为这是数据集的问题。 it worked well with individual classifiers. 与单个分类器配合使用效果很好。 Appreciate your help on StackNet Classifier. 感谢您对StackNet分类器的帮助。


# Specify model tree for StackNet
models = [[xgb_clf, gbrt_clf, cat_clf], # Level 0
          [xgb_clf]] # Level 1

# Specify parameters for stacked model and begin training
model = StackNetClassifier(models, 
                           metric="auc", 
                           folds=2,
                           restacking=False,
                           use_retraining=True,
                           use_proba=True, # To use predict_proba after training
                           random_state=seed,
                           n_jobs=-1, 
                           verbose=1)
train_preds = model.predict_proba(X_train_prep)[:, 1]

~\Anaconda3\lib\concurrent\futures\_base.py in result(self, timeout)
    430                 raise CancelledError()
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:
    434                 raise TimeoutError()

~\Anaconda3\lib\concurrent\futures\_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

ValueError: feature_names mismatch:

Answer 1

found my answer in following github link. 在以下github链接中找到了我的答案。 https://github.com/dmlc/xgboost/issues/2334 https://github.com/dmlc/xgboost/issues/2334

Real Issue 实际问题

inside fit function, Xtrain is converted to numpy.ndarray, then it is passed to "predictproba" method. 在fit函数中，Xtrain转换为numpy.ndarray，然后传递给“ predictproba”方法。 so , when we input model.predict_ proba (X_train) …..this input is still in dataframe. 因此，当我们输入model.predict_ proba（X_train）…..时，此输入仍在数据框中。 Hence, the mismatch while running "predict_proba". 因此，运行“ predict_proba”时不匹配。

Solution 解

StackNet wrapper of "predict_proba" should convert the input dataset into numpy.ndarray. “ predict_proba”的StackNet包装器应将输入数据集转换为numpy.ndarray。 or 要么

we can convert to dataset into array as below before passing to "predict_proba": Xmatrix = Xtrain.asmatrix() trainpreds = model.predictproba(Xmatrix)[:, 1] 我们可以在传递给“ predict_proba”之前将数据集转换为数组，如下所示：Xmatrix = Xtrain.asmatrix（）trainpreds = model.predictproba（Xmatrix）[:, 1]

无法运行StackNet分类器的predict_proba

问题描述

1 个解决方案

解决方案1
0 2019-08-21 16:36:11

无法运行StackNet分类器的predict_proba

问题描述

1 个解决方案

解决方案1 0 2019-08-21 16:36:11

解决方案1
0 2019-08-21 16:36:11