简体   繁体   English

如何解决 XGboost 分类器中的值错误:特征不匹配?

[英]How can I solve the Value error: feature mismatch in XGboost classifier?

for my work, I have split the data and then used oversampling (due to imbalanced distribution) and feature selection.在我的工作中,我拆分了数据,然后使用了过采样(由于分布不平衡)和特征选择。 I want to use the classifier XGboost but I get the following error.我想使用分类器 XGboost,但出现以下错误。

ValueError                                Traceback (most recent call last)
<ipython-input-16-ace98cb7898f> in <module>()
      5 model.fit(X_train, y_train)
      6 # make predictions for test data
----> 7 y_pred = model.predict(X_test)
      8 predictions = [round(value) for value in y_pred]
      9 # evaluate predictions

2 frames
/usr/local/lib/python3.7/dist-packages/xgboost/core.py in _validate_features(self, data)
   1688 
   1689                 raise ValueError(msg.format(self.feature_names,
-> 1690                                             data.feature_names))
   1691 
   1692     def get_split_value_histogram(self, feature, fmap='', bins=None, as_pandas=True):

ValueError: feature_names mismatch.

Below is the code:下面是代码:

X_train, X_test, y_train, y_test = train_test_split(
     features, label, test_size=0.50, random_state=42)

oversample = SMOTE()
X_train, y_train = oversample.fit_resample(X_train, y_train)
estimator = LogisticRegression()

selector = RFE(estimator, n_features_to_select=5, step=1)
selector = selector.fit(X_train, y_train)

model = XGBClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

How can I solve the error knowing that oversampling and feature selection always happen after splitting the data?知道在拆分数据后总是发生过采样和特征选择,我该如何解决错误?

You have used feature selector in train data only.您仅在训练数据中使用了特征选择器。 Which is the main reason for feature mismatch.这是特征不匹配的主要原因。 You can match your features by applying the same instance to test data as well.您也可以通过将相同的实例应用于测试数据来匹配您的特征。

selector = RFE(estimator, n_features_to_select=5, step=1)
X_train = selector.fit_transform(X_train)
X_test = selector.transform(X_test)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM