XGBoost 分类器：eval_metric 意外预测。形状

Question

我的 XGBoost 分类器有问题。 我稍后尝试使用自定义函数来评估我的模型。 如果我使用 XGB 回归器而不是分类器，我可以将此函数用于 eval_metric。 我有 3 个标签，让它们命名为 y1、y2、y3，我的 eval_metric 函数的拟合函数的预测形状为 (n, 3)，n = len(testdata)。 因此我没有预测，但我有 3 个测试预测值。 如果我使用回归器，一切都很好，在拟合期间，我的 eval_metric 得到形状 (n, 1) 的预测（每个标签一个）。 我希望你明白我的问题。 我有一个插图作为源代码：

from xgboost import XGBClassifier
class = XGBClassifier()
X_train = "train data with shape (m, 144)"
y_train = "train labels with shape (m, 1)"
X_test = "test data with shape (120000, 144)"
y_test = "test labels with shape (120000, 1)"

def eval_function(predicted, true):
    print(shape(predicted)) #This will be shape (120000, 3) an not as expected (120000, 1)
    print(shape(predicted)) #This will be shape (120000, 1) as expected
    return 1

class.fit(X_train, y_train, evals_set=[(X_test, y_test)], eval_metric=eval_function)

现在的问题是，拟合函数为我的“eval_function”提供了一个形状为 (120000, 3) 而不是 (120000, 1) 的数组，正如回归器所做的以及我所期望的那样。 也许是因为我有 3 个不同的标签？ 我该怎么做才能获得实际预测的标签以进行自定义评估？

Answer 1

根据 XGBoost 的文档，预测对象遵循以下规则：

...对于多类分类问题，XGBoost 为每个类构建一棵树，每个类的树称为树的“组”，因此输出维度可能会因使用的模型而改变。 在 1.4 版本之后，我们添加了一个名为 strict_shape 的新参数，可以将其设置为 True 以表示需要更受限制的输出...

...对于具有 multi:softprob 的多类，列数等于类数...

完整文档在这里： XGBoost 预测文档

基本上 XBoost 使用一对一的方法，“是那个类还是另一个类？” 对于每个班级，并使班级保持最佳“概率”（是的，这是一个快捷术语）。

这就是维度问题部分。

但请注意，使用相同的指标来评估回归器和分类器是没有意义的。

XGBoost 分类器：eval_metric 意外预测。形状

问题描述

1 个解决方案

解决方案1
0 2021-11-02 10:39:01

XGBoost 分类器：eval_metric 意外预测。 形状

问题描述

1 个解决方案

解决方案1 0 2021-11-02 10:39:01

XGBoost 分类器：eval_metric 意外预测。形状

解决方案1
0 2021-11-02 10:39:01