简体   繁体   English

R 中 XGBoost 中混淆矩阵的缺失部分

[英]Missing parts of Confusion Matrix in XGBoost in R

I am trying to get a confusion matrix from my XGBoost and compute the accuracy.我试图从我的 XGBoost 中获取混淆矩阵并计算准确性。 However, my confusion matrix is not complete and misses all the false areas and looks like this:但是,我的混淆矩阵并不完整,并且遗漏了所有错误区域,如下所示:

y_pred   0   1
  TRUE 526 482

Therefore, I cannot compute the accuracy.因此,我无法计算准确性。 Here is my code:这是我的代码:

# Splitting the dataset into the training set and test set
dataset$Good.Bad.Stock = factor(dataset$Good.Bad.Stock, levels = c(0,1))
training_set = dataset[1:2740,]
test_set = dataset[2741:3748,]
data = as.factor(as.character(training_set$Good.Bad.Stock))
data = replace(training_set$Good.Bad.Stock, is.na(training_set$Good.Bad.Stock), 0)
data

# Fitting XGBoost to the Training set
classifier_XGB = xgboost(data = as.matrix(training_set[-63]), 
                     label = data, 
                     nrounds = 15,                      
                     objective = "binary:logistic")

# Predicting the Test set results
pred_data=as.matrix(test_set[-63])
y_pred = predict(classifier_XGB, pred_data)
y_pred = (y_pred > 0.5)

# Making the Confusion Matrix
cm_XGB = table(y_pred, test_set$Good.Bad.Stock)
cm_XGB

# Evaluate Model
accuracy_XGB = (cm_XGB[1,1] + cm_XGB[2,2]) / (cm_XGB[1,1] + cm_XGB[2,2] + cm_XGB[1,2] + cm_XGB[2,1])
print(accuracy_XGB)

Thank you for the help!感谢您的帮助!

I didn't run the code, but i wonder the problem is in:我没有运行代码,但我想知道问题出在:

y_pred = (y_pred > 0.5) y_pred = (y_pred > 0.5)

Just print y_pred before to do that, and probably you will see a 1s vector or probabilities above 0.5.只需在执行此操作之前打印 y_pred,您可能会看到 1s 向量或高于 0.5 的概率。

This is probably caused by a bad configurated model (read more about xgb parameters) or a highly unbalanced dataset (don't seem that in the testset).这可能是由于配置错误的 model(阅读有关 xgb 参数的更多信息)或高度不平衡的数据集(在测试集中似乎没有)引起的。

Edited: Of course you have to be sure that your response variable is typed as factor.编辑:当然,您必须确保您的响应变量被键入为因子。 Also you should set the objective function as binary.此外,您应该将目标 function 设置为二进制。 As I said, I highly recommed you to keep reading basic posts about xgb.正如我所说,我强烈建议您继续阅读有关 xgb 的基本帖子。 https://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/ https://cran.r-project.org/web/packages/xgboost/vignettes/discoverYourData.html https://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/ https://cran.r-project.org/web/packages/xgboost/vignettes/discoverYourData.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM