简体   繁体   English

为什么我的逻辑回归模型不输出 2 个水平的因子? (错误:`data` 和 `reference` 应该是具有相同水平的因素。)

[英]Why isnt my logistic regression model output a factor of 2 levels? (Error: `data` and `reference` should be factors with the same levels.)

From reading similar questions I know that the problem is that the yhat.logisticReg isnt a factor of 2 levels while training.prepped$TARGET_FLAG is.通过阅读类似的问题,我知道问题在于yhat.logisticReg不是 2 个级别的因子,而training.prepped$TARGET_FLAG是。 I assume the issue could be fixed by changing my model or in the prediction so that yhat.logisticReg is a factor of 2 levels.我认为可以通过更改我的模型或在预测中解决该问题,以便yhat.logisticReg是 2 个级别的因子。 How can I do this?我怎样才能做到这一点?

logisticReg = glm(TARGET_FLAG ~ .,
                  data = training.prepped,
                  family = binomial())
yhat.logisticReg = predict(logisticReg, training.prepped, type = "response")
confusionMatrix(yhat.logisticReg, training.prepped$TARGET_FLAG)

Error: `data` and `reference` should be factors with the same levels.
str(training.prepped$TARGET_FLAG)
Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 2 2 1 ...

str(yhat.logisticReg)
 Named num [1:8161] 0.1656 0.2792 0.3717 0.0894 0.272 ...
 - attr(*, "names")= chr [1:8161] "1" "2" "3" "4" ...

You may need to choose a threshold first, and then convert your real-valued data into binary values, eg您可能需要先选择一个阈值,然后将您的实值数据转换为二进制值,例如

a <- c(0.2, 0.7, 0.4)
threshold <- 0.5
binary_a <- factor(as.numeric(a>threshold))

str(binary_a)
Factor w/ 2 levels "0","1": 1 2 1

The library caret have the method confusionMatrix that have several metrics implemented.库插入符号具有已实现多个指标的方法confusionMatrix矩阵。 Calling overall you can get the accuracy. overall调用可以得到准确度。 If you want another metric, you can check if they have it implemented and just call it.如果你想要另一个指标,你可以检查他们是否已经实现并调用它。

library(caret)
acc = c()
for(value in yhat.logisticReg)
{
  predictions <- ifelse(yhat.logisticReg <= value, 0, 1)
  confusion_matrix = confusionMatrix(predictions, yhat.logisticReg)
  acc = c(acc,confusion_matrix$overall["Accuracy"])
}

best_acc = max(acc)
best_threshold  = yhat.logisticReg[which.max(acc)]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 错误:`data` 和 `reference` 应该是相同级别的因子。 Logistic 回归的混淆矩阵 - Error: `data` and `reference` should be factors with the same levels. Confusion matrix for Logistic Regression 错误:`data` 和 `reference` 应该是具有相同水平的因素。 使用混淆矩阵(插入符号) - Error: `data` and `reference` should be factors with the same levels. Using confusionMatrix (caret) 混淆矩阵错误:错误:`data`和`reference`应该是具有相同水平的因子 - Confusion Matrix Error: Error: `data` and `reference` should be factors with the same levels 应该是具有相同水平,误差和参考的因素 - should be factors with the same levels, error and reference confusionMatrix - 错误:`data` 和 `reference` 应该是具有相同水平的因素 - confusionMatrix - Error: `data` and `reference` should be factors with the same levels 什么地方出了错? 错误:`data` 和 `reference` 应该是具有相同水平的因素 - What went wrong? Error: `data` and `reference` should be factors with the same levels ConfusionMatrix 错误:`data` 和 `reference` 应该是具有相同水平的因素 - ConfusionMatrix Error: `data` and `reference` should be factors with the same levels r - 错误:`data` 和 `reference` 应该是具有相同水平的因素 - r - Error: `data` and `reference` should be factors with the same levels 使用混淆矩阵`data`和`reference`的错误应该是具有相同水平的因素 - error using confusionMatrix `data` and `reference` should be factors with the same levels R:RF模型中的混淆矩阵返回错误:数据和“参考”应该是具有相同水平的因子 - R: Confusion matrix in RF model returns error: data` and `reference` should be factors with the same levels
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM