为什么我的逻辑回归模型不输出 2 个水平的因子？（错误：`data` 和 `reference` 应该是具有相同水平的因素。）

Question

From reading similar questions I know that the problem is that the yhat.logisticReg isnt a factor of 2 levels while training.prepped$TARGET_FLAG is.通过阅读类似的问题，我知道问题在于yhat.logisticReg不是 2 个级别的因子，而training.prepped$TARGET_FLAG是。 I assume the issue could be fixed by changing my model or in the prediction so that yhat.logisticReg is a factor of 2 levels.我认为可以通过更改我的模型或在预测中解决该问题，以便yhat.logisticReg是 2 个级别的因子。 How can I do this?我怎样才能做到这一点？

logisticReg = glm(TARGET_FLAG ~ .,
                  data = training.prepped,
                  family = binomial())
yhat.logisticReg = predict(logisticReg, training.prepped, type = "response")
confusionMatrix(yhat.logisticReg, training.prepped$TARGET_FLAG)

Error: `data` and `reference` should be factors with the same levels.

str(training.prepped$TARGET_FLAG)
Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 2 2 1 ...

str(yhat.logisticReg)
 Named num [1:8161] 0.1656 0.2792 0.3717 0.0894 0.272 ...
 - attr(*, "names")= chr [1:8161] "1" "2" "3" "4" ...

Answer 1

You may need to choose a threshold first, and then convert your real-valued data into binary values, eg您可能需要先选择一个阈值，然后将您的实值数据转换为二进制值，例如

a <- c(0.2, 0.7, 0.4)
threshold <- 0.5
binary_a <- factor(as.numeric(a>threshold))

str(binary_a)
Factor w/ 2 levels "0","1": 1 2 1

Answer 2

The library caret have the method confusionMatrix that have several metrics implemented.库插入符号具有已实现多个指标的方法confusionMatrix矩阵。 Calling overall you can get the accuracy. overall调用可以得到准确度。 If you want another metric, you can check if they have it implemented and just call it.如果你想要另一个指标，你可以检查他们是否已经实现并调用它。

library(caret)
acc = c()
for(value in yhat.logisticReg)
{
  predictions <- ifelse(yhat.logisticReg <= value, 0, 1)
  confusion_matrix = confusionMatrix(predictions, yhat.logisticReg)
  acc = c(acc,confusion_matrix$overall["Accuracy"])
}

best_acc = max(acc)
best_threshold  = yhat.logisticReg[which.max(acc)]

为什么我的逻辑回归模型不输出 2 个水平的因子？（错误：`data` 和 `reference` 应该是具有相同水平的因素。）

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-01-13 01:58:40

解决方案2
0 2020-01-13 02:25:13

为什么我的逻辑回归模型不输出 2 个水平的因子？ （错误：`data` 和 `reference` 应该是具有相同水平的因素。）

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-01-13 01:58:40

解决方案2 0 2020-01-13 02:25:13

为什么我的逻辑回归模型不输出 2 个水平的因子？（错误：`data` 和 `reference` 应该是具有相同水平的因素。）

解决方案1
1 已采纳 2020-01-13 01:58:40

解决方案2
0 2020-01-13 02:25:13