![](/img/trans.png)
[英]Error: `data` and `reference` should be factors with the same levels. Confusion matrix for Logistic Regression
[英]Why isnt my logistic regression model output a factor of 2 levels? (Error: `data` and `reference` should be factors with the same levels.)
通過閱讀類似的問題,我知道問題在於yhat.logisticReg
不是 2 個級別的因子,而training.prepped$TARGET_FLAG
是。 我認為可以通過更改我的模型或在預測中解決該問題,以便yhat.logisticReg
是 2 個級別的因子。 我怎樣才能做到這一點?
logisticReg = glm(TARGET_FLAG ~ .,
data = training.prepped,
family = binomial())
yhat.logisticReg = predict(logisticReg, training.prepped, type = "response")
confusionMatrix(yhat.logisticReg, training.prepped$TARGET_FLAG)
Error: `data` and `reference` should be factors with the same levels.
str(training.prepped$TARGET_FLAG)
Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 2 2 1 ...
str(yhat.logisticReg)
Named num [1:8161] 0.1656 0.2792 0.3717 0.0894 0.272 ...
- attr(*, "names")= chr [1:8161] "1" "2" "3" "4" ...
您可能需要先選擇一個閾值,然后將您的實值數據轉換為二進制值,例如
a <- c(0.2, 0.7, 0.4)
threshold <- 0.5
binary_a <- factor(as.numeric(a>threshold))
str(binary_a)
Factor w/ 2 levels "0","1": 1 2 1
庫插入符號具有已實現多個指標的方法confusionMatrix
矩陣。 overall
調用可以得到准確度。 如果你想要另一個指標,你可以檢查他們是否已經實現並調用它。
library(caret)
acc = c()
for(value in yhat.logisticReg)
{
predictions <- ifelse(yhat.logisticReg <= value, 0, 1)
confusion_matrix = confusionMatrix(predictions, yhat.logisticReg)
acc = c(acc,confusion_matrix$overall["Accuracy"])
}
best_acc = max(acc)
best_threshold = yhat.logisticReg[which.max(acc)]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.