在 R 中绘制多类 ROC 曲线时出错

Question

I have made an SVM predictor, which can class samples into one of three groups - "good", "bad" or "ok".我制作了一个 SVM 预测器，它可以将 class 样本分为三组之一——“好”、“坏”或“好”。 However, the test dataset only contains samples classed as "good" or "bad".但是，测试数据集仅包含分类为“好”或“坏”的样本。 I'm coming up with an error when I'm trying to use multi_roc , and I'm not sure the best way to solve it.当我尝试使用multi_roc时出现错误，我不确定解决它的最佳方法。 The example I've made is below:我做的例子如下：

library(tidymodels)
library(mlbench)
library(multiROC)
data(Ionosphere)

# preprocess dataset
Ionosphere <- Ionosphere %>% select(-V1, -V2)

# split into training and test data
ion_split <- initial_split(Ionosphere, prop = 3/5)

ion_train <- training(ion_split)
ion_test <- testing(ion_split) 

# making an artificial third class in the training set for this example
ion_train[,33] <- as.character(ion_train[,33])
ion_train[1:7,33] <- "ok"
ion_train[,33] <- as.factor(ion_train[,33])

# make a recipe
iono_rec <-
  recipe(Class ~ ., data = ion_train)  %>%
  step_normalize(all_predictors()) 

# build the model and workflow
svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

svm_workflow <- 
      workflow() %>%
      add_recipe(iono_rec) %>%
      add_model(svm_mod)

# run model tuning
set.seed(35)
recipe_res <-
  svm_workflow %>% 
  tune_grid(
    resamples = bootstraps(ion_train, times = 2),
    metrics = metric_set(roc_auc),
    control = control_grid(verbose = TRUE, save_pred = TRUE)
  )

# chose best model, finalise workflow
best_mod <- recipe_res %>% select_best("roc_auc")
final_wf <- finalize_workflow(svm_workflow, best_mod)
final_mod <- final_wf %>% fit(ion_train)

predict_res <- predict(
        final_mod,
        ion_test,
        type = "prob")


results <- predict_res %>% 
    cbind(ion_test$Class) %>%
    dplyr::rename(
        bad_pred_svm = .pred_bad,
        good_pred_svm = .pred_good,
        ok_pred_svm = .pred_ok,
        class = `ion_test$Class`
    ) %>%
    mutate(
        bad_true = ifelse(class == "bad", 1, 0),
        good_true = ifelse(class == "good", 1, 0),
        ok_true = ifelse(class == "ok", 1, 0)
    ) %>%
dplyr::select(-class)

This produces a results dataframe that looks like this:这会产生一个结果 dataframe，如下所示：

  bad_pred_svm good_pred_svm ok_pred_svm bad_true good_true ok_true
1   0.01166109    0.92349066  0.06484826        0         1       0
2   0.82937620    0.07576908  0.09485472        1         0       0
3   0.05858563    0.88043189  0.06098248        0         1       0
4   0.91602211    0.04624037  0.03773753        1         0       0
5   0.91841475    0.04407115  0.03751410        1         0       0
6   0.01014520    0.94295540  0.04689940        0         1       0

When I try and put this into multi_roc, I get an error:当我尝试将其放入 multi_roc 时，出现错误：

multi_roc_svm <- multi_roc(results, force_diag = TRUE)

Error in approx(res_sp[[i]][[j]], res_se[[i]][[j]], all_sp, yleft = 1,  : 
  need at least two non-NA values to interpolate
In addition: Warning messages:
1: In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
  collapsing to unique 'x' values
2: In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
  collapsing to unique 'x' value

I'm 99% sure this error is because I do not have any samples of "ok" class in my test data frame, but I don't know how to get around this.我 99% 确定这个错误是因为我的测试数据框中没有任何“ok”class 样本，但我不知道如何解决这个问题。 Could I plot the multi ROC curve by hand?我可以手动 plot 多 ROC 曲线吗？

Answer 1

I don't know what package multi_roc() is in but the tidymodels solution is pretty easy.我不知道 package multi_roc()是什么，但 tidymodels 解决方案非常简单。

If you just want to get the ROC value from the multiclass ROC curve, you can use the yardstick function:如果只是想从多类ROC曲线中得到ROC值，可以使用yardstick function：

> predict_res %>% 
+     bind_cols(ion_test) %>% 
+     # or roc_curve(Class, .pred_bad)
+     roc_auc(Class, .pred_bad)
# A tibble: 1 x 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 roc_auc binary         0.976

在 R 中绘制多类 ROC 曲线时出错

问题描述

1 个解决方案

解决方案1
0 2021-05-11 18:11:52

在 R 中绘制多类 ROC 曲线时出错

问题描述

1 个解决方案

解决方案1 0 2021-05-11 18:11:52

解决方案1
0 2021-05-11 18:11:52