简体   繁体   English

多类 kNN 中的 ROC

[英]ROC in Multiclass kNN

Im trying to run some ROC analysis on a multiclass knn model and dataset我正在尝试对多类 knn model 和数据集运行一些 ROC 分析

so far i have this code for the kNN model. It works well.到目前为止,我有 kNN model 的代码。它运行良好。
X_train_new is a dataset with 131 numeric variables (columns) and 7210 observations. X_train_new是一个包含 131 个数值变量(列)和 7210 个观测值的数据集。
Y_train is the outcome variable which i have as factor. Y_train是我作为因素的结果变量。 its a dataset with only 1 column (activity) and 7210 observations (there are 6 possible factors)它的数据集只有 1 列(活动)和 7210 个观察值(有 6 个可能的因素)

ctrl <- trainControl(method  = "cv",
                     number  = 10)

model2 <-    train(X_train_new,
                   Y_train$activity,
                   method     = "knn",
                   tuneGrid   = expand.grid(k = 5),
                   trControl  = ctrl,
                   metric     = "Accuracy"
)

X_test_new is a dataset with 131 numeric variables (columns) and 3089 observations. X_test_new是一个包含 131 个数值变量(列)和 3089 个观测值的数据集。
Y_test is the outcome variable which i have as factor. Y_test是我作为因素的结果变量。 its a dataset with only 1 column and 3089 observations (there are 6 possible factors)它是一个只有 1 列和 3089 个观察值的数据集(有 6 个可能的因素)

I run the predict function我运行预测 function

knnPredict_test <- predict(model2 , newdata = X_test_new )

I would like to do some ROC analysis on each class vs all.我想对每个 class 与所有进行一些 ROC 分析。 Im trying我正在努力

a = multiclass.roc ( Y_test$activity, knnPredict_test )

knnPredict_test is a vector with predicted classes: knnPredict_test是一个带有预测类的向量:

knnPredict_test <- predict(model2 ,newdata = X_test_new )
> length(knnPredict_test)
[1] 3089
> glimpse(knnPredict_test)
 Factor w/ 6 levels "laying","sitting",..: 2 1 5 1 3 2 4 5 3 2 ...

This is the error im getting这是我得到的错误

Error in roc.default(response, predictor, levels = X, percent = percent,  :   
Predictor must be numeric or ordered.

To get the ROC, you need a numeric prediction.要获得 ROC,您需要进行数字预测。 However, by default predict will give you the predicted classes.但是,默认情况下predict会给你预测的类。 Use type = "prob" .使用type = "prob"

Here is a reproducable example which has the same error.这是一个具有相同错误的可重现示例。

library(caret)

knnFit <- train(
  Species ~ .,
  data = iris,
  method = "knn"
)

predictions_bad <- predict(knnFit)

pROC::multiclass.roc(iris$Species, predictions_bad)
#> Error in roc.default(response, predictor, levels = X, percent = percent,  : 
#>   Predictor must be numeric or ordered.

Using type = "prob" fixes the error.使用type = "prob"修复错误。

predictions_good <- predict(knnFit, type = "prob")

pROC::multiclass.roc(iris$Species, predictions_good)
#> Call:
#> multiclass.roc.default(response = iris$Species, predictor = predictions_good)
#> 
#> Data: multivariate predictor predictions_good with 3 levels of iris$Species: setosa, versicolor, virginica.
#> Multi-class area under the curve: 0.9981

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM