简体   繁体   English

使用 pROC 将 R 中的 function 写入 plot ROC 曲线

[英]Writing a function in R to plot ROC curve using pROC

I'm trying to write a function to plot ROC curves based on different scoring systems I have to predict an outcome.我正在尝试根据不同的评分系统编写 function 到 plot ROC 曲线我必须预测结果。

I have a dataframe data_all, with columns "score_1" and "Threshold.2000".我有一个 dataframe data_all,列“score_1”和“Threshold.2000”。 I generate a ROC curve as desired with the following:我根据需要使用以下内容生成 ROC 曲线:

plot.roc(data_all$Threshold.2000, data_all$score_1)

My goal is to generate a ROC curve for a number of different outcomes (eg Threshold.1000) and scores (score_1, score_2 etc), but am initially trying to set it up just for different scores.我的目标是为许多不同的结果(例如 Threshold.1000)和分数(score_1、score_2 等)生成 ROC 曲线,但我最初试图为不同的分数设置它。 My function is as follows:我的function如下:

roc_plot <- function(dataframe_of_interest, score_of_interest) {
plot.roc(dataframe_of_interest$Threshold.2000, dataframe_of_interest$score_of_interest)}

I get the following error: Error in roc.default(x, predictor, plot = TRUE, ...): No valid data provided.我收到以下错误: roc.default(x, predictor, plot = TRUE, ...) 中的错误:未提供有效数据。

I'd be very grateful if someone can spot why my function doesn't work, I'm a python coder and new-ish to R.如果有人能发现我的 function 不起作用的原因,我将不胜感激,我是 python 编码器和 R 的新手。 and haven't had much luck trying a number of different things.并没有太多的运气尝试了许多不同的事情。 Thanks very much.非常感谢。

EDIT: Here is the same example with mtcars so it's reproducible:编辑:这是与 mtcars 相同的示例,因此它是可重现的:

data(mtcars)
plot.roc(mtcars$vs, mtcars$mpg) # --> makes correct graph
roc_plot <- function(dataframe_of_interest, score_of_interest) {
plot.roc(dataframe_of_interest$mpg, dataframe_of_interest$score_of_interest)}

Outcome: Error in roc.default(x, predictor, plot = TRUE, ...): No valid data provided.结果: roc.default 中的错误(x,预测器,plot = TRUE,...):未提供有效数据。 roc_plot(mtcars, vs) roc_plot(mtcars,vs)

Here's one solution that works as desired (ie lets the user specify different values for score_of_interest ):这是一种可以按需要工作的解决方案(即让用户为score_of_interest指定不同的值):

library(pROC)
data(mtcars)

plot.roc(mtcars$vs, mtcars$mpg) # --> makes correct graph

# expects `score_of_interest` to be a string!!!
roc_plot <- function(dataframe_of_interest, score_of_interest) {
    plot.roc(dataframe_of_interest$vs, dataframe_of_interest[, score_of_interest])
}

roc_plot(mtcars, 'mpg')
roc_plot(mtcars, 'cyl')

Note that your error was not resulting from an incorrect column name, it was resulting from an incorrect use of the data.frame class.请注意,您的错误不是由不正确的列名引起的,而是由于对data.frame class 的错误使用引起的。 Notice what happens with a simpler function:注意更简单的 function 会发生什么:

foo <- function(x, col_name) {
    head(x$col_name)
}
foo(mtcars, mpg)
## NULL

This returns NULL .这将返回NULL So in your original function when you tried to supply plot.roc with dataframe_of_interest$score_of_interest you were actually feeding plot.roc a NULL . So in your original function when you tried to supply plot.roc with dataframe_of_interest dataframe_of_interest$score_of_interest you were actually feeding plot.roc a NULL .

There are several ways to extract a column from a data.frame by the column name when that name is stored in an object (which is what you're doing when you pass it as an argument in a function).当该名称存储在 object 中时,有几种方法可以通过列名称从data.frame中提取列(当您将其作为函数中的参数传递时,您正在执行此操作)。 Perhaps the easiest way is to remember that a data.frame is like a 2D array-type object and so we can use familiar object[i, j] syntax, but we ask for all rows and we specify the column by name, eg, mtcars[, 'mpg'] .也许最简单的方法是记住data.frame就像一个二维数组类型 object ,因此我们可以使用熟悉的object[i, j]语法,但我们要求所有行并按名称指定列,例如, mtcars[, 'mpg'] This still works if we assign the string 'mpg' to an object:如果我们将字符串'mpg'分配给 object,这仍然有效:

x <- 'mpg'
mtcars[, x]

So that's how I produced my solution.这就是我提出解决方案的方式。 Going a step further, it's not hard to imagine how you would be able to supply both a score_of_interest and a threshold_of_interest :更进一步,不难想象如何同时提供score_of_interestthreshold_of_interest

roc_plot2 <- function(dataframe_of_interest, threshold_of_interest, score_of_interest) {
    plot.roc(dataframe_of_interest[, threshold_of_interest], 
             dataframe_of_interest[, score_of_interest])
}

roc_plot2(mtcars, 'vs', 'mpg')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM