简体   繁体   English

使用 for 循环和 pROC 包计算 R 中的多条 ROC 曲线。 在预测变量字段中使用什么变量?

[英]Calculating multiple ROC curves in R using a for loop and pROC package. What variable to use in the predictor field?

I am using the pROC package and I want to calculate multiple ROC curve plots using a for loop.我正在使用 pROC 包,我想使用 for 循环计算多个 ROC 曲线图。 My variables are specific column names that are included as string in a vector and I want pROC to read sequentially that vector and use the strings in the field "predictor" that seems to accept text/characters.我的变量是作为字符串包含在向量中的特定列名,我希望 pROC 顺序读取该向量并使用“预测器”字段中似乎接受文本/字符的字符串。 However, I cannot parse correctly the variable, as I am getting the error:但是,我无法正确解析变量,因为我收到错误:

'predictor' argument should be the name of the column, optionally quoted. 'predictor' 参数应该是列的名称,可以选择引用。

here is an example code with aSAH dataset:这是一个带有 aSAH 数据集的示例代码:

ROCvector<- c("s100b","ndka")
for (i in seq_along(ROCvector)){
  a<-ROCvector[i]
pROC_obj <- roc(data=aSAH, outcome, as.character(a))

#code for output/print#

}

I have tried to call just "a" and using the functions print() or get() without any results.我试图只调用“a”并使用函数 print() 或 get() 没有任何结果。 Writing manually the variable (with or without quoting) works, of course.当然,手动编写变量(带或不带引号)是可行的。 Is there something I am missing about the type of variable I should use in the predictor field?关于我应该在预测变量字段中使用的变量类型,我是否遗漏了什么?

By passing data=aSAH as first argument, you are triggering the non-standard evaluation (NSE) of arguments, dplyr-style .通过将data=aSAH作为第一个参数传递,您将触发参数 dplyr-style 的非标准评估 (NSE) Therefore you cannot simply pass the column name in a variable.因此,您不能简单地在变量中传递列名。 Note the inconsistency with outcome that you pass unquoted and looks like a variable (but isn't)?请注意与未加引号传递的outcome不一致并且看起来像一个变量(但不是)? Fortunately, functions with NSE in dplyr come with an equivalent function with standard evaluation, whose name ends with _ .幸运的是, 在 dplyr 中带有 NSE 的函数带有一个具有标准评估的等效函数,其名称以_结尾 The pROC package follows this convention. pROC 包遵循此约定。 You should usually use those if you are programming with column names.如果您使用列名进行编程,通常应该使用它们。

Long story short, you should use the roc_ function instead, which accepts characters as column names (don't forget to quote "outcome" ):长话短说,您应该改用roc_函数,它接受字符作为列名(不要忘记引用"outcome" ):

pROC_obj <- roc_(data=aSAH, "outcome", as.character(a))

A slightly more idiomatic version of your code would be:您的代码稍微更惯用的版本是:

for (predictor in ROCvector) {
    pROC_obj <- roc_(data=aSAH, "outcome", predictor)
}

roc can accept formula , so we can use paste0 and as.formula to create one. roc可以接受公式,所以我们可以使用paste0as.formula来创建一个。 ie IE

library(pROC)
ROCvector<- c("s100b","ndka")
for (i in seq_along(ROCvector)){
    a<-ROCvector[i]
    pROC_obj <- roc(as.formula(paste0("outcome~",a)), data=aSAH)
    print(pROC_obj)
    #code for output/print#

}

To can get the original call ie without paste0 wich you can use for later for downstream calculations, use eval and bquote要获得原始调用,即没有paste0您可以稍后用于下游计算,请使用evalbquote

pROC_obj <- eval(bquote(roc(.(as.formula(paste0("outcome~",a))), data=aSAH)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM