[英]How do I write a simple function that incorporates a function from an R package?
This runs fine when I specify everything, but just trying to generalize it a bit with "score" and "outcome" and it fails (see the end).当我指定所有内容时,它运行良好,但只是试图用“分数”和“结果”来概括它,但它失败了(见最后)。 Any idea how to do this?
知道怎么做吗? (I have the indices thing because I want to bootstrap this later)
(我有索引的东西,因为我想稍后引导它)
library(PRROC)
df <- iris %>% filter(Species != "virginica") %>% mutate(outcome_versi = ifelse(Species == "versicolor", 1, 0)) %>% select(Sepal.Length, outcome_versi)
#Iris single AUC
fc <- function(data, indices){
d <- data[indices,]
versi.y <- d %>% filter(outcome_versi == 1) %>% select(Sepal.Length)
versi.n <- d %>% filter(outcome_versi == 0)%>% select(Sepal.Length)
prroc.sepal.length <-pr.curve(scores.class0 = versi.y$Sepal.Length, scores.class1 = versi.n$Sepal.Length, curve=T)
return(prroc.sepal.length$auc.integral)
}
fc(df)
#AUC = 0.94
#Iris single AUC - functionalized
fcf <- function(score, outcome, data, indices){
d <- data[indices,]
test.pos <- d %>% filter(outcome==1) %>% select(score)
test.neg <- d %>% filter(outcome==0) %>% select(score)
prroc.test <-pr.curve(scores.class0 = test.pos$score, scores.class1 = test.neg$score, curve=T)
return(prroc.test$auc.integral)
}
fcf(data=df, score=Sepal.Length, outcome = outcome_versi)
#Error: 'outcome' not found```
Does this work?这行得通吗?
fcf <- function(score, outcome, data, indices){
d <- data[indices,]
test.pos <- d %>% filter(outcome==1) %>% select(all_of(score))
test.neg <- d %>% filter(outcome==0) %>% select(all_of(score))
prroc.test <-pr.curve(scores.class0 = test.pos$score, scores.class1 = test.neg$score, curve=T)
return(prroc.test$auc.integral)
}
fcf(data=df, score='Sepal.Length', outcome = 'outcome_versi')
I don't have the required package to test.我没有测试所需的 package。 But I assume it's because you've asked for a column in the df but that isn't a variable by itself.
但我认为这是因为您要求在 df 中提供一列,但这本身并不是一个变量。
NB if you have an older version of dplyr
you might need to make use of rlang
quasiquotation注意,如果您有旧版本的
dplyr
,您可能需要使用rlang
quasiquotation
As I mentioned yesterday, this is a standard NSe problem, which is [almost] always encountered when programming in the tidyverse.正如我昨天提到的,这是一个标准的 NSe 问题,在 tidyverse 中编程时[几乎] 总是会遇到。 The problem is caused by the fact that tidyverse allows you to write, for example,
问题是由于 tidyverse 允许您编写,例如,
iris %>% filter(Sepal.Length < 6)
All other things being equal, at the time the function is called, the object Sepal.Length
does not exist, but no error is thrown and the code works "as expected".在所有其他条件相同的情况下,在调用 function 时,object
Sepal.Length
不存在,但不会引发错误并且代码“按预期”工作。
Here's how I deal with this in your situation.这是我在你的情况下如何处理这个问题。 Note that I have removed the
condition
parameter to the function, because I feel this is more naturally handled by a call to filter
earlier in the pipe and I have moved data
/ d
to be the first parameter of the function so that it fits more naturally into a pipe.请注意,我已将
condition
参数删除到 function,因为我觉得这更自然地通过在 pipe 中调用filter
来处理,并且我已将data
/ d
移动为 ZC1C425268E68384F1AB40 的第一个参数,因此它更自然地适合进入 pipe。
Also, I don't have the PRROC
package, so have commened out the call to it inside the function, and replaced the original return value accordingly.另外,我没有
PRROC
package,所以在 function 中取消了对它的调用,并相应地替换了原始返回值。 Simply make the obvious changes to get the functionality you need.只需进行明显的更改即可获得所需的功能。 The solution to the NSE issue does not depend on access to
PRROC
. NSE 问题的解决方案不依赖于对
PRROC
的访问。
library(magrittr)
library(dplyr)
fcf <- function(d, score=Sepal.Length, outcome = outcome_versi){
qScore <- enquo(score)
qOutcome <- enquo(outcome)
test.pos <- d %>% filter(!! qOutcome == 1) %>% select(!! qScore)
test.neg <- d %>% filter(!! qOutcome == 0) %>% select(!! qScore)
# prroc.test <-pr.curve(scores.class0 = test.pos$score, scores.class1 = test.neg$score, curve=T)
# return(prroc.test$auc.integral)
return(list("pos"=test.pos, "neg"=test.neg))
}
# as_tibble simply to improve formatting
as_tibble(iris) %>%
mutate(outcome_versi = ifelse(Species == "versicolor", 1, 0)) %>%
fcf()
$pos
# A tibble: 50 × 1
Sepal.Length
<dbl>
1 7
2 6.4
3 6.9
4 5.5
5 6.5
6 5.7
7 6.3
8 4.9
9 6.6
10 5.2
# … with 40 more rows
$neg
# A tibble: 100 × 1
Sepal.Length
<dbl>
1 5.1
2 4.9
3 4.7
4 4.6
5 5
6 5.4
7 4.6
8 5
9 4.4
10 4.9
# … with 90 more rows
And similarly,同样,
set.seed(123)
as_tibble(iris) %>%
mutate(
outcome_versi = ifelse(Species == "versicolor", 1, 0),
RandomOutcome=runif(nrow(.)) > 0.5
) %>%
filter(Sepal.Length < 6) %>%
fcf(score=Petal.Width, outcome=RandomOutcome)
$pos
# A tibble: 40 × 1
Petal.Width
<dbl>
1 0.2
2 0.2
3 0.2
4 0.3
5 0.2
6 0.2
7 0.2
8 0.1
9 0.1
10 0.4
# … with 30 more rows
$neg
# A tibble: 43 × 1
Petal.Width
<dbl>
1 0.2
2 0.2
3 0.4
4 0.1
5 0.2
6 0.2
7 0.4
8 0.3
9 0.3
10 0.2
# … with 33 more rows
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.