![](/img/trans.png)
[英]Call function from the global environment with implicit dataframe variables (from the calling env?) inside dplyr::summarise or mutate
[英]dplyr summarise function not working with global environment variables
我正在尝试编写一个函数,该函数根据给定另一列的值来计算一列的比例(结果)。 代码如下:
thresh_measure <- function(data, indicator, thresh_value)
{
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(indicator <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, 'pass_rate', 0.8)
汇总函数似乎存在错误,其中当前函数返回全0。 当我将其更改为如下所示时,它可以工作:
thresh_measure <- function(data, indicator, thresh_value)
{
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(pass_rate <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, 'pass_rate', 0.8)
我尝试使用.GlobalEnv
设置值,我还分离了除dplyr之外的所有库,但仍然无法正常工作。
您必须处理要作为参数传递的列的名称..例如(肯定存在更好的方法):
thresh_measure <- function(data, indicator, thresh_value)
{
d1 <- data
names(d1)[names(d1)==indicator] <- "indicator"
d1 <- d1 %>%
group_by(class_number, outcome) %>%
summarize(n=sum(indicator <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
两种可行的替代方法:
# alternative I
thresh_measure <- function(data, indicator, thresh_value)
{
ind_quo <- rlang::sym(indicator)
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, 'pass_rate', 0.8)
# alternative II
thresh_measure <- function(data, indicator, thresh_value)
{
ind_quo <- enquo(indicator)
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, pass_rate, 0.8)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.