dplyr汇总函数不适用于全局环境变量

Question

I'm trying to write a function that calculates the proportion of one column (outcome) given the values of another column. 我正在尝试编写一个函数，该函数根据给定另一列的值来计算一列的比例（结果）。 The code looks like this: 代码如下：

thresh_measure <- function(data, indicator, thresh_value)
{
   d1 <- data %>% 
    group_by(class_number, outcome) %>%
    summarize(n=sum(indicator <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, 'pass_rate', 0.8)

There seems to be an error with the summarise function where the current function returns all 0's. 汇总函数似乎存在错误，其中当前函数返回全0。 When I change it to look like this, it works: 当我将其更改为如下所示时，它可以工作：

thresh_measure <- function(data, indicator, thresh_value)
{
   d1 <- data %>% 
    group_by(class_number, outcome) %>%
    summarize(n=sum(pass_rate <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, 'pass_rate', 0.8)

I've tried using the .GlobalEnv to set the value, I've also detached all libraries except dplyr but it still isn't working. 我尝试使用.GlobalEnv设置值，我还分离了除dplyr之外的所有库，但仍然无法正常工作。

Answer 1

You have to deal with the name of the column you want to pass as a parameter .. For exemple (certainly better ways exists) : 您必须处理要作为参数传递的列的名称..例如（肯定存在更好的方法）：

thresh_measure <- function(data, indicator, thresh_value)
{
  d1 <- data
 names(d1)[names(d1)==indicator] <- "indicator"
  d1 <- d1 %>% 
    group_by(class_number, outcome)  %>%
    summarize(n=sum(indicator <= thresh_value))  %>% spread(outcome, n)

   d1$thresh_value <- thresh_value
  return(d1)
}

Answer 2

Two alternative ways that should work: 两种可行的替代方法：

# alternative I
thresh_measure <- function(data, indicator, thresh_value)
{
    ind_quo <- rlang::sym(indicator)
    d1 <- data %>%
        group_by(class_number, outcome) %>%
        summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, 'pass_rate', 0.8)

# alternative II
thresh_measure <- function(data, indicator, thresh_value)
{
    ind_quo <- enquo(indicator)
    d1 <- data %>%
        group_by(class_number, outcome) %>%
        summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, pass_rate, 0.8)

dplyr汇总函数不适用于全局环境变量

问题描述

2 个解决方案

解决方案1
0 已采纳 2018-02-01 14:01:44

解决方案2
0 2018-02-01 14:45:58

dplyr汇总函数不适用于全局环境变量

问题描述

2 个解决方案

解决方案1 0 已采纳 2018-02-01 14:01:44

解决方案2 0 2018-02-01 14:45:58

解决方案1
0 已采纳 2018-02-01 14:01:44

解决方案2
0 2018-02-01 14:45:58