简体   繁体   中英

dplyr summarise function not working with global environment variables

I'm trying to write a function that calculates the proportion of one column (outcome) given the values of another column. The code looks like this:

thresh_measure <- function(data, indicator, thresh_value)
{
   d1 <- data %>% 
    group_by(class_number, outcome) %>%
    summarize(n=sum(indicator <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, 'pass_rate', 0.8)

There seems to be an error with the summarise function where the current function returns all 0's. When I change it to look like this, it works:

thresh_measure <- function(data, indicator, thresh_value)
{
   d1 <- data %>% 
    group_by(class_number, outcome) %>%
    summarize(n=sum(pass_rate <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, 'pass_rate', 0.8)

I've tried using the .GlobalEnv to set the value, I've also detached all libraries except dplyr but it still isn't working.

You have to deal with the name of the column you want to pass as a parameter .. For exemple (certainly better ways exists) :

thresh_measure <- function(data, indicator, thresh_value)
{
  d1 <- data
 names(d1)[names(d1)==indicator] <- "indicator"
  d1 <- d1 %>% 
    group_by(class_number, outcome)  %>%
    summarize(n=sum(indicator <= thresh_value))  %>% spread(outcome, n)

   d1$thresh_value <- thresh_value
  return(d1)
}

Two alternative ways that should work:

# alternative I
thresh_measure <- function(data, indicator, thresh_value)
{
    ind_quo <- rlang::sym(indicator)
    d1 <- data %>%
        group_by(class_number, outcome) %>%
        summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, 'pass_rate', 0.8)

# alternative II
thresh_measure <- function(data, indicator, thresh_value)
{
    ind_quo <- enquo(indicator)
    d1 <- data %>%
        group_by(class_number, outcome) %>%
        summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
    d1$thresh_value <- thresh_value
    return(d1)
}

final_test <- thresh_measure(df, pass_rate, 0.8)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM