简体   繁体   English

使用dplyr时如何在涉及基R函数的赋值运算符的RHS中使用UQ?

[英]How to use UQ in the RHS of assignment operator involving base R functions when using dplyr?

I would like to use variables with varying names (ie names defined by using arguments of the function) in the calculations on RHS of assignment operators.我想在赋值运算符的 RHS 的计算中使用具有不同名称的变量(即使用函数的参数定义的名称)。 Base R functions ( min , max , etc.) interpret the result of the unquoting operand !!基本 R 函数( minmax等)解释不加引号的操作数的结果!! as string, which is not what I want.作为字符串,这不是我想要的。 How should I specify that R should work with variables names, not strings.我应该如何指定 R 应该使用变量名称,而不是字符串。 No problem when using !!使用时没问题!! with dplyr verbs, such as select , mutate , etc.使用 dplyr 动词,例如selectmutate等。


df <- tibble(
  g1 = c(1, 1, 2, 2, 2),
  g2 = c(1, 2, 1, 2, 1),
  a = sample(5),
  b = sample(5)
)

my_s_mutate <- function(df, group_var, expr) {

  group_var <- enquo(group_var)
  expr <- enquo(expr)
  mean_name <- paste0("mean_", quo_name(expr))
  sum_name <- paste0("sum_", quo_name(expr))

  df %>%
    group_by(!! group_var) %>%
    mutate(
         !! mean_name := mean(!! expr),
         !! sum_name := sum(!! expr),
         diff = !! sum_name - !! mean_name
  )
}

my_s_mutate(df, g1, a)

This code give error because is evaluating the difference between two strings.此代码给出错误,因为正在评估两个字符串之间的差异。 Similar problem occurs when using min or max使用minmax时出现类似问题

 Error in "sum_a" - "mean_a" : non-numeric argument to binary operator 

Any ideas how to solve the problem!!!任何想法如何解决问题!!!

You can use this function -您可以使用此功能 -

my_s_mutate <- function(df, group_var, expr) {
  expr1 <- enquo(expr)
  mean_name <- paste0("mean_", quo_name(expr1))
  sum_name <- paste0("sum_", quo_name(expr1))
  
  df %>%
    group_by({{group_var}}) %>%
    mutate(
      !! mean_name := mean({{expr}}),
      !! sum_name := sum({{expr}}),
      diff = .data[[sum_name]] - .data[[mean_name]]
    )
}

To work with variable names use {{}} , use strings as new column names with !!要使用变量名使用{{}} ,使用字符串作为新的列名!! and := .:= Refer to column names as string with .data .使用.data将列名作为字符串引用。

my_s_mutate(df, g1, a)

# A tibble: 5 x 7
# Groups:   g1 [2]
#     g1    g2     a     b mean_a sum_a  diff
#  <dbl> <dbl> <int> <int>  <dbl> <int> <dbl>
#1     1     1     1     2      3     6     3
#2     1     2     5     3      3     6     3
#3     2     1     2     4      3     9     6
#4     2     2     4     5      3     9     6
#5     2     1     3     1      3     9     6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM