简体   繁体   English

使用 dplyr::mutate 根据字符串向量(或 tidyselect)传递的多个条件和相应的变量名称创建新变量

[英]Creating new variable with dplyr::mutate based on multiple conditions and corresponding variable names passed by string vector (or tidyselect)

I'm pretty sure this was discussed before but I'm struggling verbalizing the problem: For example, I'm looking for this data frame...我很确定这是之前讨论过的,但我很难用语言表达这个问题:例如,我正在寻找这个数据框......

iris %>%
    mutate(has_petal_1.4 = Petal.Length == 1.4 | Petal.Width == 1.4,
           width_greater_1 = Sepal.Width > 1 & Petal.Width > 1)

...without having to name the variables in the conditions explicitly. ...无需明确命名条件中的变量。 Is there a way to pass the variable names using a string vector?有没有办法使用字符串向量传递变量名? Unfortunately, this doesn't seem to work:不幸的是,这似乎不起作用:

varsel <- c('Petal.Length', 'Petal.Width')
iris %>%
  mutate(has_petal_1.4 = 1.4 %in% c(!!! syms(varsel)))

Moreover, I wonder whether there is a solution using tidyselect within the mutate() function.此外,我想知道在 mutate() function 中是否有使用 tidyselect 的解决方案。 So far, I used the new and handy across() function in order to mutate multiple variables.到目前为止,我使用了新的方便的 cross() function 来改变多个变量。 Is it possible to use it for conditions as well?是否也可以在条件下使用它? Here another example that doesn't work:这是另一个不起作用的示例:

iris %>%
  mutate(has_petal_1.4 = across(c(starts_with('Petal')), function(x) {1.4 %in% x}))

Any help is highly appreciated.非常感谢任何帮助。

There are multiple ways, one option is c_across有多种方法,一种选择是c_across

library(dplyr) # >= 1.0.0
iris %>% 
    rowwise %>% 
    mutate(has_petal_1.4 = any(c_across(varsel) == 1.4),
           width_greater_1 = all(c_across(ends_with('Width')) > 1)) %>%
    ungroup
# A tibble: 150 x 7
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species has_petal_1.4 width_greater_1
#          <dbl>       <dbl>        <dbl>       <dbl> <fct>   <lgl>         <lgl>          
# 1          5.1         3.5          1.4         0.2 setosa  TRUE          FALSE          
# 2          4.9         3            1.4         0.2 setosa  TRUE          FALSE          
# 3          4.7         3.2          1.3         0.2 setosa  FALSE         FALSE          
# 4          4.6         3.1          1.5         0.2 setosa  FALSE         FALSE          
# 5          5           3.6          1.4         0.2 setosa  TRUE          FALSE          
# 6          5.4         3.9          1.7         0.4 setosa  FALSE         FALSE          
# 7          4.6         3.4          1.4         0.3 setosa  TRUE          FALSE          
# 8          5           3.4          1.5         0.2 setosa  FALSE         FALSE          
# 9          4.4         2.9          1.4         0.2 setosa  TRUE          FALSE          
#10          4.9         3.1          1.5         0.1 setosa  FALSE         FALSE          
# … with 140 more rows

Or a faster option with rowSums或者使用rowSums的更快选项

iris %>%     
    mutate(has_petal_1.4 =  rowSums(select(., varsel) == 1.4) > 0,
           width_greater_1 = rowSums(select(., ends_with('Width')) > 1) == 2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM