简体   繁体   English

对一组变量使用 mutate 和 case_when

[英]Use mutate and case_when on a group of variables

I am trying to make a new variable that depends on a few conditions.我正在尝试创建一个取决于几个条件的新变量。 Here is an example of data similar to mine:这是一个与我的类似的数据示例:

df <- read.table(text="
color     num_1   shape      num_2   season    num_3     num_4  
red        1      triangle    4       Fall      2          8
blue       5      square      4       Summer    8          1
green      3      square      11      Summer    4          1
red        3      circle      2       Summer    1          5
red        7      triangle    6       Winter    7          9
blue       9      square      2       Fall      7          4", header=T)

I want to use mutate and case_when to make a new variable, for example if the color=red and any of the "num" categories are less than 3, the new variable's value would be "yes", or if the color=blue and any of the num categories are less than 5, the new variable would be "yes".我想使用 mutate 和 case_when 创建一个新变量,例如,如果 color=red 并且任何“num”类别小于 3,则新变量的值为“yes”,或者如果 color=blue 和任何 num 类别小于 5,新变量将为“是”。

color     num_1   shape      num_2   season    num_3     num_4     new_var
  
red        1      triangle    4       Fall      2          8         yes 
blue       5      square      4       Summer    8          1         yes
blue       9      square      11      Summer    8          7         no
red        3      circle      2       Summer    1          5         yes
red        7      triangle    6       Winter    7          9         no
blue       9      square      2       Fall      7          4         yes

I think I can do something like:我想我可以做类似的事情:


df <-df %>%
 mutate(new_var=case_when(
   color=="red" & c(2,4,6,7) < 3 ~ "Yes",
   color=="blue" & c(2,4,6,7) < 5 ~ "Yes" ,
   TRUE~"No"))

But I don't know if it is possible to chose the columns by position like this.但我不知道是否可以像这样选择 position 的列。 Any advice would be great!任何建议都会很棒!

You can't use raw column indexes like that, but you can use if_any你不能使用这样的原始列索引,但你可以使用if_any

df %>% 
  mutate(
    new_var = case_when(
      color=="red" & if_any(starts_with("num"), ~ . < 3) ~ "Yes",
      color=="blue" & if_any(starts_with("num"), ~ . < 5) ~ "Yes",
      TRUE ~ "No")
  )

The functions across , if_any , and if_all are all related and allow you to use the tidyselect helpers to look at multiple columns at once. acrossif_anyif_all的函数都是相关的,并允许您使用 tidyselect 帮助器一次查看多个列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM