简体   繁体   中英

How to group by then create a new subset in which the values are conditionally based on the each group's first row and a particular value

I have a simple data frame like this

df <- data.frame(x=c(1,1,3,3,2,2,2,1),
                 y=c('a','b','a','b','e','a','d','c'))

在此处输入图像描述

I want to group by x, create a new data frame of 2 columns: 'x' and 'test'. The value of 'test' will be based on conditions:

  • If in each group, if the first row has y == 'a' and then if 'c' appears in the list of values of y, then 'test' = 1 else 0

  • If in each group, if the first row has y == 'e' and then if 'd' appears in the list of values of y, then 'test' = 1 else 0

So the expected outcome would be as below

在此处输入图像描述

Thank you very much.

df %>%
  group_by(x) %>%
  summarise(test = (first(y) == "a" && any(y == "c") || (first(y) == "e" && any(y == "d"))) * 1L)
library(dplyr)
library(stringr)

df |> 
  group_by(x) |> 
  mutate(test = (row_number() == 1 & y == "a" & sum(str_detect(y, "c"))) | 
           (row_number() == 1 & y == "e" & sum(str_detect(y, "d")))) |> 
  summarize(test = sum(test))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM