简体   繁体   中英

dplyr : filter a sequence of rows (in one column)

A dummy data frame:

id_family<- c(1, 1, 2, 2, 3, 3)
people<- c("male", "female", "male", "female", "male", "children") 

dataset <- data.frame(id_family, people)  
dataset

My results :

id_family   people
1           male            
1           female          
2           male            
2           female          
3           male            
3           children

What I want: filtering rows based on the "male and female" sequence

Expected result: filtering families 1 and 2

id_family   people
1           male            
1           female          
2           male            
2           female          

I tried to use lag/lead dplyr's functions without success:

 dataset2 <- dataset %>%
    filter(people=="male", lead(people)=="female")

We can use all

dataset %>%
      group_by(id_family) %>%
      filter(all(c("male", "female") %in% people))
# A tibble: 4 x 2
# Groups: id_family [2]
#  id_family people
#      <dbl> <fctr>
#1         1   male
#2         1 female
#3         2   male
#4         2 female

Or as per the OP's comments, if order is important then

dataset %>%
       group_by(id_family) %>% 
       filter(first(people)=="male", last(people) == "female", n()==2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM