简体   繁体   中英

I want to filter group id's specific conditions meeting on both column and some row value in r

I have sample data and I want to filter the number of id's never had sup status while on type ==N , meaning I only choose id with status == unsup while before switching type, and then the number id's who switched from N to P .

  • for example id==1 never had status==sup while on type==N , So I need to count id 1. Then I want to check this id also whether switching to P . But id 2 is not eligible for selected because it have sup status while on type==N .

  • id's 2, 5, and id 7 will not be eligible for as they had status == sup , while on status N and id 7 was on NA only while on N .

data <- data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,6,6,6,6,6,7,7,7),
                   type=c('N','N','N', 'N', 'P','P','N','N','N', 'I', 'I','N','N','N',
 'N', 'N','N','N','N', 'O', 'O','N','N','N', 'O','N','N','N', 'P', 'P', 'N','N','P'), 
status=c(NA,'unsup',NA,'unsup',NA,'sup',NA,NA,'sup',NA,'sup','unsup',NA,'unsup',NA,
'unsup','unsup',NA,'unsup',NA,'sup','sup',NA,NA,'unsup',NA,'unsup','unsup','unsup','sup', NA, NA, 'sup'))

Expected output

1.

   id type status
1   1    N   <NA>
2   1    N  unsup
3   1    N   <NA>
4   1    N  unsup
5   1    P   <NA>
6   1    P    sup
7   3    N  unsup
8   3    N   <NA>
9   3    N  unsup
10  3    N   <NA>
11  3    N  unsup
12  4    N  unsup
13  4    N   <NA>
14  4    N  unsup
15  4    O   <NA>
16  4    O    sup
17  6    N   <NA>
18  6    N  unsup
19  6    N  unsup
20  6    P  unsup
21  6    P    sup

Then of which, id's switched to P are:

   id type status
1   1    N   <NA>
2   1    N  unsup
3   1    N   <NA>
4   1    N  unsup
5   1    P   <NA>
6   1    P    sup
7   6    N   <NA>
8   6    N  unsup
9   6    N  unsup
10  6    P  unsup
11  6    P    sup

For the first case, after grouping by 'id', filter any 'id's not having status value as 'sup' and type as 'N' and those ids having any non-NA value for status where type is 'N'

library(dplyr)
data1 <- data %>% 
  group_by(id) %>%
  filter((!any((status %in% 'sup' & type == 'N'), na.rm = TRUE))& 
      any(!is.na(status[type == "N"]))) %>% 
  ungroup

-output

data1
# A tibble: 21 × 3
      id type  status
   <dbl> <chr> <chr> 
 1     1 N     <NA>  
 2     1 N     unsup 
 3     1 N     <NA>  
 4     1 N     unsup 
 5     1 P     <NA>  
 6     1 P     sup   
 7     3 N     unsup 
 8     3 N     <NA>  
 9     3 N     unsup 
10     3 N     <NA>  
# … with 11 more rows

From the subset data, we can filter again after checking for any case where there is a type value of 'N' and the next value ( lead ) is 'P' for each 'id'

data1 %>% 
  group_by(id) %>%
  filter(any(type== "N" & lead(type) == "P", na.rm = TRUE)) %>% 
  ungroup
# A tibble: 11 × 3
      id type  status
   <dbl> <chr> <chr> 
 1     1 N     <NA>  
 2     1 N     unsup 
 3     1 N     <NA>  
 4     1 N     unsup 
 5     1 P     <NA>  
 6     1 P     sup   
 7     6 N     <NA>  
 8     6 N     unsup 
 9     6 N     unsup 
10     6 P     unsup 
11     6 P     sup   

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM