简体   繁体   English

我想过滤组 id 在 r 中的列和某些行值上满足的特定条件

[英]I want to filter group id's specific conditions meeting on both column and some row value in r

I have sample data and I want to filter the number of id's never had sup status while on type ==N , meaning I only choose id with status == unsup while before switching type, and then the number id's who switched from N to P .我有样本数据,我想过滤在type ==N时从未有过sup状态的id's数量,这意味着我只在切换类型之前选择status == unsup的 id,然后选择从N to P的数字 id .

  • for example id==1 never had status==sup while on type==N , So I need to count id 1. Then I want to check this id also whether switching to P .例如id==1type==N上从来没有status==sup ,所以我需要计算 id 1。然后我还想检查这个 id 是否切换到P But id 2 is not eligible for selected because it have sup status while on type==N .但是 id 2 不符合 selected 条件,因为它在type==N时具有sup状态。

  • id's 2, 5, and id 7 will not be eligible for as they had status == sup , while on status N and id 7 was on NA only while on N . id 的 2、5 和 id 7 将不符合条件,因为它们具有status == sup ,而在状态N和 id 7 仅在N时处于NA上。

data <- data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,6,6,6,6,6,7,7,7),
                   type=c('N','N','N', 'N', 'P','P','N','N','N', 'I', 'I','N','N','N',
 'N', 'N','N','N','N', 'O', 'O','N','N','N', 'O','N','N','N', 'P', 'P', 'N','N','P'), 
status=c(NA,'unsup',NA,'unsup',NA,'sup',NA,NA,'sup',NA,'sup','unsup',NA,'unsup',NA,
'unsup','unsup',NA,'unsup',NA,'sup','sup',NA,NA,'unsup',NA,'unsup','unsup','unsup','sup', NA, NA, 'sup'))

Expected output预期 output

1. 1.

   id type status
1   1    N   <NA>
2   1    N  unsup
3   1    N   <NA>
4   1    N  unsup
5   1    P   <NA>
6   1    P    sup
7   3    N  unsup
8   3    N   <NA>
9   3    N  unsup
10  3    N   <NA>
11  3    N  unsup
12  4    N  unsup
13  4    N   <NA>
14  4    N  unsup
15  4    O   <NA>
16  4    O    sup
17  6    N   <NA>
18  6    N  unsup
19  6    N  unsup
20  6    P  unsup
21  6    P    sup

Then of which, id's switched to P are:那么其中,id切换到P的是:

   id type status
1   1    N   <NA>
2   1    N  unsup
3   1    N   <NA>
4   1    N  unsup
5   1    P   <NA>
6   1    P    sup
7   6    N   <NA>
8   6    N  unsup
9   6    N  unsup
10  6    P  unsup
11  6    P    sup

For the first case, after grouping by 'id', filter any 'id's not having status value as 'sup' and type as 'N' and those ids having any non-NA value for status where type is 'N'对于第一种情况,在按“id”分组后, filter任何不具有“sup” status值的“id”并type “N”,以及那些具有任何非 NA 值的status ,其中type为“N”

library(dplyr)
data1 <- data %>% 
  group_by(id) %>%
  filter((!any((status %in% 'sup' & type == 'N'), na.rm = TRUE))& 
      any(!is.na(status[type == "N"]))) %>% 
  ungroup

-output -输出

data1
# A tibble: 21 × 3
      id type  status
   <dbl> <chr> <chr> 
 1     1 N     <NA>  
 2     1 N     unsup 
 3     1 N     <NA>  
 4     1 N     unsup 
 5     1 P     <NA>  
 6     1 P     sup   
 7     3 N     unsup 
 8     3 N     <NA>  
 9     3 N     unsup 
10     3 N     <NA>  
# … with 11 more rows

From the subset data, we can filter again after checking for any case where there is a type value of 'N' and the next value ( lead ) is 'P' for each 'id'从子集数据中,我们可以在检查type值为“N”且下一个值( lead )是每个“id”的“P”的any情况后再次filter

data1 %>% 
  group_by(id) %>%
  filter(any(type== "N" & lead(type) == "P", na.rm = TRUE)) %>% 
  ungroup
# A tibble: 11 × 3
      id type  status
   <dbl> <chr> <chr> 
 1     1 N     <NA>  
 2     1 N     unsup 
 3     1 N     <NA>  
 4     1 N     unsup 
 5     1 P     <NA>  
 6     1 P     sup   
 7     6 N     <NA>  
 8     6 N     unsup 
 9     6 N     unsup 
10     6 P     unsup 
11     6 P     sup   

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM