简体   繁体   中英

Filter Groups within dataframe where all element are within a list in R

I have a list such as the_list <- c("SP1","SP")

And I have a dataframe such as:

Groups Names 
G1     SP1
G1     SP2
G1     SP3
G1     SP4
G1     SP5
G2     SP1
G2     SP4
G2     SP5
G3     SP6
G3     SP7
G3     SP8
G4     SP1
G4     SP2
G4     SP7 

And I would like to only keep Groups where ALL elements in the_list are present within the Names column.

I should then get:

   Groups Names 
    G1     SP1
    G1     SP2
    G1     SP3
    G1     SP4
    G1     SP5
    G4     SP1
    G4     SP2
    G4     SP7 

So far I tried:

df <-df %>%
  group_by(Groups) %>%
  filter(all(Names %in% c('SP1','SP2'))) 

You almost have it. The problem is that the current syntax is asking "are all the values in the column 'Names' in c('SP1','SP2')?" instead of "are all the values in c('SP1','SP2') in the column 'Names'?".

So you just want to inverse the left and right hand side of the %in% like:

df %>%
  group_by(Groups) %>%
  filter(all(c('SP1','SP2') %in% Names))

And that will give you:

# # A tibble: 8 x 2
# # Groups:   Groups [2]
# Groups Names
# <chr>  <chr>
# 1 G1     SP1  
# 2 G1     SP2  
# 3 G1     SP3  
# 4 G1     SP4  
# 5 G1     SP5  
# 6 G4     SP1  
# 7 G4     SP2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM