简体   繁体   中英

Subset data frame for all factor levels which contains values in vector

I'm trying to subset data.frame in r to get all the factor levels which contains all values in a vector in a certain column. For example:

dt=data.frame(fact=c(rep("a",3),rep("b",3),rep("c",3)),val=c(1,2,3,2,3,4,3,4,5))

which looks like:

Now, the vector is: vec=c(1,2) I would like the function to return only "a" because this level of column 'fact' contains both 1&2 in column 'val' (level "b" contains only value 2 and level "c" non of the two). In reality, the vector can contain any number of elements.

Seems pretty basic but can't find an answer,

We do a group by with 'fact' and filter if all the 'vec' values are in the 'val' column

library(dplyr)
dt %>% 
  group_by(fact) %>% 
  filter(all(vec %in% val))
# A tibble: 3 x 2
# Groups:   fact [1]
#  fact    val
#  <fct> <dbl>
#1 a         1
#2 a         2
#3 a         3
sapply(tapply(dt$val, dt$fact, `%in%`, x=vec), all)

gives

    a     b     c 
 TRUE FALSE FALSE 

Store this logical vector in some new value, say, keep and subset in this way:

dtsub <- split(dt, dt$fact)[[keep]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM