I'm trying to subset data.frame in r to get all the factor levels which contains all values in a vector in a certain column. For example:
dt=data.frame(fact=c(rep("a",3),rep("b",3),rep("c",3)),val=c(1,2,3,2,3,4,3,4,5))
Now, the vector is: vec=c(1,2)
I would like the function to return only "a" because this level of column 'fact' contains both 1&2 in column 'val' (level "b" contains only value 2 and level "c" non of the two). In reality, the vector can contain any number of elements.
Seems pretty basic but can't find an answer,
We do a group by with 'fact' and filter
if all
the 'vec' values are in the 'val' column
library(dplyr)
dt %>%
group_by(fact) %>%
filter(all(vec %in% val))
# A tibble: 3 x 2
# Groups: fact [1]
# fact val
# <fct> <dbl>
#1 a 1
#2 a 2
#3 a 3
sapply(tapply(dt$val, dt$fact, `%in%`, x=vec), all)
gives
a b c
TRUE FALSE FALSE
Store this logical vector in some new value, say, keep
and subset in this way:
dtsub <- split(dt, dt$fact)[[keep]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.