简体   繁体   中英

Filter factor levels in R using dplyr

This is the glimpse() of my dataframe DF:

Observations: 221184
Variables:
$ Epsilon    (fctr) 96002.txt, 96002.txt, 96004.txt, 96004.txt, 96005.txt, 960...
$ Value   (int) 61914, 61887, 61680, 61649, 61776, 61800, 61753, 61725, 616...

I want to filter (remove) all the observations with the first two levels of Epsilon using dplyr.

I mean:

DF %>% filter(Epsilon != "96002.txt" & Epsilon != "96004.txt")

However, I don't want to use the string values (ie, "96002.txt" and "96004.txt") but the level orders (ie, 1 and 2), because it should be a general instruction independent of the level values.

You can easily convert a factor into an integer and then use conditions on it. Just replace your filter statement with:

 filter(as.integer(Epsilon)>2)

More generally, if you have a vector of indices level you want to eliminate, you can try:

 #some random levels we don't want
 nonWantedLevels<-c(5,6,9,12,13)
 #just the filter part
 filter(!as.integer(Epsilon) %in% nonWantedLevels)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM