Filter factor levels in R using dplyr

Question

This is the glimpse() of my dataframe DF:

Observations: 221184
Variables:
$ Epsilon    (fctr) 96002.txt, 96002.txt, 96004.txt, 96004.txt, 96005.txt, 960...
$ Value   (int) 61914, 61887, 61680, 61649, 61776, 61800, 61753, 61725, 616...

I want to filter (remove) all the observations with the first two levels of Epsilon using dplyr.

I mean:

DF %>% filter(Epsilon != "96002.txt" & Epsilon != "96004.txt")

However, I don't want to use the string values (ie, "96002.txt" and "96004.txt") but the level orders (ie, 1 and 2), because it should be a general instruction independent of the level values.

Answer 1

You can easily convert a factor into an integer and then use conditions on it. Just replace your filter statement with:

 filter(as.integer(Epsilon)>2)

More generally, if you have a vector of indices level you want to eliminate, you can try:

 #some random levels we don't want
 nonWantedLevels<-c(5,6,9,12,13)
 #just the filter part
 filter(!as.integer(Epsilon) %in% nonWantedLevels)

Filter factor levels in R using dplyr

Question

1 answers

solution1
17 ACCPTED 2015-05-05 11:52:55

Filter factor levels in R using dplyr

Question

1 answers

solution1 17 ACCPTED 2015-05-05 11:52:55

solution1
17 ACCPTED 2015-05-05 11:52:55