使用dplyr过滤R中的因子水平

Question

This is the glimpse() of my dataframe DF: 这是我的数据框DF的一瞥（）：

Observations: 221184
Variables:
$ Epsilon    (fctr) 96002.txt, 96002.txt, 96004.txt, 96004.txt, 96005.txt, 960...
$ Value   (int) 61914, 61887, 61680, 61649, 61776, 61800, 61753, 61725, 616...

I want to filter (remove) all the observations with the first two levels of Epsilon using dplyr. 我想使用dplyr过滤（删除）Epsilon前两个级别的所有观察结果。

I mean: 我的意思是：

DF %>% filter(Epsilon != "96002.txt" & Epsilon != "96004.txt")

However, I don't want to use the string values (ie, "96002.txt" and "96004.txt") but the level orders (ie, 1 and 2), because it should be a general instruction independent of the level values. 但是，我不想使用字符串值（即“96002.txt”和“96004.txt”）而是使用级别顺序（即1和2），因为它应该是一个独立于级别的通用指令值。

Answer 1

You can easily convert a factor into an integer and then use conditions on it. 您可以轻松地将factor转换为integer ，然后使用条件。 Just replace your filter statement with: 只需将filter语句替换为：

 filter(as.integer(Epsilon)>2)

More generally, if you have a vector of indices level you want to eliminate, you can try: 更一般地说，如果你想要消除索引级别的向量，你可以尝试：

 #some random levels we don't want
 nonWantedLevels<-c(5,6,9,12,13)
 #just the filter part
 filter(!as.integer(Epsilon) %in% nonWantedLevels)

使用dplyr过滤R中的因子水平

问题描述

1 个解决方案

解决方案1
17 已采纳 2015-05-05 11:52:55

使用dplyr过滤R中的因子水平

问题描述

1 个解决方案

解决方案1 17 已采纳 2015-05-05 11:52:55

解决方案1
17 已采纳 2015-05-05 11:52:55