简体   繁体   English

根据R中多列的值过滤行

[英]Filter rows based on values of multiple columns in R

Here is the data set, say name is DS. 这是数据集,假设名称为DS。

       Abc    Def   Ghi
1      41     190   67
2      36     118   72
3      12     149   74
4      18     313   62
5      NA      NA   56
6      28      NA   66
7      23     299   65
8      19      99   59
9       8      19   61
10     NA     194   69

How to get a new dataset DSS where value of column Abc is greater than 25, and value of column Def is greater than 100.It should also ignore any row if value of atleast one column in NA. 如何获取新的数据集DSS,其中Abc列的值大于25,而Def列的值大于100.如果NA中的至少一列的值,则还应忽略任何行。

I have tried few options but wasn't successful. 我尝试了几种选择,但没有成功。 Your help is appreciated. 感谢您的帮助。

There are multiple ways of doing it. 有多种实现方法。 I have given 5 methods, and the first 4 methods are faster than the subset function. 我给出了5种方法,而前4种方法比子集函数要快。

R Code: R代码:

# Method 1:
DS_Filtered <- na.omit(DS[(DS$Abc > 20 & DS$Def > 100), ]) 
# Method 2: which function also ignores NA
DS_Filtered <- DS[ which( DS$Abc > 20 & DS$Def > 100) , ]
# Method 3:
DS_Filtered <- na.omit(DS[(DS$Abc > 20) & (DS$Def >100), ])

# Method 4: using dplyr package
DS_Filtered <- filter(DS, DS$Abc > 20, DS$Def >100)
DS_Filtered <- DS %>% filter(DS$Abc > 20 & DS$Def >100)

# Method 5: Subset function by default ignores NA
DS_Filtered <- subset(DS, DS$Abc >20 & DS$Def > 100) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM