[英]R - Filter dataframe to only included rows where the column count meets a criteria
Assume this dataframe:假设这个 dataframe:
country <- c('USA', 'USA', 'USA', 'USA', 'USA', 'UK', 'UK', 'UK', 'Canada')
number <- c(1:9)
df <- data.frame(country, number)
I want to be able to subset only the rows where the country count is greater than 4 or less than 2. So in this case, it would return:我希望能够仅对国家计数大于 4 或小于 2 的行进行子集化。所以在这种情况下,它将返回:
country number
USA 1
USA 2
USA 3
USA 4
USA 5
Canada 9
I am able to make it work with this:我能够使它与此一起使用:
totalcounts <- filter(count(df, country), n>4 | n<2) # giving me a df of the country and count
for (i in nrow(totalcounts)){
# code in here that rbinds rows as it matches
}
But I feel there has to be an easier way.但我觉得必须有一个更简单的方法。 I haven't gotten the grasp of sapply and such yet, so I feel like I'm missing something here.我还没有掌握 sapply 之类的东西,所以我觉得我在这里遗漏了一些东西。 It just seems like I am going the long way around and there is already something in place that does this.似乎我正在走很长的路,并且已经有一些东西可以做到这一点。
Here is a base R option using subset
+ ave
这是使用subset
+ ave
的基本 R 选项
subset(df,!ave(number,country,FUN = function(x) length(x)%in% c(2:4)))
or a shorter version (Thank @Onyambu)或更短的版本(感谢@Onyambu)
subset(df,!ave(number,country,FUN = length) %in% 2:4)
such that这样
country number
1 USA 1
2 USA 2
3 USA 3
4 USA 4
5 USA 5
9 Canada 9
We can do a group by filter我们可以按过滤器分组
library(dplyr)
df %>%
group_by(country) %>%
filter(n() > 4|n() < 2)
# A tibble: 6 x 2
# Groups: country [2]
# country number
# <chr> <int>
#1 USA 1
#2 USA 2
#3 USA 3
#4 USA 4
#5 USA 5
#6 Canada 9
Or another option is to create a column of counts with add_count
and filter
或者另一种选择是使用add_count
和filter
创建一列计数
df %>%
add_count(country) %>%
filter(n > 4|n < 2) %>%
select(-n)
Or do a join if we use the count
或者如果我们使用count
进行连接
df %>%
count(country) %>%
filter(n >4 |n <2) %>%
select(country) %>%
inner_join(df)
Base R option using table
:使用table
的基本 R 选项:
tab <- table(df$country)
subset(df, country %in% names(tab[tab > 4 | tab < 2]))
# country number
#1 USA 1
#2 USA 2
#3 USA 3
#4 USA 4
#5 USA 5
#9 Canada 9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.