[英]Remove group based on count of another column (R)
我有一個數據框,我只想保留至少有2箱有汽車的組,而保留組的另一個輸出將至少有2箱沒有汽車的組:
Group = c('a','a','a','b','b','b','c','c','c','c')
Car = c(1,1,0,0,0,0,1,0,0,0) # 1 = Have car, 0 = No car
df = data.frame(Group,Car)
df$Group = factor(df$Group)
df$Car = factor(df$Car)
Group Car
1 a 1
2 a 1
3 a 0
4 b 0
5 b 0
6 b 0
7 c 1
8 c 0
9 c 0
10 c 0
輸出應為:
Group Car
a 1
a 1
a 0
第二輸出:
Group Car
b 0
b 0
b 0
c 1
c 0
c 0
c 0
我有一個非常龐大的數據集。 請幫忙。 謝謝!
第一種情況:至少有2箱汽車的團體
library(dplyr)
df %>%
group_by(Group) %>%
filter(sum(Car) > 1)
# Group Car
# <fct> <dbl>
#1 a 1
#2 a 1
#3 a 0
或基礎R ave
subset(df, ave(Car, Group, FUN = sum) > 1)
和data.table
library(data.table)
setDT(df)[, if (sum(.SD) > 1) .SD, by = Group]
第2種情況:至少有2例無車的團體
df %>%
group_by(Group) %>%
filter(sum(Car == 0) > 1)
# Group Car
# <fct> <dbl>
#1 b 0
#2 b 0
#3 b 0
#4 c 1
#5 c 0
#6 c 0
#7 c 0
並具有基准R ave
subset(df, ave(Car == 0, Group, FUN = sum) > 1)
與data.table
setDT(df)[, if (sum(.SD == 0) > 1) .SD, by = Group]
數據
Group = c('a','a','a','b','b','b','c','c','c','c')
Car = c(1,1,0,0,0,0,1,0,0,0)
df = data.frame(Group,Car)
我們可以在一個同時獲得數據集list
使用一步split
lst1 <- split(df, df$Group %in% names(which(rowsum(df$Car, df$Group)[,1] >= 2)))
lst1
#$`FALSE`
# Group Car
#4 b 0
#5 b 0
#6 b 0
#7 c 1
#8 c 0
#9 c 0
#10 c 0
#$`TRUE`
# Group Car
#1 a 1
#2 a 1
#3 a 0
如果我們需要提取list
元素,請使用[[
lst1[[1]]
lst1[[2]]
Group <- c('a','a','a','b','b','b','c','c','c','c')
Car <- c(1,1,0,0,0,0,1,0,0,0)
df <- data.frame(Group,Car)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.