[英]Removing rows after a certain value in R
I have a data frame in R, 我在R中有一个数据框,
df <- data.frame(a=c(1,1,1,2,2,5,5,5,5,5,6,6), b=c(0,1,0,0,0,0,0,1,0,0,0,1))
I want to remove the rows which has values for the variable b equal to 0 which occurs after the value equals to 1 for the duplicated variable a values. 我想删除变量b的值等于0的行,该行在重复变量a的值等于1之后发生。
So the output I am looking for is, 所以我想要的输出是
df.out <- data.frame(a=c(1,1,2,2,5,5,5,6,6), b=c(0,1,0,0,0,0,1,0,1))
Is there a way to do this in R? 有没有办法在R中做到这一点?
This should do the trick? 这应该做的把戏吗?
ind = intersect(which(df$b==0), which(df$b==1)+1)
df.out = df[-ind,]
The which(df$b==1) returns the index of the df where b==1. which(df $ b == 1)返回df的索引,其中b == 1。 add one to this and intersect with the indexes where b==0.
为此添加一个并与其中b == 0的索引相交。
How about 怎么样
df[ ave(df$b, df$a, FUN=function(x) x>=cummax(x))==1, ]
# a b
# 1 1 0
# 2 1 1
# 4 2 0
# 5 2 0
# 6 5 0
# 7 5 0
# 8 5 1
# 11 6 0
# 12 6 1
Here we use ave
to look within each level of a
and we test to see if we've seen a 1 yet with cummax
. 在这里,我们使用
ave
查看a
每个级别,并测试是否看到cummax
1。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.