简体   繁体   English

在R中某个值之后删除行

[英]Removing rows after a certain value in R

I have a data frame in R, 我在R中有一个数据框,

df <- data.frame(a=c(1,1,1,2,2,5,5,5,5,5,6,6), b=c(0,1,0,0,0,0,0,1,0,0,0,1))

I want to remove the rows which has values for the variable b equal to 0 which occurs after the value equals to 1 for the duplicated variable a values. 我想删除变量b的值等于0的行,该行在重复变量a的值等于1之后发生。

So the output I am looking for is, 所以我想要的输出是

df.out <- data.frame(a=c(1,1,2,2,5,5,5,6,6), b=c(0,1,0,0,0,0,1,0,1))

Is there a way to do this in R? 有没有办法在R中做到这一点?

This should do the trick? 这应该做的把戏吗?

ind = intersect(which(df$b==0), which(df$b==1)+1)
df.out = df[-ind,]

The which(df$b==1) returns the index of the df where b==1. which(df $ b == 1)返回df的索引,其中b == 1。 add one to this and intersect with the indexes where b==0. 为此添加一个并与其中b == 0的索引相交。

How about 怎么样

df[ ave(df$b, df$a, FUN=function(x) x>=cummax(x))==1, ]

#        a b
#     1  1 0
#     2  1 1
#     4  2 0
#     5  2 0
#     6  5 0
#     7  5 0
#     8  5 1
#     11 6 0
#     12 6 1

Here we use ave to look within each level of a and we test to see if we've seen a 1 yet with cummax . 在这里,我们使用ave查看a每个级别,并测试是否看到cummax 1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM