Delete rows in dataframe based on values in multiple previous rows/columns

Question

I have the following dataframe:

I would like to delete rows for which there is a 1 in a previous row of column z with the same values in columns x and y. For example, for Row 10, I want to search Rows 1:9 for a row in which x = "b", y = "c", and z equals 1. If such a row exists in Rows 1:9, I want to delete Row 10.

Therefore, the resulting dataframe would remove rows 4, 5, 10, 11, and 12:

Answer 1

We can do this with data.table

library(data.table)
setDT(df1)[-df1[, .I[cummin(c(0, diff(z==1)))<0], .(x, y)]$V1]
#    x y z
# 1: a c 0
# 2: a c 0
# 3: a c 1
# 4: b c 0
# 5: b c 0
# 6: b c 0
# 7: b c 1
# 8: a d 0
# 9: a d 0
#10: a d 0

Answer 2

Here is a base R method with ave for grouping, interaction to construct the groups, and a bit of logical manipulation with an anonymous function. as.logical converts the output of ave , which is 1s and 0s into a logical vector which is used for substituting.

The anonymous function c(1,head(cummin(i != 1), -1)) returns a 1 for the first element of each group, as it will always be kept. For the remainder, we check if the previous value is not 1 and return the cumulative minimum, thus any instance of 1 will return 0 for the remaining elements. head is used to drop the final element as it is not part of the consideration.

df[as.logical(ave(df$z, interaction(df$x, df$y),
                  FUN=function(i) c(1,head(cummin(i != 1), -1)))), ]
   x y z
1  a c 0
2  a c 0
3  a c 1
6  b c 0
7  b c 0
8  b c 0
9  b c 1
13 a d 0
14 a d 0
15 a d 0

Answer 3

I am not sure I get your question, but if you want to delete all row where z = 1 you can use

which(nameofdataframe$z != 1)

If you want more arguments you can use & like this:

which(nameofdataframe$z != 1 & nameofdataframe$x == "b")

Hope this helps!

Delete rows in dataframe based on values in multiple previous rows/columns

Question

3 answers

solution1
3 ACCPTED 2017-04-28 15:40:02

solution2
2 2017-04-28 15:57:24

solution3
0 2017-04-28 15:37:19

Delete rows in dataframe based on values in multiple previous rows/columns

Question

3 answers

solution1 3 ACCPTED 2017-04-28 15:40:02

solution2 2 2017-04-28 15:57:24

solution3 0 2017-04-28 15:37:19

solution1
3 ACCPTED 2017-04-28 15:40:02

solution2
2 2017-04-28 15:57:24

solution3
0 2017-04-28 15:37:19