简体   繁体   中英

R: filter rows based on a condition in one column

I have a dataframe:

a<-c(1,1,1,1,1,1,1,1,1,1,1)
b<-c(100,100,100,100,100,100,100,100,100,100,100)
c<-c(1,3,1,1,3,1,1,3,1,1,3)
d<-c(3400,3403,3407,3408,3412,3423,3434,3436,3445,3454,3645)
df<-data.frame(d,b,c,a)
df
      d   b c a
1  3400 100 1 1
2  3403 100 3 1
3  3407 100 1 1
4  3408 100 1 1
5  3412 100 3 1
6  3423 100 1 1
7  3434 100 1 1
8  3436 100 3 1
9  3445 100 1 1
10 3454 100 1 1
11 3645 100 3 1

and i want to filter always a rowpair, which fulfills the following condition: the column c value of the first row must be 3, the column c value of the second row must be 1 and the column d value between the pair has to be <10. So the expected output in this example should be:

      d   b c a
2  3403 100 3 1
3  3407 100 1 1
8  3436 100 3 1
9  3445 100 1 1

I tried the following:

filter(df,first(c)==3,nth(c,2)==1,any(diff(d) < 10))

but for some reason, it does not work. Thanks for your help!

You can first establish the indices of the first-pair parts using which :

library(dplyr)
inds <- which(df$c == 3 & lead(df$c) == 1 & lead(df$d) - df$d < 10)

and then subset your dataframe on the indices plus 1:

df[sort(unique(c(inds, inds + 1))),]
     d   b c a
2 3403 100 3 1
3 3407 100 1 1
8 3436 100 3 1
9 3445 100 1 1

Alternatively, you can do:

library(dplyr)
df1 <- df %>%                                        # get the first row
  filter(c == 3 & lead(c) == 1 & lead(d) - d < 10) 
df2 <- df %>%                                        # get the second row
  filter(lag(c) == 3 & c == 1 & d - lag(d) < 10)
arrange(rbind(df1, df2), d)                          # bind the two together and arange by d

The following code is not simple but it produces the expected result.

library(dplyr)

df %>%
  mutate(flag = cumsum(c == 3)) %>%
  group_by(flag) %>%
  slice_head(n = 2) %>%
  filter(n() > 1) %>%
  mutate(flag = flag*(diff(d) < 10)) %>%
  ungroup() %>%
  filter(flag > 0) %>%
  select(-flag)
## A tibble: 4 x 4
#      d     b     c     a
#  <dbl> <dbl> <dbl> <dbl>
#1  3403   100     3     1
#2  3407   100     1     1
#3  3436   100     3     1
#4  3445   100     1     1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM