使用dplyr仅在另一个值之后找到一个值的第一次出现

Question

I would like to find the first row of a value, but only when it occurs after another value. 我想找到一个值的第一行，但仅当它出现在另一个值之后。 I have a time-series data set of bird nest box use, and for each box I would like to filter to the row when a box is first vacated after being occupied. 我有一个燕窝盒使用的时间序列数据集，对于每个盒，当盒被占用后第一次移出时，我想过滤到该行。 Here's a simplified example of the data: 这是数据的简化示例：

# A tibble: 20 x 3
   NestID Date       Status  
   <chr>  <date>     <chr>   
 1 WA18   2019-02-01 Empty   
 2 WA18   2019-02-02 Empty   
 3 WA18   2019-02-03 Empty   
 4 WA18   2019-02-04 Occupied
 5 WA18   2019-02-05 Occupied
 6 WA18   2019-02-06 Occupied
 7 WA18   2019-02-07 Empty   
 8 WA18   2019-02-08 Empty 

dat <- structure(list(NestID = c("WA18", "WA18", "WA18", "WA18", "WA18", 
    "WA18", "WA18", "WA18", "WA18", "WA20", "WA20", "WA20", "WA20", 
    "WA20", "WA20", "WA20", "WA20", "WA20", "WA20", "WA20"), Date = structure(c(17928, 
    17929, 17930, 17931, 17932, 17933, 17934, 17935, 17936, 17555, 
    17556, 17557, 17558, 17559, 17560, 17561, 17562, 17563, 17564, 
    17565), class = "Date"), Status = c("Empty", "Empty", "Empty", 
    "Occupied", "Occupied", "Occupied", "Empty", "Empty", "Empty", 
    "Empty", "Empty", "Empty", "Empty", "Empty", "Empty", "Occupied", 
    "Occupied", "Empty", "Empty", "Empty")), class = c("tbl_df", 
    "tbl", "data.frame"), row.names = c(NA, -20L))

So for nest WA18, I want to filter to the row where the date is 2019-02-07 (the box is first considered empty after being occupied). 所以对于嵌套WA18，我想过滤到日期为2019-02-07的行（此框在被占用后首先被认为是空的）。 Not quite sure what the best route is to index that row, but I would like to use dplyr to do so. 不太清楚索引该行的最佳方法是什么，但是我想使用dplyr这样做。

Answer 1

You can use lag to get the value of a preceding row: 您可以使用lag来获取前一行的值：

dat %>%
  group_by(NestID) %>%
  filter(Status == "Empty" &
           lag(Status) == "Occupied")


#    NestID Date       Status
#    <chr>  <date>     <chr> 
#  1 WA18   2019-02-07 Empty 
#  2 WA20   2018-02-01 Empty

Answer 2

With data.table : 随着data.table ：

library(data.table)

setDT(dat)[, .SD[Status == "Empty" & shift(Status) == "Occupied"], by = NestID]

Output: 输出：

   NestID       Date Status
1:   WA18 2019-02-07  Empty
2:   WA20 2018-02-01  Empty

使用dplyr仅在另一个值之后找到一个值的第一次出现

问题描述

2 个解决方案

解决方案1
3 2019-02-27 15:20:00

解决方案2
2 2019-02-27 15:22:25

使用dplyr仅在另一个值之后找到一个值的第一次出现

问题描述

2 个解决方案

解决方案1 3 2019-02-27 15:20:00

解决方案2 2 2019-02-27 15:22:25

解决方案1
3 2019-02-27 15:20:00

解决方案2
2 2019-02-27 15:22:25