简体   繁体   English

使用dplyr仅在另一个值之后找到一个值的第一次出现

[英]Find first occurrence of a value only after another value using dplyr

I would like to find the first row of a value, but only when it occurs after another value. 我想找到一个值的第一行,但仅当它出现在另一个值之后。 I have a time-series data set of bird nest box use, and for each box I would like to filter to the row when a box is first vacated after being occupied. 我有一个燕窝盒使用的时间序列数据集,对于每个盒,当盒被占用后第一次移出时,我想过滤到该行。 Here's a simplified example of the data: 这是数据的简化示例:

# A tibble: 20 x 3
   NestID Date       Status  
   <chr>  <date>     <chr>   
 1 WA18   2019-02-01 Empty   
 2 WA18   2019-02-02 Empty   
 3 WA18   2019-02-03 Empty   
 4 WA18   2019-02-04 Occupied
 5 WA18   2019-02-05 Occupied
 6 WA18   2019-02-06 Occupied
 7 WA18   2019-02-07 Empty   
 8 WA18   2019-02-08 Empty 

dat <- structure(list(NestID = c("WA18", "WA18", "WA18", "WA18", "WA18", 
    "WA18", "WA18", "WA18", "WA18", "WA20", "WA20", "WA20", "WA20", 
    "WA20", "WA20", "WA20", "WA20", "WA20", "WA20", "WA20"), Date = structure(c(17928, 
    17929, 17930, 17931, 17932, 17933, 17934, 17935, 17936, 17555, 
    17556, 17557, 17558, 17559, 17560, 17561, 17562, 17563, 17564, 
    17565), class = "Date"), Status = c("Empty", "Empty", "Empty", 
    "Occupied", "Occupied", "Occupied", "Empty", "Empty", "Empty", 
    "Empty", "Empty", "Empty", "Empty", "Empty", "Empty", "Occupied", 
    "Occupied", "Empty", "Empty", "Empty")), class = c("tbl_df", 
    "tbl", "data.frame"), row.names = c(NA, -20L))

So for nest WA18, I want to filter to the row where the date is 2019-02-07 (the box is first considered empty after being occupied). 所以对于嵌套WA18,我想过滤到日期为2019-02-07的行(此框在被占用后首先被认为是空的)。 Not quite sure what the best route is to index that row, but I would like to use dplyr to do so. 不太清楚索引该行的最佳方法是什么,但是我想使用dplyr这样做。

You can use lag to get the value of a preceding row: 您可以使用lag来获取前一行的值:

dat %>%
  group_by(NestID) %>%
  filter(Status == "Empty" &
           lag(Status) == "Occupied")


#    NestID Date       Status
#    <chr>  <date>     <chr> 
#  1 WA18   2019-02-07 Empty 
#  2 WA20   2018-02-01 Empty 

With data.table : 随着data.table

library(data.table)

setDT(dat)[, .SD[Status == "Empty" & shift(Status) == "Occupied"], by = NestID]

Output: 输出:

   NestID       Date Status
1:   WA18 2019-02-07  Empty
2:   WA20 2018-02-01  Empty

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在dplyr中使用filter()来查找grepl值的第一个匹配项,并将其和所有后续行返回 - Using filter() in dplyr to find the first occurrence of a grepl value and return it and all following rows 在列中第一次出现值后删除行及其后续行(使用 dplyr) - remove row and its subsequent rows of a group after first occurrence of a value in a column (using dplyr) dplyr mutate:使用第一次出现的另一列创建列 - dplyr mutate: create column using first occurrence of another column 如何在R中第一次出现另一个条件后根据条件的第二次出现从列中获取值? - How to get a value from a column based on second occurrence of a condition after first occurrence of another condition in R? 使用dplyr mutate在组中查找第一次出现的值 - Find first occurence of value in group using dplyr mutate 如何使用 dplyr 在 R 中找到第一个值的列? - How to find column with first of value in R using dplyr? 在第一次出现值后替换有序列 - Replace ordered columns after first occurrence of value 使用dplyr按子组条件过滤(指定每个组的值的出现) - Filter by subgroup criteria (specify the occurrence of a value per group) using dplyr 查找向量中第一次出现的值,如果值不存在则返回向量的长度 - Find first occurrence of value in vector, and return length of vector if value not present 仅使用dplyr突变数值列的值,并且仅对数据帧的第一行和最后一行进行突变 - Mutate value only for numeric columns AND only in the first and last row of a data frame using dplyr
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM