简体   繁体   English

当我知道某个时间间隔的开始和结束记录时,该如何识别属于该时间间隔的记录? (右)

[英]How to identify the records that belong to a certain time interval when I know the start and end records of that interval? (R)

So, here is my problem. 所以,这是我的问题。 I have a dataset of locations of radiotagged hummingbirds I've been following as part of my thesis. 作为论文的一部分,我有一个一直跟踪着放射性标记的蜂鸟位置的数据集。 As you might imagine, they fly fast so there were intervals when I lost track of where they were until I eventually found them again. 就像您想象的那样,它们飞得很快,所以当我不知道它们在哪里之前,有时会有间隔,直到我最终再次找到它们。 Now I am trying to identify the segments where the bird was followed continuously (ie, the intervals between “Lost” periods). 现在,我试图确定连续跟踪鸟类的区域(即“失落”时间段之间的间隔)。

    ID  Type        TimeStart   TimeEnd     Limiter Starter Ender
    1   Observed    6:45:00     6:45:00     NO      Start   End 
    2   Lost        6:45:00     5:31:00     YES     NO      NO  
    3   Observed    5:31:00     5:31:00     NO      Start   NO  
    4   Observed    9:48:00     9:48:00     NO      NO      NO  
    5   Observed    10:02:00    10:02:00    NO      NO      NO  
    6   Observed    10:18:00    10:18:00    NO      NO      NO  
    7   Observed    11:00:00    11:00:00    NO      NO      NO  
    8   Observed    13:15:00    13:15:00    NO      NO      NO  
    9   Observed    13:34:00    13:34:00    NO      NO      NO  
    10  Observed    13:43:00    13:43:00    NO      NO      NO  
    11  Observed    13:52:00    13:52:00    NO      NO      NO  
    12  Observed    14:25:00    14:25:00    NO      NO      NO  
    13  Observed    14:46:00    14:46:00    NO      NO      End 
    14  Lost        14:46:00    10:47:00    YES     NO      NO  
    15  Observed    10:47:00    10:47:00    NO      Start   NO  
    16  Observed    10:57:00    11:00:00    NO      NO      NO  
    17  Observed    11:10:00    11:10:00    NO      NO      NO  
    18  Observed    11:19:00    11:27:55    NO      NO      NO  
    19  Observed    11:28:05    11:32:00    NO      NO      NO  
    20  Observed    11:45:00    12:09:00    NO      NO      NO  
    21  Observed    11:51:00    11:51:00    NO      NO      NO  
    22  Observed    12:11:00    12:11:00    NO      NO      NO  
    23  Observed    13:15:00    13:15:00    NO      NO      End 
    24  Lost        13:15:00    7:53:00     YES     NO      NO  
    25  Observed    7:53:00     7:53:00     NO      Start   NO  
    26  Observed    8:48:00     8:48:00     NO      NO      NO  
    27  Observed    9:25:00     9:25:00     NO      NO      NO  
    28  Observed    9:26:00     9:26:00     NO      NO      NO  
    29  Observed    9:32:00     9:33:25     NO      NO      NO  
    30  Observed    9:33:35     9:33:35     NO      NO      NO  
    31  Observed    9:42:00     9:42:00     NO      NO      NO  
    32  Observed    9:44:00     9:44:00     NO      NO      NO  
    33  Observed    9:48:00     9:48:00     NO      NO      NO  
    34  Observed    9:48:30     9:48:30     NO      NO      NO  
    35  Observed    9:51:00     9:51:00     NO      NO      NO  
    36  Observed    9:54:00         9:54:00     NO      NO      NO  
    37  Observed    9:55:00         9:55:00     NO      NO      NO  
    38  Observed    9:57:00     10:01:00    NO      NO      NO  
    39  Observed    10:02:00    10:02:00    NO      NO      NO  
    40  Observed    10:04:00    10:04:00    NO      NO      NO  
    41  Observed    10:06:00    10:06:00    NO      NO      NO  
    42  Observed    10:20:00    10:33:00    NO      NO      NO  
    43  Observed    10:34:00    10:34:00    NO      NO      NO  
    44  Observed    10:39:00    10:39:00    NO      NO      End 

Note: When there is a “Start” and an “End” in the same row it's because the non-lost period consists only of that record. 注意:如果同一行中同时存在“开始”和“结束”,这是因为非丢失时间段仅包含该记录。

I was able to identify the records that start or end these “non-lost” periods (under the columns “Starter” and “Ender”), but now I want to be able to identify those periods by giving them unique identifiers (period A,B,C or 1,2,3, etc). 我能够识别开始或结束这些“非丢失”期间的记录(在“启动器”和“结束器”列下),但是现在我希望能够通过给它们提供唯一的标识符来识别这些期间(时段A ,B,C或1,2,3等)。 Ideally, the name of the identifier would be the name of the start point for that period (ie, ID[ Starter==”Start”]) 理想情况下,标识符的名称应为该时间段的起点名称(即ID [Starter ==” Start”])

I'm looking for something like this: 我正在寻找这样的东西:

    ID  Type        TimeStart   TimeEnd     Limiter Starter Ender   Period

    1   Observed    6:45:00     6:45:00     NO      Start   End     1
    2   Lost        6:45:00     5:31:00     YES     NO      NO      Lost    
    3   Observed    5:31:00     5:31:00     NO      Start   NO      3
    4   Observed    9:48:00     9:48:00     NO      NO      NO      3
    5   Observed    10:02:00    10:02:00    NO      NO      NO      3
    6   Observed    10:18:00    10:18:00    NO      NO      NO      3
    7   Observed    11:00:00    11:00:00    NO      NO      NO      3
    8   Observed    13:15:00    13:15:00    NO      NO      NO      3
    9   Observed    13:34:00    13:34:00    NO      NO      NO      3
    10  Observed    13:43:00    13:43:00    NO      NO      NO      3
    11  Observed    13:52:00    13:52:00    NO      NO      NO      3
    12  Observed    14:25:00    14:25:00    NO      NO      NO      3
    13  Observed    14:46:00    14:46:00    NO      NO      End     3
    14  Lost        14:46:00    10:47:00    YES     NO      NO      Lost    
    15  Observed    10:47:00    10:47:00    NO      Start   NO      15
    16  Observed    10:57:00    11:00:00    NO      NO      NO      15
    17  Observed    11:10:00    11:10:00    NO      NO      NO      15
    18  Observed    11:19:00    11:27:55    NO      NO      NO      15
    19  Observed    11:28:05    11:32:00    NO      NO      NO      15
    20  Observed    11:45:00    12:09:00    NO      NO      NO      15
    21  Observed    11:51:00    11:51:00    NO      NO      NO      15
    22  Observed    12:11:00    12:11:00    NO      NO      NO      15
    23  Observed    13:15:00    13:15:00    NO      NO      End     15
    24  Lost        13:15:00    7:53:00     YES     NO      NO      Lost    

Would this be too hard to do in R? 在R中这样做太难了吗?

Thanks! 谢谢!

> d <- data.frame(Limiter = rep("NO", 44), Starter = rep("NO", 44), Ender = rep("NO", 44), stringsAsFactors = FALSE)
> d$Starter[c(1, 3, 15, 25)] <- "Start"
> d$Ender[c(1, 13, 23, 44)] <- "End"
> d$Limiter[c(2, 14, 24)] <- "Yes"
> d$Period <- ifelse(d$Limiter == "Yes", "Lost", which(d$Starter == "Start")[cumsum(d$Starter == "Start")])
> d
       Limiter Starter Ender Period
1       NO   Start   End      1
2      Yes      NO    NO   Lost
3       NO   Start    NO      3
4       NO      NO    NO      3
5       NO      NO    NO      3
6       NO      NO    NO      3
7       NO      NO    NO      3
8       NO      NO    NO      3
9       NO      NO    NO      3
10      NO      NO    NO      3
11      NO      NO    NO      3
12      NO      NO    NO      3
13      NO      NO   End      3
14     Yes      NO    NO   Lost
15      NO   Start    NO     15
16      NO      NO    NO     15
17      NO      NO    NO     15
18      NO      NO    NO     15
19      NO      NO    NO     15
20      NO      NO    NO     15
21      NO      NO    NO     15
22      NO      NO    NO     15
23      NO      NO   End     15
24     Yes      NO    NO   Lost
25      NO   Start    NO     25
26      NO      NO    NO     25
27      NO      NO    NO     25
28      NO      NO    NO     25
29      NO      NO    NO     25
30      NO      NO    NO     25
31      NO      NO    NO     25
32      NO      NO    NO     25
33      NO      NO    NO     25
34      NO      NO    NO     25
35      NO      NO    NO     25
36      NO      NO    NO     25
37      NO      NO    NO     25
38      NO      NO    NO     25
39      NO      NO    NO     25
40      NO      NO    NO     25
41      NO      NO    NO     25
42      NO      NO    NO     25
43      NO      NO    NO     25
44      NO      NO   End     25

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM