组合满足R中条件的行

Question

I have a standard data frame where I have individuals doing a certain behavior over a period of time. 我有一个标准的数据框，其中有个人在一段时间内进行某种行为。 When an incident occurs within 50 seconds of the previous incident (Delay <=50) I would like to combine it with the previous incident. 当一个事件在上一个事件的50秒内发生（延迟<= 50）时，我想将其与上一个事件合并。 That is, each incident would have either a Delay of NA (first incident) or Delay >50. 也就是说，每个事件将具有NA的延迟（第一次事件）或Delay> 50。 The Start time would then be the start time of the first incident (either NA or >50) and the End time would be that of the last incident <=50 (see example with data below). 然后，“开始时间”将是第一个事件的开始时间（NA或> 50），“结束”时间将是最后一个事件的开始时间<= 50（请参见下面的示例数据）。 I would also like the sum of X1 within the combined incidents. 我还要合并事件中X1的总和。 Hopefully the data below clarifies exactly what I am looking for. 希望下面的数据能准确说明我在寻找什么。

Original Data: 原始数据：

ID          Incident    Start   End     X1   Delay
Person A    1           747     748     735  NA
Person A    2           868     882     384  120
Person A    3           998     999     354  116
Person A    4           1057    1059    382  58
Person A    5           1063    1064    138  4
Person A    6           1077    1078    138  13
Person A    7           1412    1413    384  334
Person B    1           739     740     387  NA
Person B    2           742     743     132  2
Person B    3           760     761     386  17
Person B    4           768     769     731  7
Person B    5           835     835     894  66
Person B    6           838     839     891  3
Person B    7           925     926     385  86

Desired Data: 所需数据：

ID          Iteration   Start   End     X1      Delay
Person A    1           747     748     735     NA
Person A    2           868     882     384     120
Person A    3           998     999     354     116
Person A    4           1057    1078    658     58
Person A    5           1412    1413    384     334
Person B    1           739     769     1636    NA
Person B    2           835     839     1785    66
Person B    3           925     926     385     86

I have tried multiple things, the issue is I just can't aggregate by ID because the same person might have two separate incidents. 我已经尝试了多种方法，问题是我无法按ID进行汇总，因为同一个人可能会发生两次单独的事件。

Thanks! 谢谢！ and let me know if you need any more information. 并告诉我您是否需要更多信息。

Answer 1

I think you have a mistake in your desired result table. 我认为您想要的结果表中有一个错误。 Line 5 should be Person A. 第5行应为A人。

Here's a way to do that with dplyr . 这是使用dplyr此操作的方法。 The rationale is that we first combine indicents using cumsum . 理由是我们首先使用cumsum组合cumsum 。 If a delay is > 50 or NA, the incident number is increased by one. 如果延迟大于50或不适用，则事件数将增加一。 Then, we summarise on this new incident column. 然后，我们在这个新的事件列中进行summarise 。

df%>%
  group_by(ID)%>%
  mutate(Incident=cumsum(Delay>50|is.na(Delay)))%>%
  group_by(ID,Incident)%>%
  summarise(Start=first(Start),End=last(End),X1=sum(X1),Delay=first(Delay))

       ID Incident Start   End    X1 Delay
    <chr>    <int> <int> <int> <int> <int>
1 PersonA        1   747   748   735    NA
2 PersonA        2   868   882   384   120
3 PersonA        3   998   999   354   116
4 PersonA        4  1057  1078   658    58
5 PersonA        5  1412  1413   384   334
6 PersonB        1   739   769  1636    NA
7 PersonB        2   835   839  1785    66
8 PersonB        3   925   926   385    86

Data 数据

df <- read.table(text="ID  Incident  Start  End X1 Delay
PersonA    1           747     748     735  NA
PersonA    2           868     882     384  120
PersonA    3           998     999     354  116
PersonA    4           1057    1059    382  58
PersonA    5           1063    1064    138  4
PersonA    6           1077    1078    138  13
PersonA    7           1412    1413    384  334
PersonB    1           739     740     387  NA
PersonB    2           742     743     132  2
PersonB    3           760     761     386  17
PersonB    4           768     769     731  7
PersonB    5           835     835     894  66
PersonB    6           838     839     891  3
PersonB    7           925     926     385  86",header=TRUE,stringsAsFactors=FALSE)

组合满足R中条件的行

问题描述

1 个解决方案

解决方案1
0 2017-06-19 22:24:03

组合满足R中条件的行

问题描述

1 个解决方案

解决方案1 0 2017-06-19 22:24:03

解决方案1
0 2017-06-19 22:24:03