[英]how to transform time codes to turn codes with dplyr if dataframe includes more than one event per code
I want to transform time codes like these我想像这样转换时间码
library(lubridate)
library(tidyverse)
df_time <- tibble(time = c(ymd_hms("2020_01_01 00:00:01"),
ymd_hms("2020_01_01 00:00:02"),
ymd_hms("2020_01_01 00:00:03"),
ymd_hms("2020_01_01 00:00:04"),
ymd_hms("2020_01_01 00:00:05"),
ymd_hms("2020_01_01 00:00:06"),
ymd_hms("2020_01_01 00:00:07"),
ymd_hms("2020_01_01 00:00:08"),
ymd_hms("2020_01_01 00:00:09"),
ymd_hms("2020_01_01 00:00:10")),
a = c(0, 1, 1, 1, 1, 0, 0, 1, 1, 0),
b = c(0, 0, 1, 1, 0, 1, 1, 1, 0, 0))
resulting in导致
> df_time
# A tibble: 10 x 3
time a b
<dttm> <dbl> <dbl>
1 2020-01-01 00:00:01 0 0
2 2020-01-01 00:00:02 1 0
3 2020-01-01 00:00:03 1 1
4 2020-01-01 00:00:04 1 1
5 2020-01-01 00:00:05 1 0
6 2020-01-01 00:00:06 0 1
7 2020-01-01 00:00:07 0 1
8 2020-01-01 00:00:08 1 1
9 2020-01-01 00:00:09 1 0
10 2020-01-01 00:00:10 0 0
into turn codes (aka event codes/"start stop data").变成轮流代码(又名事件代码/“开始停止数据”)。 Should look like the following df:
应该看起来像下面的df:
df_turn <- tibble(start = c(ymd_hms("2020_01_01 00:00:02"),
ymd_hms("2020_01_01 00:00:03"),
ymd_hms("2020_01_01 00:00:06"),
ymd_hms("2020_01_01 00:00:08")),
end = c(ymd_hms("2020_01_01 00:00:05"),
ymd_hms("2020_01_01 00:00:04"),
ymd_hms("2020_01_01 00:00:08"),
ymd_hms("2020_01_01 00:00:09")),
code = c("a", "b", "b", "a"))
> df_turn
# A tibble: 4 x 3
start end code
<dttm> <dttm> <chr>
1 2020-01-01 00:00:02 2020-01-01 00:00:05 a
2 2020-01-01 00:00:03 2020-01-01 00:00:04 b
3 2020-01-01 00:00:06 2020-01-01 00:00:08 b
4 2020-01-01 00:00:08 2020-01-01 00:00:09 a
This great post how to transform time codes into turn codes provides a solution for one event per code, but not for more than one.这篇关于如何将时间码转换为轮流码的精彩帖子为每个代码提供了一个事件的解决方案,但不会超过一个。
Thanks!谢谢!
I will offer this solution using a link to a similar task我将使用指向类似任务的链接来提供此解决方案
df_time %>%
pivot_longer(-time) %>%
group_by(name) %>%
mutate(tmp = value - lag(value)) %>%
filter(value == 1) %>%
mutate(tmp = cumsum(tmp)) %>%
group_by(name, tmp) %>%
summarise(start = range(time)[1],
end = range(time)[2])
# A tibble: 4 x 4
# Groups: name [2]
name tmp start end
<chr> <dbl> <dttm> <dttm>
1 a 1 2020-01-01 00:00:02 2020-01-01 00:00:05
2 a 2 2020-01-01 00:00:08 2020-01-01 00:00:09
3 b 1 2020-01-01 00:00:03 2020-01-01 00:00:04
4 b 2 2020-01-01 00:00:06 2020-01-01 00:00:08
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.