lets say I have dataframe like this:
dt <-
data.frame(
date = as.Date(
c("2022-01-01", "2022-01-03", "2022-01-05", "2022-01-06", "2022-01-07", "2022-02-01", "2022-02-01"))
)
I would like to calculate sequence of dates, where difftime between first in sequence and last is less or equal 2 days. Once sequence reaches its last possible day, I would like to create seqeuences from all upcoming ones.
In other words: Dataset and even sequence starts with 2022-01-01 so it will be marked by 0 - 2022-01-03 will be marked by 1 because it is part of sequence that started on 2022-01-01.
2022-01-05 can't be marked by 0 because difftime between 2022-01-01 and 2022-01-05 is greater than 2 days, this date is begining of new sequence and all upcoming dates where difftime is lower or equeal than 2 days (2022-01-06 and 2022-01-07) will be marked by 0.
Simiarly with 2022-02-01 (please note than there could be same dates in dataset).
I prefer dplyr solution, but if you can create another one, help I really appreciate your help.
result <-
data.frame(
date = as.Date(
c("2022-01-01", "2022-01-03", "2022-01-05", "2022-01-06", "2022-01-07", "2022-02-01", "2022-02-01")),
flag = c(0, 1, 1, 0, 0, 1, 0)
)
We may use diff
to get the difference between adjacent 'date' and convert it to logical vector ( >
) and coerce the logical to binary with +
or as.integer
library(dplyr)
dt <- dt %>%
mutate(flag = +(c(0, diff(date) > 1)))
-output
dt
date flag
1 2022-01-01 0
2 2022-01-03 1
3 2022-01-05 1
4 2022-01-06 0
5 2022-01-07 0
6 2022-02-01 1
7 2022-02-01 0
Or with lag
and difftime
dt %>%
mutate(flag = +(difftime(date, lag(date, default = first(date)),
units = "day") > 1))
date flag
1 2022-01-01 0
2 2022-01-03 1
3 2022-01-05 1
4 2022-01-06 0
5 2022-01-07 0
6 2022-02-01 1
7 2022-02-01 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.