繁体   English   中英

确定工作日是否在 R 中的两个日期之间

[英]Determine if a weekday is between two dates in R

[更新]

在另一个线程中,@Frank 的回答解决了这个问题。 这个问题成为另一个问题的重复。


[题]

我正在用R编写一个函数来测试工作日是否在两个日期之间。 这是我所拥有的,但我认为解决方案并不优雅。 有没有更数学的方法来做到这一点?

library(data.table) ## wday is a function in this package
isDayIn <- function(weekday, date1, date2) {
  if (weekday<1 | weekday>7) stop("weekday must be an integer from 1 to 7.")
  date1 <- as.Date(date1)
  date2 <- as.Date(date2)
  output <- weekday %in% unique(wday(seq.Date(date1, date2, by=1)))
  return(output)
}

## 2015-08-02 is a Sunday and 2015-08-03 is a Monday
isDayIn(1, "2015-08-02", "2015-08-03")
> TRUE
isDayIn(7, "2015-08-02", "2015-08-03")
> FALSE

注意:函数wday从星期日开始,到星期六结束,所以星期日将被映射到整数 1,星期六将被映射到整数 7。

使用base R另一个功能选项:

isDayIn <- function(weekday, date1, date2) {
  if (weekday<1 | weekday>7) stop("weekday must be an integer from 1 to 7.")
  weekday %in% strftime(seq(as.Date(date1), as.Date(date2), by="day"), format="%w")
}

isDayIn(1, "2015-08-02", "2015-08-03")
[1] TRUE
isDayIn(7, "2015-08-02", "2015-08-03")
[1] FALSE

我认为你的解决方案很好。 但这里有一个快速修复:

isDayIn <- function(weekday, date1, date2) {
  if (weekday<1 | weekday>7) stop("weekday must be an integer from 1 to 7.")
  require(lubridate)
  date1 <- as.Date(date1)
  date2 <- as.Date(date2)
  if (as.integer(date2 - date1) >= 7) {
    return(TRUE) # by default
  } else {
    return(weekday %in% wday(seq.Date(date1, date2, by=1)))
  }
}

已经有很好的解决方案,但没有一个能避免产生一系列的日子。 我试图找到一个只比较工作日数字(和周数)的解决方案。 它在内部使用星期一作为一周的第一天,但​​参数startWithSunday提供了将星期日设置为第 1 天的可能性。另一种方法是在strftime %V%U之间切换,但这种方法对我来说似乎更直接。

isDayIn1 <- function(weekday, date1, date2, startWithSunday = FALSE) {

  if (weekday < 1 | weekday > 7) stop("weekday must be an integer from 1 to 7.")

  if(startWithSunday) {
    weekday <- max(weekday - 1, 1)
  }

  dates <- sort(as.Date(c(date1, date2)))

  if (dates[2] - dates[1] >= 7) return(TRUE)

  weeks <- strftime(dates, "%V")
  days  <- strftime(dates, "%u")

  if (weeks[1] == weeks[2]) { # Dates are in the same week.
    return(weekday >= days[1] & weekday <= days[2])
  } else { # Different weeks.
    return(weekday >= days[1] | weekday <= days[2])
  }
}

对于这么小的任务,函数看起来很多代码,但大部分只是准备; 实际工作在两个return语句中完成。 诀窍是区分日期在同一周和不同周的情况,因为这会影响我们应该做的比较。

为了检查isDayIn1是否完成它的工作,我编写了这个小包装函数:

niceTests <- function(weekday, date1, date2, startWithSunday = FALSE) {

  date1 <- as.Date(date1)
  date2 <- as.Date(date2)

  fmt <- "%a, %y-%m-%d (week %V)"
  if (startWithSunday) {
    fmt <- "%a, %y-%m-%d (week %U)"
  }
  print(sprintf("Date1: %s, Date2: %s, Diff.: %d. Range contains day #%d: %s",
                strftime(date1, fmt),
                strftime(date2, fmt),
                abs(date2 - date1),
                weekday,
                as.character(isDayIn1(weekday, date1, date2, startWithSunday))
                ))
}

这是第一批测试。 请注意, startWithSunday默认为FALSE ,因此这里的第1工作日表示星期一。

niceTests(7, "2015-08-02", "2015-08-03") # from question (Sunday in Su-Mo)
niceTests(6, "2015-08-02", "2015-08-03") # from question (Saturday in Su-Mo)
niceTests(1, "2015-08-02", "2015-08-09") # Full week or more.
niceTests(1, "2015-08-02", "2015-08-10") # Full week or more.

niceTests(1, "2015-08-05", "2015-08-07") # Same week. (Wednesday - Friday)
niceTests(2, "2015-08-05", "2015-08-07") # Same week.
niceTests(3, "2015-08-05", "2015-08-07") # Same week.
niceTests(4, "2015-08-05", "2015-08-07") # Same week.
niceTests(5, "2015-08-05", "2015-08-07") # Same week.
niceTests(6, "2015-08-05", "2015-08-07") # Same week.
niceTests(7, "2015-08-05", "2015-08-07") # Same week.

niceTests(1, "2015-08-08", "2015-08-11") # Across weeks. (Saturday - Tuesday)
niceTests(2, "2015-08-08", "2015-08-11") # Across weeks.
niceTests(3, "2015-08-08", "2015-08-11") # Across weeks.
niceTests(4, "2015-08-08", "2015-08-11") # Across weeks.
niceTests(5, "2015-08-08", "2015-08-11") # Across weeks.
niceTests(6, "2015-08-08", "2015-08-11") # Across weeks.
niceTests(7, "2015-08-08", "2015-08-11") # Across weeks.

输出:

[1] "Date1: Sun, 15-08-02 (week 31), Date2: Mon, 15-08-03 (week 32), Diff.: 1. Range contains day #7: TRUE"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Mon, 15-08-03 (week 32), Diff.: 1. Range contains day #6: FALSE"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Sun, 15-08-09 (week 32), Diff.: 7. Range contains day #1: TRUE"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Mon, 15-08-10 (week 33), Diff.: 8. Range contains day #1: TRUE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #1: FALSE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #2: FALSE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #3: TRUE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #4: TRUE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #5: TRUE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #6: FALSE"
[1] "Date1: Wed, 15-08-05 (week 32), Date2: Fri, 15-08-07 (week 32), Diff.: 2. Range contains day #7: FALSE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #1: TRUE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #2: TRUE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #3: FALSE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #4: FALSE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #5: FALSE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #6: TRUE"
[1] "Date1: Sat, 15-08-08 (week 32), Date2: Tue, 15-08-11 (week 33), Diff.: 3. Range contains day #7: TRUE"

最后,测试startWidthSunday = TRUE ,其中第 1 天是星期日:

print("Now: Start with Sunday!")

niceTests(1, "2015-08-02", "2015-08-03", startWithSunday = TRUE) # from question (Sunday in Su-Mo)
niceTests(7, "2015-08-02", "2015-08-03", startWithSunday = TRUE) # from question (Saturday in Su-Mo)
niceTests(1, "2015-08-02", "2015-08-09", startWithSunday = TRUE) # Full week or more.
niceTests(1, "2015-08-02", "2015-08-10", startWithSunday = TRUE) # Full week or more.

niceTests(1, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week. (Wednesday - Friday)
niceTests(2, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week.
niceTests(3, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week.
niceTests(4, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week.
niceTests(5, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week.
niceTests(6, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week.
niceTests(7, "2015-08-05", "2015-08-07", startWithSunday = TRUE) # Same week.

niceTests(1, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks. (Saturday - Tuesday)
niceTests(2, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks.
niceTests(3, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks.
niceTests(4, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks.
niceTests(5, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks.
niceTests(6, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks.
niceTests(7, "2015-08-08", "2015-08-11", startWithSunday = TRUE) # Across weeks.

输出:

[1] "Now: Start with Sunday!"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Mon, 15-08-03 (week 31), Diff.: 1. Range contains day #1: TRUE"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Mon, 15-08-03 (week 31), Diff.: 1. Range contains day #7: FALSE"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Sun, 15-08-09 (week 32), Diff.: 7. Range contains day #1: TRUE"
[1] "Date1: Sun, 15-08-02 (week 31), Date2: Mon, 15-08-10 (week 32), Diff.: 8. Range contains day #1: TRUE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #1: FALSE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #2: FALSE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #3: FALSE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #4: TRUE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #5: TRUE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #6: TRUE"
[1] "Date1: Wed, 15-08-05 (week 31), Date2: Fri, 15-08-07 (week 31), Diff.: 2. Range contains day #7: FALSE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #1: TRUE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #2: TRUE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #3: TRUE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #4: FALSE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #5: FALSE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #6: FALSE"
[1] "Date1: Sat, 15-08-08 (week 31), Date2: Tue, 15-08-11 (week 32), Diff.: 3. Range contains day #7: TRUE"

我写了一个@CL 答案的矢量化版本,它也更通用:

#' Check if a weekday is within an interval
#' 
#' @param wday Day of week (integer 1-7)
#' @param from Date. Can be a vector.
#' @param to Date. Same length as `from` and must be greater than `from`.
#' @param week_start 1 = Monday. 7 = Sunday
#' 
wday_in_interval = function(wday, from, to, week_start = 1) {
  if (wday < 1 | weekday > 7) 
    stop("wday must be an integer from 1 to 7.")
  if (week_start)
    wday = 1 + (((wday - 2) + week_start ) %% 7)  # Translate wday to week_start = 1 (ISO standard)
  if (any(from > to, na.rm = TRUE))
    stop("`from` must come before `to`")
  
  # If the interval is greater than a week, it trivially contains any weekday
  over_a_week = difftime(from, to, units = "days") >= 7
  
  # Check if weekday is both smaller/greater than "from" and "to"
  days_from = as.numeric(strftime(from, "%u"))
  days_to = as.numeric(strftime(to, "%u"))
  contains_weekday = ifelse(
    strftime(from, "%V") == strftime(to, "%V"),  # Dates are in the same week?
    yes = wday >= days_from & wday <= days_to,
    no = wday >= days_from | wday <= days_to  # 
  )
  
  return(over_a_week | contains_weekday)
}

例如,假设我们要检测与周末重叠的时间序列中的间隔。 我们为周六和周日运行wday_in_interval

library(dplyr)
tibble::tibble(
  timestamp = seq(as.POSIXct("2020-09-03 0:00"), as.POSIXct("2020-09-8 12: 00"), length.out = 10),
  overlaps_saturday = wday_in_interval(6, from = lag(timestamp), to = timestamp),
  overlaps_sunday = wday_in_interval(7, from = lag(timestamp), to = timestamp),
  overlaps_weekend = overlaps_saturday | overlaps_sunday
)

结果:

# A tibble: 10 x 4
   timestamp           overlaps_saturday overlaps_sunday overlaps_weekend
   <dttm>              <lgl>             <lgl>           <lgl>           
 1 2020-09-03 00:00:00 NA                NA              NA              
 2 2020-09-03 14:40:00 FALSE             FALSE           FALSE           
 3 2020-09-04 05:20:00 FALSE             FALSE           FALSE           
 4 2020-09-04 20:00:00 FALSE             FALSE           FALSE           
 5 2020-09-05 10:40:00 TRUE              FALSE           TRUE            
 6 2020-09-06 01:20:00 TRUE              TRUE            TRUE            
 7 2020-09-06 16:00:00 FALSE             TRUE            TRUE            
 8 2020-09-07 06:40:00 FALSE             TRUE            TRUE            
 9 2020-09-07 21:20:00 FALSE             FALSE           FALSE           
10 2020-09-08 12:00:00 FALSE             FALSE           FALSE  

在我的中端笔记本电脑上,它在大约 3 秒内处理了 250.000 行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM