简体   繁体   中英

Count number of days with date range in one table and days criteria in another table

My one table has data with date mentioned in last two columns:

dat<- data.frame(a = c(rep("x",3)),
                 date1=c(seq(as.Date("2018-01-01"), as.Date("2018-01-3"), 1)), 
                 date2=c(seq(as.Date("2018-01-08"), as.Date("2018-01-10"), 1)))

a date1 date2

1 x 2018-01-01 2018-01-08

2 x 2018-01-02 2018-01-09

3 x 2018-01-03 2018-01-10

My another table has what kind of day each day is

cal <- data.frame(dt = c(seq(as.Date("2018-01-01"), as.Date("2018-01-10"),1)),
                  day = c(rep("workday",5), rep("holiday",1), rep("weekend",4)))

How to get number of days in table 1(dat) as anew column such that the it counts only the workday that falls in the range mentioned in column 2 and column 3?

Example output with 4 columns. The last column is the number of workdays for the date range in previous two columns

  a      date1      date2     countdown
1 x 2018-01-01   2018-01-08           5
2 x 2018-01-02   2018-01-09           4
3 x 2018-01-03   2018-01-10           3

data.table solution

library( data.table )
#set data to data.table format
setDT(dat); setDT(cal)
setkey(dat, date1, date2 )
dat[dat, 
    N := { val = cal[ day == "workday" & dt >= i.date1 & dt <= i.date2 ]
           list( nrow( val ) ) }, 
    by = .EACHI ]

#    a      date1      date2 N
# 1: x 2018-01-01 2018-01-08 5
# 2: x 2018-01-02 2018-01-09 4
# 3: x 2018-01-03 2018-01-10 3

update data.table::foverlaps() solution

library( data.table )
#set data to data.table format
setDT(dat); setDT(cal)
#create dummy date
cal[,dt2 := dt]
#set keys
setkey( dat, date1, date2 )
setkey( cal, dt, dt2 )
#overlap join
ans <- foverlaps( dat, cal )
#summarise
ans[, .( countdown = uniqueN( dt[day == "workday"] ) ), by = .(a, date1, date2)][]
#    a      date1      date2 countdown
# 1: x 2018-01-01 2018-01-08         5
# 2: x 2018-01-02 2018-01-09         4
# 3: x 2018-01-03 2018-01-10         3

A way using tidyverse functions :

  1. Create a sequence of days between date1 and date2

  2. Get the data in long format

  3. Left join data the above data with cal dataframe

  4. Calculate number of workdays for each row.

library(dplyr)

dat %>%
  mutate(row = row_number(),
         dt = purrr::map2(date1, date2, seq, by = '1 day')) %>%
  tidyr::unnest(dt) %>%
  left_join(cal, by = 'dt') %>%
  group_by(row, a, date1, date2) %>%
  summarise(countdown = sum(day == 'workday')) %>%
  ungroup() %>%
  select(-row)

#   a     date1      date2      countdown
#  <chr> <date>     <date>         <int>
#1 x     2018-01-01 2018-01-08         5
#2 x     2018-01-02 2018-01-09         4
#3 x     2018-01-03 2018-01-10         3

A base R option

within(
  dat,
  countdown <- sapply(
    1:nrow(dat),
    function(k) sum(cal$day == "workday" & !is.na(cut(cal$dt, c(date1[k], date2[k]))))
  )
)

giving

  a      date1      date2 countdown
1 x 2018-01-01 2018-01-08         5
2 x 2018-01-02 2018-01-09         4
3 x 2018-01-03 2018-01-10         3

additional solution

# v1
df %>% 
  rowwise() %>% 
  mutate(int_date = list(seq(date1, date2, "1 day"))) %>% 
  unnest(int_date) %>% 
  left_join(cal, by = c("int_date" = "dt")) %>% 
  filter(day == "workday") %>% 
  group_by(a, date1, date2) %>% 
  count

# v2
df %>% 
  rowwise() %>% 
  mutate(int_date = list(seq(date1, date2, "1 day")),
         out = sum(unlist(int_date) %in% cal$dt[cal$day == "workday"])) %>% 
  select(-int_date)

# v3 (using @Ronak Shah hint with a `map` )
df %>% 
  mutate(int_date = map2(date1, date2, seq, "1 day"),
         out = map_dbl(int_date, ~ sum(.x %in% cal$dt[cal$day == "workday"]))) %>% 
  select(-int_date)


# A tibble: 3 x 4
# Rowwise: 
  a     date1      date2        out
  <chr> <date>     <date>     <int>
1 x     2018-01-01 2018-01-08     5
2 x     2018-01-02 2018-01-09     4
3 x     2018-01-03 2018-01-10     3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM