简体   繁体   中英

Count observations at any given date

I'm working with a dataset similar to the following:

df <- data.frame(type = c("A", "A", "A", "A", "A", "B", "B", "B", "C", "D", "D", "D"), 
                 start_date = as.Date(c("2010-02-01", "2011-03-15", "2011-09-15", "2015-01-01", "2015-05-15", "2009-01-01", "2015-07-14", "2016-06-30", "2012-01-15", "2010-04-05", "2010-08-01", "2012-04-01"), format = "%Y-%m-%d"), 
                 end_date = as.Date(c("2010-12-31", "2011-07-31", "2014-04-04", "2015-02-15", "2016-12-15", "2013-02-16", "2015-12-31", "2016-12-31", "2015-09-17", "2010-04-10", "2010-09-30", "2013-12-31"), format = "%Y-%m-%d"))

I would like to count the number of observations at any given date.

Expected output

Essentially, I want my results to show that there is only one type until 2010-02-01, then two until 2010-04-05, then three until 2010-04-10 etc, ie one column with the date (one row per day) and one column with the count of type .

date count_of_type
2009-01-01 1
2009-01-02 1
2009-01-03 1
...
2010-01-31 1
2010-02-01 2
2010-02-02 2
...
2010-04-04 2
2010-04-05 3
2010-04-06 3
2010-04-07 3
2010-04-08 3
2010-04-09 3
2010-04-10 2
2010-04-11 2
...

I thought this would be would be easy to do, but can't figure it out... any idea?

Cheers,

An option would be to transmute by taking the corresponding seq uence of 'start_date', 'end_date' by day and then get the count

library(tidyverse)
df %>% 
   transmute(date = map2(start_date, end_date, seq, by = '1 day')) %>% 
   unnest %>% 
   count(date)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM