简体   繁体   中英

Padding around dates in R to add missing/blank months?

The padr R pacakge vignette describes different package functions to pad dates and times around said dates and times.

I am in situations where I'll be tallying events in data frames (ie dplyr::count() ) and will need to plot occurrences, over a period of say... 1 year. When I count the events in a low volume data frame I'll often get single line item results, like this:

library(tidyverse)
library(lubridate)
library(padr)
df <- tibble(col1 = as.Date("2018-10-01"), col2 = "g", col3 = 5)

#> # A tibble: 1 x 3
#>   col1       col2   col3
#>   <date>     <chr> <dbl>
#> 1 2018-10-01 g         5

To plot this with ggplot, over a period of a year, on a monthly basis, requires a data frame of 12 rows. It basically needs to look like this:

#> # A tibble: 12 x 3
#>   col1       col2   col3
#>   <date>     <chr> <dbl>
#>  1 2018-01-01 NA        0
#>  2 2018-02-01 NA        0
#>  3 2018-03-01 NA        0
#>  4 2018-04-01 NA        0
#>  5 2018-05-01 NA        0
#>  6 2018-06-01 NA        0
#>  7 2018-07-01 NA        0
#>  8 2018-08-01 NA        0
#>  9 2018-09-01 NA        0
#> 10 2018-10-01 g         5
#> 11 2018-11-01 NA        0
#> 12 2018-12-01 NA        0

Perhaps padr() can do this with some combination of the thicken() and pad() functions. My attempts are shown below, neither line 3 nor line 4 construct the data frame shown directly above.

How do I construct that data frame direclty above, utilizing padr() , lubridate() , tidyverse() , data.table() , base R , or any way you please? Manual entry of each month shall not be considered either, if that needs to be said. Thank you.

df %>% 
  thicken("year") %>% 
  # pad(by = "col1") %>%       # line 3
  # pad(by = "col1_year") %>%  # line 4
  print()
library(lubridate)
library(tidyverse)

df <- tibble(col1 = as.Date("2018-10-01"), col2 = "g", col3 = 5)

my_year <- year(df$col1[1])

df2 <- tibble(col1 = seq(ymd(paste0(my_year,'-01-01')),ymd(paste0(my_year,'-12-01')), by = '1 month'))

df3 <- merge(df,df2, by ="col1",all.y=TRUE) %>% mutate(col3 = replace_na(col3,0))

df3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM