简体   繁体   中英

R: convert long to wide format filling out missing dates

I'm reshaping data from the hour registration of my company, to fit a certain format. I've modified the input to look like this:

   employee project month day hours
1         A  16-001     9   9     5
2         B  16-001     9  29     1
3         A  16-001     9   3     5
4         B  16-001     9  28     2
5         A  16-002     9   8     6
6         B  16-002     9   9     4
7         A  16-002    10  25     6
8         B  16-002    10  21     8
9         A  overig    10   6     6
10        B  overig    10  17     7
11        A  overig    10   9     1
12        B  overig    10  10     7

#reproducicle data:  
df <- data.frame(employee = rep(c("A","B"),6),project=rep(c("16-001","16-002","overig"), each=4), month=rep(c(9,10),each=6),day=sample(1:30,12,replace=T), hours=sample(1:8,12,replace=T))

#Now, I need to move this to a cross table: 
res <- ftable(xtabs(hours~month+employee+project+day, aggregate(hours~month+employee+project+day, data=df, FUN=sum)))

#And put this cross table in a data.frame (for export to csv)
library(reshape2) 
df_res <- dcast(as.data.frame(res), as.formula(paste(paste(names(attr(res, "row.vars")), collapse="+"), "~", paste(names(attr(res, "col.vars"))))))

df_res

   month employee project 3 6 8 9 10 17 21 25 28 29
1      9        A  16-001 5 0 0 5  0  0  0  0  0  0
2      9        A  16-002 0 0 6 0  0  0  0  0  0  0
3      9        A  overig 0 0 0 0  0  0  0  0  0  0
4      9        B  16-001 0 0 0 0  0  0  0  0  2  1
5      9        B  16-002 0 0 0 4  0  0  0  0  0  0
6      9        B  overig 0 0 0 0  0  0  0  0  0  0
7     10        A  16-001 0 0 0 0  0  0  0  0  0  0
8     10        A  16-002 0 0 0 0  0  0  0  6  0  0
9     10        A  overig 0 6 0 1  0  0  0  0  0  0
10    10        B  16-001 0 0 0 0  0  0  0  0  0  0
11    10        B  16-002 0 0 0 0  0  0  8  0  0  0
12    10        B  overig 0 0 0 0  7  7  0  0  0  0

I'm not sure this is the best way, but the format is good now. However, I need to have ALL de days as columns, not only the days that were in my data.frame (so 31 columns, preferably with dates that don't exist (like sep 31) with NA and the rest as 0. Any suggestions how to get that?

I think this is an acceptable solution and it will handle leap years too (for bonus points). Still taking advantage of tidyr::spread() 's nice factor fill behavior with drop = F , but now using the function lubridate::days_in_month() to spread only but so far. Here we go:

library(tidyr)
library(lubridate)
library(purrr)

df$year <- 2016 
df$num_in_month <- ymd(paste(df$year, df$month, df$day)) %>%
    days_in_month()

df %>% split(.$month) %>%
    map(~mutate(., day = factor(day, levels = 1:unique(num_in_month)))) %>%
    map(~spread(., key = day, value = hours, fill = 0, drop = F)) %>%
    bind_rows() %>%
    select(-num_in_month)

  employee project month year 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
1        A  16-001     9 2016 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  2  0  8  0  0 NA
2        A  16-002     9 2016 0 0 0 0 5 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 NA
3        B  16-001     9 2016 0 0 7 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  5  0  0  0 NA
4        B  16-002     9 2016 0 0 0 0 0 0 0 0 0  0  0  0  0  0  8  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 NA
5        A  16-002    10 2016 1 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
6        A  overig    10 2016 0 4 0 0 0 0 0 0 0  5  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
7        B  16-002    10 2016 0 0 0 0 0 0 0 0 0  0  0  0  7  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
8        B  overig    10 2016 0 0 0 0 6 0 0 0 0  0  0  8  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Cheers

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM