I have data on worker pay and some workers are paid monthly and others weekly. I would like to combine the data into a panel by worker and week (of year). To do that, I need to expand the monthly rows.
The data look like:
pay_data <- tibble(worker="Jim", start=ymd("2020-1-3"), end=ymd("2020-2-2"), rate=10, hours=50, wages=rate*hours) %>%
mutate(f_week=week(start), l_week=week(end))
# A tibble: 1 x 8
worker start end rate hours wages f_week l_week
<chr> <date> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Jim 2020-01-03 2020-02-02 10 50 500 1 5
Is there a way to use complete, fill or any other dplyr function to get the data to look like the below?
# A tibble: 5 x 5
worker week rate hours wage
<chr> <int> <dbl> <dbl> <dbl>
1 Jim 1 10 50 500
2 Jim 2 10 50 500
3 Jim 3 10 50 500
4 Jim 4 10 50 500
5 Jim 5 10 50 500
(I would then of course divide the amounts to put them all in common units).
Thanks!
A tidyverse
approach making use of tidyr::separate_rows
may look like so. To make the data more interesting I added data for a second worker.
library(tidyverse)
tbl %>%
rowwise() %>%
mutate(weeks = paste(seq(f_week, l_week, by = 1), collapse = ", ")) %>%
ungroup() %>%
separate_rows(weeks) %>%
select(-ends_with("_week"), -start, -end)
#> # A tibble: 13 x 5
#> worker rate hours wages weeks
#> <chr> <int> <int> <int> <chr>
#> 1 Jim 10 50 500 1
#> 2 Jim 10 50 500 2
#> 3 Jim 10 50 500 3
#> 4 Jim 10 50 500 4
#> 5 Jim 10 50 500 5
#> 6 John 20 100 1000 1
#> 7 John 20 100 1000 2
#> 8 John 20 100 1000 3
#> 9 John 20 100 1000 4
#> 10 John 20 100 1000 5
#> 11 John 20 100 1000 6
#> 12 John 20 100 1000 7
#> 13 John 20 100 1000 8
DATA
tbl <- read.table(text="worker start end rate hours wages f_week l_week
1 Jim 2020-01-03 2020-02-02 10 50 500 1 5\n
2 John 2020-01-03 2020-02-02 20 100 1000 1 8", header = TRUE)
tbl
#> worker start end rate hours wages f_week l_week
#> 1 Jim 2020-01-03 2020-02-02 10 50 500 1 5
#> 2 John 2020-01-03 2020-02-02 20 100 1000 1 8
Another tidyverse
way would be:
library(tidyverse)
pay_data %>%
mutate(week = map2(f_week, l_week, seq)) %>%
unnest(week) %>%
select(worker, rate:wages, week)
# worker rate hours wages week
# <chr> <dbl> <dbl> <dbl> <int>
#1 Jim 10 50 500 1
#2 Jim 10 50 500 2
#3 Jim 10 50 500 3
#4 Jim 10 50 500 4
#5 Jim 10 50 500 5
Try this:
#Code
pay_data <- pay_data[rep(seq_len(nrow(pay_data)), unique(pay_data$l_week)),
c('worker','rate','hours','wages')]
pay_data$week <- 1:nrow(pay_data)
Output:
# A tibble: 5 x 5
worker rate hours wages week
<chr> <dbl> <dbl> <dbl> <int>
1 Jim 10 50 500 1
2 Jim 10 50 500 2
3 Jim 10 50 500 3
4 Jim 10 50 500 4
5 Jim 10 50 500 5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.