简体   繁体   中英

How I can divide values in a column by specific row in R using dplyr?

So I am currently trying to create a new column cookies produced by hour, but I get an error about something about "cookies produced by hour" must be a size of 1, not 0 and something about "cookies produced by hour" can't be recycled to size 1. However, I do get a new column, but the calculations are a bit off.

Sample Data:

working day working week    franchise_id    measurement     amount

01-01-2020  01-01-2020      1               Cookies made    100 
01-01-2020  01-01-2020      1               Hours           1
...
03-12-2020  03-11-2020      1               Cookies made    200 
03-13-2020  03-11-2020      1               Hours           5

Code:

cookie_data %>% group_by(franchise_id) %>% mutate(cookiesperhour = amount/amount[measurement=="Hours"]) %>% ungroup()

Any help is greatly appreciated.

I think this is an ideal setting to use pivot_wider to reorient your data:

library(tidyverse)

df %>% 
  tidyr::pivot_wider(id_cols = c(`working week`, franchise_id),
                     names_from = measurement,
                     values_from = amount) %>%
  dplyr::mutate(cookiesperhour = `Cookies made`/Hours)

Output

  `working week` franchise_id `Cookies made` Hours cookiesperhour
  <chr>                 <dbl>          <dbl> <dbl>          <dbl>
1 01-01-2020                1            100     1            100
2 03-11-2020                1            200     5             40

If you need to maintain that structure then you could do the following as long as:

  1. The structure is such that there are only two measurements per franchise_id and work week
  2. The first measurement was "Cookies made" and the second (or last) per group was "Hours". To be safe you could pre-sort to makes sure that is the case, but given all these caveats you can see why this might not be the best way to go.
df %>% 
  dplyr::group_by(franchise_id, `working week`) %>% 
  dplyr::mutate(cookiesperhour = first(amount) / last(amount)) %>% 
  dplyr::ungroup()

Output

  `working day` `working week` franchise_id measurement  amount cookiesperhour
  <chr>         <chr>                 <dbl> <chr>         <dbl>          <dbl>
1 01-01-2020    01-01-2020                1 Cookies made    100            100
2 01-01-2020    01-01-2020                1 Hours             1            100
3 03-12-2020    03-11-2020                1 Cookies made    200             40
4 03-13-2020    03-11-2020                1 Hours             5             40

Data

df <- structure(list(`working day` = c("01-01-2020", "01-01-2020", 
"03-12-2020", "03-13-2020"), `working week` = c("01-01-2020", 
"01-01-2020", "03-11-2020", "03-11-2020"), franchise_id = c(1, 
1, 1, 1), measurement = c("Cookies made", "Hours", "Cookies made", 
"Hours"), amount = c(100, 1, 200, 5)), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM