简体   繁体   中英

How to subtract one row value from another and assign the value to a different row all in the same column, by group, for multiple columns in R

I have a strangely structured data frame with a grouping variable as a factor, followed by a character variable identifying expected vs. actual values for a series of ongoing days (where the actual values are recorded each day in a new column), as well as cumulatively, and I'm trying to calculate the difference between the expected and the actual values.

However, given the structure of the data, I'm struggling to figure out how to do this without getting into a whole mess of reshaping the data only to have to put it back together like this.

Is there a way I can calculate the differences, for each group, across all 'daily' columns, and then store the output in each 'original' column?

Sample Data

df1 <- structure(list(Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), Variable = c("Day", "Actual.Cumulative", "Actual.Daily", "Expected.Cumulative", "Expected.Daily", "Diff.Daily", "Diff.Cumulative", "Day", "Actual.Cumulative", "Actual.Daily", "Expected.Cumulative", "Expected.Daily", "Diff.Daily", "Diff.Cumulative", "Day", "Actual.Cumulative", "Actual.Daily", "Expected.Cumulative", "Expected.Daily", "Diff.Daily", "Diff.Cumulative"), Day1 = c("Sunday", "1", "1", "3", "3", NA, NA, "Sunday", "1", "1", "1", "1", NA, NA, "Sunday", "8", "8", "5", "5", NA, NA), Day2 = c("Monday", "4", "3", "7", "4", NA, NA, "Monday", "2", "1", "3", "2", NA, NA, "Monday", "17", "9", "10", "5", NA, NA), Day3 = c("Tuesday", "6", "2", "10", "3", NA, NA, "Tuesday", "3", "1", "4", "1", NA, NA, "Tuesday", "29", "12", "14", "4", NA, NA), Day4 = c("Wednesday", "11", "5", "12", "2", NA, NA, "Wednesday", "4", "1", "5", "1", NA, NA, "Wednesday", "32", "3", "17", "3", NA, NA)), row.names = c(NA, -21L), class = "data.frame")
df1

What I would like to do is fill in the Diff.Daily cells in each "Day" column (ie Day1 , Day2 , etc.) by taking the difference between the Actual.Daily and Expected.Daily values for each Group (A, B, C), and then do similarly for Diff.Cumulative (ie Actual.Cumulative - Expected.Cumulative ).

Desired Output

df2 <- structure(list(Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), Variable = c("Day", "Actual.Cumulative", "Actual.Daily", "Expected.Cumulative", "Expected.Daily", "Diff.Daily", "Diff.Cumulative", "Day", "Actual.Cumulative", "Actual.Daily", "Expected.Cumulative", "Expected.Daily", "Diff.Daily", "Diff.Cumulative", "Day", "Actual.Cumulative", "Actual.Daily", "Expected.Cumulative", "Expected.Daily", "Diff.Daily", "Diff.Cumulative"), Day1 = c("Sunday", "1", "1", "3", "3", "-2", "-2", "Sunday", "1", "1", "1", "1", "0", "0", "Sunday", "8", "8", "5", "5", "3", "3"), Day2 = c("Monday", "4", "3", "7", "4", "-1", "-3", "Monday", "2", "1", "3", "2", "-1", "-1", "Monday", "17", "9", "10", "5", "4", "7"), Day3 = c("Tuesday", "6", "2", "10", "3", "-1", "-4", "Tuesday", "3", "1", "4", "1", "0", "-1", "Tuesday", "29", "12", "14", "4", "8", "15"), Day4 = c("Wednesday", "11", "5", "12", "2", "3", "-1", "Wednesday", "4", "1", "5", "1", "0", "-1", "Wednesday", "32", "3", "17", "3", "0", "15")), row.names = c(NA, -21L), class = "data.frame")
df2    

Here you can see for Group A on Day 1, Diff.Daily = -2 as expected from Actual.Daily of 1 minus an Expected.Daily of 3.

My Attempts

As the "Day" columns are of character class because of their mixed nature, I know I need to wrap the calculations with as.numeric() , and I can calculate the values individually for each column:

Diff.Cumulative <- as.numeric(df1[df1$Variable == "Actual.Cumulative", "Day1"]) - as.numeric(df1[df1$Variable == "Expected.Cumulative", "Day1"])

But I'm struggling with how to expand this to all "Day" columns and to fit it into the current data frame structure. I came across a similar but ultimately different question.

Here's a way to do this with the help of a helper function.

library(dplyr)

change_values <- function(sub1, sub2, to, var, value) {
  x <- suppressWarnings(as.numeric(value))
  x[match(to, var)] <- x[match(sub1, var)] - x[match(sub2, var)]
  x[1] <- value[1]
  x
}


df1 %>%
  group_by(Group) %>%
  mutate(across(starts_with('Day'), ~change_values('Actual.Daily', 
        'Expected.Daily', 'Diff.Daily', Variable, .)), 
         across(starts_with('Day'), ~change_values('Actual.Cumulative', 
        'Expected.Cumulative', 'Diff.Cumulative', Variable, .))) %>%
  ungroup

This returns -

#   Group            Variable   Day1   Day2    Day3      Day4
#1      A                 Day Sunday Monday Tuesday Wednesday
#2      A   Actual.Cumulative      1      4       6        11
#3      A        Actual.Daily      1      3       2         5
#4      A Expected.Cumulative      3      7      10        12
#5      A      Expected.Daily      3      4       3         2
#6      A          Diff.Daily     -2     -1      -1         3
#7      A     Diff.Cumulative     -2     -3      -4        -1
#8      B                 Day Sunday Monday Tuesday Wednesday
#9      B   Actual.Cumulative      1      2       3         4
#10     B        Actual.Daily      1      1       1         1
#11     B Expected.Cumulative      1      3       4         5
#12     B      Expected.Daily      1      2       1         1
#13     B          Diff.Daily      0     -1       0         0
#14     B     Diff.Cumulative      0     -1      -1        -1
#15     C                 Day Sunday Monday Tuesday Wednesday
#16     C   Actual.Cumulative      8     17      29        32
#17     C        Actual.Daily      8      9      12         3
#18     C Expected.Cumulative      5     10      14        17
#19     C      Expected.Daily      5      5       4         3
#20     C          Diff.Daily      3      4       8         0
#21     C     Diff.Cumulative      3      7      15        15

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM