简体   繁体   中英

Summing Multiple Variables using dplyr

I am trying to sum multiple variables over multiple subjects in a data set. I know how to do this using the plyr package; however, because of the length of the data set, number of variables, and number of different rolling sums I am trying to do (2-day, 3-day, 4-day, etc). I was wondering if someone had a more time efficient manner to complete this task in dplyr.

My data is similar to this:

Subjects <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
Day <- c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5)
variable.A <- rnorm(n = Day, mean = 20, sd = 5)
variable.B <- rnorm(n = Day, mean = 50, sd = 15)
variable.C <- rnorm(n = Day, mean = 100, sd = 33)
dat <- data.frame(Subjects, Day, variable.A, variable.B, variable.C)
dat



   Subjects Day variable.A variable.B variable.C
1         1   1   20.17676   72.44022   56.69915
2         1   2   14.11462   46.28473  117.00864
3         1   3   15.30440   72.43752   93.17489
4         1   4   13.72422   66.76744  101.26422
5         1   5   21.97695   69.50480  102.61979
6         2   1   14.45742   32.69106   82.37268
7         2   2   33.37783   65.06782   97.17744
8         2   3   13.57833   26.37183   89.38218
9         2   4   23.01717   55.83446  147.85362
10        2   5   14.06008   32.00396   48.73060
11        3   1   14.57199   60.29746   87.07977
12        3   2   15.77413   77.04517  132.17910
13        3   3   30.05661   30.62220  171.35998
14        3   4   24.65348   53.96450   74.99875
15        3   5   26.93699   57.06393   36.81901

An example of the code I tried was this:

library(plyr)
library(RcppRoll)
summarize <- ddply(dat, "Subjects", mutate,
    Two.Day.Roll.A = roll_sum(variable.A, 2, align = "right", fill = NA),
    Two.Day.Roll.B = roll_sum(variable.B, 2, align = "right", fill = NA),
    Two.Day.Roll.C = roll_sum(variable.C, 2, align = "right", fill = NA))

   Subjects Day variable.A variable.B variable.C Two.Day.Roll.A Two.Day.Roll.B Two.Day.Roll.C
1         1   1  15.324798   24.83074  137.48853             NA             NA             NA
2         1   2  12.112943   58.86094   86.87454       27.43774       83.69168       224.3631
3         1   3  16.179328   57.95450   68.71333       28.29227      116.81544       155.5879
4         1   4  15.319750   38.13721   79.43194       31.49908       96.09171       148.1453
5         1   5  21.791452   61.99368  134.30205       37.11120      100.13089       213.7340
6         2   1  10.937461   63.83164   95.04865             NA             NA             NA
7         2   2  14.642376   79.12452  107.13699       25.57984      142.95616       202.1856
8         2   3  17.519905   52.75490  100.62811       32.16228      131.87942       207.7651
9         2   4  23.190371   37.56950  179.72763       40.71028       90.32440       280.3557
10        2   5  13.729350   46.95616   72.14179       36.91972       84.52566       251.8694
11        3   1   9.609171   74.51140  130.90005             NA             NA             NA
12        3   2  27.542897   14.36222  133.87630       37.15207       88.87363       264.7763
13        3   3  18.750015   60.46183  130.44314       46.29291       74.82405       264.3194
14        3   4  17.461882   52.65797  176.30620       36.21190      113.11979       306.7493
15        3   5  31.244564   62.41614   78.82916       48.70645      115.07411       255.1354

This works well enough but, as I said the original data has a lot more columns and I want to continue and do 3 day sums, 4 day sums, etc over all of those variables. Also, my original data has some NAs in it so perhaps there is a way to handle this?

I have played around with trying to use the mutate_each() function with the dplyr package but can't seem to get the syntax right.

Thank you.

Here's the dplyr version:

library(dplyr)
library(RcppRoll)
dat %>% group_by(Subjects) %>% 
        mutate_each(funs(roll_sum(., 2, align = "right", fill=NA)), -Subjects, -Day)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM