简体   繁体   中英

R data.table to aggregate by multiple columns and retaining all columns

I would like to use data.table to perform an aggregation and return outputs identical to the sql query below:

sqldf("select *,
              sum(x) over (partition by year, month, day) as x_agg
              sum(y) over (partition by year, month, day) as y_agg,
       from table")

I tried the codes below, but I'd like to not have to list out all the columns

datatable[, list(col1,
                col2,
                ...
                coln,
                x_agg = sum(x),
                y_agg = sum(y), 
             by = .(year, month, day)]

the easiest way is copying the data.table as you already want to return all the column in a new data.table , and then append the columns x_agg, y_agg

library(data.table)
dt <- data.frame(x=rnorm(40), y=rnorm(20), z= rnorm(10), year=rep(2019:2020,times=2, each=10), month=rep(1:4, 10), day=rep(1:4,10))

setDT(dt)

dt2<- copy(dt)
names <- c("x","y")

dt2[, paste0(names, "_agg"):= lapply(.SD, sum), 
             .SDcols=names, by = .(year, month, day)][]
            x           y          z year month day      x_agg        y_agg
 1:  0.52378890  0.19143318 -0.3387854 2019     1   1 -0.1709390 -2.967623395
 2: -0.35158261  1.62461341 -0.9818403 2019     2   2 -3.6556367  5.940791892
 3:  1.29391093 -0.73192766 -2.5227705 2019     3   3  2.1449165 -0.009080778
 4:  1.15131966 -0.96903745 -0.5124389 2019     4   4  2.7530336 -1.763717065
 5: -0.97305571 -1.16620834  0.8567205 2019     1   1 -0.1709390 -2.967623395
 6: -1.73289458  1.74064829 -0.7019242 2019     2   2 -3.6556367  5.940791892
 7:  0.14822163  0.72738728 -1.4267469 2019     3   3  2.1449165 -0.009080778
 8: -0.17853639  0.08717892  2.0463365 2019     4   4  2.7530336 -1.763717065
 9:  0.43857404 -0.50903654 -0.6887948 2019     1   1 -0.1709390 -2.967623395
10:  0.56904083 -0.39486575 -0.1134194 2019     2   2 -3.6556367  5.940791892
11:  0.54823107 -0.28118769 -0.3387854 2020     3   3  1.3975639 -5.470426871
12:  1.12885306 -0.80344406 -0.9818403 2020     4   4  2.5982909 -3.062138945
13:  0.98747699  0.72247033 -2.5227705 2020     1   1  2.4807741  0.134137894
14: -2.60859806 -1.37195721 -0.5124389 2020     2   2 -0.8401949 -2.285724235
15: -0.44170249 -1.47594529  0.8567205 2020     3   3  1.3975639 -5.470426871
16:  0.02994275  0.01272509 -0.7019242 2020     4   4  2.5982909 -3.062138945
17: -0.11760158 -0.65540139 -1.4267469 2020     1   1  2.4807741  0.134137894
18:  0.87222687  0.22909510  2.0463365 2020     2   2 -0.8401949 -2.285724235
19:  0.33379209 -0.97808045 -0.6887948 2020     3   3  1.3975639 -5.470426871
20: -0.70379104 -0.74035050 -0.1134194 2020     4   4  2.5982909 -3.062138945
21:  0.22151323  0.19143318 -0.3387854 2019     1   1 -0.1709390 -2.967623395
22: -0.91018028  1.62461341 -0.9818403 2019     2   2 -3.6556367  5.940791892
23: -0.05931458 -0.73192766 -2.5227705 2019     3   3  2.1449165 -0.009080778
24:  0.51606540 -0.96903745 -0.5124389 2019     4   4  2.7530336 -1.763717065
25: -0.81728153 -1.16620834  0.8567205 2019     1   1 -0.1709390 -2.967623395
26: -1.43174995  1.74064829 -0.7019242 2019     2   2 -3.6556367  5.940791892
27:  0.76209854  0.72738728 -1.4267469 2019     3   3  2.1449165 -0.009080778
28:  1.26418496  0.08717892  2.0463365 2019     4   4  2.7530336 -1.763717065
29:  0.43552206 -0.50903654 -0.6887948 2019     1   1 -0.1709390 -2.967623395
30:  0.20172988 -0.39486575 -0.1134194 2019     2   2 -3.6556367  5.940791892
31:  0.21270847 -0.28118769 -0.3387854 2020     3   3  1.3975639 -5.470426871
32:  1.21382327 -0.80344406 -0.9818403 2020     4   4  2.5982909 -3.062138945
33:  0.41322214  0.72247033 -2.5227705 2020     1   1  2.4807741  0.134137894
34:  0.09986465 -1.37195721 -0.5124389 2020     2   2 -0.8401949 -2.285724235
35: -0.09185291 -1.47594529  0.8567205 2020     3   3  1.3975639 -5.470426871
36:  0.13209497  0.01272509 -0.7019242 2020     4   4  2.5982909 -3.062138945
37:  1.19767652 -0.65540139 -1.4267469 2020     1   1  2.4807741  0.134137894
38:  0.79631162  0.22909510  2.0463365 2020     2   2 -0.8401949 -2.285724235
39:  0.83638763 -0.97808045 -0.6887948 2020     3   3  1.3975639 -5.470426871
40:  0.79736792 -0.74035050 -0.1134194 2020     4   4  2.5982909 -3.062138945
              x           y          z year month day      x_agg        y_agg

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM