[英]R data.table to aggregate by multiple columns and retaining all columns
我想使用 data.table 执行聚合并返回与下面的 sql 查询相同的输出:
sqldf("select *,
sum(x) over (partition by year, month, day) as x_agg
sum(y) over (partition by year, month, day) as y_agg,
from table")
我尝试了下面的代码,但我不想列出所有的列
datatable[, list(col1,
col2,
...
coln,
x_agg = sum(x),
y_agg = sum(y),
by = .(year, month, day)]
最简单的方法是复制data.table
因为您已经想在新的data.table
返回所有列,然后附加列x_agg, y_agg
library(data.table)
dt <- data.frame(x=rnorm(40), y=rnorm(20), z= rnorm(10), year=rep(2019:2020,times=2, each=10), month=rep(1:4, 10), day=rep(1:4,10))
setDT(dt)
dt2<- copy(dt)
names <- c("x","y")
dt2[, paste0(names, "_agg"):= lapply(.SD, sum),
.SDcols=names, by = .(year, month, day)][]
x y z year month day x_agg y_agg
1: 0.52378890 0.19143318 -0.3387854 2019 1 1 -0.1709390 -2.967623395
2: -0.35158261 1.62461341 -0.9818403 2019 2 2 -3.6556367 5.940791892
3: 1.29391093 -0.73192766 -2.5227705 2019 3 3 2.1449165 -0.009080778
4: 1.15131966 -0.96903745 -0.5124389 2019 4 4 2.7530336 -1.763717065
5: -0.97305571 -1.16620834 0.8567205 2019 1 1 -0.1709390 -2.967623395
6: -1.73289458 1.74064829 -0.7019242 2019 2 2 -3.6556367 5.940791892
7: 0.14822163 0.72738728 -1.4267469 2019 3 3 2.1449165 -0.009080778
8: -0.17853639 0.08717892 2.0463365 2019 4 4 2.7530336 -1.763717065
9: 0.43857404 -0.50903654 -0.6887948 2019 1 1 -0.1709390 -2.967623395
10: 0.56904083 -0.39486575 -0.1134194 2019 2 2 -3.6556367 5.940791892
11: 0.54823107 -0.28118769 -0.3387854 2020 3 3 1.3975639 -5.470426871
12: 1.12885306 -0.80344406 -0.9818403 2020 4 4 2.5982909 -3.062138945
13: 0.98747699 0.72247033 -2.5227705 2020 1 1 2.4807741 0.134137894
14: -2.60859806 -1.37195721 -0.5124389 2020 2 2 -0.8401949 -2.285724235
15: -0.44170249 -1.47594529 0.8567205 2020 3 3 1.3975639 -5.470426871
16: 0.02994275 0.01272509 -0.7019242 2020 4 4 2.5982909 -3.062138945
17: -0.11760158 -0.65540139 -1.4267469 2020 1 1 2.4807741 0.134137894
18: 0.87222687 0.22909510 2.0463365 2020 2 2 -0.8401949 -2.285724235
19: 0.33379209 -0.97808045 -0.6887948 2020 3 3 1.3975639 -5.470426871
20: -0.70379104 -0.74035050 -0.1134194 2020 4 4 2.5982909 -3.062138945
21: 0.22151323 0.19143318 -0.3387854 2019 1 1 -0.1709390 -2.967623395
22: -0.91018028 1.62461341 -0.9818403 2019 2 2 -3.6556367 5.940791892
23: -0.05931458 -0.73192766 -2.5227705 2019 3 3 2.1449165 -0.009080778
24: 0.51606540 -0.96903745 -0.5124389 2019 4 4 2.7530336 -1.763717065
25: -0.81728153 -1.16620834 0.8567205 2019 1 1 -0.1709390 -2.967623395
26: -1.43174995 1.74064829 -0.7019242 2019 2 2 -3.6556367 5.940791892
27: 0.76209854 0.72738728 -1.4267469 2019 3 3 2.1449165 -0.009080778
28: 1.26418496 0.08717892 2.0463365 2019 4 4 2.7530336 -1.763717065
29: 0.43552206 -0.50903654 -0.6887948 2019 1 1 -0.1709390 -2.967623395
30: 0.20172988 -0.39486575 -0.1134194 2019 2 2 -3.6556367 5.940791892
31: 0.21270847 -0.28118769 -0.3387854 2020 3 3 1.3975639 -5.470426871
32: 1.21382327 -0.80344406 -0.9818403 2020 4 4 2.5982909 -3.062138945
33: 0.41322214 0.72247033 -2.5227705 2020 1 1 2.4807741 0.134137894
34: 0.09986465 -1.37195721 -0.5124389 2020 2 2 -0.8401949 -2.285724235
35: -0.09185291 -1.47594529 0.8567205 2020 3 3 1.3975639 -5.470426871
36: 0.13209497 0.01272509 -0.7019242 2020 4 4 2.5982909 -3.062138945
37: 1.19767652 -0.65540139 -1.4267469 2020 1 1 2.4807741 0.134137894
38: 0.79631162 0.22909510 2.0463365 2020 2 2 -0.8401949 -2.285724235
39: 0.83638763 -0.97808045 -0.6887948 2020 3 3 1.3975639 -5.470426871
40: 0.79736792 -0.74035050 -0.1134194 2020 4 4 2.5982909 -3.062138945
x y z year month day x_agg y_agg
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.