[英]Computation of average of a column on rolling in R data.table
library(data.table)
DT <- data.table(
N = 1:16,
x = c(11,11,11,11,11,11,11,11,21,21,21,21,21,21,21,21),
y = c(1,2,3,4,4,4,4,4,1,2,3,4,4,4,4,4),
z = c(53,71,27,64,43,62,61,85,44,56,23,37,31,48,80,38)
)
對於每個 x 值,我想得到一列,其值是 z 相對於 y 的平均值,例如
N x y z Roll Mean
1 11 1 53 NA
2 11 2 71 53
3 11 3 27 62
4 11 4 64 50.33333333
5 11 4 43 53.75
6 11 4 62 51.25
7 11 4 61 49
8 11 4 85 57.5
9 21 1 44 NA
10 21 2 56 44
11 21 3 23 50
12 21 4 37 41
13 21 4 31 40
14 21 4 48 36.75
15 21 4 80 34.75
16 21 4 38 49
例如當 N=2 x=11, y=2 滾動平均值 = 前 1 次的平均值 z 項 = mean(53) 當 N=3, x=11, y=3 滾動平均值 = 前 1 次和 2 次的平均值z term = mean(53&71) 當 N=4 x=11, y=4 滾動平均值 = 前 1、2 和 3 個 z 術語的平均值 = mean(53,71,27) 然后 N = 5 到 8 我必須獲得前 4 個值的平均值。 我寫了代碼
DT[, RollingAvg := frollapply(z,4, mean), .(x)]
給出 output
N x y z RollingAvg
1: 1 11 1 53 NA
2: 2 11 1 71 NA
3: 3 11 1 27 NA
4: 4 11 1 64 53.75
5: 5 11 1 43 51.25
6: 6 11 1 62 49.00
7: 7 11 1 61 57.50
8: 8 11 1 85 62.75
9: 9 21 1 44 NA
10: 10 21 1 56 NA
11: 11 21 1 23 NA
12: 12 21 1 37 40.00
13: 13 21 1 31 36.75
14: 14 21 1 48 34.75
15: 15 21 1 80 49.00
16: 16 21 1 38 49.25
我怎樣才能得到正確的輸出
我們可以使用帶有partial = TRUE
的rollapply
library(zoo)
library(data.table)
DT[, RollingAvg := shift(rollapply(z, 4, mean,
partial = TRUE, align = 'right')), by = x]
-輸出
> DT
N x y z RollingAvg
<int> <num> <num> <num> <num>
1: 1 11 1 53.00 NA
2: 2 11 2 71.00 53.00000
3: 3 11 3 27.00 62.00000
4: 4 11 4 64.00 50.33333
5: 5 11 4 43.00 53.75000
6: 6 11 4 62.00 51.25000
7: 7 11 4 61.00 49.00000
8: 8 11 1 85.00 57.50000
9: 9 21 2 44.00 NA
10: 10 21 3 56.00 44.00000
11: 11 21 4 23.00 50.00000
12: 12 21 4 37.00 41.00000
13: 13 21 4 31.00 40.00000
14: 14 21 4 48.00 36.75000
15: 15 21 1 80.38 34.75000
16: 16 21 2 53.00 49.09500
此外,如果我們將n
指定為值向量,則frollmean
可以具有adaptive
選項
DT[, RollingAvg := shift(frollmean(z, rep(1:4, c(1, 1, 1, .N-3)),
adaptive = TRUE)), by = x]
-輸出
> DT
N x y z RollingAvg
<int> <num> <num> <num> <num>
1: 1 11 1 53.00 NA
2: 2 11 2 71.00 53.00000
3: 3 11 3 27.00 62.00000
4: 4 11 4 64.00 50.33333
5: 5 11 4 43.00 53.75000
6: 6 11 4 62.00 51.25000
7: 7 11 4 61.00 49.00000
8: 8 11 1 85.00 57.50000
9: 9 21 2 44.00 NA
10: 10 21 3 56.00 44.00000
11: 11 21 4 23.00 50.00000
12: 12 21 4 37.00 41.00000
13: 13 21 4 31.00 40.00000
14: 14 21 4 48.00 36.75000
15: 15 21 1 80.38 34.75000
16: 16 21 2 53.00 49.09500
DT[, rm := shift(as.data.frame(frollmean(z, 1:4))[cbind(1:.N, y)])]
# OR
DT[, rm := shift(unlist(frollmean(z, 1:4))[.I + (y-1)*.N])]
DT[y == 1L, rm := NA_real_]
# N x y z rm
# <int> <num> <num> <num> <num>
# 1: 1 11 1 53 NA
# 2: 2 11 2 71 53.00000
# 3: 3 11 3 27 62.00000
# 4: 4 11 4 64 50.33333
# 5: 5 11 4 43 53.75000
# 6: 6 11 4 62 51.25000
# 7: 7 11 4 61 49.00000
# 8: 8 11 4 85 57.50000
# 9: 9 21 1 44 NA
# 10: 10 21 2 56 44.00000
# 11: 11 21 3 23 50.00000
# 12: 12 21 4 37 41.00000
# 13: 13 21 4 31 40.00000
# 14: 14 21 4 48 36.75000
# 15: 15 21 4 80 34.75000
# 16: 16 21 4 38 49.00000
正確重現的數據:
DT <- data.table(
N = 1:16,
x = c(11,11,11,11,11,11,11,11,21,21,21,21,21,21,21,21),
y = c(1,2,3,4,4,4,4,4,1,2,3,4,4,4,4,4),
z = c(53,71,27,64,43,62,61,85,44,56,23,37,31,48,80,38)
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.