簡體   English   中英

R data.table 中滾動列的平均值計算

[英]Computation of average of a column on rolling in R data.table

library(data.table)
DT <- data.table(
      N = 1:16,
  x = c(11,11,11,11,11,11,11,11,21,21,21,21,21,21,21,21), 
  y = c(1,2,3,4,4,4,4,4,1,2,3,4,4,4,4,4), 
  z = c(53,71,27,64,43,62,61,85,44,56,23,37,31,48,80,38)
)

對於每個 x 值,我想得到一列,其值是 z 相對於 y 的平均值,例如

N   x   y   z   Roll Mean 
1   11  1   53  NA
2   11  2   71  53
3   11  3   27  62
4   11  4   64  50.33333333
5   11  4   43  53.75
6   11  4   62  51.25
7   11  4   61  49
8   11  4   85  57.5
9   21  1   44  NA
10  21  2   56  44
11  21  3   23  50
12  21  4   37  41
13  21  4   31  40
14  21  4   48  36.75
15  21  4   80  34.75
16  21  4   38  49

例如當 N=2 x=11, y=2 滾動平均值 = 前 1 次的平均值 z 項 = mean(53) 當 N=3, x=11, y=3 滾動平均值 = 前 1 次和 2 次的平均值z term = mean(53&71) 當 N=4 x=11, y=4 滾動平均值 = 前 1、2 和 3 個 z 術語的平均值 = mean(53,71,27) 然后 N = 5 到 8 我必須獲得前 4 個值的平均值。 我寫了代碼

DT[, RollingAvg := frollapply(z,4, mean), .(x)] 

給出 output

    N  x y  z RollingAvg
1:  1 11 1 53         NA
2:  2 11 1 71         NA
3:  3 11 1 27         NA
4:  4 11 1 64      53.75
5:  5 11 1 43      51.25
6:  6 11 1 62      49.00
7:  7 11 1 61      57.50
8:  8 11 1 85      62.75
9:  9 21 1 44         NA
10: 10 21 1 56         NA
11: 11 21 1 23         NA
12: 12 21 1 37      40.00
13: 13 21 1 31      36.75
14: 14 21 1 48      34.75
15: 15 21 1 80      49.00
16: 16 21 1 38      49.25

我怎樣才能得到正確的輸出

我們可以使用帶有partial = TRUErollapply

library(zoo)
library(data.table)
DT[, RollingAvg := shift(rollapply(z, 4, mean, 
    partial = TRUE, align = 'right')), by = x]

-輸出

> DT
        N     x     y     z RollingAvg
    <int> <num> <num> <num>      <num>
 1:     1    11     1 53.00         NA
 2:     2    11     2 71.00   53.00000
 3:     3    11     3 27.00   62.00000
 4:     4    11     4 64.00   50.33333
 5:     5    11     4 43.00   53.75000
 6:     6    11     4 62.00   51.25000
 7:     7    11     4 61.00   49.00000
 8:     8    11     1 85.00   57.50000
 9:     9    21     2 44.00         NA
10:    10    21     3 56.00   44.00000
11:    11    21     4 23.00   50.00000
12:    12    21     4 37.00   41.00000
13:    13    21     4 31.00   40.00000
14:    14    21     4 48.00   36.75000
15:    15    21     1 80.38   34.75000
16:    16    21     2 53.00   49.09500

此外,如果我們將n指定為值向量,則frollmean可以具有adaptive選項

DT[, RollingAvg := shift(frollmean(z, rep(1:4, c(1, 1, 1, .N-3)), 
      adaptive = TRUE)), by = x]

-輸出

> DT
        N     x     y     z RollingAvg
    <int> <num> <num> <num>      <num>
 1:     1    11     1 53.00         NA
 2:     2    11     2 71.00   53.00000
 3:     3    11     3 27.00   62.00000
 4:     4    11     4 64.00   50.33333
 5:     5    11     4 43.00   53.75000
 6:     6    11     4 62.00   51.25000
 7:     7    11     4 61.00   49.00000
 8:     8    11     1 85.00   57.50000
 9:     9    21     2 44.00         NA
10:    10    21     3 56.00   44.00000
11:    11    21     4 23.00   50.00000
12:    12    21     4 37.00   41.00000
13:    13    21     4 31.00   40.00000
14:    14    21     4 48.00   36.75000
15:    15    21     1 80.38   34.75000
16:    16    21     2 53.00   49.09500
DT[, rm := shift(as.data.frame(frollmean(z, 1:4))[cbind(1:.N, y)])]
# OR 
DT[, rm := shift(unlist(frollmean(z, 1:4))[.I + (y-1)*.N])]
DT[y == 1L, rm := NA_real_]
#         N     x     y     z       rm
#     <int> <num> <num> <num>    <num>
#  1:     1    11     1    53       NA
#  2:     2    11     2    71 53.00000
#  3:     3    11     3    27 62.00000
#  4:     4    11     4    64 50.33333
#  5:     5    11     4    43 53.75000
#  6:     6    11     4    62 51.25000
#  7:     7    11     4    61 49.00000
#  8:     8    11     4    85 57.50000
#  9:     9    21     1    44       NA
# 10:    10    21     2    56 44.00000
# 11:    11    21     3    23 50.00000
# 12:    12    21     4    37 41.00000
# 13:    13    21     4    31 40.00000
# 14:    14    21     4    48 36.75000
# 15:    15    21     4    80 34.75000
# 16:    16    21     4    38 49.00000

正確重現的數據:

DT <- data.table(
  N = 1:16,
  x = c(11,11,11,11,11,11,11,11,21,21,21,21,21,21,21,21), 
  y = c(1,2,3,4,4,4,4,4,1,2,3,4,4,4,4,4), 
  z = c(53,71,27,64,43,62,61,85,44,56,23,37,31,48,80,38)
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM