通过在R中的数据帧的行上应用Reduce函数来创建新列

Question

I have a data frame that contains IDs, Dates, and observed returns. 我有一个包含ID，日期和观察到的收益的数据框。 It can be likened to this: 可以比喻为：

df <- data.frame(
  ID = gl(3, 10, labels = c("A", "B", "C")), 
  Date = factor(rep(2006, 2015, 3)), 
  lr = runif(30, -0.01, 0.01))

查看快照

Now I want to use the following function to find the vectors of exponentially moving averages for each of the IDs and add them as a new column to my original dataframe: 现在，我想使用以下函数查找每个ID的指数移动平均值的向量，并将它们作为新列添加到我的原始数据帧中：

Emean<-function(x){
    ema <- function(a,b) {lambda*a+(1-lambda)*b}
    Reduce(ema, x, accumulate=T)
}

So I want the resulting data frame to have columns ID, Date, lr, and mlr. 因此，我希望结果数据框具有ID，Date，lr和mlr列。 The last column (mlr) will be calculated using above function; 最后一列（mlr）将使用上述函数进行计算； and (sorry for loose notation!) but this is the formula: 和（很抱歉使用宽松的符号！），但这是公式：

mlr_t=lambda*mlr_t-1 + (1-lambda)*lr_t

'_t' denotes the time. “ _t”表示时间。

Now as I said I want to apply my function to the rows grouped by IDs and add the result as a column to this data frame. 现在，正如我所说，我想将我的函数应用于按ID分组的行，并将结果作为列添加到此数据框。 The output of 'Reduce' cannot be added directly to that data frame and I have to manipulate it in several steps which is extremely time consuming in R. 无法将“ Reduce”的输出直接添加到该数据帧，因此我必须分几个步骤对其进行操作，这在R中非常耗时。

I need a computationally efficient solution for doing what I said. 我需要一种计算有效的解决方案来完成我所说的事情。 In the actual data set I have +100K IDs and +250 dates for each ID. 在实际数据集中，我有+ 100K ID和每个ID +250个日期。

Answer 1

As 如

mlr_0 = 0
mlr_1 = 0 + (1-lambda)*lr_1
mlr_2 = lambda * mlr_1 + (1-lambda)*lr_2
      = lambda * (1-lambda) * lr_1 + (1-lambda)*lr_2
mlr_3 = lambda * mlr_2 + (1-lambda)*lr_3
      = lambda^2 * (1-lambda) * lr_1 + lambda * (1-lambda) * lr_2 + (1-lambda)*lr_3
...
mlr_t = lambda^(t-1) * (1-lambda) * lr_1 + lambda^(t-2) * (1-lambda) * lr_2 + ...
      = \Sum_{i=1}^{t} lambda^(t-i) * (1-lambda)*lr_i

you can do something like this (using data.table ) 你可以做这样的事情（使用data.table ）

setDT(df)
lambda <- 0.5
# This calculates the lambda^(t-i)
l <- function(i, lambda){ lambda^(i-seq_len(i)) }

# This calculates multiplies element wise and sums up the mlr_3
my_fun <- function(x, lr, lambda){
  sum((1-lambda) * c(0,lr)[1:x] * l(x, lambda))}

# Apply both function to the vector
df[, vapply(seq_len(.N), my_fun, numeric(1), lr, lambda)  ,by = ID]

Results in (with set.seed(42) ) 结果（带有set.seed(42) ）

    ID        V1
 1:  A 0.0000000
 2:  A 0.4574030
 3:  A 0.6972392
 4:  A 0.4916894
 5:  A 0.6610685
 6:  A 0.6514070
 7:  A 0.5852515
 8:  A 0.6609199
 9:  A 0.3977932
10:  A 0.5273928
11:  B 0.0000000
12:  B 0.2288709
...

通过在R中的数据帧的行上应用Reduce函数来创建新列

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-10-23 14:33:59

通过在R中的数据帧的行上应用Reduce函数来创建新列

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-10-23 14:33:59

解决方案1
1 已采纳 2015-10-23 14:33:59