简体   繁体   English

将apply()与自定义函数和第二个数据帧一起使用

[英]Using apply() with a custom function and a second data frame

I'm trying to generate predicted values from a large number of model simulations and I'm having a hard time doing it simply. 我试图从大量的模型仿真中生成预测值,但我很难做到这一点。 I suspect I need something from the apply() family, but I can't figure out the syntax. 我怀疑我需要apply()系列产品,但是我无法弄清楚语法。 Maybe my knowledge of apply() is weak. 也许我对apply()的了解很薄弱。 Or maybe my function is wrong. 也许我的功能是错误的。 Any suggestions? 有什么建议么?

Suppose I've got following coefficients resulting from six model simulations... 假设我已经跟随了六个模型仿真得出的系数...

coef <- data.frame(intercept=c(2,3,5,7,2,1),
                   b1 = c(.2,.5,.6,.7,.9,.4),
                   b2 = c(10,11,12,11,9,10))

I want to compute (predicted values or) the linear combination of each row above and each row of the following data frame... 我想计算(预测值或)上方每个数据行与以下数据框架的每一行的线性组合...

df <- data.frame(age = c(50,20,19, 42),
                 height = c(60,72,79, 66))

...Using the following model equation: ...使用以下模型方程式:

coef$intercept + coef$b1*df$age + coef$b2*df$height

Done right, I should get the following 24 data values: 做对了,我应该获得以下24个数据值:

612.0   726.0   795.8   670.4
688.0   805.0   881.5   750
755.0   881.0   964.4   822.2
702.0   813.0   889.3   762.4
587.0   668.0   730.1   633.8
621.0   729.0   798.6   677.8

To get the above, I've tried the following function and use of apply()... 为了达到上述目的,我尝试了以下功能并使用了apply()...

equation <-  function(...) coef$intercept + coef$b1*df$age + coef$b2*df$height
result <- apply(df, 1, equation)

...but I don't get the correct answer. ...但是我没有得到正确的答案。 The "result" data frame just repeats the correct diagonals. “结果”数据帧仅重复正确的对角线。 I also get the message: 我也收到消息:

> Warning messages: 1: In coef$b1 * df$age :   longer object length is
> not a multiple of shorter object length

Yes I can get the correct answer through simple matrix multiplication: 是的,我可以通过简单的矩阵乘法得到正确的答案:

df$ones <- 1
df <- df[,c(3, 1, 2)]
result <- as.matrix(coef) %*% t(as.matrix(df))

But it seems to me one ought to be able to do this more generally using apply() and a custom function. 但是在我看来,应该可以使用apply()和自定义函数来更一般地执行此操作。 Use of apply() is more compact and puts me less at risk of having my matrix columns in the wrong order. apply()的使用更加紧凑,使我的矩阵列顺序错误的风险较小。 Any suggestions? 有什么建议么?

If you really want to use apply, you can do this: 如果您确实要使用Apply,则可以执行以下操作:

result<- t(apply(coef, 1, function(x) x[1] + x[2]*df$age + x[3]*df$height))
> result
     [,1] [,2]  [,3]  [,4]
[1,]  612  726 795.8 670.4
[2,]  688  805 881.5 750.0
[3,]  755  881 964.4 822.2
[4,]  702  813 889.3 762.4
[5,]  587  668 730.1 633.8
[6,]  621  729 798.6 677.8

But it's really preferable (and faster) to do the matrix multiplication. 但是做矩阵乘法确实是更好的选择(而且更快)。

We can do this with %*% 我们可以用%*%做到这一点

coef[,1] + as.matrix(coef[-1]) %*% t(df)
#     [,1] [,2]  [,3]  [,4]
#[1,]  612  726 795.8 670.4
#[2,]  688  805 881.5 750.0
#[3,]  755  881 964.4 822.2
#[4,]  702  813 889.3 762.4
#[5,]  587  668 730.1 633.8
#[6,]  621  729 798.6 677.8

Here is what I'd do: 这是我要做的:

sapply(seq_along(1:nrow(coef)), function(x){

  sapply(seq_along(1:nrow(df)), function(y) {
    coef$intercept[[x]] + coef$b1[[x]]*df$age[[y]] + coef$b2[[x]]*df$height[[y]]
  })

})

Result: 结果:

     [,1]  [,2]  [,3]  [,4]  [,5]  [,6]
[1,] 612.0 688.0 755.0 702.0 587.0 621.0
[2,] 726.0 805.0 881.0 813.0 668.0 729.0
[3,] 795.8 881.5 964.4 889.3 730.1 798.6
[4,] 670.4 750.0 822.2 762.4 633.8 677.8

Use two sapplys. 使用两个sapplys。 One for each object ( df and coef ). 每个对象一个( dfcoef )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM