[英]R ddply sum value from next row
I want to sum the column value from a row with the next one. 我想将下一行的列值相加。
> df
+----+------+--------+------+
| id | Val | Factor | Col |
+----+------+--------+------+
| 1 | 15 | 1 | 7 |
| 3 | 20 | 1 | 4 |
| 2 | 35 | 2 | 8 |
| 7 | 35 | 1 | 12 |
| 5 | 40 | 1 | 11 |
| 6 | 45 | 2 | 13 |
| 4 | 55 | 1 | 4 |
| 8 | 60 | 1 | 7 |
| 9 | 15 | 2 | 12 |
..........
I would like to have the mean of sum of the Row$Val
+ nextRow$Val
based on their id
and Col
. 我想根据其id
和Col
获得Row$Val
+ nextRow$Val
之和的平均值。 I can't assume that the id
or Col
are consecutive. 我不能认为id
或Col
是连续的。
I am using ddply to summarize my df. 我正在使用ddply总结我的df。 I have tried 我努力了
> ddply(df, .(Factor), summarize,
max(Val),
sum(Val),
mean(Val + df[df$id == id+1 & df$Col = Col]$Val)
)
> "longer object length is not a multiple of shorter object length"
You can use rollapply
from the zoo
package. 您可以从zoo
包中使用rollapply
。 Since you want mean of only two consecutive rows , you can try 由于您只希望连续两行的平均值,因此可以尝试
library(zoo)
rollapply(df[order(df$id), 2], 2, function(x) sum(x)/2)
#[1] 17.5 27.5 35.0 37.5 42.5 50.0 57.5 37.5
You can build a vector of values with 您可以使用以下方法构建值向量
sapply(df$id, function(x){mean(c(
subset(df, id == x, select = Val, drop = TRUE),
subset(df, id == x+1, select = Val, drop = TRUE)
))})
You could simplify, but I tried to make it as readable as possible. 您可以简化,但是我尝试使其尽可能地可读。
You can do something like this with dplyr
package: 您可以使用dplyr
软件包执行以下操作:
library(dplyr)
df <- arrange(df, id)
mean(df$Val + lead(df$Val), na.rm = TRUE)
[1] 76.25
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.