简体   繁体   English

R ddp从下一行求和

[英]R ddply sum value from next row

I want to sum the column value from a row with the next one. 我想将下一行的列值相加。

> df

+----+------+--------+------+
| id |  Val | Factor | Col  |
+----+------+--------+------+
|  1 |   15 |      1 |    7 |
|  3 |   20 |      1 |    4 |
|  2 |   35 |      2 |    8 | 
|  7 |   35 |      1 |   12 |
|  5 |   40 |      1 |   11 |
|  6 |   45 |      2 |   13 |
|  4 |   55 |      1 |    4 |
|  8 |   60 |      1 |    7 |
|  9 |   15 |      2 |   12 |
..........

I would like to have the mean of sum of the Row$Val + nextRow$Val based on their id and Col . 我想根据其idCol获得Row$Val + nextRow$Val之和的平均值。 I can't assume that the id or Col are consecutive. 我不能认为idCol是连续的。

I am using ddply to summarize my df. 我正在使用ddply总结我的df。 I have tried 我努力了

> ddply(df, .(Factor), summarize, 
       max(Val), 
       sum(Val), 
       mean(Val + df[df$id == id+1 & df$Col = Col]$Val)
       )

> "longer object length is not a multiple of shorter object length"

You can use rollapply from the zoo package. 您可以从zoo包中使用rollapply Since you want mean of only two consecutive rows , you can try 由于您只希望连续两行的平均值,因此可以尝试

library(zoo)
rollapply(df[order(df$id), 2], 2, function(x) sum(x)/2)

#[1] 17.5 27.5 35.0 37.5 42.5 50.0 57.5 37.5

You can build a vector of values with 您可以使用以下方法构建值向量

sapply(df$id, function(x){mean(c(
    subset(df, id == x, select = Val, drop = TRUE), 
    subset(df, id == x+1, select = Val, drop = TRUE)
    ))})

You could simplify, but I tried to make it as readable as possible. 您可以简化,但是我尝试使其尽可能地可读。

You can do something like this with dplyr package: 您可以使用dplyr软件包执行以下操作:

library(dplyr)
df <- arrange(df, id)
mean(df$Val + lead(df$Val), na.rm = TRUE)
[1] 76.25

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM