简体   繁体   English

滚动平均值(变化滞后)

[英]Rolling mean (varying lag)

library(tidyverse)

How to calculate the means for the following depths: 1-2, 1-3, 1-4, …, 1-10 for variables y and z. 如何计算以下深度的平均值:变量y和z的1-2、1-3、1-4,…,1-10。 Note that in my real data I do not have equally space depth, so I can't really use rollapply directly. 请注意,在我的真实数据中,我没有同样的空间深度,因此我无法真正直接使用rollapply。

set.seed(123)
df <- data.frame(depth = seq(1, 10, length.out = 100), y = rnorm(100), z = rnorm(100))

head(df)
#>      depth           y           z
#> 1 1.000000 -0.56047565 -0.71040656
#> 2 1.090909 -0.23017749  0.25688371
#> 3 1.181818  1.55870831 -0.24669188
#> 4 1.272727  0.07050839 -0.34754260
#> 5 1.363636  0.12928774 -0.95161857
#> 6 1.454545  1.71506499 -0.04502772

Example of desired ouputs 所需输出的示例

df %>% 
  filter(between(depth, 1, 2)) %>% 
  summarise_at(vars(y, z), mean) %>% 
  mutate(start_depth = 1, end_depth = 2)
#>           y          z start_depth end_depth
#> 1 0.1941793 -0.3271552           1         2

df %>% 
  filter(between(depth, 1, 3)) %>% 
  summarise_at(vars(y, z), mean) %>% 
  mutate(start_depth = 1, end_depth = 3)
#>            y          z start_depth end_depth
#> 1 0.02263796 -0.3699128           1         3

df %>% 
  filter(between(depth, 1, 4)) %>% 
  summarise_at(vars(y, z), mean) %>% 
  mutate(start_depth = 1, end_depth = 4)
#>            y          z start_depth end_depth
#> 1 0.01445704 -0.1993295           1         4

And so on… Created on 2018-10-23 by the reprex package (v0.2.1) 依此类推… reprex包 (v0.2.1)创建于2018-10-23

OP already has code to create output one at a time, so I guess the request is to do it all at once: OP已经有代码可以一次创建一个输出,因此我想请求是一次完成所有操作:

library(data.table)
setDT(df)

cols = c("y", "z")
mDT = data.table(start_depth = 1, end_depth = as.numeric(1:10))
res = df[mDT, on=.(depth >= start_depth, depth <= end_depth), 
  lapply(.SD, mean), by=.EACHI, .SDcols=cols]    
setnames(res, c(names(mDT), cols))

    start_depth end_depth           y           z
 1:           1         1 -0.56047565 -0.71040656
 2:           1         2  0.19417934 -0.32715522
 3:           1         3  0.02263796 -0.36991283
 4:           1         4  0.01445704 -0.19932946
 5:           1         5  0.06702734 -0.27118566
 6:           1         6  0.08145323 -0.21811183
 7:           1         7  0.03197788 -0.13311881
 8:           1         8  0.01918313 -0.10335488
 9:           1         9  0.03956002 -0.08520866
10:           1        10  0.09040591 -0.10754680

This is a non-equi join. 这是非等额联接。 The extra setnames step may change soon . 额外的setnames步骤可能很快会更改

A non-equi join may be suitable if your ranges are arbitrary, but in the OP's case, it is just a growing range so the natural solution is a rolling computation (eg with RcppRoll). 如果您的范围是任意的,则非等参联接可能是合适的,但是在OP的情况下,它只是一个不断扩大的范围,因此自然的解决方案是滚动计算(例如,使用RcppRoll)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM