[英]Fill NAs with next columns for moving average
set.seed(123)
df <- data.frame(loc.id = rep(c(1:3), each = 4*10),
year = rep(rep(c(1980:1983), each = 10), times = 3),
day = rep(1:10, times = 3*4),
x = sample(123:200, 4*3*10, replace = T))
I want to add one more column x.mv
which is 3 days moving average of x
for each loc.id and year combination 我想再添加一列x.mv
,这是每个loc.id和year组合的x
3天移动平均值
df %>% group_by(loc.id,year) %>% mutate(x.mv = zoo::rollmean(x, 3, fill = "NA", align = "right"))
loc.id year day x x.mv
<int> <int> <int> <int> <dbl>
1 1 1980 1 145 NA
2 1 1980 2 184 NA
3 1 1980 3 154 161
4 1 1980 4 191 176.
5 1 1980 5 196 180.
6 1 1980 6 126 171
7 1 1980 7 164 162
8 1 1980 8 192 161.
9 1 1980 9 166 174
10 1 1980 10 158 172
What I want to do is to replace the NAs in the x.mv
column with x
. 我想做的是用x
替换x.mv
列中的NA。 I tried this: 我尝试了这个:
df %>% group_by(loc.id,year) %>% mutate(x.mv = zoo::rollmean(x, 3, fill = x[1:2], align = "right"))
loc.id year day x x.mv
<int> <int> <int> <int> <dbl>
1 1 1980 1 145 145
2 1 1980 2 184 145
3 1 1980 3 154 161
4 1 1980 4 191 176.
5 1 1980 5 196 180.
6 1 1980 6 126 171
7 1 1980 7 164 162
8 1 1980 8 192 161.
9 1 1980 9 166 174
10 1 1980 10 158 172
But what it is doing instead is filling the NAs with the first value of x instead of the corresponding value of x. 但是它要做的是用x的第一个值而不是x的对应值填充NA。 How do I fix it? 我如何解决它?
skip the fill
argument and pad manually: 跳过fill
参数并手动fill
:
df %>%
group_by(loc.id,year) %>%
mutate(x.mv = c(x[1:2],zoo::rollmean(x, 3, align = "right"))) %>%
ungroup
# # A tibble: 120 x 5
# loc.id year day x x.mv
# <int> <int> <int> <int> <dbl>
# 1 1 1980 1 145 145.0000
# 2 1 1980 2 184 184.0000
# 3 1 1980 3 154 161.0000
# 4 1 1980 4 191 176.3333
# 5 1 1980 5 196 180.3333
# 6 1 1980 6 126 171.0000
# 7 1 1980 7 164 162.0000
# 8 1 1980 8 192 160.6667
# 9 1 1980 9 166 174.0000
# 10 1 1980 10 158 172.0000
# # ... with 110 more rows
You might want to use dplyr::cummean(x[1:2])
instead of x[1:2]
, to have an average for the second value already, or in this case, use @g-grothendieck's suggestion in the comments and rewrite your mutate call as mutate(x.mv = rollapplyr(x, 3, mean, partial = TRUE))
. 您可能希望使用dplyr::cummean(x[1:2])
而不是x[1:2]
来获得第二个值的平均值,或者在这种情况下,请在评论中使用@ g-grothendieck的建议并将您的mutate调用重写为mutate(x.mv = rollapplyr(x, 3, mean, partial = TRUE))
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.