[英]Fill NAs with next columns for moving average
set.seed(123)
df <- data.frame(loc.id = rep(c(1:3), each = 4*10),
year = rep(rep(c(1980:1983), each = 10), times = 3),
day = rep(1:10, times = 3*4),
x = sample(123:200, 4*3*10, replace = T))
我想再添加一列x.mv
,這是每個loc.id和year組合的x
3天移動平均值
df %>% group_by(loc.id,year) %>% mutate(x.mv = zoo::rollmean(x, 3, fill = "NA", align = "right"))
loc.id year day x x.mv
<int> <int> <int> <int> <dbl>
1 1 1980 1 145 NA
2 1 1980 2 184 NA
3 1 1980 3 154 161
4 1 1980 4 191 176.
5 1 1980 5 196 180.
6 1 1980 6 126 171
7 1 1980 7 164 162
8 1 1980 8 192 161.
9 1 1980 9 166 174
10 1 1980 10 158 172
我想做的是用x
替換x.mv
列中的NA。 我嘗試了這個:
df %>% group_by(loc.id,year) %>% mutate(x.mv = zoo::rollmean(x, 3, fill = x[1:2], align = "right"))
loc.id year day x x.mv
<int> <int> <int> <int> <dbl>
1 1 1980 1 145 145
2 1 1980 2 184 145
3 1 1980 3 154 161
4 1 1980 4 191 176.
5 1 1980 5 196 180.
6 1 1980 6 126 171
7 1 1980 7 164 162
8 1 1980 8 192 161.
9 1 1980 9 166 174
10 1 1980 10 158 172
但是它要做的是用x的第一個值而不是x的對應值填充NA。 我如何解決它?
跳過fill
參數並手動fill
:
df %>%
group_by(loc.id,year) %>%
mutate(x.mv = c(x[1:2],zoo::rollmean(x, 3, align = "right"))) %>%
ungroup
# # A tibble: 120 x 5
# loc.id year day x x.mv
# <int> <int> <int> <int> <dbl>
# 1 1 1980 1 145 145.0000
# 2 1 1980 2 184 184.0000
# 3 1 1980 3 154 161.0000
# 4 1 1980 4 191 176.3333
# 5 1 1980 5 196 180.3333
# 6 1 1980 6 126 171.0000
# 7 1 1980 7 164 162.0000
# 8 1 1980 8 192 160.6667
# 9 1 1980 9 166 174.0000
# 10 1 1980 10 158 172.0000
# # ... with 110 more rows
您可能希望使用dplyr::cummean(x[1:2])
而不是x[1:2]
來獲得第二個值的平均值,或者在這種情況下,請在評論中使用@ g-grothendieck的建議並將您的mutate調用重寫為mutate(x.mv = rollapplyr(x, 3, mean, partial = TRUE))
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.