[英]Fill missing values with a calculated next value
I have a data frame with a date column with some missing values:我有一个带有日期列的数据框,其中包含一些缺失值:
my_df <- data.frame(date = as.Date(c("2020-07-01", NA, NA, NA, "2022-07-01", "2023-07-01")))
my_df
# date
# 1 2020-07-01
# 2 <NA>
# 3 <NA>
# 4 <NA>
# 5 2022-07-01 # NAs to be replaced with 2022-07-01 minus one year
# 6 2023-07-01
I want to fill in the NA
dates thus:我想这样填写
NA
日期:
NA
, upwardsNA
Desired result:期望的结果:
data.frame(date = as.Date(c("2020-07-01", "2021-07-01", "2021-07-01", "2021-07-01", "2022-07-01", "2023-07-01")))
# date
# 1 2020-07-01
# 2 2021-07-01
# 3 2021-07-01
# 4 2021-07-01
# 5 2022-07-01
# 6 2023-07-01
I like tidyverse
so I'm hoping to use fill()
and something like %m-% years(1)
from lubridate.我喜欢
tidyverse
,所以我希望使用fill()
和 lubridate 中的%m-% years(1)
之类的东西。 My attempt:我的尝试:
my_df <- my_df %>%
mutate(date2 = date %m-% years(1)) %>%
fill(date2, .direction = "up") %>%
mutate(date = if_else(is.na(date), date2, date)) %>%
select(-date2)
seems to work, but is there a more direct method?似乎可行,但有更直接的方法吗?
An option with zoo
: zoo
的一个选项:
library(dplyr)
library(lubridate)
library(zoo)
my_df %>% mutate(date = na.locf(date, fromLast = TRUE) %m-% years(1 * is.na(date)))
Output: Output:
date
1 2020-07-01
2 2021-07-01
3 2021-07-01
4 2021-07-01
5 2022-07-01
6 2023-07-01
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.