简体   繁体   English

用计算出的下一个值填充缺失值

[英]Fill missing values with a calculated next value

I have a data frame with a date column with some missing values:我有一个带有日期列的数据框,其中包含一些缺失值:

my_df <- data.frame(date = as.Date(c("2020-07-01", NA, NA, NA, "2022-07-01", "2023-07-01")))
my_df
#         date
# 1 2020-07-01
# 2       <NA>
# 3       <NA>
# 4       <NA>
# 5 2022-07-01 # NAs to be replaced with 2022-07-01 minus one year
# 6 2023-07-01

I want to fill in the NA dates thus:我想这样填写NA日期:

  • fill with next non- NA , upwards向上填充下一个非NA
  • subtract 1 year from the filled values.从填充值中减去 1 年。

Desired result:期望的结果:

data.frame(date = as.Date(c("2020-07-01", "2021-07-01", "2021-07-01", "2021-07-01", "2022-07-01", "2023-07-01")))

#         date
# 1 2020-07-01
# 2 2021-07-01 
# 3 2021-07-01 
# 4 2021-07-01 
# 5 2022-07-01
# 6 2023-07-01

I like tidyverse so I'm hoping to use fill() and something like %m-% years(1) from lubridate.我喜欢tidyverse ,所以我希望使用fill()和 lubridate 中的%m-% years(1)之类的东西。 My attempt:我的尝试:

my_df <- my_df %>%
  mutate(date2 = date %m-% years(1)) %>%
  fill(date2, .direction = "up") %>%
  mutate(date = if_else(is.na(date), date2, date)) %>%
  select(-date2)

seems to work, but is there a more direct method?似乎可行,但有更直接的方法吗?

An option with zoo : zoo的一个选项:

library(dplyr)
library(lubridate)
library(zoo)

my_df %>% mutate(date = na.locf(date, fromLast = TRUE) %m-% years(1 * is.na(date)))

Output: Output:

        date
1 2020-07-01
2 2021-07-01
3 2021-07-01
4 2021-07-01
5 2022-07-01
6 2023-07-01

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM