[英]Add missing months for a range of date in R
Say I have a data.frame
as follows, each month has one entry of data:假设我有一个data.frame
如下,每个月都有一个数据条目:
df <- read.table(text="date,gmsl
2009-01-17,58.4
2009-02-17,59.1
2009-04-16,60.9
2009-06-16,62.3
2009-09-16,64.6
2009-12-16,68.3",sep=",",header=TRUE)
## > df
## date gmsl
## 1 2009-01-17 58.4
## 2 2009-02-17 59.1
## 3 2009-04-16 60.9
## 4 2009-06-16 62.3
## 5 2009-09-16 64.6
## 6 2009-12-16 68.3
Just wondering how could I fill missing month with gmsl
as NaN
for date range from 2009-01
to 2009-12
?只是想知道如何在2009-01
到2009-12
的日期范围内用gmsl
作为NaN
填充缺失的月份?
I have extracted year and month for date column by df$Month_Yr <- format(as.Date(df$date), "%Y-%m")
.我已经通过df$Month_Yr <- format(as.Date(df$date), "%Y-%m")
提取了日期列的年份和月份。
Here's a way to this with tidyr::complete
这是tidyr::complete
的一种方法
library(dplyr)
df %>%
mutate(date = as.Date(date),
first_date = as.Date(format(date, "%Y-%m-01"))) %>%
tidyr::complete(first_date = seq(min(first_date), max(first_date), "1 month"))
# A tibble: 12 x 3
# first_date date gmsl
# <date> <date> <dbl>
# 1 2009-01-01 2009-01-17 58.4
# 2 2009-02-01 2009-02-17 59.1
# 3 2009-03-01 NA NA
# 4 2009-04-01 2009-04-16 60.9
# 5 2009-05-01 NA NA
# 6 2009-06-01 2009-06-16 62.3
# 7 2009-07-01 NA NA
# 8 2009-08-01 NA NA
# 9 2009-09-01 2009-09-16 64.6
#10 2009-10-01 NA NA
#11 2009-11-01 NA NA
#12 2009-12-01 2009-12-16 68.3
You can then decide which column to keep, either first_date
or date
or combine them both.然后,您可以决定保留哪一列, first_date
或date
或将两者结合起来。
data数据
df <- structure(list(date = structure(1:6, .Label = c("2009-01-17",
"2009-02-17", "2009-04-16", "2009-06-16", "2009-09-16", "2009-12-16"
), class = "factor"), gmsl = c(58.4, 59.1, 60.9, 62.3, 64.6,
68.3)), class = "data.frame", row.names = c(NA, -6L))
In base R you could match
(using %in%
) the substr
ings of a seq.Date
.在基础 R 中,您可以match
(使用%in%
) substr
的seq.Date
。
dt.match <- seq.Date(ISOdate(2009, 1, 1), ISOdate(2009, 12, 1), "month")
sub <-
cbind(date=substr(dt.match, 1, 10)[!substr(dt.match, 1, 7) %in% substr(dat$date, 1, 7)],
gmsl=NA)
merge(dat, sub, all=TRUE)
# date gmsl
# 1 2009-01-17 58.4
# 2 2009-02-17 59.1
# 3 2009-03-01 <NA>
# 4 2009-04-16 60.9
# 5 2009-05-01 <NA>
# 6 2009-06-16 62.3
# 7 2009-07-01 <NA>
# 8 2009-08-01 <NA>
# 9 2009-09-16 64.6
# 10 2009-10-01 <NA>
# 11 2009-11-01 <NA>
# 12 2009-12-16 68.3
Data数据
dat <- structure(list(date = c("2009-01-17", "2009-02-17", "2009-04-16",
"2009-06-16", "2009-09-16", "2009-12-16"), gmsl = c(58.4, 59.1,
60.9, 62.3, 64.6, 68.3)), row.names = c(NA, -6L), class = "data.frame")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.