[英]Adding the Century to 2-Digit Year
I currently have a df that looks like我目前有一个看起来像的 df
STA YR MO DA MAX date
58716 33013 43 3 11 60 0043-03-11
58717 33013 43 3 12 55 0043-03-12
58718 33013 43 3 13 63 0043-03-13
58719 33013 43 3 14 50 0043-03-14
58720 33013 43 3 15 58 0043-03-15
58721 33013 43 3 16 63 0043-03-16
I did df$date <- as.Date(with(df, paste(YR, MO, DA,sep="-")), "%Y-%m-%d")
as you can see to get the date column, but clearly because there's no '19' in front of the year column, the year in the date comes out wacky.如您所见,我做了
df$date <- as.Date(with(df, paste(YR, MO, DA,sep="-")), "%Y-%m-%d")
来获取日期列,但显然因为年份列前面没有“19”,日期中的年份显得古怪。 These are all 19xx dates.这些都是 19xx 日期。 What would be a good way to fix this?
什么是解决这个问题的好方法?
Try尝试
df$date <- as.Date(with(df, paste(1900+YR, MO, DA,sep="-")), "%Y-%m-%d")
You should use %y
since you have two digit year.你应该使用
%y
因为你有两位数的年份。
df$date <- as.Date(with(df, paste(YR, MO, DA,sep="-")), "%y-%m-%d")
However, this doesn't solve your problem since anything less than 69 is prefixed with 20 in 2 digit-years so 43 becomes 2043.但是,这并不能解决您的问题,因为小于 69 的任何东西都会在 2 位数年份中以 20 为前缀,因此 43 变为 2043。
If you know that all your years are in the form of 19XX
, you can do如果你知道你所有的年份都是
19XX
的形式,你可以这样做
df$date <- as.Date(with(df, sprintf('19%d-%d-%d', YR, MO, DA)))
If your years contain a mixture of 2-digit years from more than one century, then this code converts them all into valid dates in the past (no future dates).如果您的年份包含超过一个世纪的 2 位数年份,则此代码会将它们全部转换为过去的有效日期(没有未来日期)。
dates_y2Y <- function(y,m,d) {
library(stringr)
y <- stringr::str_pad(y, width=2, pad="0")
m <- stringr::str_pad(m, width=2, pad="0")
d <- stringr::str_pad(d, width=2, pad="0")
toyear <- format(Sys.Date(), "%y")
tomnth <- format(Sys.Date(), "%m")
today <- format(Sys.Date(), "%d")
as.Date(
ifelse(y<toyear | y==toyear & m<tomnth | y==toyear & m==tomnth & d<=today,
as.Date(paste(y,m,d,sep="-"), format="%y-%m-%d"),
as.Date(paste(paste0("19",y),m,d,sep="-"), format="%Y-%m-%d"))
, origin="1970-01-01")
}
df$date <- dates_y2Y(df$YR, df$MO, df$DA)
df
STA YR MO DA date
1 33013 23 1 31 1923-01-31
2 33013 43 2 30 <NA>
3 33013 63 5 5 1963-05-05
4 33013 83 7 27 1983-07-27
5 33013 3 12 9 2003-12-09
6 33013 20 4 21 2020-04-21
7 33013 20 4 22 1920-04-22
Data :数据:
df <- structure(list(STA = c(33013L, 33013L, 33013L, 33013L, 33013L,
33013L, 33013L), YR = c(23L, 43L, 63L, 83L, 3L, 20L, 20L), MO = c(1L,
2L, 5L, 7L, 12L, 4L, 4L), DA = c(31L, 30L, 5L, 27L, 9L, 21L,
22L), date = structure(c(-17137, NA, -2433, 4955, 12395, 18373,
-18151), class = "Date")), row.names = c(NA, -7L), class = "data.frame")
another solution另一种解决方案
library(lubridate)
df %>%
mutate(date = make_date(year = 1900 + YR, month = MO, day = DA))
Another option with sprintf
sprintf
的另一个选择
df$date <- as.Date(do.call(sprintf, c(f = '19%d-%d-%d', df[2:4])))
Or with unite
或与
unite
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(YR = str_c('19', YR)) %>%
unite(date, YR, MO, DA, sep="-", remove = FALSE) %>%
mutate(date = as.Date(date))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.