简体   繁体   English

将世纪添加到 2 位数的年份

[英]Adding the Century to 2-Digit Year

I currently have a df that looks like我目前有一个看起来像的 df

        STA YR MO DA MAX       date
58716 33013 43  3 11  60 0043-03-11
58717 33013 43  3 12  55 0043-03-12
58718 33013 43  3 13  63 0043-03-13
58719 33013 43  3 14  50 0043-03-14
58720 33013 43  3 15  58 0043-03-15
58721 33013 43  3 16  63 0043-03-16

I did df$date <- as.Date(with(df, paste(YR, MO, DA,sep="-")), "%Y-%m-%d") as you can see to get the date column, but clearly because there's no '19' in front of the year column, the year in the date comes out wacky.如您所见,我做了df$date <- as.Date(with(df, paste(YR, MO, DA,sep="-")), "%Y-%m-%d")来获取日期列,但显然因为年份列前面没有“19”,日期中的年份显得古怪。 These are all 19xx dates.这些都是 19xx 日期。 What would be a good way to fix this?什么是解决这个问题的好方法?

Try尝试

df$date <- as.Date(with(df, paste(1900+YR, MO, DA,sep="-")), "%Y-%m-%d")

You should use %y since you have two digit year.你应该使用%y因为你有两位数的年份。

df$date <- as.Date(with(df, paste(YR, MO, DA,sep="-")), "%y-%m-%d")

However, this doesn't solve your problem since anything less than 69 is prefixed with 20 in 2 digit-years so 43 becomes 2043.但是,这并不能解决您的问题,因为小于 69 的任何东西都会在 2 位数年份中以 20 为前缀,因此 43 变为 2043。

If you know that all your years are in the form of 19XX , you can do如果你知道你所有的年份都是19XX的形式,你可以这样做

df$date <- as.Date(with(df, sprintf('19%d-%d-%d', YR, MO, DA)))

If your years contain a mixture of 2-digit years from more than one century, then this code converts them all into valid dates in the past (no future dates).如果您的年份包含超过一个世纪的 2 位数年份,则此代码会将它们全部转换为过去的有效日期(没有未来日期)。

dates_y2Y <- function(y,m,d) {
  library(stringr)
  y <- stringr::str_pad(y, width=2, pad="0")
  m <- stringr::str_pad(m, width=2, pad="0")
  d <- stringr::str_pad(d, width=2, pad="0")

  toyear <- format(Sys.Date(), "%y")
  tomnth <- format(Sys.Date(), "%m")
  today  <- format(Sys.Date(), "%d")

  as.Date(
    ifelse(y<toyear | y==toyear & m<tomnth | y==toyear & m==tomnth & d<=today,
           as.Date(paste(y,m,d,sep="-"), format="%y-%m-%d"),
           as.Date(paste(paste0("19",y),m,d,sep="-"), format="%Y-%m-%d"))
    , origin="1970-01-01")
}

df$date <- dates_y2Y(df$YR, df$MO, df$DA)
df

    STA YR MO DA       date
1 33013 23  1 31 1923-01-31
2 33013 43  2 30       <NA>
3 33013 63  5  5 1963-05-05
4 33013 83  7 27 1983-07-27
5 33013  3 12  9 2003-12-09
6 33013 20  4 21 2020-04-21
7 33013 20  4 22 1920-04-22

Data :数据

df <- structure(list(STA = c(33013L, 33013L, 33013L, 33013L, 33013L, 
33013L, 33013L), YR = c(23L, 43L, 63L, 83L, 3L, 20L, 20L), MO = c(1L, 
2L, 5L, 7L, 12L, 4L, 4L), DA = c(31L, 30L, 5L, 27L, 9L, 21L, 
22L), date = structure(c(-17137, NA, -2433, 4955, 12395, 18373, 
-18151), class = "Date")), row.names = c(NA, -7L), class = "data.frame")

another solution另一种解决方案

library(lubridate)
df %>% 
  mutate(date = make_date(year = 1900 + YR, month = MO, day = DA))

Another option with sprintf sprintf的另一个选择

df$date <- as.Date(do.call(sprintf, c(f = '19%d-%d-%d', df[2:4])))

Or with unite或与unite

library(dplyr)
library(tidyr)
library(stringr)
df %>%
  mutate(YR = str_c('19', YR)) %>%
  unite(date, YR, MO, DA, sep="-", remove = FALSE) %>%
  mutate(date = as.Date(date))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM