简体   繁体   English

从两种不同的日期格式中提取年份

[英]Extracting year from two different date format

I have column say x which has two different date formats 12/31/1998 and 12/--/98 . 我有说x列,它具有两种不同的日期格式12/31/199812/--/98 As you can see, in the second format date is missing and year is in 2 digits. 如您所见,在第二种格式中, date缺失, year以2位数字表示。

I need to extract year from all the dates in my column. 我需要从列中的所有日期中提取year So, when I am using Year<- data.frame(format(df$x, "%Y")) it returning year for first format. 因此,当我使用Year<- data.frame(format(df$x, "%Y"))它返回第一种格式的year For second format, it is returning NA . 对于第二种格式,它将返回NA

I would appreciate all the help. 我将不胜感激。 Thanks. 谢谢。

You could get a bit creative and specify an ugly format for the missing data, and then just keep one of the valid responses: 您可能会有点创意,并为丢失的数据指定丑陋的格式,然后保留有效的响应之一:

vals <- c("12/31/1998", "12/--/98")
out <- pmax(
         as.Date(vals, "%m/%d/%Y"),
         as.Date(paste0("01",vals), "%d%m/--/%y"),
         na.rm=TRUE
       )
format(out, "%Y")
#[1] "1998" "1998"

If they are all in the format where the year is the last number after "/" you can use basename . 如果它们全部采用格式,其中年份是“ /”后的最后一个数字,则可以使用basename Then you just need to covert the 2 character years to a four year format: 然后,您只需要将2个字符的年份转换为四年格式即可:

vals <- c("12/31/1998", "12/--/98", "68", "69")
yrs <- basename(vals)
yrs <- ifelse(nchar(yrs) == 2, format(as.Date(yrs, format = "%y"), "%Y"), yrs)
yrs
# [1] "1998" "1998" "2068" "1969"

The issue is it does not work with dates older than 1969. 问题是它不适用于早于1969年的日期。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM