简体   繁体   English

在 R 中查找四位数字并将其转换为日历日期

[英]Find four digit numbers and convert them to calendary date in R

I have a dataframe column that contains a mixture of date formats, for example 30/06/2020,07/2020 and 2020. I would like to convert the four digit numbers into a date (eg 2020 -> XX/XX/2020).我有一个 dataframe 列,其中包含混合日期格式,例如 30/06/2020、07/2020 和 2020。我想将四位数字转换为日期(例如 2020 -> XX/XX/2020) . I have different years, not just 2020, so I would prefer, if possible, a generic expression.我有不同的年份,而不仅仅是 2020 年,所以如果可能的话,我更喜欢通用的表达方式。

A supplementary question: when I read the data from an excel file, I get five-digit numbers instead of dates.一个补充问题:当我从 excel 文件中读取数据时,我得到的是五位数字而不是日期。 From what I have read, these numbers are the days passed since 1900. Hence, the actual column involves also five-digit numbers, the four-digit numbers that represent the year, and the other days.根据我的阅读,这些数字是自 1900 年以来经过的天数。因此,实际列还包含五位数字、代表年份的四位数字和其他日期。 I have dealed with that issue, but not in an optimal way.我已经处理了这个问题,但不是以最佳方式。 Is there a generic way to deal all these formats together?有没有一种通用的方法来处理所有这些格式? Sorry for the large post对不起,大帖子

K ķ

Thank you all for your ideas.谢谢大家的想法。 You are right, I need to be more specific next time.你是对的,我下次需要更具体。 I focused on solving the problem to be honest I believe I did it.老实说,我专注于解决问题,我相信我做到了。

Regarding the data, a simple illustration might be the following:关于数据,一个简单的说明可能如下:

date
08/2003
12/06/2002
38054
2004
...
...
...

First, I found which elements of the dataframe column (RHO_DataBase$date) are expressed as a year (eg 2003) and convert them to date (eg 15/05/2003):首先,我发现 dataframe 列(RHO_DataBase$date)的哪些元素表示为年份(例如 2003)并将它们转换为日期(例如 15/05/2003):

#Step 1
counter1 <- which( (!is.na(as.numeric(RHO_DataBase$date))) & (as.numeric(RHO_DataBase$date)<2030)  )
for (i in counter1) {
  RHO_DataBase$date[i] <- paste ("15/05/",sep="",RHO_DataBase$date[i])
}

Then, I found which elements are expressed in numeric values (days since 30/12/1899), and convert their format to day/month/year然后,我找到了哪些元素以数值表示(自 1899 年 12 月 30 日以来的天数),并将它们的格式转换为日/月/年

#Step 2
counter2 <- which(!is.na(as.numeric(RHO_DataBase$date)))
for (i in counter2) {
  RHO_DataBase$date[i] <- format(as.Date(as.numeric(RHO_DataBase$date[i]), origin = "1899-12-30"),'%d/%m/%Y')
}

Then, I found the elements of the column that are expressed in the other remaining format, in this case only month/year, and change it to the day/month/year using paste.然后,我找到了以其他剩余格式表示的列元素,在这种情况下只有月/年,并使用粘贴将其更改为日/月/年。

# Step 3:
counter3<-which(is.na(as.Date( RHO_DataBase$date, "%d/%m/%Y") ) )
for (i in counter3) {
  RHO_DataBase$date[i] <- paste ("01/",sep="",RHO_DataBase$date[i])
} 

Cheers, K干杯,K

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM