简体   繁体   中英

Find four digit numbers and convert them to calendary date in R

I have a dataframe column that contains a mixture of date formats, for example 30/06/2020,07/2020 and 2020. I would like to convert the four digit numbers into a date (eg 2020 -> XX/XX/2020). I have different years, not just 2020, so I would prefer, if possible, a generic expression.

A supplementary question: when I read the data from an excel file, I get five-digit numbers instead of dates. From what I have read, these numbers are the days passed since 1900. Hence, the actual column involves also five-digit numbers, the four-digit numbers that represent the year, and the other days. I have dealed with that issue, but not in an optimal way. Is there a generic way to deal all these formats together? Sorry for the large post

K

Thank you all for your ideas. You are right, I need to be more specific next time. I focused on solving the problem to be honest I believe I did it.

Regarding the data, a simple illustration might be the following:

date
08/2003
12/06/2002
38054
2004
...
...
...

First, I found which elements of the dataframe column (RHO_DataBase$date) are expressed as a year (eg 2003) and convert them to date (eg 15/05/2003):

#Step 1
counter1 <- which( (!is.na(as.numeric(RHO_DataBase$date))) & (as.numeric(RHO_DataBase$date)<2030)  )
for (i in counter1) {
  RHO_DataBase$date[i] <- paste ("15/05/",sep="",RHO_DataBase$date[i])
}

Then, I found which elements are expressed in numeric values (days since 30/12/1899), and convert their format to day/month/year

#Step 2
counter2 <- which(!is.na(as.numeric(RHO_DataBase$date)))
for (i in counter2) {
  RHO_DataBase$date[i] <- format(as.Date(as.numeric(RHO_DataBase$date[i]), origin = "1899-12-30"),'%d/%m/%Y')
}

Then, I found the elements of the column that are expressed in the other remaining format, in this case only month/year, and change it to the day/month/year using paste.

# Step 3:
counter3<-which(is.na(as.Date( RHO_DataBase$date, "%d/%m/%Y") ) )
for (i in counter3) {
  RHO_DataBase$date[i] <- paste ("01/",sep="",RHO_DataBase$date[i])
} 

Cheers, K

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM