简体   繁体   中英

R how to replace/gsub a vector of values by another vector of values in a datatable

I have data with dates in a not directly usable format. I have data that are either annual, quaterly or mensual. Annual are stored correctly, quaterly are in the form 1Q2010 , and monthly JAN2010 . So something like

library(tidyverse)
library(data.table)

MWE <- data.table(date=c("JAN2020","FEB2020","1Q2020","2020"),
                  value=rnorm(4,2,1))

> MWE
      date     value
1: JAN2020 2.5886057
2: FEB2020 0.5913031
3:  1Q2020 1.6237973
4:    2020 1.4093762

I want to have them in a standard format. I thing a decently readable way to do that is to replace the non standard elements, so to have these elements :

Date_Brute <- c("JAN","FEB","MAR","APR","MAY","JUN","JUL","AUG","SEP","OCT","NOV","DEC","1Q","2Q","3Q","4Q")

Replaced by these ones

Date_Standardisee <- c("01-01","01-02","01-03","01-04","01-05","01-06","01-07", "01-08","01-09","01-10","01-11","01-12","01-01","01-04","01-07","01-10")

Now I think gsub does not work with vectors. I have found this answer that suggests using stingr::str_replace_all but I have not been able to make it function in a data.table .

I am open to other functions to replace a vector by another one, but would like to avoid for instance slicing the data, and using specific date lectures functions.

Desired output :

> MWE
      date     value
1: 01-01-2020 2.5886057
2: 01-02-2020 0.5913031
3: 01-01-2020 1.6237973
4: 2020       1.4093762

We can use grep with as.yearqtr and as.yearmon to convert those 'date' elements into Date class and further change it to the specified format

library(zoo)
library(data.table)
MWE[grep('Q', date), date := format(as.Date(as.yearqtr(date, 
             '%qQ %Y')), '%d-%m-%Y')]
MWE[grep("[A-Z]", date), date := format(as.Date(as.yearmon(date)), '%d-%m-%Y')]

-output

MWE
#         date     value
#1: 01-01-2020 0.8931051
#2: 01-02-2020 2.9813625
#3: 01-01-2020 1.1918638
#4:       2020 2.8001267

Or another option is fcoalecse with myd from lubridate

library(lubridate)
MWE[, date := fcoalesce(format(myd(date, truncated = 2), '%d-%m-%Y'), date)]

You can try with lubridate::parse_date_time() and which takes a vector of candidate formats to attempt in the conversion:

library(lubridate)
library(data.table)

MWE[, date := parse_date_time(date, orders = c("bY","qY", "Y"))]

         date      value
1: 2020-01-01 -0.4948354
2: 2020-02-01  1.0227036
3: 2020-01-01  2.6285688
4: 2020-01-01  1.9158595

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM