简体   繁体   中英

Efficient way to recode multiple date values in R

I have a quite large monthly database where the dates are recorded in a poorly way.

For instance, for January 2000, the value is "200001". So I have values ranging from "200001" to "200012". To make matters worse, each month is recorded in a different .csv file.

First, I loaded all .csv files together, creating a list I called "tbl". So tbl[[1]] returns the values for the month of January, for example. What I need is to encounter an efficient way to revalue "20000i" to "2000-01-0i", where i goes from 1 to 12, and then converting those values to date format.

What I've tried is:

for (i in length(tbl)) {
  if (i < 10) {
    tbl[[i]]$DATA %>% as.character() %>% revalue(c(paste0("20000",i) = paste0("2000-01-0",i))) %>%  as.Date() -> tbl[[i]]$DATA
  } else {
    tbl[[i]]$DATA %>% as.character() %>% revalue(c(paste0("2000",i) = paste0("2000-01-",i))) %>%  as.Date() -> tbl[[i]]$DATA
  }
}

This approach is not working and return the following error: Error: unexpected '=' in " tbl[[i]]$DATA %>% as.character() %>% revalue(c(paste0("2000",i) ="

Does anybody have a better idea?

EDIT: an example of my data

list(c("200001", "200001", "200001", "200001", "200001", "200001","200001", "200001", "200001", "200001", "200001", "200001"), 
c("200002", "200002", "200002", "200002", "200002", "200002", 
"200002", "200002", "200002", "200002", "200002", "200002"
), c("200003", "200003", "200003", "200003", "200003", "200003", 
"200003", "200003", "200003", "200003", "200003", "200003"
), c("200004", "200004", "200004", "200004", "200004", "200004", 
"200004", "200004", "200004", "200004", "200004", "200004"
), c("200005", "200005", "200005", "200005", "200005", "200005", 
"200005", "200005", "200005", "200005", "200005", "200005"
), c("200006", "200006", "200006", "200006", "200006", "200006", 
"200006", "200006", "200006", "200006", "200006", "200006"
), c("200007", "200007", "200007", "200007", "200007", "200007", 
"200007", "200007", "200007", "200007", "200007", "200007"
), c("200008", "200008", "200008", "200008", "200008", "200008", 
"200008", "200008", "200008", "200008", "200008", "200008"
), c("200009", "200009", "200009", "200009", "200009", "200009", 
"200009", "200009", "200009", "200009", "200009", "200009"
), c("200010", "200010", "200010", "200010", "200010", "200010", 
"200010", "200010", "200010", "200010", "200010", "200010"
), c("200011", "200011", "200011", "200011", "200011", "200011", 
"200011", "200011", "200011", "200011", "200011", "200011"
), c("200012", "200012", "200012", "200012", "200012", "200012", 
"200012", "200012", "200012", "200012", "200012", "200012"
))

In order to convert your input into a date object you will need to add a day onto the yearmonth and then use the proper format:

for (i in 1:length(tbl)) {
   tbl[[i]]$DATA <- as.Date(paste(tbl[[i]]$DATA, 01), "%Y%m %d")
}

This will make every input the first day or the month. For just a dozen itens, a for loop is a quick enough.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM