简体   繁体   中英

R: data.table using a for loop to wrangle multiple columns

I am currently working in R to build a for loop which will add the year to 7 columns that contain partial dates (dd/mm). I have been attempting to run the following for-loop and have not been successful. What am I doing wrong?

Here's a sample of what my data set looks like (The actual data set includes columns HomDate - HomDate_7 but I only included the first few as I know you'll get the point...)

    Participant  DateVisit  HomDate  HomDate_2  HomeDate_3  year_flag
    1            2012-04-25 18/04    19/04      20/04       NA
    2            2012-01-04 28/12    29/12      30/12       1
    3            2012-01-05 31/12    01/01      01/02       1
    4            2012-06-13 06/06    07/06      08/06       NA
    5            2012-02-12 05/02    06/02      07/02       NA

Here's the code I've been trying to use:

   hom_date <- list("HomDate", "HomDate_2", "HomDate_3", "HomDate_4", "HomDate_5", "HomDate_6",         
   "HomDate_7")
   set_dates <- function(x){
   home_morbid[,x:=as.character(x)]
   home_morbid[(substr(x, 4, 5)==12) & (year_flag==1), x:=paste(x, "/2011", sep="")]
   home_morbid[(substr(x, 4, 5)==01) & (year_flag==1), x:=paste(x, "/2012", sep="")]
   home_morbid[is.na(year_flag), x:=paste(x, "/", substr(DateVisit, 1, 4), sep="")]
    }

   for(i in 1:length(hom_date)){
     x <- hom_date[i]
     home_morbid_2<-set_dates(x)
    }

I'm not sure what happens to those with an NA flag. Here is an approach:

    to_replace<-grep("^Hom",names(df))
df[,(to_replace):=lapply(.SD, function(x) ifelse(is.na(year_flag),x,
       ifelse(substr(x, 4, 5)==12,
                               paste0(x,"/","2011"),
                                         paste0(x,"/","2012")))),
    .SDcols=HomDate:HomeDate_3][]
   Participant  DateVisit    HomDate  HomDate_2 HomeDate_3 year_flag
1:           1 2012-04-25      18/04      19/04      20/04        NA
2:           2 2012-01-04 28/12/2011 29/12/2011 30/12/2011         1
3:           3 2012-01-05 31/12/2011 01/01/2012 01/02/2012         1
4:           4 2012-06-13      06/06      07/06      08/06        NA
5:           5 2012-02-12      05/02      06/02      07/02        NA

To replace NA flagged years with the year from DateVisit :

 library(lubridate)
 to_replace<-grep("^Hom",names(df))
 df[,(to_replace):=lapply(.SD, function(x) ifelse(is.na(year_flag),
                             paste0(x,"/",year(ymd(DateVisit))),
                                            ifelse(substr(x, 4, 5)==12,
                                            paste0(x,"/","2011"),
                                                paste0(x,"/","2012")))),
   .SDcols=HomDate:HomeDate_3][]
   Participant  DateVisit    HomDate  HomDate_2 HomeDate_3 year_flag
1:           1 2012-04-25 18/04/2012 19/04/2012 20/04/2012        NA
2:           2 2012-01-04 28/12/2011 29/12/2011 30/12/2011         1
3:           3 2012-01-05 31/12/2011 01/01/2012 01/02/2012         1
4:           4 2012-06-13 06/06/2012 07/06/2012 08/06/2012        NA
5:           5 2012-02-12 05/02/2012 06/02/2012 07/02/2012        NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM