简体   繁体   中英

R - data.table and loops

I have a dataset named "dt" that is in data.table format. I am attempting to create and append multiple variables to this dataset, based on existing variables in the same dataset, as follows:

for (i in 1:17){
dt[, list(tmp_var[i] = Dose[i] * Freq[i] * (NA^!grepl("^12345",DRUG[i])))]
}

In words, search for the integer 12345 in dt and wherever found, assign the product of the following two columns (corresponding to the same row) to a new variable: tmp_var[i].

This doesn't seem to work and the resulting error message reads:

Can someone spot the problem or suggest an alternative method?

Thank you.

One option you could use is get assuming that the Dose , Drug etc are separate objects that has the corresponding column names. For a 3 column case:

Dose <- paste0("Dose", 1:3)
Freq <- paste0("Freq", 1:3)
Drug <- paste0("Drug", 1:3)
tmp_var <- paste0("New_Var", 1:3)

 for(i in 1:3){
 dt[, (tmp_var[i]):= get(Dose[i]) * get(Freq[i])
                          *(NA^!grepl("^12345",get(Drug[i])))]
 }

But, I would use dt[[Dose[i]]] instead of this

 dt
 #        Drug1 Dose1 Freq1      Drug2 Dose2 Freq2      Drug3 Dose3 Freq3
 #1: 1234567890     2     1 1548768954    23   2.0 2222132435     2     2
 #2: 4356678344     2     2 6547894356     3   1.0 2123456789     2     2
 #3: 5673452976     4     1 1234567890     4   0.5 4568789076    33     4
 #  New_Var1 New_Var2 New_Var3
 #1:        2       NA       NA
 #2:       NA       NA       NA
 #3:       NA        2       NA

Update

Another option would be to use eval which is faster than get

 for(i in 1:3){
   Dose <- as.symbol(paste0('Dose', i))
   Freq <- as.symbol(paste0('Freq',i))
   Drug <- as.symbol(paste0('Drug', i))
   dt[,(tmp_var[i]):= eval(Dose)*eval(Freq)*
                   (NA^!grepl('^12345', eval(Drug)))]
    }

 dt
 #        Drug1 Dose1 Freq1      Drug2 Dose2 Freq2      Drug3 Dose3 Freq3
 #1: 1234567890     2     1 1548768954    23   2.0 2222132435     2     2
 #2: 4356678344     2     2 6547894356     3   1.0 2123456789     2     2
 #3: 5673452976     4     1 1234567890     4   0.5 4568789076    33     4
 #   New_Var1 New_Var2 New_Var3
 #1:        2       NA       NA
 #2:       NA       NA       NA
 #3:       NA        2       NA

data

 df <- structure(list(Drug1 = c(1234567890, 4356678344, 5673452976), 
 Dose1 = c(2L, 2L, 4L), Freq1 = c(1L, 2L, 1L), Drug2 = c(1548768954, 
 6547894356, 1234567890), Dose2 = c(23L, 3L, 4L), Freq2 = c(2, 
 1, 0.5), Drug3 = c(2222132435, 2123456789, 4568789076), Dose3 = c(2L, 
 2L, 33L), Freq3 = c(2L, 2L, 4L)), .Names = c("Drug1", "Dose1", 
 "Freq1", "Drug2", "Dose2", "Freq2", "Drug3", "Dose3", "Freq3"
 ), class = "data.frame", row.names = c(NA, -3L))    

 dt <- as.data.table(df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM