简体   繁体   English

R-data.table和循环

[英]R - data.table and loops

I have a dataset named "dt" that is in data.table format. 我有一个名为“ dt”的数据集,它是data.table格式的。 I am attempting to create and append multiple variables to this dataset, based on existing variables in the same dataset, as follows: 我正在尝试基于同一数据集中的现有变量创建多个变量并将其附加到此数据集,如下所示:

for (i in 1:17){
dt[, list(tmp_var[i] = Dose[i] * Freq[i] * (NA^!grepl("^12345",DRUG[i])))]
}

In words, search for the integer 12345 in dt and wherever found, assign the product of the following two columns (corresponding to the same row) to a new variable: tmp_var[i]. 换句话说,在dt中搜索整数12345,然后在找到的任何地方,将以下两列(对应于同一行)的乘积分配给新变量:tmp_var [i]。

This doesn't seem to work and the resulting error message reads: Error: unexpected '}' in "}" 这似乎不起作用,并且产生的错误消息显示为: 错误:“}”中出现意外的'}'

Can someone spot the problem or suggest an alternative method? 有人可以发现问题或提出替代方法吗?

Thank you. 谢谢。

One option you could use is get assuming that the Dose , Drug etc are separate objects that has the corresponding column names. 你可以使用一个选项get假设DoseDrug等是有相应的列名单独的对象。 For a 3 column case: 对于3列的情况:

Dose <- paste0("Dose", 1:3)
Freq <- paste0("Freq", 1:3)
Drug <- paste0("Drug", 1:3)
tmp_var <- paste0("New_Var", 1:3)

 for(i in 1:3){
 dt[, (tmp_var[i]):= get(Dose[i]) * get(Freq[i])
                          *(NA^!grepl("^12345",get(Drug[i])))]
 }

But, I would use dt[[Dose[i]]] instead of this 但是,我将使用dt[[Dose[i]]]代替

 dt
 #        Drug1 Dose1 Freq1      Drug2 Dose2 Freq2      Drug3 Dose3 Freq3
 #1: 1234567890     2     1 1548768954    23   2.0 2222132435     2     2
 #2: 4356678344     2     2 6547894356     3   1.0 2123456789     2     2
 #3: 5673452976     4     1 1234567890     4   0.5 4568789076    33     4
 #  New_Var1 New_Var2 New_Var3
 #1:        2       NA       NA
 #2:       NA       NA       NA
 #3:       NA        2       NA

Update 更新资料

Another option would be to use eval which is faster than get 另一种选择是使用eval ,它比get更快

 for(i in 1:3){
   Dose <- as.symbol(paste0('Dose', i))
   Freq <- as.symbol(paste0('Freq',i))
   Drug <- as.symbol(paste0('Drug', i))
   dt[,(tmp_var[i]):= eval(Dose)*eval(Freq)*
                   (NA^!grepl('^12345', eval(Drug)))]
    }

 dt
 #        Drug1 Dose1 Freq1      Drug2 Dose2 Freq2      Drug3 Dose3 Freq3
 #1: 1234567890     2     1 1548768954    23   2.0 2222132435     2     2
 #2: 4356678344     2     2 6547894356     3   1.0 2123456789     2     2
 #3: 5673452976     4     1 1234567890     4   0.5 4568789076    33     4
 #   New_Var1 New_Var2 New_Var3
 #1:        2       NA       NA
 #2:       NA       NA       NA
 #3:       NA        2       NA

data 数据

 df <- structure(list(Drug1 = c(1234567890, 4356678344, 5673452976), 
 Dose1 = c(2L, 2L, 4L), Freq1 = c(1L, 2L, 1L), Drug2 = c(1548768954, 
 6547894356, 1234567890), Dose2 = c(23L, 3L, 4L), Freq2 = c(2, 
 1, 0.5), Drug3 = c(2222132435, 2123456789, 4568789076), Dose3 = c(2L, 
 2L, 33L), Freq3 = c(2L, 2L, 4L)), .Names = c("Drug1", "Dose1", 
 "Freq1", "Drug2", "Dose2", "Freq2", "Drug3", "Dose3", "Freq3"
 ), class = "data.frame", row.names = c(NA, -3L))    

 dt <- as.data.table(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM