在R中：快速在data.table中设置新值

Question

I am trying to set values to a data.table in an efficient way. 我试图以一种有效的方式将值设置为data.table。 The following code will do what I want, but it is too slow for large datasets: 以下代码可以完成我想要的操作，但是对于大型数据集来说太慢了：

DTcars<-as.data.table(mtcars)
for(i in 1:(dim(DTcars)[1]-1)){
  for(j in 1:dim(DTcars)[2]){
    if(DTcars[i,j, with=F]>10){
      set(DTcars,
          i=as.integer(i),
          j =as.integer(j)  ,
          value = DTcars[dim(DTcars)[1],j,with=F])
    }
  }
}

And I want something like this... which is totally a wrong code, but expresses my need and I think it would be faster. 我想要这样的代码……这完全是错误的代码，但是表达了我的需求，我认为这样会更快。 Meaning that I want to subset my data.table and insert the same value for a particular column and repeat for each column. 这意味着我想对我的data.table进行子集化，并为特定列插入相同的值，并为每一列重复。

DTcars<-as.data.table(mtcars)
ns<-names(DTcars)
for(j in 1:length(ns)){
  DTcars[ns[j]>10]<-DTcars[20,ns[j]]
}

Answer 1

I think you're looking for 我想你在找

for (j in names(DTcars)) set(DTcars,
  i     = which(DTcars[[j]]>10),
  j     = j,
  value = tail(DTcars[[j]],1)
)

The column numbers or names can be used as the for iterator here. 列号或名称可用作此处的for迭代器。

The value changes between the two pieces of code in the OP, so I'm not sure about that. 该value在OP中的两段代码之间变化，因此我不确定。

Answer 2

IMO set should be used sparingly, and regular := is sufficient almost always: 应当谨慎使用IMO set ，而常规:=几乎总是足够的：

for (col in names(DTcars))
  DTcars[get(col) > 10, (col) := get(col)[.N]]

在R中：快速在data.table中设置新值

问题描述

2 个解决方案

解决方案1
3 2015-07-23 15:02:18

解决方案2
2 已采纳 2015-07-23 16:07:47

在R中：快速在data.table中设置新值

问题描述

2 个解决方案

解决方案1 3 2015-07-23 15:02:18

解决方案2 2 已采纳 2015-07-23 16:07:47

解决方案1
3 2015-07-23 15:02:18

解决方案2
2 已采纳 2015-07-23 16:07:47