in R: Setting new Values in a data.table fast

Question

I am trying to set values to a data.table in an efficient way. The following code will do what I want, but it is too slow for large datasets:

DTcars<-as.data.table(mtcars)
for(i in 1:(dim(DTcars)[1]-1)){
  for(j in 1:dim(DTcars)[2]){
    if(DTcars[i,j, with=F]>10){
      set(DTcars,
          i=as.integer(i),
          j =as.integer(j)  ,
          value = DTcars[dim(DTcars)[1],j,with=F])
    }
  }
}

And I want something like this... which is totally a wrong code, but expresses my need and I think it would be faster. Meaning that I want to subset my data.table and insert the same value for a particular column and repeat for each column.

DTcars<-as.data.table(mtcars)
ns<-names(DTcars)
for(j in 1:length(ns)){
  DTcars[ns[j]>10]<-DTcars[20,ns[j]]
}

Answer 1

I think you're looking for

for (j in names(DTcars)) set(DTcars,
  i     = which(DTcars[[j]]>10),
  j     = j,
  value = tail(DTcars[[j]],1)
)

The column numbers or names can be used as the for iterator here.

The value changes between the two pieces of code in the OP, so I'm not sure about that.

Answer 2

IMO set should be used sparingly, and regular := is sufficient almost always:

for (col in names(DTcars))
  DTcars[get(col) > 10, (col) := get(col)[.N]]

in R: Setting new Values in a data.table fast

Question

2 answers

solution1
3 2015-07-23 15:02:18

solution2
2 ACCPTED 2015-07-23 16:07:47

in R: Setting new Values in a data.table fast

Question

2 answers

solution1 3 2015-07-23 15:02:18

solution2 2 ACCPTED 2015-07-23 16:07:47

solution1
3 2015-07-23 15:02:18

solution2
2 ACCPTED 2015-07-23 16:07:47