简体   繁体   中英

Update an entire row in data.table in R

I have a data.table object in R that has 9,000 columns. My code calculates new values for all 9,000 columns at once and returns a vector of values. I'd like to just replace the row in the data.table with all the values at once. In a dataFrame object this is easy. However, I can't figure out how to get that working in a data.table.

d <- data.table(q=c(1,2,3,4,5,6,7,8,9), x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
d[q==1, := c(5,5,5,5)] # FAILS
d[q==1, ] <- c(5,5,5,5) # FAILS

Any idea how to efficiently update the whole row at once?

You could use names(d) for LHS, then use as.list in order to convert your vector to a list so data.table will understand that it need to assign each value to a different column instead all the values to each column.

You are also converting character vector to numeric here (the x column), so data.table will return a warning in order to make sure you are aware of that.

vec <- c(5, 5, 5, 5)
d[q == 1L, names(d) := as.list(vec)][]
#    q x y v
# 1: 5 5 5 5
# 2: 2 a 3 2
# 3: 3 a 6 3
# 4: 4 b 1 4
# 5: 5 b 3 5
# 6: 6 b 6 6
# 7: 7 c 1 7
# 8: 8 c 3 8
# 9: 9 c 6 9

This can also be done using set , for the example above (referencing by row number).

set(d, 1L, names(d), as.list(vec))

You may gain some speed using set instead, but lose some of the advantage if you need to retrieve the row numbers first.

# Create large data table
DT = data.table(col1 = 1:1e5)
cols = paste0('col', 1:9e3)
for (col in cols){ DT[, (col) := 1:1e5] }
vec <- rep(5,9e3)

# Test options
microbenchmark(
  row_idnx <- DT[,.I[col1 == 1L]], # Retrieve row number
  set(DT, row_idnx, names(DT), as.list(vec)),
  DT[col1 == 1L, names(DT) := as.list(vec)]
)

Unit: microseconds
                                          expr      min        lq      mean    median        uq       max neval
              row_idnx <- DT[, .I[col1 == 1L]] 1255.430 1969.5630 2168.9744 2129.2635 2302.1000  3269.947   100
    set(DT, row_idnx, names(DT), as.list(vec))  171.606  207.3235  323.7642  236.6765  274.6515  7725.120   100
 DT[col1 == 1L, `:=`(names(DT), as.list(vec))] 2761.289 2998.3750 3361.7842 3155.8165 3444.6310 13473.081   100

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM