简体   繁体   中英

Are you able to reference columns outside of .SD columns in an R data.table vectorized function?

I have a data.table

library(data.table)
DT <- data.table(
    signal = c(1, -1, -5),
    draw_1 = c(NA, 3, NA),
    draw_2 = c(NA, NA, 2)                 
)
> DT
   signal draw_1 draw_2
1:      1     NA     NA
2:     -1      3     NA
3:     -5     NA      2

And I'd like to replace values of the draw_* columns where:

  1. signal is less than 0
  2. The draw_* column is NA

So the desired result is:

> desired
   signal draw_1 draw_2
1:      1     NA     NA
2:     -1      3     50
3:     -5     50      2

I tried the same approach I normally use for assigning values to groups of columns at a time:

draws <- c("draw_1", "draw_2")
replacement <- 50
DT[,(draws) := ifelse( is.na(.SD) & signal<0, replacement, .SD), .SDcols=draws]

But this results in an error,

Error in `[.data.table`(DT, , `:=`((draws), ifelse(is.na(.SD) & signal <  : 
Supplied 2 columns to be assigned 6 items. Please see NEWS for v1.12.2.

I don't understand what's going wrong here. I'm suspicious that it has to do with the use of signal , a column outside of .SDcols . If what I'm doing isn't possible, is there a better way to accomplish my goal?

We can loop over the columns with lapply as ifelse requires a vector and .SD is Subset of data.table which is basically a list of vectors. In the first argument ie 'test', it changes to logical matrix, but ini the last one ie 'no', it remains as a data.table

library(data.table)
DT[,(draws) := lapply(.SD, function(x)
    fifelse(is.na(x) & signal < 0, replacement, x)), .SDcols = draws]
DT
#   signal draw_1 draw_2
#1:      1     NA     NA
#2:     -1      3     50
#3:     -5     50      2

NOTE: Here, we are using the data.table version of ifelse ie fifelse

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM