I am trying to make an operation conditional on the name of a column in a data.table. With below example I try to illustrate what I mean. We have a DT
with two columns carrot and banana. Each of these columns contains values. I want now that the carrot values are multiplied by 2 and that the banana values are divided by 2. My code, however, does not work, because names(.SD)
is a vector of length 2 ( names(DT)
). is there a way I can make this work with lapply()
?
carrot <- 1:5
banana <- 1:5
DT <- data.table(carrot, banana)
DT[, lapply(.SD, function(x) if(names(.SD) == 'carrot') {x * 2} else {x / 2}), .SDcols = names(DT)]
Do you have to do it in one operation? Multiple operations is cleaner I think eg
carrot <- 1:5
banana <- 1:5
DT <- data.table(carrot, banana)
# simplest way, assigning back to original value (or new columns)
DT[, carrot := carrot*2]
DT[, banana := banana/2]
# lapply way - do it twice
DT <- data.table(carrot, banana)
cols1 <- "carrot"
cols2 <- "banana"
# forms new unassigned tables
DT[, lapply(.SD, function(x) x*2), .SDcols=cols1]
DT[, lapply(.SD, function(x) x/2), .SDcols=cols2]
# can also assign back in to DT
DT[, (cols1) := lapply(.SD, function(x) x*2), .SDcols=cols1]
DT[]
DT[, (cols2) := lapply(.SD, function(x) x/2), .SDcols=cols2]
DT[]
The question/answer Access lapply index names inside FUN provided me with inspiration for a solution:
DT[, lapply(seq_along(names(.SD)),
function(y, n, i) if(n[[i]] == 'carrot') {y[[i]] * 2} else {y[[i]] / 2},
y = .SD,
n = names(.SD)),
.SDcols = names(DT)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.