[英]Create multiple new data.table columns simultaneously, by dividing old columns
Is there a better way to do this in data.table? 在data.table中有更好的方法吗?
library(data.table)
n_obs <- 10
df <- data.frame(y=rnorm(n_obs), x1=runif(n_obs, 0, 1000), x2=runif(n_obs, 0, 1000),
x3=runif(n_obs, -1000, 1000), x4=runif(n_obs, -1000, 1000))
colnames <- c("x1", "x2", "x3", "x4")
colnames_scaled <- sprintf("%s_scaled", colnames)
df[, colnames_scaled] <- df[, colnames] / 1000 # Create four new columns in one line
all(colnames_scaled %in% names(df)) # True
max(df[, colnames_scaled]) <= 1 # True
## What's the right way to do the same thing with data.table?
dt <- data.table(y=rnorm(n_obs), x1=runif(n_obs, 0, 1000), x2=runif(n_obs, 0, 1000),
x3=runif(n_obs, -1000, 1000), x4=runif(n_obs, -1000, 1000))
for(i in seq_along(colnames)) {
dt[[colnames_scaled[i]]] <- dt[[colnames[i]]] / 1000
}
all(colnames_scaled %in% names(dt)) # True
max(dt[, colnames_scaled, with=F]) <= 1 # True
You can do: 你可以做:
dt[, paste(names(dt)[-1], "scaled", sep="_") := lapply(.SD, `/`, 1000), .SDcols=names(dt)[-1]]
Or, using your vectors colnames
(which should actually be named with something else than a R "reserved" name ;-) ) and colnames_scaled
: 或者,使用向量
colnames
(实际上应该用R的“保留”名称;-)以外的名称来命名)和colnames_scaled
:
dt[, (colnames_scaled) := lapply(.SD, `/`, 1000), .SDcols = colnames]
The checking: 检查:
all(colnames_scaled %in% names(dt))
#[1] TRUE
max(dt[, colnames_scaled, with=F]) <= 1
#[1] TRUE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.