简体   繁体   English

通过划分旧列,同时创建多个新的data.table列

[英]Create multiple new data.table columns simultaneously, by dividing old columns

Is there a better way to do this in data.table? 在data.table中有更好的方法吗?

library(data.table)

n_obs <- 10
df <- data.frame(y=rnorm(n_obs), x1=runif(n_obs, 0, 1000), x2=runif(n_obs, 0, 1000),
                 x3=runif(n_obs, -1000, 1000), x4=runif(n_obs, -1000, 1000))
colnames <- c("x1", "x2", "x3", "x4")
colnames_scaled <- sprintf("%s_scaled", colnames)
df[, colnames_scaled] <- df[, colnames] / 1000  # Create four new columns in one line
all(colnames_scaled %in% names(df))  # True
max(df[, colnames_scaled]) <= 1  # True

## What's the right way to do the same thing with data.table?
dt <- data.table(y=rnorm(n_obs), x1=runif(n_obs, 0, 1000), x2=runif(n_obs, 0, 1000),
                 x3=runif(n_obs, -1000, 1000), x4=runif(n_obs, -1000, 1000))
for(i in seq_along(colnames)) {
    dt[[colnames_scaled[i]]] <- dt[[colnames[i]]] / 1000
}
all(colnames_scaled %in% names(dt))  # True
max(dt[, colnames_scaled, with=F]) <= 1  # True

You can do: 你可以做:

dt[, paste(names(dt)[-1], "scaled", sep="_") := lapply(.SD, `/`, 1000), .SDcols=names(dt)[-1]]

Or, using your vectors colnames (which should actually be named with something else than a R "reserved" name ;-) ) and colnames_scaled : 或者,使用向量colnames (实际上应该用R的“保留”名称;-)以外的名称来命名)和colnames_scaled

dt[, (colnames_scaled) := lapply(.SD, `/`, 1000), .SDcols = colnames]

The checking: 检查:

all(colnames_scaled %in% names(dt))
#[1] TRUE
max(dt[, colnames_scaled, with=F]) <= 1
#[1] TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM