简体   繁体   English

在R中缩放data.table列的子集

[英]scaling a subset of columns of data.table in R

I would like to scale a subset of columns in my data.table . 我想在data.table scale列的子集。 There are many of these that I would like to scale so i want to avoid specifying them all by name. 我想要scale其中许多,所以我想避免使用名称来指定它们。 The columns that are not being scaled, I would just like to return as is. 没有缩放的列,我只想按原样返回。 Here is what I was hoping would work but it does not: 这是我希望可以工作,但它没有:

require(data.table)
x = data.table(id=1:10, a=sample(1:10,10), b=sample(1:10,10), c=sample(1:10,10))
> dput(x)
structure(list(id = 1:10, a = c(1L, 6L, 10L, 7L, 5L, 3L, 2L, 
4L, 9L, 8L), b = c(4L, 9L, 5L, 7L, 6L, 1L, 8L, 10L, 3L, 2L), 
    c = c(2L, 7L, 5L, 6L, 4L, 1L, 10L, 9L, 8L, 3L)), .Names = c("id", 
"a", "b", "c"), row.names = c(NA, -10L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x1a85d088>)

sx = x[,c(id, lapply(.SD, function(v) as.vector(scale(v)))), .SDcols = colnames(x)[2:4]]
   Error in eval(expr, envir, enclos) : object 'id' not found

Any suggestions? 有什么建议么?

You could also assign by reference in a copy of the data table 您还可以通过引用分配数据表的副本

sc <- names(x)[2:4]

sx <- copy(x)[ , (sc) := as.data.table(scale(.SD)), .SDcols = sc]

scale returns a matrix and iirc data.table doesn't like matrix columns. scale返回一个矩阵,而iirc data.table不喜欢矩阵列。

Or, 要么,

sx <- copy(x)[ , (sc) := lapply(.SD,scale), .SDcols = sc]

[ The brackets around (sc) tell data.table to take the LHS value from the value of the variable in calling scope rather than the column name sc itself. [ (sc)周围的括号告诉data.table从调用范围中的变量值而不是列名sc本身获取LHS值。 ] ]

sx = cbind(x[,-(2:4),with=FALSE],data.table(scale(x[,2:4,with=FALSE])))

我怀疑,您的工作流程会更好地融合您的data.table到长格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM