简体   繁体   English

R data.table计算新列,但在开头插入

[英]R data.table compute new column, but insert at beginning

In R data.table s, I can use this syntax to add a new column: 在R data.table ,我可以使用此语法添加新列:

> dt <- data.table(a=c(1,2), b=c(3,4))
> dt[, c := a + b]
> dt
   a b c
1: 1 3 4
2: 2 4 6

But how would I insert c at the front of the dt like so: 但是我如何在dt的前面插入c,如下所示:

   c a b
1: 4 1 3
2: 6 2 4

I looked on SO, and found some people suggesting cbind for data.frame s, but it's more convenient for me to use the := syntax here, so I was wondering if there was a data.table sanctioned way of doing this. 我看着SO,并发现了一些人建议cbinddata.frame S中,但它更方便,我用:=这里的语法,所以我在想,如果有一个data.table这样的制裁方式。 My data.table has around 100 columns, so I don't want to list them all out. 我的data.table有大约100列,所以我不想全部列出。

Update: This feature has now been merged into the latest CRAN version of data.table (starting with v1.11.0), so installing the development version is no longer necessary to use this feature. 更新:此功能现已合并到data.table的最新CRAN版本(从v1.11.0开始),因此不再需要安装开发版本才能使用此功能。 From the release notes: 从发行说明:

  1. setcolorder() now accepts less than ncol(DT) columns to be moved to the front, #592. setcolorder()现在接受少于ncol(DT)列移动到前面,#592。 Thanks @MichaelChirico for the PR. 感谢@MichaelChirico的PR。

Current development version of data.table (v1.10.5) has updates to setcolorder() that make this way more convenient by accepting a partial list of columns. data.table (v1.10.5)的当前开发版本对setcolorder()进行了更新,通过接受部分列列表使这种方式更加方便。 The columns provided are placed first, and then all non-specified columns are added after in the existing order. 首先放置提供的列,然后在现有顺序之后添加所有未指定的列。

Installation instructions for development branch here. 这里有开发分支的安装说明。

Note regarding development branch stability: I've been running it for several months now to utilize the multi-threaded version in fread() in v1.10.5 (that alone is worth the update if you deal with multi-GB .csv files) and I have not noticed any bugs or regressions for my usage. 关于开发分支稳定性的注意事项:我已经运行了几个月,现在在v1.10.5中的fread()中使用多线程版本fread()如果你处理多GB .csv文件,那么单独值得更新)和我没有注意到我的使用有任何错误或回归。

library(data.table)
DT <- as.data.table(mtcars)
DT[1:5]

gives

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

re-order columns based on a partial list: 根据部分列表重新排序列:

setcolorder(DT,c("gear","carb"))
DT[1:5]

now gives 现在给

   gear carb  mpg cyl disp  hp drat    wt  qsec vs am
1:    4    4 21.0   6  160 110 3.90 2.620 16.46  0  1
2:    4    4 21.0   6  160 110 3.90 2.875 17.02  0  1
3:    4    1 22.8   4  108  93 3.85 2.320 18.61  1  1
4:    3    1 21.4   6  258 110 3.08 3.215 19.44  1  0
5:    3    2 18.7   8  360 175 3.15 3.440 17.02  0  0

If for any reason you don't want to update to the development branch, the following works in previous (and current CRAN) versions. 如果由于任何原因您不想更新到开发分支,则以下(以及当前的CRAN)版本中的以下内容适用。

newCols <- c("gear","carb")
setcolorder(DT,c(newCols, setdiff(newCols,colnames(DT)) ## (Per Frank's advice in comments)

## the long way I'd always done before seeing setdiff()
## setcolorder(DT,c(newCols,colnames(DT)[which(!colnames(DT) %in% newCols)]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM