R data.table合并表按多列分组

Question

I have two huge data tables ( dt1 and dt2 ) that are almost identical except for 1 column. 我有两个巨大的数据表（ dt1和dt2 ），除了1列外几乎相同。 I want to join the tables by the p-1 columns, where p <- ncol(dt1) . 我想通过p-1列加入表，其中p <- ncol(dt1) 。 Should I setkey() to the p-1 columns and join using dt1[dt2] ? 我应该将setkey()设置为p-1列并使用dt1[dt2]加入吗？ If that is the case, how can I enter the arguments in setkey() since I can't put quoted string as argument. 在这种情况下，由于无法将带引号的字符串作为参数，因此如何在setkey()输入参数。

Here is some simulated data: 这是一些模拟数据：

dt1 <- data.table(matrix(rnorm(260), 10, 26))
setnames(dt1, letters)
dt2 <- copy(dt1)
dt2[,z:=rnorm(10)]

## Sections below won't run
setkey(dt1, get(letters[-which(letters=="z")]))
setkey(dt2, get(letters[-which(letters=="z")]))
dt1[dt2]

Answer 1

Use setkeyv : 使用setkeyv ：

setkeyv(dt1, letters[-which(letters=="z")])
setkeyv(dt2, letters[-which(letters=="z")])
dt1[dt2]

Answer 2

If you know the name of the different column this works 如果您知道其他列的名称，则可以使用

merge(dt1,dt2,names(dt1)[-grep("z",names(dt1))])

It also preserves the two original differing columns as dt$zx and dt$zy 它还将两个原始不同的列保留为dt$zx和dt$zy

R data.table合并表按多列分组

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-07-23 15:09:41

解决方案2
0 2014-07-23 15:04:56

R data.table合并表按多列分组

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-07-23 15:09:41

解决方案2 0 2014-07-23 15:04:56

解决方案1
2 已采纳 2014-07-23 15:09:41

解决方案2
0 2014-07-23 15:04:56