将 2 个 data.tables 与多于一列的 data.table 方式合并

Question

I have two data.table s as following:-我有两个data.table s 如下：-

a <- data.table(id = 1:10, val = 2010:2019)
b <- data.table(id = c(1, 2, 4, 6), year = 1:4)

Now if I merged b and a as following:-现在，如果我将b和a合并如下：-

b[a, val := i.val, on = "id"]

This will make an extra column in b called val .这将在b创建一个名为val的额外列。 This will also not reassign the memory for b data.table .这也不会为b data.table重新分配内存。

I wanted to know if a more than 2 columns and was as following:-我想知道，如果a超过2列，是如下： -

    a <- data.table(id = 1:10, val = 2010:2019,
                    twr = c(10, 13, 22 ,43, 23, 23, -4, 33, -54, 34))

how to merge the two data.table s ( b and a ), the data.table way, ie not using merge or any of the join function.如何合并两个data.table s（ b和a ）， data.table方式，即不使用merge或任何join函数。

But using [, , on = "id"] syntax.但是使用[, , on = "id"]语法。

I want to know this because using any of the join functions or merge makes a whole new object, whereas the data.table way only creates the new columns and not a whole new object.我想知道这一点，因为使用任何join函数或merge都会创建一个全新的对象，而data.table方式只创建新列而不是一个全新的对象。

Thanks in advance.提前致谢。

Answer 1

If there are only two columns to be returned, just wrap then in a list (or short form .( ) after joining on by 'id', and assign := ) those columns to 'b'如果只有两列要返回，只需on通过 'id' 加入后用list （或短格式.( ) 包装，然后将:= ）这些列分配给 'b'

b[a, names(a)[-1] := .(i.val, i.twr), on = .(id)]

If there are many columns to be returned如果要返回的列很多

nm1 <- names(a)[-1]
b[a, (nm1) := mget(paste0("i.", nm1)), on = .(id)]

-ouput -输出

b
   id year  val twr
1:  1    1 2010  10
2:  2    2 2011  13
3:  4    3 2013  43
4:  6    4 2015  23

Answer 2

With development version 1.14.1, data.table has gained the env parameter which is meant for programming on data.table :在开发版本 1.14.1 中， data.table获得了env参数，用于在 data.table 上进行编程：

cols <- setdiff(names(a), "id")
b[a, on = "id", (cols) := acols, env = list(acols = as.list(cols))][]

 id year val twr 1: 1 1 2010 10 2: 2 2 2011 13 3: 4 3 2013 43 4: 6 4 2015 23

This will work in many cases where there no duplicate column names in a and b except those to join on.这在许多情况下都有效，其中a和b除了要加入的列名之外没有重复的列名。 However, we can explicitely refer to columns of a by using the prefix i.但是，我们可以通过使用前缀i.来显式地引用a的列i. : ：

b[a, on = "id", (cols) := acols, env = list(acols = as.list(paste0("i.", cols)))][]

将 2 个 data.tables 与多于一列的 data.table 方式合并

问题描述

2 个解决方案

解决方案1
2 2021-07-15 16:30:39

解决方案2
0 2021-07-16 16:23:02

将 2 个 data.tables 与多于一列的 data.table 方式合并

问题描述

2 个解决方案

解决方案1 2 2021-07-15 16:30:39

解决方案2 0 2021-07-16 16:23:02

解决方案1
2 2021-07-15 16:30:39

解决方案2
0 2021-07-16 16:23:02