简体   繁体   English

R 如何按条件加入 data.table?

[英]R How to join by condition with data.table?

After some research, especially here, I discovered what I think is a very interesting method to join two data.table by condition: DT[WHERE, v:= FROM[.SD, on=, xv]] .经过一些研究,特别是在这里,我发现我认为是一种非常有趣的方法,可以按条件加入两个data.tableDT[WHERE, v:= FROM[.SD, on=, xv]]

Unfortunately, I can not perform this in spite of many attempts.不幸的是,尽管尝试了很多次,我还是无法做到这一点。

From DT1 and DT2, I need to create DTres: perform a join only on some DT1 rows.从 DT1 和 DT2,我需要创建 DTres:只对一些 DT1 行执行连接。

And DT3 is one of my unsuccessful attempt...而DT3是我不成功的尝试之一......

Is it possible to do this?是否有可能做到这一点? How to?如何?

Many thanks for helping.非常感谢您的帮助。

library(data.table)

DT1 <- data.table(crit = rep(c('AA', 'BB', 'CC', 'DD'),each = 3),
                  num = rep(1:3, 4), 
                  val = rnorm(12)^2)
DT1

DT2 <- data.table(BB = c(1,3),
                  cross = c(128, 183))
DT2

DTres <- data.table(crit = rep(c('AA', 'BB', 'CC', 'DD'),each = 3),
                    num = rep(1:3, 4), 
                    val = rnorm(12)^2,
                    cross = c(rep(NA,3), 128, NA, 183, rep(NA, 6))
)
DTres

DT3 <- DT1[crit == 'BB', cross := DT2[DT1, on = .('BB' = num), x.cross]]

Create a column of 'crit' in the second dataset, do the join and assign the value of 'cross' from 'DT2' to 'DT1'在第二个数据集中创建一列“crit”,进行连接并将“cross”的值从“DT2”分配给“DT1”

DT1[DT2[,  c(.(crit = 'BB'), .SD)] , cross := cross, on = .(crit, num = BB)]

DT1

Or melt the second data into 'long 'format或者将第二个数据melt成'long'格式

DT1[ melt(DT2, id.var = 'cross', variable.name = 'crit'), 
      cross := cross, on = .(crit, num = value)]



DT1
#    crit num          val cross
# 1:   AA   1 4.720241e+00    NA
# 2:   AA   2 2.261093e-01    NA
# 3:   AA   3 5.040239e-01    NA
# 4:   BB   1 3.729867e-01   128
# 5:   BB   2 8.725384e-01    NA
# 6:   BB   3 1.571597e+00   183
# 7:   CC   1 8.494091e-02    NA
# 8:   CC   2 1.965077e-01    NA
# 9:   CC   3 1.221802e-06    NA
#10:   DD   1 5.526632e-03    NA
#11:   DD   2 3.475349e-01    NA
#12:   DD   3 3.233841e-01    NA

Or another option based on the OP's attempt is或者基于 OP 尝试的另一个选项是

DT1[crit == 'BB' & num %in% DT2$BB, 
      cross := .SD[DT2, on = .(num = BB)]$cross]

Here is another data.table option这是另一个data.table选项

DT2[, c(stack(.SD[, .(BB)]), .(cross = cross))][DT1, on = .(ind = crit, values = num)]

which gives这使

    values ind cross          val
 1:      1  AA    NA 0.0080997625
 2:      2  AA    NA 0.0001964834
 3:      3  AA    NA 1.2621554895
 4:      1  BB   128 1.8066857886
 5:      2  BB    NA 2.3200035029
 6:      3  BB   183 0.1780571706
 7:      1  CC    NA 1.8521153969
 8:      2  CC    NA 3.0757963595
 9:      3  CC    NA 2.4597679400
10:      1  DD    NA 1.6815750082
11:      2  DD    NA 0.0564519787
12:      3  DD    NA 1.4985435547

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM