
[英]Merge two data.tables where all rows in dt2 are combined with each row in dt1
[英]Merge two data tables, take value from dt2 if dt1 has NA
我有两个数据表。 我将它们合并到 4 个常用列中(主键 = col1)。 但是,dt1 中的许多行在 3 个子键(col2、col3、col4)中都有 NA。 合并时,它们保留 NA 而不是从 dt2 中与它们匹配的行中获取现有值。
例子:
setkey(dt1, letter, number)
setkey(dt2, letter, number)
dt1 dt2
letter number size letter number color
a 10 big a 10 blue
b NA small b 20 orange
c 30 big c 30 yellow
d 40 big d 40 red
dt_merged <- merge(dt1, dt2, all=TRUE)
dt_merged
letter number size color
a 10 big blue
b NA small orange
c 30 big yellow
d 40 big red
每当 dt1 具有 NA 时,如何调整合并以从 dt2 获取这 3 列(例如 col2、col3、col4)的值?
编辑:添加大小列,因为以前似乎没有必要将 DT 与给定值合并。
对于此示例数据集,可以使用coalesce
dplyr
中的合并来解决dt1
具有 NA 时的问题。 如果您有更多列(您提到了 col2、col3、col4),您可能应该在两个数据集之间以相同的理由将它们全部coalesce
。
library(dplyr)
dt1 %>%
mutate(number = coalesce(number, dt2$number)) %>%
inner_join(dt2, by = c("letter", "number"))
# letter number size color
# 1 a 10 big blue
# 2 b 20 small orange
# 3 c 30 big yellow
# 4 d 40 big red
数据
dt1 <- structure(list(letter = c("a", "b", "c", "d"), number = c(10L,
NA, 30L, 40L), size = c("big", "small", "big", "big")), class = "data.frame", row.names = c(NA, -4L))
dt2 <- structure(list(letter = c("a", "b", "c", "d"), number = c(10L,
20L, 30L, 40L), color = c("blue", "orange", "yellow", "red")), class = "data.frame", row.names = c(NA, -4L))
这是一个选项:
cols <- paste0("col", 2L:4L)
#look up the missing values from dt2 first before merging
dt1[is.na(col2), (cols) := dt2[.SD, on=.(col1), mget(paste0("x.", cols))]]
merge(dt1, dt2, all=TRUE)
output:
col1 col2 col3 col4 size color
1: a 10 10 10 big blue
2: b 20 20 20 small orange
3: c 30 30 30 big yellow
4: d 40 40 40 big red
数据:
dt1 <- data.table(col1 = c("a", "b", "c", "d"),
col2 = c(10L, NA, 30L, 40L),
col3 = c(10L, NA, 30L, 40L),
col4 = c(10L, NA, 30L, 40L),
size = c("big", "small", "big", "big"))
dt2 <- data.table(col1 = c("a", "b", "c", "d"),
col2 = c(10L, 20L, 30L, 40L),
col3 = c(10L, 20L, 30L, 40L),
col4 = c(10L, 20L, 30L, 40L),
color = c("blue", "orange", "yellow", "red"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.