简体   繁体   English

连接两个具有相同列名的不完整数据表

[英]Joining two incomplete data.tables with the same column names

I have two incomplete data.tables with the same column names.我有两个不完整的 data.tables 具有相同的列名。

 dt1 <- data.table(id = c(1, 2, 3), v1 = c("w", "x", NA), v2 = c("a", NA, "c")) dt2 <- data.table(id = c(2, 3, 4), v1 = c(NA, "y", "z"), v2 = c("b", "c", NA))

They look like this:它们看起来像这样:

 dt1 id v1 v2 1: 1 wa 2: 2 x <NA> 3: 3 <NA> c
 > dt2 id v1 v2 1: 2 <NA> b 2: 3 y c 3: 4 z <NA>

Is there a way to merge the two by filling in the missing info?有没有办法通过填写缺失的信息来合并两者?

This is the result I'm after:这是我追求的结果:

 id v1 v2 1: 1 wa 2: 2 xb 3: 3 y c 4: 4 z <NA>

I've tried various data.table joins, merges but I either get the columns repeated:我尝试了各种 data.table 连接、合并,但我要么重复列:

 > merge(dt1, + dt2, + by = "id", + all = TRUE) id v1.x v2.x v1.y v2.y 1: 1 wa <NA> <NA> 2: 2 x <NA> <NA> b 3: 3 <NA> c y c 4: 4 <NA> <NA> z <NA>

or the rows repeated:或重复的行:

 > merge(dt1, + dt2, + by = names(dt1), + all = TRUE) id v1 v2 1: 1 wa 2: 2 <NA> b 3: 2 x <NA> 4: 3 <NA> c 5: 3 y c 6: 4 z <NA>

Both data.tables have the same column names.两个 data.tables 具有相同的列名。

You can group by ID and get the unique values after omitting NAs, ie您可以按 ID 分组并在省略 NA 后获取唯一值,即

library(data.table) merge(dt1, dt2, all = TRUE)[, lapply(.SD, function(i)na.omit(unique(i))), by = id][] # id v1 v2 #1: 1 wa #2: 2 xb #3: 3 y c #4: 4 z <NA>

You could also start out with rbind():你也可以从 rbind() 开始:

 rbind(dt1, dt2)[, lapply(.SD, \(x) unique(x[.is,na(x)])): by = id] # id v1 v2 # <num> <char> <char> # 1: 1 wa # 2: 2 xb # 3: 3 y c # 4 4 z <NA>

First full_join and after that group_by per id and merge the rows:首先full_join ,然后是group_by每个 id 并合并行:

 library(dplyr) library(tidyr) dt1 %>% full_join(dt2, by = c("id", "v1", "v2")) %>% group_by(id) %>% fill(starts_with('v'),.direction = 'updown') %>% slice(1) %>% ungroup

Output: Output:

 # A tibble: 4 × 3 id v1 v2 <dbl> <chr> <chr> 1 1 wa 2 2 xb 3 3 y c 4 4 z NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM