从不同长度和不同密钥的数据帧列表中进行多个联接

Question

Let's say I've got this list of data frames: 假设我有以下数据帧列表：

library(tidyverse)
df_list <- list(data.frame(cheese = c("ex","ok","bd"), 
                          cheese_val = c(3:1), 
                          stringsAsFactors = F),
               data.frame(egg = c("great","good","bad", "eww"), 
                          egg_val = c(4:1),
                          stringsAsFactors = F),
               data.frame(milk = c("good","bad"), 
                          milk_val = c(2:1), 
                          stringsAsFactors = F))

And I've got this core data set: 而且我有以下核心数据集：

core_dat <- data.frame(cheese = c("ex","ok","ok", "bd", "ok"), 
                      egg = c("great", "bad", "bad", "eww", "great"), 
                      milk = c("good", "good", "good", "bad", "good"), 
                      stringsAsFactors = F)

I'd like to get core_dat joined individually with each element of df_list . 我想让core_dat与df_list每个元素分别df_list 。

I then tried this: 然后我尝试了这个：

for(i in 1:length(df_list)) {
  gg<-core_dat %>% 
    left_join(df_list[[i]], by = names(df_list[[i]][1]), copy = T)
}

which ran but only applied the join to the milk column such that the only additional column in core_dat was milk_val but I expected to see cheese_val , and egg_val too. 运行，但仅将core_dat应用于milk列，因此core_dat唯一的附加列是milk_val但我希望也能看到cheese_val和egg_val 。

I suspect there are more appropriate options than a for loop here and I am looking for suggestions. 我怀疑这里有比for循环更合适的选项，我正在寻找建议。 Note that my actual data set has many more df's than this small example. 请注意，我的实际数据集比这个小例子要多得多。

I should not that I expect the resulting data frame, in this case gg , to contain 6 columns total (3 standard name + 3 with "val" suffix) such that it looks like printed version of this: 我不应该期望所得的数据帧（在这种情况下为gg总共包含6列（3个标准名称+ 3个带有“ val”后缀的列），使得它看起来像这样：

data.frame(cheese = c("ex","ok","ok", "bd", "ok"), 
                      egg = c("great", "bad", "bad", "eww", "great"), 
                      milk = c("good", "good", "good", "bad", "good"), 
                      chees_val = c(3, 2, 2, 1, 2), 
                      egg_val = c(4, 2, 2, 1, 4), 
                      milk_val = c(2, 2, 2, 1, 2))

I've seen many "multiple joins" answers here but none that quite line up with what I'm trying to accomplish here (differing key columns, differing lengths of data). 我在这里看到了许多“多重联接”的答案，但没有一个与我在这里要完成的工作完全一致（不同的键列，不同的数据长度）。

Answer 1

You can use map to get a list of joined data frames, then use reduce to join them all together. 您可以使用map获取已连接数据框的列表，然后使用reduce将它们全部连接在一起。

map(df_list, right_join, rownames_to_column(core_dat)) %>%
  reduce(full_join)
# Joining, by = "cheese"
# Joining, by = "egg"
# Joining, by = "milk"
# Joining, by = c("cheese", "rowname", "egg", "milk")
# Joining, by = c("cheese", "rowname", "egg", "milk")
#   cheese cheese_val rowname   egg milk egg_val milk_val
# 1     ex          3       1 great good       4        2
# 2     ok          2       2   bad good       2        2
# 3     ok          2       3   bad good       2        2
# 4     bd          1       4   eww  bad       1        1
# 5     ok          2       5 great good       4        2

Answer 2

This should give the desired output: 这应该提供所需的输出：

Reduce(merge,c(df_list,list(core_dat)))
  cheese   egg milk cheese_val egg_val milk_val
1     bd   eww  bad          1       1        1
2     ex great good          3       4        2
3     ok   bad good          2       2        2
4     ok   bad good          2       2        2
5     ok great good          2       4        2

从不同长度和不同密钥的数据帧列表中进行多个联接

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-01-25 21:55:38

解决方案2
2 2018-01-25 22:02:35

从不同长度和不同密钥的数据帧列表中进行多个联接

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-01-25 21:55:38

解决方案2 2 2018-01-25 22:02:35

解决方案1
2 已采纳 2018-01-25 21:55:38

解决方案2
2 2018-01-25 22:02:35