简体   繁体   English

在 R 中使用 full_join 按顺序连接数据帧列表

[英]Join list of dataframes in sequence using full_join in R

I have a list of dataframes with similar variable names that I'm looking to join using full_join in the order in which they appear in the list.我有一个具有相似变量名称的数据full_join列表,我希望按照它们在列表中出现的顺序使用full_join进行连接。

require(tidyverse)

x<-data.frame(id=c("a","a","b","b","b","c","c","c","c"),
              sub.id=c("1","2","1","2","3","1","2","3","4"))

y<-data.frame(id = as.character(rep(1:4,each=2)),
              sub.id = c("AA","CC","DD","AA","GG","OO","PP","OW"))

z<-data.frame(id = c("AA","CC","DD","GG","OO","OW","PP"),
              sub.id = as.character(1:7))

dfs<-list(x,y,z)

I've tried using reduce from the purrr package but this will join all dataframes in the list to the first dataframe.我已经尝试使用reducepurrr包,但这将加入所有dataframes列表中的第一个数据帧。 In this case the x dataframe.在这种情况下, x数据帧。

dfs %>% 
  reduce(full_join,by = c("sub.id" = "id"))

Is there a way to perform a full_join to the dataframes found in a list such that the by follows the sequence that the dataframes appear on the list?有没有办法对列表中找到的数据帧执行full_join ,使得by遵循数据帧出现在列表中的顺序? In this example the sub.id of x would match with id of y and then the sub.id from y after joining would match the id of z for the final join.在此示例中, xsub.id将与y id匹配,然后加入后来自ysub.id将与zid匹配以进行最终sub.id

EDIT: The expected result of this should be similar to the following:编辑:这的预期结果应该类似于以下内容:

    id sub.id.x sub.id.y sub.id.y.y
1   a        1       AA          1
2   a        1       CC          2
3   a        2       DD          3
4   a        2       AA          1
5   b        1       AA          1
6   b        1       CC          2
7   b        2       DD          3
8   b        2       AA          1
9   b        3       GG          4
10  b        3       OO          5
11  c        1       AA          1
12  c        1       CC          2
13  c        2       DD          3
14  c        2       AA          1
15  c        3       GG          4
16  c        3       OO          5
17  c        4       PP          7
18  c        4       OW          6

Joinded column name suffixes unchanged at this time.加入的列名后缀此时不变。

Perhaps, we need a for loop to change the column names after each join on the output generated也许,我们需要一个for循环来在每次连接生成的输出后更改列名

out <- dfs[[1]]
for(i in 2:length(dfs)) {
    out <- full_join(out, dfs[[i]], by = c('sub.id' = 'id'))
    names(out)[names(out) == 'sub.id'] <- paste0("sub.id", i)
    names(out)[names(out) == 'sub.id.y'] <- 'sub.id'
  }

-output -输出

out
#   id sub.id2 sub.id3 sub.id
#1   a       1      AA      1
#2   a       1      CC      2
#3   a       2      DD      3
#4   a       2      AA      1
#5   b       1      AA      1
#6   b       1      CC      2
#7   b       2      DD      3
#8   b       2      AA      1
#9   b       3      GG      4
#10  b       3      OO      5
#11  c       1      AA      1
#12  c       1      CC      2
#13  c       2      DD      3
#14  c       2      AA      1
#15  c       3      GG      4
#16  c       3      OO      5
#17  c       4      PP      7
#18  c       4      OW      6

If we can assume that the joining columns are always found on the end of the first dataframe and the first on the second dataframe, then you could do:如果我们可以假设连接列总是在第一个数据帧的末尾和第二个数据帧的第一个,那么你可以这样做:

In Base R:在基础 R 中:

Reduce(function(x,y) merge(x,y,by.x = tail(names(x),1), by.y = names(y)[1], all = TRUE), dfs)

   sub.id1 sub.id0 id sub.id11
1       AA       1  a        1
2       AA       2  a        1
3       AA       1  c        1
4       AA       2  b        1
5       AA       1  b        1
6       AA       2  c        1
7       CC       1  a        2
8       CC       1  b        2
9       CC       1  c        2
10      DD       2  b        3
11      DD       2  a        3
12      DD       2  c        3
13      GG       3  b        4
14      GG       3  c        4
15      OO       3  b        5
16      OO       3  c        5
17      OW       4  c        6
18      PP       4  c        7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM