简体   繁体   English

R:合并两个匹配数据帧列表

[英]R: merge two lists of matched dataframes

I have two lists that consist of the same amount of dataframes, and the order of dataframes in both lists indicates which dataframes belong together. 我有两个包含相同数量数据框的列表,两个列表中数据框的顺序指示哪些数据框属于同一数据框。 In other words, the first dataframe in the first list goes together with the first list in the second dataframe, and the second one with the second, etc. I want to merge the dataframes in both lists with each other, but only the dataframes that belong together. 换句话说,第一个列表中的第一个数据帧与第二个数据帧中的第一个列表一起出现,第二个与第二个数据帧一起出现,依此类推。我想将两个列表中的数据帧相互合并,但仅合并那些永远在一起。 Let's say the first list has these three dataframes: 假设第一个列表具有以下三个数据帧:

df1:
id var1
1 0.2
2 0.1
3 0.4
4 0.3

df2:
id var1
1 0.2
6 0.5

df3:
id var1
1 0.2
3 0.1
6 0.4

And the second list has the following dataframes: 第二个列表具有以下数据帧:

df1:
id var2
1 A
2 B
3 C
4 C

df2:
id var2
1 B
6 B

df3:
id var2
1 A
3 D
6 D

I would like to merge them based on the variable "id", and the end result then to be the following: 我想基于变量“ id”合并它们,然后最终结果如下:

df1:
id var1 var2
1 0.2 A
2 0.1 B 
3 0.4 C
4 0.3 C

df2:
id var1 var2
1 0.2 B
6 0.5 B

df3:
id var1 var2
1 0.2 A 
3 0.1 D
6 0.4 D

How do I do this? 我该怎么做呢?

First list of data-sets: 第一组数据集:

list1<-list(df1,df2,df3)

Second list of data sets: 数据集的第二个列表:

list2<-list(df1,df2,df3)

result: 结果:

lapply(1:length(list1),function(x) {merge(list1[[x]], list2[[x]], by.x = 'id')}) 

Using either tidyverse or base R : 使用tidyverse或base R

Map(merge,l1,l2)

library(tidyverse)
map2(l1,l2,inner_join)

# [[1]]
#   id   a b
# 1  1 0.1 A
# 2  2 0.2 B
# 
# [[2]]
#   id   a b
# 1  1 0.1 A
# 2  2 0.2 B
# 
# [[3]]
#   id   a b
# 1  1 0.1 A
# 2  2 0.2 B
# 

data 数据

l1 <- replicate(3,data.frame(id= 1:2,a=c(0.1,0.2)),F)

l1
# [[1]]
# id   a
# 1  1 0.1
# 2  2 0.2
# 
# [[2]]
# id   a
# 1  1 0.1
# 2  2 0.2
# 
# [[3]]
# id   a
# 1  1 0.1
# 2  2 0.2

l2 <- replicate(3,data.frame(id= 1:2,b=c("A","B")),F)
l2
# [[1]]
#   id b
# 1  1 A
# 2  2 B
# 
# [[2]]
#   id b
# 1  1 A
# 2  2 B
# 
# [[3]]
#   id b
# 1  1 A
# 2  2 B
# 

Use Map like this: 像这样使用Map

Map(merge, L1, L2)

giving: 给予:

$`df1`
  id var1 var2
1  1  0.2    A
2  2  0.1    B
3  3  0.4    C
4  4  0.3    C

$df2
  id var1 var2
1  1  0.2    B
2  6  0.5    B

$df3
  id var1 var2
1  1  0.2    A
2  3  0.1    D
3  6  0.4    D

Note 注意

The input lists in reproducible form are: 可复制形式的输入列表是:

Lines1 <- "df1:
id var1
1 0.2
2 0.1
3 0.4
4 0.3

df2:
id var1
1 0.2
6 0.5

df3:
id var1
1 0.2
3 0.1
6 0.4"
Read <- function(Lines) {
 L <- readLines(textConnection(Lines))
 ix <- grep(":", L)
 nms <- sub(":", "", L[ix])
 g <- nms[cumsum(L[-ix] == "")+1]
 lapply(split(L[-ix], g), function(x) read.table(text = x, header = TRUE))
}
L1 <- Read(Lines1)

and

Lines2 <- "df1:
id var2
1 A
2 B
3 C
4 C

df2:
id var2
1 B
6 B

df3:
id var2
1 A
3 D
6 D"
L2 <- Read(Lines2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM