简体   繁体   中英

R: merge two lists of matched dataframes

I have two lists that consist of the same amount of dataframes, and the order of dataframes in both lists indicates which dataframes belong together. In other words, the first dataframe in the first list goes together with the first list in the second dataframe, and the second one with the second, etc. I want to merge the dataframes in both lists with each other, but only the dataframes that belong together. Let's say the first list has these three dataframes:

df1:
id var1
1 0.2
2 0.1
3 0.4
4 0.3

df2:
id var1
1 0.2
6 0.5

df3:
id var1
1 0.2
3 0.1
6 0.4

And the second list has the following dataframes:

df1:
id var2
1 A
2 B
3 C
4 C

df2:
id var2
1 B
6 B

df3:
id var2
1 A
3 D
6 D

I would like to merge them based on the variable "id", and the end result then to be the following:

df1:
id var1 var2
1 0.2 A
2 0.1 B 
3 0.4 C
4 0.3 C

df2:
id var1 var2
1 0.2 B
6 0.5 B

df3:
id var1 var2
1 0.2 A 
3 0.1 D
6 0.4 D

How do I do this?

First list of data-sets:

list1<-list(df1,df2,df3)

Second list of data sets:

list2<-list(df1,df2,df3)

result:

lapply(1:length(list1),function(x) {merge(list1[[x]], list2[[x]], by.x = 'id')}) 

Using either tidyverse or base R :

Map(merge,l1,l2)

library(tidyverse)
map2(l1,l2,inner_join)

# [[1]]
#   id   a b
# 1  1 0.1 A
# 2  2 0.2 B
# 
# [[2]]
#   id   a b
# 1  1 0.1 A
# 2  2 0.2 B
# 
# [[3]]
#   id   a b
# 1  1 0.1 A
# 2  2 0.2 B
# 

data

l1 <- replicate(3,data.frame(id= 1:2,a=c(0.1,0.2)),F)

l1
# [[1]]
# id   a
# 1  1 0.1
# 2  2 0.2
# 
# [[2]]
# id   a
# 1  1 0.1
# 2  2 0.2
# 
# [[3]]
# id   a
# 1  1 0.1
# 2  2 0.2

l2 <- replicate(3,data.frame(id= 1:2,b=c("A","B")),F)
l2
# [[1]]
#   id b
# 1  1 A
# 2  2 B
# 
# [[2]]
#   id b
# 1  1 A
# 2  2 B
# 
# [[3]]
#   id b
# 1  1 A
# 2  2 B
# 

Use Map like this:

Map(merge, L1, L2)

giving:

$`df1`
  id var1 var2
1  1  0.2    A
2  2  0.1    B
3  3  0.4    C
4  4  0.3    C

$df2
  id var1 var2
1  1  0.2    B
2  6  0.5    B

$df3
  id var1 var2
1  1  0.2    A
2  3  0.1    D
3  6  0.4    D

Note

The input lists in reproducible form are:

Lines1 <- "df1:
id var1
1 0.2
2 0.1
3 0.4
4 0.3

df2:
id var1
1 0.2
6 0.5

df3:
id var1
1 0.2
3 0.1
6 0.4"
Read <- function(Lines) {
 L <- readLines(textConnection(Lines))
 ix <- grep(":", L)
 nms <- sub(":", "", L[ix])
 g <- nms[cumsum(L[-ix] == "")+1]
 lapply(split(L[-ix], g), function(x) read.table(text = x, header = TRUE))
}
L1 <- Read(Lines1)

and

Lines2 <- "df1:
id var2
1 A
2 B
3 C
4 C

df2:
id var2
1 B
6 B

df3:
id var2
1 A
3 D
6 D"
L2 <- Read(Lines2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM