简体   繁体   中英

merging lists of dataframes in R effectively

I have 12 sets of 4 lists that contain between 2 to 13 data frames I want to merge them into 1 set of 4 lists of data frames.

While I call them sets, they are simply stored in Global Environment as: list_a_1, list_b_1, list_c_1, list_d_1, list_a_2, list_b_2, list_c_2, list_d_2, ...

list_a_1 ~ list_a_12 would have data frames that have the exact same names and same columns.

My desired outcome is 4 lists containing all 12 sets of dataframe merged.


df1 = data.frame(A = 1:5, B = 100:104)
df2 = data.frame(C = 6:10, D = 100:104)

list_a_1 = list(df1, df2)
list_a_2 = list(df1, df2)



desired_outcome
df1
   A   B
1  1 100
2  2 101
3  3 102
4  4 103
5  5 104
6  1 100
7  2 101
8  3 102
9  4 103
10 5 104

df2
    C   D
1   6 100
2   7 101
3   8 102
4   9 103
5  10 104
6   6 100
7   7 101
8   8 102
9   9 103
10 10 104

I tried writing a function with rbind, append, merge ... etc. with an aim to use it with lapply, but cannot seem to get it right. Since each list is quite large, efficiency is also an important factor.

As these are correspoinding elements to be rbind , use Map in base R

Map(rbind, list_a_1, list_a_2)
#[[1]]
#   A   B
#1  1 100
#2  2 101
#3  3 102
#4  4 103
#5  5 104
#6  1 100
#7  2 101
#8  3 102
#9  4 103
#10 5 104

#[[2]]
#    C   D
#1   6 100
#2   7 101
#3   8 102
#4   9 103
#5  10 104
#6   6 100
#7   7 101
#8   8 102
#9   9 103
#10 10 104

Or loop over the sequence of one list , extract each based on the index and rbind

lapply(seq_along(list_a_1), function(i) rbind(list_a_1[[i]], list_a_2[[i]]))

For multiple lists , we can use

v1 <- paste0('list_', letters[1:4], "_", rep(1:2, each = 4))

and then use mget

lst1 <- mget(v1)

Or this can be done automatically with a regex pattern

list_b_1 <- list_a_1
list_b_2 <- list_a_2
list_c_1 <- list_a_1
list_c_2 <- list_a_2
list_d_1 <- list_a_1
list_d_2 <- list_a_2
nms <- ls(pattern = '^list_[a-d]_\\d+$')
lst1 <- mget(nms)
grps <- sub("list_([a-d])_\\d+", "\\1", nms)
lst2 <- split(lst1, grps)
out <- lapply(lst2, function(lstnew) do.call(Map, c(f = rbind, unname(lstnew))))

-checking the output

out$a
[[1]]
   A   B
1  1 100
2  2 101
3  3 102
4  4 103
5  5 104
6  1 100
7  2 101
8  3 102
9  4 103
10 5 104

[[2]]
    C   D
1   6 100
2   7 101
3   8 102
4   9 103
5  10 104
6   6 100
7   7 101
8   8 102
9   9 103
10 10 104

For 'd' objects

out$d
[[1]]
   A   B
1  1 100
2  2 101
3  3 102
4  4 103
5  5 104
6  1 100
7  2 101
8  3 102
9  4 103
10 5 104

[[2]]
    C   D
1   6 100
2   7 101
3   8 102
4   9 103
5  10 104
6   6 100
7   7 101
8   8 102
9   9 103
10 10 104

Or map2 from purrr

library(dplyr)
library(purrr)
map2(list_a_1, list_a_2, bind_rows)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM