简体   繁体   English

按列名合并 3 个 data.frames

[英]merge 3 data.frames by column names

I have three independent data.frames .我有三个独立data.frames The three data.frames have the same number of columns and the same number of rows.三个data.frames具有相同的列数和相同的行数。 Additionally They have the same column names.此外,它们具有相同的列名。 I' m trying to merge the three data.frames according to column names.我正在尝试根据列名合并三个 data.frames。 I'm using the following code wrote to merge two data.frames and return the number of matches.我正在使用编写的以下代码来合并两个 data.frames 并返回匹配项数。

 Merged_DF = sapply(names(DF1),function(n) nrow(merge(DF1, DF2, by=n)))

The problem is that while in this example there are two data.frames, in my case I have 3 data.frames.问题是虽然在此示例中有两个 data.frames,但在我的例子中我有 3 个 data.frames。 How can I modify the code to merge three data.frames instead of two?如何修改代码以合并三个 data.frames 而不是两个? I tried to modify the string in this way simply adding the third data.frame but it does not work:我试图以这种方式修改字符串,只需添加第三个 data.frame 但它不起作用:

  Merged_DF = sapply(names(DF1),function(n) nrow(merge(DF1, DF2, DF3,  by=n)))

It returns the following error:它返回以下错误:

 Error in fix.by(by.x, x) :  'by' must specify column(s) as numbers, names or logical

Ex:前任:

DF1 DF1

 G1 G2 G3 abfb c a c db

DF2 DF2

 G1 G2 G3 A bfb c ah M b

DF3东风三号

 G1 G2 G3 abfblaj M v

The data.frames have around 250 rows and 50 cols. data.frames 有大约 250 行和 50 列。

You can use the Reduce function to merge multiple data frames:您可以使用Reduce函数来合并多个数据框:

df_list <- list(DF1, DF2, DF3)
Reduce(function(x, y) merge(x, y, all=TRUE), df_list, accumulate=FALSE)

Or merge_recurse from the reshape package:或者reshape包中的merge_recurse

library(reshape)
data <- merge_recurse(df_list)

See also the R Wiki: Merge data frames另请参阅 R Wiki: 合并数据框

After researching this very same question for a couple hours today, I came up with this simple but elegant solution using a combination of 'dplyr' pipes and the base R 'merge()' function.今天在研究这个非常相同的问题几个小时后,我想出了这个简单而优雅的解决方案,它使用了 'dplyr' 管道和基本的 R 'merge()' 函数的组合。

MergedDF <- merge(DF1, DF2) %>%
              merge(DF3)

As you mention in your post, this assumes that the column names are the same and that there's the same number of rows in each data frame you are merging.正如您在帖子中提到的,这假设列名称相同,并且您要合并的每个数据框中的行数相同。 This will also automatically eliminate any duplicate columns (ie, identifiers) that were used in the merging process.这也将自动消除合并过程中使用的任何重复列(即标识符)。

Just in case anyone wants to merge multiple data frames with the same column name but unequal row numbers, this article was helpful: https://medium.com/coinmonks/merging-multiple-dataframes-in-r-72629c4632a3以防万一有人想合并具有相同列名但行号不相等的多个数据框,这篇文章很有帮助: https : //medium.com/coinmonks/merging-multiple-dataframes-in-r-72629c4632a3

Basically, you use the do.call and rbind functions:基本上,您使用 do.call 和 rbind 函数:

Merged <- do.call("rbind", list(df1, df2, df3, df4))

Adding a data.table solution:添加一个data.table解决方案:

library(data.table)
    
Merged= rbindlist(list(df2,df3,df4))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM