[英]How to concatenate multiple dataframes with different column names?
I am trying to concatenate multiple tables and let's say each of them have 20 columns but the column names are different.我正在尝试连接多个表,假设每个表都有 20 列,但列名不同。 How would I concatenate them?
我将如何连接它们?
Table 1:表格1:
a <- matrix(1:6, ncol = 2, byrow = TRUE) %>%
as.data.frame() %>%
setNames(c("A1", "B1"))
Table 2:表 2:
b <- matrix(7:10, ncol = 2, byrow = TRUE) %>%
as.data.frame() %>%
setNames(c("A2", "B2"))
Expected output:预期 output:
A B Number
1 2 1
3 4 1
5 6 1
7 8 2
9 10 2
I need to do this a the time for tables that can have hundreds of columns, this is my approach using a reference table with the standardised names under the "name" column.对于可以有数百列的表,我需要这样做,这是我使用“名称”列下具有标准化名称的参考表的方法。
For big projects I find it helpful to have the reference table in an Excel file, which is imported with readxl::read_xlsx()
.对于大型项目,我发现在使用
readxl::read_xlsx()
导入的 Excel 文件中包含参考表很有帮助。
#' fun
#'
#' Rename data frame columns based on reference table
#'
#' @param df_names data frame column names
#' @param df assigned name of data frame
#' @param reference data frame of column name mappings
#'
#' @return character vector of mapped names
#'
#' @export
fun <- function(df_names, df, reference) sapply(df_names,
function(x, y, d) ifelse(x %in% d[[y]], d[d[[y]] == x,]$name, x),
y = df,
d = reference)
reference <- data.frame(name = c("A", "B"), a = c("A1", "B1"), b = c("A2", "B2"))
names(a) <- fun(names(a), "a", reference)
names(b) <- fun(names(b), "b", reference)
a$Number <- 1
b$Number <- 2
rbind(a, b)
Maybe you can try something like below也许您可以尝试以下方法
library(dplyr)
library(tidyr)
df1$Number <- 1
df2$Number <- 2
dfout <- bind_rows(df1, df2) %>%
unite("A", c("A1", "A2"), na.rm = TRUE) %>%
unite("B", c("B1", "B2"), na.rm = TRUE)
which gives这使
> dfout
A B Number
1 1 2 1
2 3 4 1
3 5 6 1
4 7 8 2
5 9 10 2
Data数据
> dput(df1)
structure(list(A1 = c(1L, 3L, 5L), B1 = c(2L, 4L, 6L), Number = c(1,
1, 1)), row.names = c(NA, -3L), class = "data.frame")
> dput(df2)
structure(list(A2 = c(7L, 9L), B2 = c(8L, 10L), Number = c(2,
2)), row.names = c(NA, -2L), class = "data.frame")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.