简体   繁体   English

在R中按表格将多个Excel文件追加到一个Excel文件中

[英]Append multiple excel file into one excel file by sheet in R

I have multiple excel files, and I just want to append them in single excel file based on the sheet (For example, Data of Sheet1 always append to sheet1 and so on for sheet2). 我有多个Excel文件,我只想基于工作表将它们追加到单个excel文件中(例如,工作表1的数据始终追加到工作表1上,对于工作表2依此类推)。

And i want to keep only one header of any one file, and while appending other files I want to remove the header column. 我只想保留任何一个文件的一个标头,而在附加其他文件时,我想删除标头列。

so far i have tried. 到目前为止,我已经尝试过了。

library(dplyr)
library(xlsx)

path<- "C:/Users/KJD14/Documents/Test - "
dataFolders<- list.files(path,pattern = "*.xlsx")
dataFolders<- sort(dataFolders[starts_with(match = "Test - ", vars = dataFolders)])
files<- lapply(lapply(dataFolders, FUN = function(x){
  paste(path,x,sep = "/")
}), FUN = function(x){
  list.files(x, pattern = "*.xlsx", full.names = TRUE)
})

Using this you can read in all the sheets of one excel file 使用此功能,您可以阅读一个excel文件的所有表格。

library(data.table)
library(readxl)
# Using the example excel file and only read in the first sheet three times
list.import <- lapply(rep(excel_sheets(readxl_example("datasets.xlsx"))[1],3), function(sheet){read_xlsx(readxl_example("datasets.xlsx"), sheet = sheet  )})

dt <- rbindlist(list.import)

Using an additional loop you can then also read in more than one excel file if you like. 如果需要,还可以使用一个附加循环读取多个excel文件。 I just found a new package, which at the moment is only on github, but probably could be installed. 我刚刚找到了一个新软件包,该软件包目前仅在github上,但可能可以安装。 So please checkout: https://github.com/ropensci/writexl To install: 因此,请检出: https : //github.com/ropensci/writexl要安装:

 install.packages("devtools")
 library(devtools)
 writexl::write_xlsx(dt, path = "temp.xlsx")

Please mind that I haven't checked the last lines of code if they work properly, so please test that on your own. 请注意,我没有检查最后几行代码是否正常工作,因此请自行进行测试。

Here's an option that returns a data frame with columns for file and sheet names for each file. 这是一个选项,它返回一个数据框,其中包含用于文件的列和用于每个文件的工作表名称。 In this example, not every file has the same sheets or columns; 在此示例中,并非每个文件都具有相同的工作表或列。 test2.xlsx has only one sheet and test3.xlsx sheet1 does not have col3. test2.xlsx只有一张工作表,而test3.xlsx sheet1没有col3。

library(tidyverse)
library(readxl)

dir_path <- "~/test_dir/"         # target directory where the xlsx files are located. 
re_file <- "^test[0-9]\\.xlsx"    # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.

read_sheets <- function(dir_path, file){
  xlsx_file <- paste0(dir_path, file)
  xlsx_file %>%
    excel_sheets() %>%
    set_names() %>%
    map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>% 
    mutate(file_name = file) %>% 
    select(file_name, sheet_name, everything())
}

df <- list.files(dir_path, re_file) %>% 
  map_df(~ read_sheets(dir_path, .))

# A tibble: 15 x 5
   file_name  sheet_name  col1  col2  col3
   <chr>      <chr>      <dbl> <dbl> <dbl>
 1 test1.xlsx Sheet1         1     2     4
 2 test1.xlsx Sheet1         3     2     3
 3 test1.xlsx Sheet1         2     4     4
 4 test1.xlsx Sheet2         3     3     1
 5 test1.xlsx Sheet2         2     2     2
 6 test1.xlsx Sheet2         4     3     4
 7 test2.xlsx Sheet1         1     3     5
 8 test2.xlsx Sheet1         4     4     3
 9 test2.xlsx Sheet1         1     2     2
10 test3.xlsx Sheet1         3     9    NA
11 test3.xlsx Sheet1         4     7    NA
12 test3.xlsx Sheet1         5     3    NA
13 test3.xlsx Sheet2         1     3     4
14 test3.xlsx Sheet2         2     5     9
15 test3.xlsx Sheet2         4     3     1

And then exporting to a single XLSX file: 然后导出到单个XLSX文件:

library(xlsx)
write.xlsx(df, 'file_name.xslx', sheetName="Sheet1", row.names=TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM