简体   繁体   English

R合并所有Excel文件中的所有图纸

[英]R to Merge All Sheets From All Excel Files

I am trying to merge data from all sheets in all Excel files in a folder. 我正在尝试合并文件夹中所有Excel文件中所有工作表的数据。 All sheets and all files have the same headers and same data sets. 所有工作表和所有文件都具有相同的标题和相同的数据集。 I thought the code below would read all sheets, but it seems to be reading ONLY the first sheet in each file. 我以为下面的代码可以读取所有工作表,但似乎只读取每个文件中的第一工作表。

# This needs several other packages
# install.packages("XLConnect")
require(XLConnect)

setwd("C:/Users/Excel/Desktop/Coding/R Programming/Excel/Excel_Files/")

fpattern <- "File.*.xls*?"  # pattern for filenames
output.file <- "Test.xls"
lfiles <- list.files(pattern = fpattern)

# Read data from all sheets
lfiles %>% 
  excel_sheets() %>% 
  set_names() %>% 
  map(read_excel, lfiles = lfiles)

I think the following does what you're looking for. 我认为以下内容可以满足您的需求。 In this example, not every file has the same sheets or columns; 在此示例中,并非每个文件都具有相同的工作表或列。 test2.xlsx has only one sheet and test3.xlsx sheet1 does not have col3. test2.xlsx只有一张工作表,而test3.xlsx sheet1没有col3。 It also labels the file and sheet name for each file. 它还标记了每个文件的文件名和图纸名称。

library(tidyverse)
library(readxl)

dir_path <- "~/test_dir/"         # target directory where the xlsx files are located. 
re_file <- "^test[0-9]\\.xlsx"    # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.

read_sheets <- function(dir_path, file){
  xlsx_file <- paste0(dir_path, file)
  xlsx_file %>%
    excel_sheets() %>%
    set_names() %>%
    map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>% 
    mutate(file_name = file) %>% 
    select(file_name, sheet_name, everything())
}

df <- list.files(dir_path, re_file) %>% 
  map_df(~ read_sheets(dir_path, .))

# A tibble: 15 x 5
   file_name  sheet_name  col1  col2  col3
   <chr>      <chr>      <dbl> <dbl> <dbl>
 1 test1.xlsx Sheet1         1     2     4
 2 test1.xlsx Sheet1         3     2     3
 3 test1.xlsx Sheet1         2     4     4
 4 test1.xlsx Sheet2         3     3     1
 5 test1.xlsx Sheet2         2     2     2
 6 test1.xlsx Sheet2         4     3     4
 7 test2.xlsx Sheet1         1     3     5
 8 test2.xlsx Sheet1         4     4     3
 9 test2.xlsx Sheet1         1     2     2
10 test3.xlsx Sheet1         3     9    NA
11 test3.xlsx Sheet1         4     7    NA
12 test3.xlsx Sheet1         5     3    NA
13 test3.xlsx Sheet2         1     3     4
14 test3.xlsx Sheet2         2     5     9
15 test3.xlsx Sheet2         4     3     1

This is an example using only R base functions and XLConnect: 这是仅使用R基本功能和XLConnect的示例:

library(XLConnect)

testDir <- "Excel_Files"

re_file <- ".+\\.xls.?"
testFiles <- list.files(testDir, re_file, full.names = TRUE)

# This function rbinds in a single dataframe
# the content of multiple sheets in the same workbook
# (assuming that all the sheets have the same column types)
rbindAllSheets <- function(file) {
  wb <- loadWorkbook(file)
  sheets <- getSheets(wb)
  do.call(rbind,
          lapply(sheets, function(sheet) {
            readWorksheet(wb, sheet)
          })
  )
}

# Getting a single dataframe for all the Excel files
result <- do.call(rbind, lapply(testFiles, rbindAllSheets))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM