[英]R to Merge All Sheets From All Excel Files
我正在嘗試合並文件夾中所有Excel文件中所有工作表的數據。 所有工作表和所有文件都具有相同的標題和相同的數據集。 我以為下面的代碼可以讀取所有工作表,但似乎只讀取每個文件中的第一工作表。
# This needs several other packages
# install.packages("XLConnect")
require(XLConnect)
setwd("C:/Users/Excel/Desktop/Coding/R Programming/Excel/Excel_Files/")
fpattern <- "File.*.xls*?" # pattern for filenames
output.file <- "Test.xls"
lfiles <- list.files(pattern = fpattern)
# Read data from all sheets
lfiles %>%
excel_sheets() %>%
set_names() %>%
map(read_excel, lfiles = lfiles)
我認為以下內容可以滿足您的需求。 在此示例中,並非每個文件都具有相同的工作表或列。 test2.xlsx只有一張工作表,而test3.xlsx sheet1沒有col3。 它還標記了每個文件的文件名和圖紙名稱。
library(tidyverse)
library(readxl)
dir_path <- "~/test_dir/" # target directory where the xlsx files are located.
re_file <- "^test[0-9]\\.xlsx" # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.
read_sheets <- function(dir_path, file){
xlsx_file <- paste0(dir_path, file)
xlsx_file %>%
excel_sheets() %>%
set_names() %>%
map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>%
mutate(file_name = file) %>%
select(file_name, sheet_name, everything())
}
df <- list.files(dir_path, re_file) %>%
map_df(~ read_sheets(dir_path, .))
# A tibble: 15 x 5
file_name sheet_name col1 col2 col3
<chr> <chr> <dbl> <dbl> <dbl>
1 test1.xlsx Sheet1 1 2 4
2 test1.xlsx Sheet1 3 2 3
3 test1.xlsx Sheet1 2 4 4
4 test1.xlsx Sheet2 3 3 1
5 test1.xlsx Sheet2 2 2 2
6 test1.xlsx Sheet2 4 3 4
7 test2.xlsx Sheet1 1 3 5
8 test2.xlsx Sheet1 4 4 3
9 test2.xlsx Sheet1 1 2 2
10 test3.xlsx Sheet1 3 9 NA
11 test3.xlsx Sheet1 4 7 NA
12 test3.xlsx Sheet1 5 3 NA
13 test3.xlsx Sheet2 1 3 4
14 test3.xlsx Sheet2 2 5 9
15 test3.xlsx Sheet2 4 3 1
這是僅使用R基本功能和XLConnect的示例:
library(XLConnect)
testDir <- "Excel_Files"
re_file <- ".+\\.xls.?"
testFiles <- list.files(testDir, re_file, full.names = TRUE)
# This function rbinds in a single dataframe
# the content of multiple sheets in the same workbook
# (assuming that all the sheets have the same column types)
rbindAllSheets <- function(file) {
wb <- loadWorkbook(file)
sheets <- getSheets(wb)
do.call(rbind,
lapply(sheets, function(sheet) {
readWorksheet(wb, sheet)
})
)
}
# Getting a single dataframe for all the Excel files
result <- do.call(rbind, lapply(testFiles, rbindAllSheets))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.