简体   繁体   English

读取 excel 文件的文件夹并将单个工作表作为单独的 df 导入 R 中的名称

[英]Read a folder of excel files and import individual sheets as separate df's in R with Names

I have a folder of excel files that contain multiple sheets each.我有一个包含多个工作表的 excel 文件文件夹。 The sheets are named the same in each wb.工作表在每个 wb 中的名称相同。 I'm trying to import one specific named sheet for all excel files as separate data frames.我正在尝试将所有 excel 文件的一个特定命名工作表作为单独的数据框导入。 I have been able to import them in;我已经能够将它们导入; however, the names become df_1, df_2, df_3, etc... I've been trying to take the first word of the excel file name and use that to identify the df.但是,名称变为 df_1、df_2、df_3 等...我一直在尝试获取 excel 文件名的第一个单词并使用它来识别 df。

Example of Excel file Name "AAPL Multiple Sheets" the sheet would be named "Balance" I'm importing as a df. Excel 文件名称“AAPL 多张工作表”示例该工作表将被命名为“余额”,我作为 df 导入。 I would like "AAPL Balance df" as the result.我想要“AAPL Balance df”作为结果。

The code that came closest to what I'm looking for located below, however, it names each data frame as df_1, df_2, and so on.最接近我正在寻找的代码位于下面,但是,它将每个数据帧命名为 df_1、df_2 等。

library(purrr)
library(readxl)

files_list <- list.files(path = 'C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data/',
pattern = "*.xlsx",full.names = TRUE)

files_list %>% 
    walk2(1:length(files_list),
          ~ assign(paste0("df_", .y), read_excel(path = .x), envir = globalenv()))

I tried using the file path variable 'file_list' in the past0 function to label them and ended up with,我尝试在 past0 函数中使用文件路径变量 'file_list' 来标记它们并最终得到,

df_C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data/.xlsx1, df_C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data/.xlsx2, df_C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data/.xlsx1, df_C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data/.xlsx2,

and so on.等等。

I tried to make a list of file names to use.我试图列出要使用的文件名。 This read the file names and created a list but I couldn't make it work with the code above.这读取了文件名并创建了一个列表,但我无法使其与上面的代码一起使用。

files_Names<-list.files(path='C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data/', pattern=NULL, all.files=FALSE, full.names=FALSE)

Which resulted with this, "AAPL Analysis of Data.xlsx" for all the files in the list.这导致列表中所有文件的“AAPL Analysis of Data.xlsx”。

I hope I could reproduce your example without code.我希望我可以在没有代码的情况下重现您的示例。 I would create a function to have more control for the new filename.我会创建一个函数来更好地控制新文件名。

I would suggest:我会建议:

library(purrr)
library(readxl)
library(openxlsx)

target_folder <- 'C:/Users/example/Drive/Desktop/Total_Related_Data/Analysis of Data'

files_list <- list.files(path = target_folder,
                         pattern = "*.xlsx", full.names = TRUE)

tease_out <- function(file) {
  data <- read_excel(file, sheet = "Balance")
  filename <- basename(file) %>% tools::file_path_sans_ext()
  new_filename <- paste0(target_folder, "/", fileneame, "Balance df.xlsx")
  
  write.xlsx(data, file = new_filename)
}

map(file_list, tease_out)

Let me know if it works.让我知道它是否有效。 I assume you are just targeting for the sheet "Balance"?我假设您只是针对“余额”表?

You can do the following (note that I'm using the openxlsx package for reading in Excel files, but you can replace that part with readxl of course):您可以执行以下操作(请注意,我使用 openxlsx 包来读取 Excel 文件,但您当然可以用 readxl 替换该部分):

library(openxlsx)
library(tidyverse)

Starting with your `files_list` we can do:

# using lapply to read in all files and store them as list elements in one list
list_of_dfs <- lapply(as.list(files_list), function(x) readWorkbook(x, sheet = "Balance"))

# Create a vector of names based on the first word of the filename + "Balance"
# Note that we can't use empty space in object names, hence the underscore
df_names <- paste0(str_extract(basename(files_list), "[^ ]+"), "_Balance_df")

# Assign the names to our list of dfs
names(list_of_dfs) <- df_names

# Push the list elements (i.e. data frames) to the Global environment
# I highly recommend NOT doing this. I'd say in 99% of the cases it's better to continue working in the list structure or combine the individual dfs into one large df.
list2env(list_of_dfs, env = .GlobalEnv)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM