简体   繁体   中英

How do I write a function (analogous to a SAS macro) in R to import and format a list of Excel files?

I'm looking for a more efficient way to write the following:

Read in all my Excel files

DF1 <- read_excel(DF1, sheet = "ABC", range = cell_cols(1:10) )
DF2 <- read_excel(DF2, sheet = "ABC", range = cell_cols(1:10) )
etc...
DF50 <- read_excel(DF50, sheet = "ABC", range = cell_cols(1:10) )

Add a column to each DF with a location

DF1$Location <- location1
DF2$Location <- location2
etc...
DF50$Location <- location50

Keep only columns with specified names, get rid of blank rows, and convert column CR_NUMBER to an integer

library(hablar)
DF1 <- DF1 %>% select(all_of(colnames_r)) %>% filter(!is.na(NAME)) %>% convert(int(CR_NUMBER))
DF2 <- DF2 %>% select(all_of(colnames_r)) %>% filter(!is.na(NAME)) %>% convert(int(CR_NUMBER))
etc...
DF50 <- DF50 %>% select(all_of(colnames_r)) %>% filter(!is.na(NAME)) %>% convert(int(CR_NUMBER))

You can try to use the following getting the data in a list :

library(readxl)
library(hablar)
library(dplyr)

#Get the complete path of file which has name "DF" followed by a number.
file_names <- list.files('/folder/path', pattern = 'DF\\d+', full.names = TRUE)

list_data <- lapply(seq_along(file_names), function(x) {
  data <- read_excel(file_names[x], sheet = "ABC", range = cell_cols(1:10))
  data %>%
    mutate(Location = paste0('location', x))
    select(all_of(colnames_r)) %>% 
    filter(!is.na(NAME)) %>% 
    convert(int(CR_NUMBER))
})

list_data is a list of dataframes which is usually better to manage instead of having 50 dataframes in global environment. If you still want all the dataframes separately name the list and use list2env .

names(list_data) <- paste0('DF', seq_along(list_data))
list2env(list_data, .GlobalEnv)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM