简体   繁体   中英

R download and read many excel files, automatically

I need to download a few hundred number of excel files and import them into R each day. Each one should be their own data-frame. I have a csv. file with all the adresses (the adresses remains static).

The csv. file looks like this:

http://www.www.somehomepage.com/chartserver/hometolotsoffiles%a
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%b
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%a0
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%aa11
etc.....

I can do it with a single file like this:

library(XLConnect)

my.url <- "http://www.somehomepage.com/chartserver/hometolotsoffiles%a"

loc.download <- "C:/R/lotsofdata/" # each files probably needs to have their own name here? 

download.file(my.url, loc.download, mode="wb")

df.import.x1 = readWorksheetFromFile("loc.download", sheet=2)) 

# This kind of import works on all the files, if you ran them individually

But I have no idea how to download each file, and place it separately in a folder, and then import them all into R as individual data frames.

It's hard to answer your question as you haven't provided a reproducible example and it isn't clear what you exactly want. Anyway, the code below should point you in the right direction.

  1. You have a list of urls you want to visit:

     urls = c("http://www/chartserver/hometolotsoffiles%a", "http://www/chartserver/hometolotsoffiles%b") 

    in your example, you load this from a csv file

  2. Next we download each file and put it in a separate directory (you mentioned that in your question

     for(url in urls) { split_url = strsplit(url, "/")[[1]] ##Extract final part of URL dir = split_url[length(split_url)] ##Create a directory dir.create(dir) ##Download the file download.file(url, dir, mode="wb") } 
  3. Then we loop over the directories and files and store the results in a list.

     ##Read in files l = list(); i = 1 dirs = list.dirs("/data/", recursive=FALSE) for(dir in dirs){ file = list.files(dir, full.names=TRUE) ##Do something? ##Perhaps store sheets as a list l[[i]] = readWorksheetFromFile(file, sheet=2) i = i + 1 } 

We could of course combine steps two and three into a single loop. Or drop the loops and use sapply .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM