使用 R 從網頁下載所有文件（.zip 和 .txt）

Question

我正在嘗試從網站下載所有文件（.zip 和 .txt 文件），但我似乎找不到方法。 我嘗試了this和this的建議，但沒有成功。

網站https://pubs.usgs.gov/sir/2007/5107/downloads/

（我需要為幾個類似的 USGS 頁面執行此操作，因此無法手動執行此操作）

Answer 1

這里試試這個。 它之所以有效，是因為文件 url 具有可重復的模式。 從網頁中獲取文件名有點笨拙，但它似乎確實有效。

許多文本文件可能缺少行尾標記（這很常見）並可能引發錯誤。 但是，這可能不是一個重要的錯誤。 如果發生這種情況，請打開下載的 txt 文件以確保下載正確。 毫無疑問，有一種方法可以自動執行該步驟，但我沒時間做這個了，Dude（或 Dudette 或任何你喜歡的東西）。

#get homepage for locations
page <- "https://pubs.usgs.gov/sir/2007/5107/downloads/"
a <- readLines(page)

#find lines of interest
loc.txt <- grep(".txt", a)
loc.zip <- grep(".zip", a)

#A convenience function that uses
#line from original page
#marker of file type to locate name 
#and page (url original page)
#------------------------------------
convfn <- function(line, marker, page){
  i <- unlist(gregexpr(pattern ='href="', line)) + 6
  i2<- unlist(gregexpr(pattern =,marker,  line)) + 3
  #target file
  .destfile <- substring(line, i[1], i2[1])
  #target url
  .url      <- paste(page, .destfile, sep = "/")
  #print targets
  cat(.url, '\n', .destfile, '\n')
  #the workhorse function
  download.file(url=.url, destfile=.destfile)
  }
#--------------------------------------------

#they will save in your working directory
#use setwd() to change if needed
print(getwd())
  
#get the .txt files and download them
sapply(a[loc.txt], 
       FUN = convfn, 
       marker = '.txt"', #this is key part, locates text file name
       page = page)

#get the .zip files and download them
sapply(a[loc.zip], 
       FUN = convfn, 
       marker = '.zip"', #this is key part, locates zip file name
       page = page)

使用 R 從網頁下載所有文件（.zip 和 .txt）

問題描述

1 個解決方案

解決方案1
1 2022-02-04 18:31:43

使用 R 從網頁下載所有文件（.zip 和 .txt）

問題描述

1 個解決方案

解決方案1 1 2022-02-04 18:31:43

解決方案1
1 2022-02-04 18:31:43