简体   繁体   中英

how to use download.file to download images from web in r

I have the following code:

library(rvest)
library(httr)
interval <- seq(0, 55, by = 5)
hour <- c("01", "02", "03")
for (h in hour) {
  for (i in interval) {
    print(i, length(i))
    if (i == 0 | i == 5) {
      url <- paste("https://www.nea.gov.sg/docs/default-source/rain-area-240km/dpsri_240km_20220731", h, "0", i, "0000dBR.dpsri.png", sep = "")
    } else {
       url <- paste("https://www.nea.gov.sg/docs/default-source/rain-area-240km/dpsri_240km_20220731", h, i, "0000dBR.dpsri.png", sep = "")
    }
    imgurl <- read_html(url) %>%
      html_node(css = "img") %>%
      html_attr("src")
    download.file (imgurl, destfile = "C:/Users/~/Downloads/DSA2101 Main/data/radar_files")
  }
} 

I am trying to webscrape the 240km radar scans from the aforementioned url in the code and download them into radar_files before zipping the file. Radar_files is a folder I created. However, when I run the code, I get

Warning: URL NA: cannot open destfile 'C:/Users/~/Downloads/DSA2101 Main/data/radar_files', reason 'No such file or directory'Warning: download had nonzero exit status

Where did I go wrong?

Thank you.

  1. From the error message I would guess that the folder does not exist.

  2. Even if the folder exists your destfile contains only the path to the folder. You also have to add a filename. To this end I use basename(url) which returns the original filename from the url.

  3. As your url already contains the url to the png I don't understand what you are trying to achieve with the rvest code.

This said, below is a minimal reprex which works fine for me.

Note: I download only three of the images.

# Create a temporary directory for the reprex
folder <- tempdir() # Replace with the path to your folder

# Create folder to the store the pngs
dir.create(file.path(folder, "radar_files"))

interval <- c(0, 5, 10)
hour <- c("01")

for (h in hour) {
  for (i in interval) {
    print(i, length(i))
    if (i %in% c(0, 5)) {
      url <- paste("https://www.nea.gov.sg/docs/default-source/rain-area-240km/dpsri_240km_20220731", h, "0", i, "0000dBR.dpsri.png", sep = "")
    } else {
      url <- paste("https://www.nea.gov.sg/docs/default-source/rain-area-240km/dpsri_240km_20220731", h, i, "0000dBR.dpsri.png", sep = "")
    }
    
    download.file(url, destfile = file.path(folder, "radar_files", basename(url)))
  }
}
#> [1] 0
#> [1] 5
#> [1] 10

dir(file.path(folder, "radar_files"))
#> [1] "dpsri_240km_2022073101000000dBR.dpsri.png"
#> [2] "dpsri_240km_2022073101050000dBR.dpsri.png"
#> [3] "dpsri_240km_2022073101100000dBR.dpsri.png"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM