简体   繁体   中英

how to download files via rselenium

I'm trying to download files via Rselium and it has been 4 days without any success. All I want to do is get Rselium to click on the download link and save it to a local file on disk.

I am using Docker on a windows 10 machine.

In this page https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html

I want to download the List of eligible securities

Based on feed back from @Nad pat below, I have now updated my code. I am still not able to download the files to the specified directory!

Updated code

library(tidyverse)
library(lubridate)
library(XML)
library(readxl)
library(janitor)    

library(RSelenium)


fprof <- makeFirefoxProfile(list(
    
    browser.download.manager.showWhenStarting = FALSE,
    
    browser.download.dir = str_replace_all("c:/users/joe/downloads/", "/", "\\\\\\\\"),
    # browser.download.dir = "c:/users/joe/downloads",
    # browser.download.useDownloadDir = str_replace_all("c:/users/joe/downloads/", "/", "\\\\\\\\"),

    
    rowser.helperApps.neverAsk.openFile = "text/csv",
    
    browser.helperApps.neverAsk.saveToDisk = "text/plain,application/octet-stream,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
    # browser.helperApps.neverAsk.saveToDisk="text/csv",
    
    browser.download.folderList = 2L
                                 )
)


link_to_chart <- "https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html"

remDr <- remoteDriver(remoteServerAddr = "localhost", 
                      port = 4445L, browserName = "firefox", 
                      extraCapabilities = fprof)

remDr$open(silent = TRUE)
remDr$navigate(link_to_chart)
remDr$screenshot(display = TRUE) #This will take a screenshot and display it in the RStudio viewer


#download file - 2 options
remDr$findElement(using ="class", "anchor-xls")$clickElement()
remDr$findElement(using ="xpath", '//*[@id="content"]/section/div[2]/table/tbody/tr[1]/th/a')$clickElement()


remDr$closeWindow()

Old Code

This is the best code I could come-up with:

library(tidyverse)
library(lubridate)

library(XML)
library(readxl)
library(janitor)
library(platus)


library(RSelenium)

firefor_exemptions <- c("application/vnd.openxmlformats-officedocument.presentationml.presentation",
  "application/vnd.openxmlformats-officedocument.presentationml.slide",
  "application/vnd.openxmlformats-officedocument.presentationml.slideshw",
  "application/vnd.openxmlformats-officedocument.presentationml.template",
  "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
  "application/vnd.openxmlformats-officedocument.spreadsheetml.template",
  "application/vnd.openxmformats-officedocument.wordprocessingml.document",
  "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
  "application/x-msbinder",
  "application/vnd.ms-officetheme",
  "application/onenote",
  "audio/vnd.ms-playready.media.pya",
  "vdeo/vnd.ms-playready.media.pyv",
  "application/vnd.ms-powerpoint",
  "application/vnd.ms-powerpoint.addin.macroenabled.12",
  "application/vnd.ms-powerpoint.slide.macroenabled.12",
  "application/vnd.ms-powerpoint.presentation.macroenabled.12",
  "appliation/vnd.ms-powerpoint.slideshow.macroenabled.12",
  "application/vnd.ms-project",
  "application/x-mspublisher",
  "application/x-msschedule",
  "application/x-silverlight-app"
)

fprof <- makeFirefoxProfile(list(browser.helperApps.neverAsk.saveToDisk = firefor_exemptions,
                                 browser.download.dir = str_replace_all("c:/users/john_doe/downloads", "/", "\\\\\\\\"))
)



link_to_chart <- "https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html"

remDr <- remoteDriver(remoteServerAddr = "localhost", 
                      port = 4445L, browserName = "firefox", 
                      extraCapabilities = fprof)

remDr$open(silent = FALSE)
remDr$navigate(link_to_chart)

#I'm not sure what I should be selecting
remDr$findElement("class", "js-no-cache anchor-to-file anchor-xls")$clickElement()

remDr$findElement("link", "List of eligible securities")$clickElement()


remDr$closeWindow()

Any ideas on how to make this work?

For chrome to download in working directory we can use,

file_path <- getwd() %>% str_replace_all("/", "\\\\\\")

eCaps <- list(
  chromeOptions = 
    list(prefs = list('download.default_directory' = file_path))
)

driver <- rsDriver(browser = "chrome",port = 9995L, extraCapabilities = eCaps)

remDr<-driver[["client"]]
remDr$navigate(link_to_chart)

Using the xpath I was able to download the file,

remDr$findElement(using ="xpath", '//*[@id="content"]/section/div[2]/table/tbody/tr[1]/th/a')$clickElement()

using class

remDr$findElement(using ="class", 'anchor-xls')$clickElement()

Why use Selenium? You can download the file with download.file if you find out its url:

url <- "https://www.rba.gov.au/mkt-operations/xls/eligible-securities.xls"
url <- paste0(url, "?v=", gsub(" |:", "-", as.character(Sys.time())))
download.file(url, "securities.xls")

This saves the file to disk.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM