简体   繁体   English

如何通过rselenium下载文件

[英]how to download files via rselenium

I'm trying to download files via Rselium and it has been 4 days without any success.我正在尝试通过 Rselium 下载文件,但已经 4 天没有成功。 All I want to do is get Rselium to click on the download link and save it to a local file on disk.我想要做的就是让 Rselium 单击下载链接并将其保存到磁盘上的本地文件中。

I am using Docker on a windows 10 machine.我在 windows 10 机器上使用 Docker。

In this page https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html在本页https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html

I want to download the List of eligible securities我要下载List of eligible securities

Based on feed back from @Nad pat below, I have now updated my code.根据下面@Nad pat 的反馈,我现在更新了我的代码。 I am still not able to download the files to the specified directory!我仍然无法将文件下载到指定目录!

Updated code更新代码

library(tidyverse)
library(lubridate)
library(XML)
library(readxl)
library(janitor)    

library(RSelenium)


fprof <- makeFirefoxProfile(list(
    
    browser.download.manager.showWhenStarting = FALSE,
    
    browser.download.dir = str_replace_all("c:/users/joe/downloads/", "/", "\\\\\\\\"),
    # browser.download.dir = "c:/users/joe/downloads",
    # browser.download.useDownloadDir = str_replace_all("c:/users/joe/downloads/", "/", "\\\\\\\\"),

    
    rowser.helperApps.neverAsk.openFile = "text/csv",
    
    browser.helperApps.neverAsk.saveToDisk = "text/plain,application/octet-stream,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
    # browser.helperApps.neverAsk.saveToDisk="text/csv",
    
    browser.download.folderList = 2L
                                 )
)


link_to_chart <- "https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html"

remDr <- remoteDriver(remoteServerAddr = "localhost", 
                      port = 4445L, browserName = "firefox", 
                      extraCapabilities = fprof)

remDr$open(silent = TRUE)
remDr$navigate(link_to_chart)
remDr$screenshot(display = TRUE) #This will take a screenshot and display it in the RStudio viewer


#download file - 2 options
remDr$findElement(using ="class", "anchor-xls")$clickElement()
remDr$findElement(using ="xpath", '//*[@id="content"]/section/div[2]/table/tbody/tr[1]/th/a')$clickElement()


remDr$closeWindow()

Old Code旧代码

This is the best code I could come-up with:这是我能想到的最好的代码:

library(tidyverse)
library(lubridate)

library(XML)
library(readxl)
library(janitor)
library(platus)


library(RSelenium)

firefor_exemptions <- c("application/vnd.openxmlformats-officedocument.presentationml.presentation",
  "application/vnd.openxmlformats-officedocument.presentationml.slide",
  "application/vnd.openxmlformats-officedocument.presentationml.slideshw",
  "application/vnd.openxmlformats-officedocument.presentationml.template",
  "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
  "application/vnd.openxmlformats-officedocument.spreadsheetml.template",
  "application/vnd.openxmformats-officedocument.wordprocessingml.document",
  "application/vnd.openxmlformats-officedocument.wordprocessingml.template",
  "application/x-msbinder",
  "application/vnd.ms-officetheme",
  "application/onenote",
  "audio/vnd.ms-playready.media.pya",
  "vdeo/vnd.ms-playready.media.pyv",
  "application/vnd.ms-powerpoint",
  "application/vnd.ms-powerpoint.addin.macroenabled.12",
  "application/vnd.ms-powerpoint.slide.macroenabled.12",
  "application/vnd.ms-powerpoint.presentation.macroenabled.12",
  "appliation/vnd.ms-powerpoint.slideshow.macroenabled.12",
  "application/vnd.ms-project",
  "application/x-mspublisher",
  "application/x-msschedule",
  "application/x-silverlight-app"
)

fprof <- makeFirefoxProfile(list(browser.helperApps.neverAsk.saveToDisk = firefor_exemptions,
                                 browser.download.dir = str_replace_all("c:/users/john_doe/downloads", "/", "\\\\\\\\"))
)



link_to_chart <- "https://www.rba.gov.au/mkt-operations/resources/tech-notes/eligible-securities.html"

remDr <- remoteDriver(remoteServerAddr = "localhost", 
                      port = 4445L, browserName = "firefox", 
                      extraCapabilities = fprof)

remDr$open(silent = FALSE)
remDr$navigate(link_to_chart)

#I'm not sure what I should be selecting
remDr$findElement("class", "js-no-cache anchor-to-file anchor-xls")$clickElement()

remDr$findElement("link", "List of eligible securities")$clickElement()


remDr$closeWindow()

Any ideas on how to make this work?关于如何使这项工作有任何想法吗?

For chrome to download in working directory we can use,要在工作目录中下载chrome ,我们可以使用,

file_path <- getwd() %>% str_replace_all("/", "\\\\\\")

eCaps <- list(
  chromeOptions = 
    list(prefs = list('download.default_directory' = file_path))
)

driver <- rsDriver(browser = "chrome",port = 9995L, extraCapabilities = eCaps)

remDr<-driver[["client"]]
remDr$navigate(link_to_chart)

Using the xpath I was able to download the file,使用xpath我能够下载文件,

remDr$findElement(using ="xpath", '//*[@id="content"]/section/div[2]/table/tbody/tr[1]/th/a')$clickElement()

using class使用class

remDr$findElement(using ="class", 'anchor-xls')$clickElement()

Why use Selenium?为什么要用Selenium? You can download the file with download.file if you find out its url:如果你找到它的 url,你可以使用download.file下载文件:

url <- "https://www.rba.gov.au/mkt-operations/xls/eligible-securities.xls"
url <- paste0(url, "?v=", gsub(" |:", "-", as.character(Sys.time())))
download.file(url, "securities.xls")

This saves the file to disk.这会将文件保存到磁盘。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM