簡體   English   中英

R Selenium 無法找到元素返回錯誤 Selenium 消息:無法找到元素

[英]R Selenium unable to findElement return Error Selenium message:Unable to locate element

我正在從這個“https://lsf.uni-heidelberg.de/qisserver/rds?state=change&type=6&moduleParameter=personalSelect&nextdir=change&next=SearchSelect.vm&target=personSearch&subdir=person&init=y&source=state%3Dchange%26type%3D5 %26moduleParameter%3DpersonSearch%26nextdir%3Dchange%26next%3Dsearch.vm%26subdir%3Dperson%26menuid%3Dsearch%26_form%3Ddisplay%26topitem%3Dmembers%26subitem%3D%26field%3DNachname&targetfield=Nachname&_form=display”。 我想搜索每個人以收集 email 地址。 我正在執行以下操作,但找不到提交搜索按鈕的方法。

#url
uni<-"https://lsf.uni-heidelberg.de/qisserver/rds?state=change&type=6&moduleParameter=personalSelect&nextdir=change&next=SearchSelect.vm&target=personSearch&subdir=person&init=y&source=state%3Dchange%26type%3D5%26moduleParameter%3DpersonSearch%26nextdir%3Dchange%26next%3Dsearch.vm%26subdir%3Dperson%26menuid%3Dsearch%26_form%3Ddisplay%26topitem%3Dmembers%26subitem%3D%26field%3DNachname&targetfield=Nachname&_form=display"

#people's name
r<-read_html(uni)
name <- r %>%
  html_nodes("a") %>%
  html_text()
name<-name[40:length(name)]
name<-gsub("\n","",name ,fixed = T)
name<-gsub("\t","",name ,fixed = T)

#people's first link
link <- r %>%
  html_nodes("a") %>%
html_attrs() %>%
  as.character()
link<-link[40:length(link)]
link<-str_split(link, '"')
link<-sapply(link, "[", 6)


#create a loop: with R selenium, click on search for each link and get emails which are in the next page

rD <- rsDriver(browser="firefox", port=4545L, verbose=F)
remDr <- rD[["client"]]
#remDr$navigate("https://ki.se/en/research/professors-at-ki")

for (i in 1:lenght(link)) {
  i=1
 #r<- read_html(link[i])
 remDr$navigate(link[i])
 webElem <- remDr$findElement(using = 'xpath', '//*+[contains(concat( " ", @class, " " ), concat( " ", "abstand_search", " " ))]//font//input')
 
 webElem$clickElement()
 
#here i get the error
 

}


這里有一些指示。 我會 go 在閱讀時使用更快、更直觀的 css 選擇器來收集鏈接:

library(rvest)

links <- read_html('https://lsf.uni-heidelberg.de/qisserver/rds?state=change&type=6&moduleParameter=personalSelect&nextdir=change&next=SearchSelect.vm&target=personSearch&subdir=person&init=y&source=state%3Dchange%26type%3D5%26moduleParameter%3DpersonSearch%26nextdir%3Dchange%26next%3Dsearch.vm%26subdir%3Dperson%26menuid%3Dsearch%26_form%3Ddisplay%26topitem%3Dmembers%26subitem%3D%26field%3DNachname&targetfield=Nachname&_form=display') %>% 
  html_nodes('.regular[name]') %>% 
  html_attr('href')

然后,我會使用相同的策略來定位搜索按鈕:

webElem <- remDr$findElement(using = 'css selector', '.abstand_search + [value="Suche starten"]')  # this matches for the element which is interactable
  

最后,我會從目標頁面獲取名稱和 email

name <- remDr$findElement(using = 'css selector', '.regular')
email <- remDr$findElement(using = 'css selector', '[href*=mail]') # could also take 2nd match for .regular

我通過在循環中以下列方式使用 rvest 來解決它

 #use Rselenium to dowload emails
  rD <- rsDriver(browser="firefox", port=4545L, verbose=F)
  remDr <- rD[["client"]]
   emails<-list()

  for (i in 1:length(links)) {
    #r<- read_html(link[i])
    remDr$navigate(links[i])
    webElem <- remDr$findElement(using = 'css selector', '.abstand_search + [value="Suche starten"]')  # this matches for the element which is interactable
    webElem$clickElement()
    
    r <- read_html(unlist(webElem$getCurrentUrl()))
    mail <- r %>%
      html_nodes("a") %>%
      html_attrs() %>%
      as.character() %>%
      str_subset("mailto:") %>%
      str_remove("mailto:")
    if(length(mail)!=0){
    a<-  str_split(mail, "href")
    a<-unlist(a)
    w<-which((grepl("@",a, fixed = T)))
    emails<-c(emails,a[w])
    }else{    emails<-c(emails,NA)}
    rm(mail)
    
  }

不僅僅是優雅的代碼,但它可以工作。 因為名稱更復雜,我找不到正確的 css 或 xpath 的方法。 讓我知道您是否可以想到更優雅、更快速的代碼,或者該問題是否只能使用 brute forze 方式解決。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM