簡體   English   中英

python selenium 循環通過一些鏈接

[英]python selenium loop through some links

我有一個鏈接數組,我試圖訪問每個鏈接並從中打印一些東西,然后返回主頁並訪問第二個鏈接,然后做同樣的事情,直到我完成數組中的所有鏈接。

發生的情況是第一個鏈接是唯一有效的鏈接,就像數組中的所有鏈接都消失了一樣。 我得到錯誤:

File "e:\work\MY CODE\scraping\learn.py", line 25, in theprint link.click()

    from selenium import webdriver
from selenium.webdriver.common import keys
#it make us able to use keybored keys like enter ,esc , etc....
from selenium.webdriver.common.keys import Keys
import time

#make us can wait for event to happen until run the next line of code
from selenium.webdriver.common.by import By
from selenium.webdriver.remote import command
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

#get the google chrome driver path
PATH="E:\work\crom\chromedriver.exe"
#pass the pass to selenium webdriver method
driver=webdriver.Chrome(PATH)
#get the link of the site we want
driver.get("https://app.dealroom.co/companies.startups/f/client_focus/anyof_business/company_status/not_closed/company_type/not_government%20nonprofit/employees/anyof_2-10_11-50_51-200/has_website_url/anyof_yes/slug_locations/anyof_france?sort=-revenue")

#wait for the page to load
time.sleep(5)
#get the links i want to get info from
the_links=driver.find_elements_by_class_name("table-list-item")

#function that go the link and print somethin and return to main page
links=[]
the_links=driver.find_elements_by_class_name("table-list-item")
for link in the_links:
      links.append(link.get_attribute('href'))

for link in links:
      driver.get(link)
      website=driver.find_element_by_class_name("item-details-info__url")
      print(website.text)
      driver.back()
      time.sleep(3)
      

您的代碼將拋出一個過時的元素引用錯誤,因為當您導航到下一頁時,保存前一頁任何元素的變量將變得不可用。

所以你需要做的是將所有元素存儲在數組中,然后像這樣循環遍歷它:

links=[]
the_links=driver.find_elements_by_class_name("table-list-item")
for link in the_links:
    links.append(link.get_attribute('href'))

for link in links:
    driver.get(link)
    print("do something on this link")

或者,您可以在當前使用 while 循環,然后在 driver.back() 再次填充 the_links 變量。

卡里姆,所有頁面上都存在 class_name 為“item-details-info__url”的元素嗎? 另外,get() 方法會拋出什么錯誤?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM