[英]python/Selenium --> find_elements_by_xpath method not finding all elements
[英]Python Selenium Webscraping: find_elements_by_xpath returning an empty list
我在大學學習了一些編碼科目,並試圖通過學習 selenium 來分析網球統計數據,這對我來說是全新的。
我正在使用的頁面在這里( https://www.atptour.com/en/scores/results-archive?year=2021 ),我正在關注這個網站的指南( https://www.scrapingbee .com/blog/selenium-python/ , https://www.scrapingbee.com/blog/practical-xpath-for-web-scraping/ )。 我遇到的特殊問題是在第二個指南網站的副標題“電子商務產品數據提取”下。
我的目標是遍歷錦標賽並提取帶有“結果”按鈕的鏈接,但我遇到了麻煩,因為我的程序只是給了我一個空列表。
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
DRIVER_PATH = "C:\Program Files (x86)\chromedriver.exe"
#driver = webdriver.Chrome(executable_path=DRIVER_PATH)
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
#driver.get("https://www.nintendo.com/")
#print(driver.page_source)
#driver.quit()
# 1 Data Collection
# 1.1 Find Links to All Tournaments
tournaments_2021_url = "https://www.atptour.com/en/scores/results-archive?year=2021"
#tournament_class = "tourney-result"
driver.get(tournaments_2021_url) # print(driver.page_source)
tournaments_2021_url_list = driver.find_elements_by_xpath("//a[@class='button-border']")
print("\n tournament urls \n")
print(tournaments_2021_url_list)
print(len(tournaments_2021_url_list))
driver.quit()
# 1.2 For Each Tournament, Find Links to Each Match
# 1.3 For Each Match, Extract Relevant Statistics
我希望有一個元素列表或一些奇怪的對象並能夠提取鏈接,但我得到一個 len 0 的空列表。感謝您的幫助。
要打印所有RESULTS的href屬性的值,您需要為visibility_of_all_elements_located()引入WebDriverWait ,您可以使用以下任一Locator Strategies :
使用PARTIAL_LINK_TEXT :
driver.get("https://www.atptour.com/en/scores/results-archive?year=2021") print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.PARTIAL_LINK_TEXT, "Results")))]) driver.quit()
使用CSS_SELECTOR :
driver.get("https://www.atptour.com/en/scores/results-archive?year=2021") print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[href$='results']")))]) driver.quit()
使用XPATH :
driver.get("https://www.atptour.com/en/scores/results-archive?year=2021") print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[normalize-space()='Results']")))]) driver.quit()
控制台 Output:
['https://www.atptour.com/en/scores/archive/delray-beach/499/2021/results', 'https://www.atptour.com/en/scores/archive/antalya/9426/2021/results', 'https://www.atptour.com/en/scores/archive/auckland/301/2021/results', 'https://www.atptour.com/en/scores/archive/melbourne/8998/2021/results', 'https://www.atptour.com/en/scores/archive/melbourne/9428/2021/results', 'https://www.atptour.com/en/scores/archive/pune/891/2021/results', 'https://www.atptour.com/en/scores/archive/atp-cup/8888/2021/results', 'https://www.atptour.com/en/scores/archive/australian-open/580/2021/results', 'https://www.atptour.com/en/scores/archive/new-york/424/2021/results', 'https://www.atptour.com/en/scores/archive/rio-de-janeiro/6932/2021/results', 'https://www.atptour.com/en/scores/archive/singapore/9460/2021/results', 'https://www.atptour.com/en/scores/archive/cordoba/9158/2021/results', 'https://www.atptour.com/en/scores/archive/montpellier/375/2021/results', 'https://www.atptour.com/en/scores/archive/rotterdam/407/2021/results', 'https://www.atptour.com/en/scores/archive/buenos-aires/506/2021/results', 'https://www.atptour.com/en/scores/archive/doha/451/2021/results', 'https://www.atptour.com/en/scores/archive/marseille/496/2021/results', 'https://www.atptour.com/en/scores/archive/santiago/8996/2021/results', 'https://www.atptour.com/en/scores/archive/dubai/495/2021/results', 'https://www.atptour.com/en/scores/archive/acapulco/807/2021/results', 'https://www.atptour.com/en/scores/archive/miami/403/2021/results', 'https://www.atptour.com/en/scores/archive/marrakech/360/2021/results', 'https://www.atptour.com/en/scores/archive/cagliari/9481/2021/results', 'https://www.atptour.com/en/scores/archive/marbella/9462/2021/results', 'https://www.atptour.com/en/scores/archive/houston/717/2021/results', 'https://www.atptour.com/en/scores/archive/monte-carlo/410/2021/results', 'https://www.atptour.com/en/scores/archive/barcelona/425/2021/results', 'https://www.atptour.com/en/scores/archive/belgrade/5053/2021/results', 'https://www.atptour.com/en/scores/archive/estoril/7290/2021/results', 'https://www.atptour.com/en/scores/archive/munich/308/2021/results', 'https://www.atptour.com/en/scores/archive/madrid/1536/2021/results', 'https://www.atptour.com/en/scores/archive/rome/416/2021/results', 'https://www.atptour.com/en/scores/archive/geneva/322/2021/results', 'https://www.atptour.com/en/scores/archive/lyon/7694/2021/results', 'https://www.atptour.com/en/scores/archive/parma/9510/2021/results', 'https://www.atptour.com/en/scores/archive/belgrade/9512/2021/results', 'https://www.atptour.com/en/scores/archive/roland-garros/520/2021/results', 'https://www.atptour.com/en/scores/archive/s-hertogenbosch/440/2021/results', 'https://www.atptour.com/en/scores/archive/stuttgart/321/2021/results', 'https://www.atptour.com/en/scores/archive/halle/500/2021/results', 'https://www.atptour.com/en/scores/archive/london/311/2021/results', 'https://www.atptour.com/en/scores/archive/mallorca/8994/2021/results', 'https://www.atptour.com/en/scores/archive/eastbourne/741/2021/results', 'https://www.atptour.com/en/scores/archive/wimbledon/540/2021/results', 'https://www.atptour.com/en/scores/archive/hamburg/414/2021/results', 'https://www.atptour.com/en/scores/archive/newport/315/2021/results', 'https://www.atptour.com/en/scores/archive/bastad/316/2021/results', 'https://www.atptour.com/en/scores/archive/los-cabos/7480/2021/results', 'https://www.atptour.com/en/scores/archive/gstaad/314/2021/results', 'https://www.atptour.com/en/scores/archive/umag/439/2021/results', 'https://www.atptour.com/en/scores/archive/tokyo/96/2021/results', 'https://www.atptour.com/en/scores/archive/atlanta/6116/2021/results', 'https://www.atptour.com/en/scores/archive/kitzbuhel/319/2021/results', 'https://www.atptour.com/en/scores/archive/washington/418/2021/results', 'https://www.atptour.com/en/scores/archive/toronto/421/2021/results', 'https://www.atptour.com/en/scores/archive/cincinnati/422/2021/results', 'https://www.atptour.com/en/scores/archive/winston-salem/6242/2021/results', 'https://www.atptour.com/en/scores/archive/us-open/560/2021/results', 'https://www.atptour.com/en/scores/archive/nur-sultan/9410/2021/results', 'https://www.atptour.com/en/scores/archive/metz/341/2021/results', 'https://www.atptour.com/en/scores/archive/laver-cup/9210/2021/results', 'https://www.atptour.com/en/scores/archive/san-diego/9569/2021/results', 'https://www.atptour.com/en/scores/archive/sofia/7434/2021/results', 'https://www.atptour.com/en/scores/archive/chengdu/7581/2021/results', 'https://www.atptour.com/en/scores/archive/zhuhai/9164/2021/results', 'https://www.atptour.com/en/scores/archive/shanghai/5014/2021/results', 'https://www.atptour.com/en/scores/archive/beijing/747/2021/results', 'https://www.atptour.com/en/scores/archive/tokyo/329/2021/results', 'https://www.atptour.com/en/scores/archive/indian-wells/404/2021/results', 'https://www.atptour.com/en/scores/archive/moscow/438/2021/results', 'https://www.atptour.com/en/scores/archive/antwerp/7485/2021/results', 'https://www.atptour.com/en/scores/archive/vienna/337/2021/results', 'https://www.atptour.com/en/scores/archive/st-petersburg/568/2021/results', 'https://www.atptour.com/en/scores/archive/basel/328/2021/results', 'https://www.atptour.com/en/scores/archive/paris/352/2021/results', 'https://www.atptour.com/en/scores/archive/stockholm/429/2021/results', 'https://www.atptour.com/en/scores/archive/intesa-sanpaolo-next-gen-atp-finals/7696/2021/results', 'https://www.atptour.com/en/scores/archive/nitto-atp-finals/605/2021/results']
注意:您必須添加以下導入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
這是要添加到基本代碼中的更新代碼:
from selenium.webdriver.common.by import By
tournaments_2021_url = "https://www.atptour.com/en/scores/results-archive?year=2021"
self.driver.get(tournaments_2021_url)
tournaments_2021_url_list = self.driver.find_elements(By.XPATH, "//a[@class='button-border']")
print("\nTournament URLs:\n")
for row in tournaments_2021_url_list:
print(row.get_attribute("href"))
print("\nNumber of rows:")
print(len(tournaments_2021_url_list))
這是運行所有內容后的 output:
Tournament URLs:
https://www.atptour.com/en/scores/archive/delray-beach/499/2021/results
https://www.atptour.com/en/scores/archive/antalya/9426/2021/results
https://www.atptour.com/en/scores/archive/auckland/301/2021/results
https://www.atptour.com/en/scores/archive/melbourne/8998/2021/results
https://www.atptour.com/en/scores/archive/melbourne/9428/2021/results
https://www.atptour.com/en/scores/archive/pune/891/2021/results
https://www.atptour.com/en/scores/archive/atp-cup/8888/2021/results
https://www.atptour.com/en/scores/archive/australian-open/580/2021/results
https://www.atptour.com/en/scores/archive/new-york/424/2021/results
https://www.atptour.com/en/scores/archive/rio-de-janeiro/6932/2021/results
https://www.atptour.com/en/scores/archive/singapore/9460/2021/results
https://www.atptour.com/en/scores/archive/cordoba/9158/2021/results
https://www.atptour.com/en/scores/archive/montpellier/375/2021/results
https://www.atptour.com/en/scores/archive/rotterdam/407/2021/results
https://www.atptour.com/en/scores/archive/buenos-aires/506/2021/results
https://www.atptour.com/en/scores/archive/doha/451/2021/results
https://www.atptour.com/en/scores/archive/marseille/496/2021/results
https://www.atptour.com/en/scores/archive/santiago/8996/2021/results
https://www.atptour.com/en/scores/archive/dubai/495/2021/results
https://www.atptour.com/en/scores/archive/acapulco/807/2021/results
https://www.atptour.com/en/scores/archive/miami/403/2021/results
https://www.atptour.com/en/scores/archive/marrakech/360/2021/results
https://www.atptour.com/en/scores/archive/cagliari/9481/2021/results
https://www.atptour.com/en/scores/archive/marbella/9462/2021/results
https://www.atptour.com/en/scores/archive/houston/717/2021/results
https://www.atptour.com/en/scores/archive/monte-carlo/410/2021/results
https://www.atptour.com/en/scores/archive/barcelona/425/2021/results
https://www.atptour.com/en/scores/archive/belgrade/5053/2021/results
https://www.atptour.com/en/scores/archive/estoril/7290/2021/results
https://www.atptour.com/en/scores/archive/munich/308/2021/results
https://www.atptour.com/en/scores/archive/madrid/1536/2021/results
https://www.atptour.com/en/scores/archive/rome/416/2021/results
https://www.atptour.com/en/scores/archive/geneva/322/2021/results
https://www.atptour.com/en/scores/archive/lyon/7694/2021/results
https://www.atptour.com/en/scores/archive/parma/9510/2021/results
https://www.atptour.com/en/scores/archive/belgrade/9512/2021/results
https://www.atptour.com/en/scores/archive/roland-garros/520/2021/results
https://www.atptour.com/en/scores/archive/s-hertogenbosch/440/2021/results
https://www.atptour.com/en/scores/archive/stuttgart/321/2021/results
https://www.atptour.com/en/scores/archive/halle/500/2021/results
https://www.atptour.com/en/scores/archive/london/311/2021/results
https://www.atptour.com/en/scores/archive/mallorca/8994/2021/results
https://www.atptour.com/en/scores/archive/eastbourne/741/2021/results
https://www.atptour.com/en/scores/archive/wimbledon/540/2021/results
https://www.atptour.com/en/scores/archive/hamburg/414/2021/results
https://www.atptour.com/en/scores/archive/newport/315/2021/results
https://www.atptour.com/en/scores/archive/bastad/316/2021/results
https://www.atptour.com/en/scores/archive/los-cabos/7480/2021/results
https://www.atptour.com/en/scores/archive/gstaad/314/2021/results
https://www.atptour.com/en/scores/archive/umag/439/2021/results
https://www.atptour.com/en/scores/archive/tokyo/96/2021/results
https://www.atptour.com/en/scores/archive/atlanta/6116/2021/results
https://www.atptour.com/en/scores/archive/kitzbuhel/319/2021/results
https://www.atptour.com/en/scores/archive/washington/418/2021/results
https://www.atptour.com/en/scores/archive/toronto/421/2021/results
https://www.atptour.com/en/scores/archive/cincinnati/422/2021/results
https://www.atptour.com/en/scores/archive/winston-salem/6242/2021/results
https://www.atptour.com/en/scores/archive/us-open/560/2021/results
https://www.atptour.com/en/scores/archive/nur-sultan/9410/2021/results
https://www.atptour.com/en/scores/archive/metz/341/2021/results
https://www.atptour.com/en/scores/archive/laver-cup/9210/2021/results
https://www.atptour.com/en/scores/archive/san-diego/9569/2021/results
https://www.atptour.com/en/scores/archive/sofia/7434/2021/results
https://www.atptour.com/en/scores/archive/chengdu/7581/2021/results
https://www.atptour.com/en/scores/archive/zhuhai/9164/2021/results
https://www.atptour.com/en/scores/archive/shanghai/5014/2021/results
https://www.atptour.com/en/scores/archive/beijing/747/2021/results
https://www.atptour.com/en/scores/archive/tokyo/329/2021/results
https://www.atptour.com/en/scores/archive/indian-wells/404/2021/results
https://www.atptour.com/en/scores/archive/moscow/438/2021/results
https://www.atptour.com/en/scores/archive/antwerp/7485/2021/results
https://www.atptour.com/en/scores/archive/vienna/337/2021/results
https://www.atptour.com/en/scores/archive/st-petersburg/568/2021/results
https://www.atptour.com/en/scores/archive/basel/328/2021/results
https://www.atptour.com/en/scores/archive/paris/352/2021/results
https://www.atptour.com/en/scores/archive/stockholm/429/2021/results
https://www.atptour.com/en/scores/archive/intesa-sanpaolo-next-gen-atp-finals/7696/2021/results
https://www.atptour.com/en/scores/archive/nitto-atp-finals/605/2021/results
Number of rows:
78
這還具有避免棄用警告的額外好處。 driver.find_elements_by_xpath
已“棄用”,運行pytest
后會顯示警告消息。 較新的driver.find_elements(By.XPATH, XPATH)
避免了這種情況,盡管它確實添加了額外的導入行, from selenium.webdriver.common.by import By
到代碼。
我拿走了你的代碼並運行了它,這很好。 它做它應該做的事情。 因此,我的建議是通過調試器運行它並逐步確保一切按預期進行。 也刪除無頭選項,以便您可以目視確認。 檢查您的 chrome 瀏覽器版本並確保它與您使用的 chromedriver 匹配。 (盡管如果版本不匹配,它應該會給您一條錯誤消息。)最后,如果一切都失敗了,請使用另一個瀏覽器嘗試它,例如 firefox 和適當的 geckodriver。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.