简体   繁体   中英

Tick a checkbox using Selenium webdriver in Python

Fellows,

I'm doing some webscraping and need to download multiple PDFs from the www1.hkexnews.hk website.

However, I encountered a problem while trying to make my Selenium chromedriver tick the box that appears every time one wants to download a PDF on the said website. The code executes, but the box still appears unclicked.

Please refer to my source code below - would appreciate any advice!

driver = webdriver.Chrome('/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/chromedriver',options=chrome_options)
driver.implicitly_wait(10)
driver.maximize_window()

start_address = "https://www1.hkexnews.hk/app/appyearlyindex.html?lang=en&board=mainBoard&year=2021"

driver.get(start_address)
PDF_link = driver.find_element_by_xpath("//a[contains(text(),'Full Version')]")
print("Now clicking...'", PDF_link.text,"'")
PDF_link.click()

checkbox = driver.find_element_by_id('warning-statement-accept')
print("Now clicking...", checkbox.text)
checkbox.click

This should do it:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

link = "https://www1.hkexnews.hk/app/appyearlyindex.html?lang=en&board=mainBoard&year=2021"

driver = webdriver.Chrome()
wait = WebDriverWait(driver,10)

driver.get(link)
elem = wait.until(EC.presence_of_element_located((By.XPATH,"//tr[@class='record-ap-phip']//a[contains(.,'Full Version')]")))
elem.click()
wait.until(EC.presence_of_element_located((By.XPATH,"//*[@id='warning-statement-dialog']//label[@for='warning-statement-accept']"))).click()
wait.until(EC.presence_of_element_located((By.XPATH,"//*[@id='warning-statement-dialog']//a[contains(@class,'btn-ok')]"))).click()

There are several issues here:

  1. "checkbox" locator is wrong.
  2. Your current code will download the first PDF file only.
    It is preferably to use expected conditions explicit waits instead of implicit wait.
    This should work better:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome('/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/chromedriver',options=chrome_options)
wait = WebDriverWait(driver, 20)

driver.maximize_window()

start_address = "https://www1.hkexnews.hk/app/appyearlyindex.html?lang=en&board=mainBoard&year=2021"

driver.get(start_address)
PDF_link = wait.until(EC.visibility_of_element_located((By.XPATH, "//a[contains(text(),'Full Version')]")))

print("Now clicking...'", PDF_link.text,"'")
PDF_link.click()

checkbox = wait.until(EC.visibility_of_element_located((By.XPATH, "//label[@for='warning-statement-accept']")))
print("Now clicking...", checkbox.text)
checkbox.click

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM