Fellows,
I'm doing some webscraping and need to download multiple PDFs from the www1.hkexnews.hk website.
However, I encountered a problem while trying to make my Selenium chromedriver tick the box that appears every time one wants to download a PDF on the said website. The code executes, but the box still appears unclicked.
Please refer to my source code below - would appreciate any advice!
driver = webdriver.Chrome('/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/chromedriver',options=chrome_options)
driver.implicitly_wait(10)
driver.maximize_window()
start_address = "https://www1.hkexnews.hk/app/appyearlyindex.html?lang=en&board=mainBoard&year=2021"
driver.get(start_address)
PDF_link = driver.find_element_by_xpath("//a[contains(text(),'Full Version')]")
print("Now clicking...'", PDF_link.text,"'")
PDF_link.click()
checkbox = driver.find_element_by_id('warning-statement-accept')
print("Now clicking...", checkbox.text)
checkbox.click
This should do it:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
link = "https://www1.hkexnews.hk/app/appyearlyindex.html?lang=en&board=mainBoard&year=2021"
driver = webdriver.Chrome()
wait = WebDriverWait(driver,10)
driver.get(link)
elem = wait.until(EC.presence_of_element_located((By.XPATH,"//tr[@class='record-ap-phip']//a[contains(.,'Full Version')]")))
elem.click()
wait.until(EC.presence_of_element_located((By.XPATH,"//*[@id='warning-statement-dialog']//label[@for='warning-statement-accept']"))).click()
wait.until(EC.presence_of_element_located((By.XPATH,"//*[@id='warning-statement-dialog']//a[contains(@class,'btn-ok')]"))).click()
There are several issues here:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome('/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/chromedriver',options=chrome_options)
wait = WebDriverWait(driver, 20)
driver.maximize_window()
start_address = "https://www1.hkexnews.hk/app/appyearlyindex.html?lang=en&board=mainBoard&year=2021"
driver.get(start_address)
PDF_link = wait.until(EC.visibility_of_element_located((By.XPATH, "//a[contains(text(),'Full Version')]")))
print("Now clicking...'", PDF_link.text,"'")
PDF_link.click()
checkbox = wait.until(EC.visibility_of_element_located((By.XPATH, "//label[@for='warning-statement-accept']")))
print("Now clicking...", checkbox.text)
checkbox.click
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.