Can't grab all the pdf links within a table from a webpage

Question

I've written a script in python in combination with selenium to scrape different pdf links generated upon clicking on the different numbers, as in 110015710 , 110015670 etc located within a table from a webpage.

Site link

My script can click on those links, reveal the pdf files but parse only 5 of them out of many.

How can I get them all?

I've tried so far:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

link = "replace_with_above_link"

driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get(link)

[driver.execute_script("arguments[0].click();",item) for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"tr.Iec")))]
for elem in wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,".IecAttachments li a[href$='.pdf']"))):
    print(elem.get_attribute("href"))
driver.quit()

Answer 1

when you click the element it will doing XHR to request for pdf links, add delay after every click.

for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"tr.Iec"))):
    driver.execute_script("arguments[0].click();",item)
    time.sleep(1)

Can't grab all the pdf links within a table from a webpage

Question

1 answers

solution1
0 ACCPTED 2018-12-23 00:10:33

Can't grab all the pdf links within a table from a webpage

Question

1 answers

solution1 0 ACCPTED 2018-12-23 00:10:33

solution1
0 ACCPTED 2018-12-23 00:10:33