hi I'm new to web scraping and have been trying to use Selenium to scrape a forum in python
I am trying to get Selenium to click "Next" until the last page but I am not sure how to break the loop. and I having trouble with the locator:
When I locate the next button by partial link , the automated clicking will continue to next thread eg page1->page2->next thread->page1 of next thread-->page2 of next thread
while True:
next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.PARTIAL_LINK_TEXT, "Next")))
next_link.click()
When I locate the next button by class name , the automated clicking will click "prev" button when it reaches the last page
while True:
next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "prevnext")))
next_link.click()
My questions are:
You can use any locator which gives unique identification. Best practices says the following order.
The come out of the while loop when it is not find the element you can use try block as given below. the break command is used for the same.
while True: try: next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "prevnext"))) next_link.click() except TimeoutException: break
There are a couple of things you need to consider as follows :
click()
on the element instead of expected-conditions as presence_of_element_located()
you need to use element_to_be_clickable()
. click()
within try-catch
block and incase of an exception break
out. Here is the working code block :
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_argument('disable-infobars') driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\\Utility\\BrowserDrivers\\chromedriver.exe') driver.get("https://forums.hardwarezone.com.sg/money-mind-210/hdb-fully-paid-up-5744914.html") driver.find_element_by_xpath("//a[@id='poststop' and @name='poststop']//following::table[1]//li[@class='prevnext']/a").click() while True: try : WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@id='poststop' and @name='poststop']//following::table[1]//li[@class='prevnext']/a[contains(.,'Next')]"))).click() except : print("No more pages left") break driver.quit()
Console Output :
No more pages left
You can use below code to click Next button until the last page reached and break the loop if the button is not present:
from selenium.common.exceptions import TimeoutException
while True:
try:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.LINK_TEXT, "Next ›"))).click()
except TimeoutException:
break
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.