[英]Selenium clicking next button programmatically until the last page
hi I'm new to web scraping and have been trying to use Selenium to scrape a forum in python 嗨,我是Web抓取的新手,并且一直在尝试使用Selenium来抓取python中的论坛
I am trying to get Selenium to click "Next" until the last page but I am not sure how to break the loop. 我试图让Selenium单击“下一步”,直到最后一页,但是我不确定如何中断循环。 and I having trouble with the locator:
我在定位器上遇到了麻烦:
When I locate the next button by partial link , the automated clicking will continue to next thread eg page1->page2->next thread->page1 of next thread-->page2 of next thread 当我通过部分链接找到下一个按钮时,自动单击将继续到下一个线程,例如page1-> page2->下一个线程->下一个线程的page1->下一个线程的page2
while True:
next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.PARTIAL_LINK_TEXT, "Next")))
next_link.click()
When I locate the next button by class name , the automated clicking will click "prev" button when it reaches the last page 当我按类别名称找到下一个按钮时,自动单击将在到达最后一页时单击“上一个”按钮
while True:
next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "prevnext")))
next_link.click()
My questions are: 我的问题是:
You can use any locator which gives unique identification. 您可以使用任何提供唯一标识的定位器。 Best practices says the following order.
最佳做法按以下顺序进行。
The come out of the while loop when it is not find the element you can use try block as given below. 当找不到元素时可以退出while循环,可以使用try块,如下所示。 the break command is used for the same.
break命令用于相同的命令。
while True: try: next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "prevnext"))) next_link.click() except TimeoutException: break
There are a couple of things you need to consider as follows : 您需要考虑以下几点:
click()
on the element instead of expected-conditions as presence_of_element_located()
you need to use element_to_be_clickable()
. click()
而不是期望的条件(作为element_to_be_clickable()
presence_of_element_located()
,则需要使用element_to_be_clickable()
。 click()
within try-catch
block and incase of an exception break
out. click()
中try-catch
块和柜面的异常break
了。 Here is the working code block : 这是工作代码块:
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_argument('disable-infobars') driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\\Utility\\BrowserDrivers\\chromedriver.exe') driver.get("https://forums.hardwarezone.com.sg/money-mind-210/hdb-fully-paid-up-5744914.html") driver.find_element_by_xpath("//a[@id='poststop' and @name='poststop']//following::table[1]//li[@class='prevnext']/a").click() while True: try : WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@id='poststop' and @name='poststop']//following::table[1]//li[@class='prevnext']/a[contains(.,'Next')]"))).click() except : print("No more pages left") break driver.quit()
Console Output : 控制台输出:
No more pages left
You can use below code to click Next button until the last page reached and break the loop if the button is not present: 您可以使用以下代码单击“下一步”按钮,直到到达最后一页;如果不存在该按钮,则中断循环:
from selenium.common.exceptions import TimeoutException
while True:
try:
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.LINK_TEXT, "Next ›"))).click()
except TimeoutException:
break
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.