简体   繁体   English

硒以编程方式单击下一步按钮直到最后一页

[英]Selenium clicking next button programmatically until the last page

hi I'm new to web scraping and have been trying to use Selenium to scrape a forum in python 嗨,我是Web抓取的新手,并且一直在尝试使用Selenium来抓取python中的论坛

I am trying to get Selenium to click "Next" until the last page but I am not sure how to break the loop. 我试图让Selenium单击“下一步”,直到最后一页,但是我不确定如何中断循环。 and I having trouble with the locator: 我在定位器上遇到了麻烦:

When I locate the next button by partial link , the automated clicking will continue to next thread eg page1->page2->next thread->page1 of next thread-->page2 of next thread 当我通过部分链接找到下一个按钮时,自动单击将继续到下一个线程,例如page1-> page2->下一个线程->下一个线程的page1->下一个线程的page2

while True:
    next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.PARTIAL_LINK_TEXT, "Next")))
    next_link.click()

When I locate the next button by class name , the automated clicking will click "prev" button when it reaches the last page 当我按类别名称找到下一个按钮时,自动单击将在到达最后一页时单击“上一个”按钮

while True:
    next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "prevnext")))
    next_link.click()

My questions are: 我的问题是:

  1. Which locator should I use? 我应该使用哪个定位器? (by class or by partial link or any other suggestion? (按课程,部分链接或其他任何建议?
  2. How do I break the loop so it stops clicking when it reaches the last page? 如何中断循环,使其在到达最后一页时停止单击?
  1. You can use any locator which gives unique identification. 您可以使用任何提供唯一标识的定位器。 Best practices says the following order. 最佳做法按以下顺序进行。

    • Id ID
    • Name 名称
    • Class Name 班级名称
    • Css Selector CSS选择器
    • Xpath Xpath
    • Others 其他
  2. The come out of the while loop when it is not find the element you can use try block as given below. 当找不到元素时可以退出while循环,可以使用try块,如下所示。 the break command is used for the same. break命令用于相同的命令。

     while True: try: next_link = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "prevnext"))) next_link.click() except TimeoutException: break 

There are a couple of things you need to consider as follows : 您需要考虑以下几点:

  • There are two elements on the page with text as Next one on Top and another at the Bottom , so you need to decide with which element you desire to interact and construct a unique Locator Strategy 页面上有两个元素,其文本分别为“ 下一步 和“ 底部” ,因此您需要确定与哪个元素进行交互并构建独特的“ 定位器策略”
  • Moving forward as you want to invoke click() on the element instead of expected-conditions as presence_of_element_located() you need to use element_to_be_clickable() . 向前移动时要使用对元素的click()而不是期望的条件(作为element_to_be_clickable() presence_of_element_located() ,则需要使用element_to_be_clickable()
  • When there would be no element with text as Next you need to execute the remaining steps, so invoke the click() within try-catch block and incase of an exception break out. 当将与作为接下来您需要执行剩余的步骤文本没有元素,所以调用click()try-catch块和柜面的异常break了。
  • As per your requirement xpath as a Locator Strategy looks good to me. 根据您的要求, xpath作为定位器策略对我来说很好。
  • Here is the working code block : 这是工作代码块:

     from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_argument('disable-infobars') driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\\Utility\\BrowserDrivers\\chromedriver.exe') driver.get("https://forums.hardwarezone.com.sg/money-mind-210/hdb-fully-paid-up-5744914.html") driver.find_element_by_xpath("//a[@id='poststop' and @name='poststop']//following::table[1]//li[@class='prevnext']/a").click() while True: try : WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//a[@id='poststop' and @name='poststop']//following::table[1]//li[@class='prevnext']/a[contains(.,'Next')]"))).click() except : print("No more pages left") break driver.quit() 
  • Console Output : 控制台输出:

     No more pages left 

You can use below code to click Next button until the last page reached and break the loop if the button is not present: 您可以使用以下代码单击“下一步”按钮,直到到达最后一页;如果不存在该按钮,则中断循环:

from selenium.common.exceptions import TimeoutException

while True:
    try:
        WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.LINK_TEXT, "Next ›"))).click()
    except TimeoutException:
        break

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM