简体   繁体   中英

Selenium Python - Do action, click next page, repeat until last page

I'm trying to do an action on a webpage, click the next button, then repeat that action until the last page is reached. I've tried using answers from similar questions on StackOverflow but I can't get them to work. Right now the only thing that happens is the webpage opens. None of my code to do stuff with the webpage happens. My code is below. Thanks for your help! from selenium import webdriver from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://obamawhitehouse.archives.gov/briefing-room/speeches-and-remarks')

while True:
    next_page_btn = driver.find_elements_by_xpath("//li[@class = 'pagination-next']/a")
    if len(next_page_btn) < 1:
        print("No more pages left")
        break
    else:
        <MY CODE>
        WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.LINK_TEXT, 'Next'))).click() 

I took a look into the site, and it seems that the pagination-next class doesn't exist. Instead the "Next" button that you are looking for has the class pager-next last

I suggest then to change this:

next_page_btn = driver.find_elements_by_xpath("*//li[@class = 'pagination-next']/a")

for this:

next_page_btn = driver.find_elements_by_xpath("*//li[@class = 'pager-next last']/a")

Let me know if this helps!

Please check below solution for your ref:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as Wait
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException


driver = webdriver.Chrome(executable_path=r"\chromedriver.exe")

driver.get('https://obamawhitehouse.archives.gov/briefing-room/speeches-and-remarks')
wait = WebDriverWait(driver,30)

flag = True

while flag:
 try:
    element = wait.until(EC.element_to_be_clickable((By.XPATH, "//a[contains(text(),'Next')]")))
    if (element != 0):
        element.click()

 except TimeoutException as ex:
        print "It is all good, no element there"

I noticed that the pages of my website were delineated like this:

https://obamawhitehouse.archives.gov/briefing-room/speeches-and-remarks?term_node_tid_depth=31&page=1

Going up to page=473 . So I was able to wrap my code in a while loop, add a counter, and do page={}.format .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM