简体   繁体   English

硒嵌套循环到下一页

[英]Selenium nested loops to next page

I have built a script using Selenium that loops through a page, prints the data and goes to the next page and does the same. 我使用Selenium构建了一个脚本,该脚本可循环浏览页面,打印数据并转到下一页并执行相同的操作。

Now I am trying to save the data to a CSV file and thus need to create a nested loop - Currently I am repeating the loop multiple times (as below). 现在,我试图将数据保存到CSV文件,因此需要创建一个嵌套循环-目前,我要重复循环多次(如下所示)。

How do I create a nested loop then save to the CSV file? 如何创建嵌套循环,然后保存到CSV文件?

Also will the script fail if it gets the last page and their isn't a next button there? 如果脚本到达最后一页并且不是那里的下一个按钮,脚本还会失败吗?

Thanks - This is the code I am using. 谢谢-这是我正在使用的代码。

from selenium import webdriver
import time

browser = webdriver.Firefox(executable_path="/Users/path/geckodriver")

browser.get('https://www.tripadvisor.co.uk/Restaurants-g186338-zfn29367-London_England.html#EATERY_OVERVIEW_BOX')


meci = browser.find_elements_by_class_name('property_title')

for items in meci:
    title = items.text
    href = items.get_attribute('href')
    print(title)
    print(href)

time.sleep(3)
browser.find_element_by_css_selector('.next').click()
time.sleep(3)
meci = browser.find_elements_by_class_name('property_title')

for items in meci:
    title = items.text
    href = items.get_attribute('href')
    print(title)
    print(href)

time.sleep(3)
browser.find_element_by_css_selector('.next').click()
time.sleep(3)
meci = browser.find_elements_by_class_name('property_title')

for items in meci:
    title = items.text
    href = items.get_attribute('href')
    print(title)
    print(href)

browser.quit()

I have used try-except so the program will exit when there isn't a next button. 我使用了try-except,因此当没有下一个按钮时,程序将退出。 Instead of printing, you can write the result to a CSV file. 您可以将结果写入CSV文件,而无需打印。

while True:
    try:
        meci = browser.find_elements_by_class_name('property_title')

        for items in meci:
            title = items.text
            href = items.get_attribute('href')
            print(title)
            print(href)

        time.sleep(3)
        browser.find_element_by_css_selector('.next').click()
        time.sleep(3)
    except:
        break


browser.quit()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM