简体   繁体   中英

Can't go on clicking on the next page button while scraping certain fields from a website

I've created a script using python in association with pyppeteer to keep clicking on the next page button until there is no more. The script while clicking on the next page button throws this error pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded. pointing at this line await page.waitForNavigation() . It can parse name and item_type from the landing page of that site, though. I know I can issue post http requests with appropriate payload to get data from there but my intention is to make use of pyppeteer and keep clicking on the next page button while parsing the required fields.

website address

import asyncio
from pyppeteer import launch

link = "https://www.e-ports.com/ships"

async def get_content():
    wb = await launch(headless=True)
    [page] = await wb.pages()
    await page.goto(link)

    while True:
        await page.waitForSelector(".common_card", {'visible':True})

        elements = await page.querySelectorAll('.common_card')
        for element in elements:
            name = await element.querySelectorEval('span.title > a','e => e.innerText')
            item_type = await element.querySelectorEval('.bottom > span','e => e.innerText')
            print(name.strip(),item_type.strip())

        try:
            await page.click("button.btn-next")
            await page.waitForNavigation()
        except Exception: break

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(get_content())

Btw, If I manually click on the next page button for the first time, It accomplishes the rest successfully.

I don't know the valid syntax in Pypeteer, but common syntax of waitForNavigation maybe this one.

await Promise.all([
   page.waitForNavigation(),
   page.click("button.btn-next")
])

With the iterator inside the array promised, all of the methods will resolved when become true or finished desired action.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM