简体   繁体   English

从网站抓取某些字段时无法继续单击下一页按钮

[英]Can't go on clicking on the next page button while scraping certain fields from a website

I've created a script using python in association with pyppeteer to keep clicking on the next page button until there is no more.我已经使用 python 与pyppeteer结合创建了一个脚本,以继续单击下一页按钮,直到没有更多按钮为止。 The script while clicking on the next page button throws this error pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.单击下一页按钮时脚本会抛出此错误pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded. pointing at this line await page.waitForNavigation() .指向这一行await page.waitForNavigation() It can parse name and item_type from the landing page of that site, though.不过,它可以从该站点的登录页面解析nameitem_type I know I can issue post http requests with appropriate payload to get data from there but my intention is to make use of pyppeteer and keep clicking on the next page button while parsing the required fields.我知道我可以使用适当的有效负载发出 post http 请求以从那里获取数据,但我的目的是利用pyppeteer并在解析所需字段的同时继续单击下一页按钮。

website address网站地址

import asyncio
from pyppeteer import launch

link = "https://www.e-ports.com/ships"

async def get_content():
    wb = await launch(headless=True)
    [page] = await wb.pages()
    await page.goto(link)

    while True:
        await page.waitForSelector(".common_card", {'visible':True})

        elements = await page.querySelectorAll('.common_card')
        for element in elements:
            name = await element.querySelectorEval('span.title > a','e => e.innerText')
            item_type = await element.querySelectorEval('.bottom > span','e => e.innerText')
            print(name.strip(),item_type.strip())

        try:
            await page.click("button.btn-next")
            await page.waitForNavigation()
        except Exception: break

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(get_content())

Btw, If I manually click on the next page button for the first time, It accomplishes the rest successfully.顺便说一句,如果我第一次手动点击下一页按钮,它会成功完成其余的工作。

I don't know the valid syntax in Pypeteer, but common syntax of waitForNavigation maybe this one.我不知道在Pypeteer有效语法,但常见的语法waitForNavigation也许这一个。

await Promise.all([
   page.waitForNavigation(),
   page.click("button.btn-next")
])

With the iterator inside the array promised, all of the methods will resolved when become true or finished desired action.使用数组中的迭代器承诺,所有方法将在变为 true 或完成所需操作时解析。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM