简体   繁体   中英

How to get search result URLs by Pyppeteer?

I am trying to scrape the searching result's url by Pyppeteer in my Python program, but it doesn't work... And here is my code:

import asyncio
from pyppeteer import launch

URL = 'https://hk.appledaily.com/search/apple'

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto(URL)

    await page.waitForSelector(".flex-feature")

    elements = await page.querySelectorAll('.flex-feature')

    for el in elements:
        text = await page.evaluate('(el) => el.innerHTML.querySelectorAll("story-card")', el)
        print(text)

    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Hope anyone can help! Thanks!

Sorry for stupid question. I have done it just now haha...

import asyncio
from pyppeteer import launch

# https://pypi.org/project/pyppeteer/

URL = 'https://hk.appledaily.com/search/apple'


async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto(URL)

    await page.waitForSelector(".flex-feature")

    elements = await page.querySelectorAll('.flex-feature')
    elements = await page.querySelectorAll('.story-card')

    for el in elements:
        text = await page.evaluate('(el) => el.textContent', el)
        text2 = await page.evaluate('(el) => el.href', el)
        print(text2)

    await browser.close()

asyncio.get_event_loop().run_until_complete(main())  

And the result will be:

https://hk.appledaily.com/entertainment/20201227/LJL5DQ64QZHLTHI7LFKHVXB7JM/
https://hk.appledaily.com/sports/20201227/7MQKJNXPQNA6HDXTFUCMWNGUAU/
https://hk.appledaily.com/local/20201227/SWIBOUDSLZB5JBULTIT4DPSEIQ/
https://hk.appledaily.com/entertainment/20201227/TA457F5YYRGQZCNDIR5OFJDLPU/
https://hk.appledaily.com/china/20201227/DY2RQZJVSZHJBDV6XDYBH5G73I/
https://hk.appledaily.com/sports/20201227/4FLJFIHZOFF3JMWPOOSTO5QLCQ/
https://hk.appledaily.com/local/20201227/NIWG4U4LBFGPHLA73RTWHEQCY4/
https://hk.appledaily.com/china/20201227/SUR6Q4UEIVE5HD7OLSCAYIVUUY/
https://hk.appledaily.com/international/20201227/N2P5IPMBKBEGRALQWMDFXJCVGY/
https://hk.appledaily.com/entertainment/20201227/MGG6H2JIJVGODEV3EE7OI6HEGI/
https://hk.appledaily.com/local/20201227/N3TQO3VOBRC3NKT2ILES76CSKY/
https://hk.appledaily.com/international/20201227/GJXFM53DAFAUVOFFZIRKBH3X24/
https://hk.appledaily.com/sports/20201227/2UQC7A4HCBFD5IF7IGJWVK3AOA/
https://hk.appledaily.com/entertainment/20201226/AI7CAJD6O5D5XP7UMZCWSQ5VU4/
https://hk.appledaily.com/entertainment/20201227/3BIOQMUCQVGHXKNP3A4KF7VC6A/
https://hk.appledaily.com/local/20201227/OOYOPLI5WFGJZGAFKGLHSVINPM/
https://hk.appledaily.com/local/20201227/6FXZ5FKNMVHS5JTTO6YWO55JZY/
https://hk.appledaily.com/local/20201227/VQTZMOKCUZGMFL4PYBZ5YZYOSQ/
https://hk.appledaily.com/international/20201227/4VPFDXJFKZH5ZFRXSKZW3OASAA/
https://hk.appledaily.com/entertainment/20201227/TCVCDXKK4JHE7HHEJ7U6MFSS5U/
https://hk.appledaily.com/local/20201227/NIWG4U4LBFGPHLA73RTWHEQCY4/
https://hk.appledaily.com/entertainment/20201227/GY4WJIFLPREKJHGJ2VQO7LDZAU/
https://hk.appledaily.com/entertainment/20201227/3BIOQMUCQVGHXKNP3A4KF7VC6A/
https://hk.appledaily.com/local/20201227/OOYOPLI5WFGJZGAFKGLHSVINPM/
https://hk.appledaily.com/local/20201227/N3TQO3VOBRC3NKT2ILES76CSKY/
https://hk.appledaily.com/local/20201227/Z4CRG7TLUJFMLO3JIY2KWBTL5A/
https://hk.appledaily.com/local/20201227/353WEBFTBZFHBCP2O4IXIARBEM/

Process finished with exit code 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM