简体   繁体   中英

Scraping webpage with Javascripts functions using Python

I need to retrieve some information from a webpage. The page is like a powerpoint: several slides are shown one by one. To move from one slide to another you have to press a button that runs a js function " load_image_btn('plus') " which change the image. The URL is exactly the same, and the HTML code only changes de URL of the img " someurl/546 ".

Is there any way to execute that function from python iteratively so I can get all the images?

One generic way to cope with Javascript-induced problems is to use a headless browser to fully execute each page and then scrape from there.

For my last similar project I used a service that provides instances of headless webbrowsers that can be controlled via API, namely https://scrapinghub.com/splash .

But I am sure there are many alternatives.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM