简体   繁体   中英

Scraping data from complex website (hidden content)

I am just starting with web scraping and unfortunately, I am facing a showstopper: I would like pull some financial data but it seems that the website is quite complex (dynamic content etc.).

Data I would like pull

Website: https://www.de.vanguard/web/cf/professionell/de/produktart/detailansicht/etf/9527/EQUITY/performance

So far, I've used Beautiful Soup to get this done. However, I cannot even find the table. Any ideas?

Look into using selenium to launch an automated web browser. This loads the web page and it's associated dynamic content, as well as allow you the option to 'click' on certain web elements to load content that may be generated on_click . You can use this in tandem with BeautifulSoup by passing driver.page_source to BeautifulSoup and parsing through it that way.

This SO answer provides a basic example that would serve as a good starting point: Python WebDriver how to print whole page source (html)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM