简体   繁体   中英

How to get the result of a javascript function from a python code using Beautiful Soup?

I want to scrape data from a website using Beautiful Soup in Python. The site changes the values of a drop down menu based on selection by user. There is no api call in changing the values of drop down menu. On taking a closer look, I observed there is one javascript function which is called internally to get the values of drop down menu. My problem is values of that drop down menu are not there in page source. They are got by calling that js function but sice there is no api call, I can't request that values. Can anyone tell me how can I call a javascript function from a python code. I'm using the Beautiful Soup for web scraping.

Thanks

You can't. BeautifulSoup is an HTML parser.

You want to do more than parse HTML; you want to evaluate Javascript.

Perhaps you are looking for a Javascript-capable browser, like Selenium .

You might be interested in the Pyv8 module ; it lets you embed a javascript interpreter in Python code, but does not include a browser DOM. I give a short example in Why is BeautifulSoup not finding a specific table class?

For javascript that makes more extensive use of browser features, you may prefer ghost.py , a headless Webkit-based browser with a Python API.

Failing that, if you gave the page url, we could take a look at the javascript and see if there's a quick way to duplicate the call in Python.

Beautiful Soup can't be used for parsing javascript loaded content. You should use something like Selenium

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM