简体   繁体   English

需要通过Mechanize + BeautifulSoup(Python)启用Javascript的抓取网站

[英]Scraping site that requires Javascript enabled with Mechanize + BeautifulSoup (Python)

So.. i got this site I am tryign to scrape, but as I understand lack of support of mechanize for .js, and a stuborn site that requires javascript enabled browser is not a good mix... 所以..我很想刮这个网站,但是据我了解,缺乏对.js机械化的支持,而一个需要启用JavaScript的浏览器的繁华网站却不是很好的组合...

I am looking for ideas, on how to do this... 我正在寻找有关如何执行此操作的想法...

URL : https://members.iracing.com/membersite/login.jsp 网址: https : //members.iracing.com/membersite/login.jsp

Depending on what you need to do, you could use webkit to parse the page, which will allow you to get the final html after the javascript has been executed. 根据您需要执行的操作,可以使用webkit来解析页面,这将使您可以在执行javascript之后获取最终的html。 You could then use any decent html parser, beautifulsoup for example, to do the rest. 然后,您可以使用任何不错的html解析器(例如beautifulsoup)来完成其余的工作。

使用JavaScript,我将Chickenfoot用于简单的网站,将Webkit用于更复杂的网站。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM