[英]Scraping a javascript / json object from a webpage using BeautifulSoup?
I am using BeautifulSoup to get the HTML of a webpage. 我正在使用BeautifulSoup来获取网页的HTML。 That works fine so far.
到目前为止,一切正常。 But what I really want are the contents of this javascript chunk inside the HTML, which is encapsulated with
<script type="text/javascript">
and then inside that tag, eventually there is a giant array thing that has a lot of {}
brackets, and I believe this is a JSON array? 但是我真正想要的是HTML内的这个JavaScript块的内容,该内容用
<script type="text/javascript">
封装,然后在该标记内,最终有一个包含很多{}
的巨型数组中括号,我相信这是一个JSON数组?
Is there a way I can try to extract that entire array from within the HTML? 有没有办法我可以尝试从HTML中提取整个数组?
You are looking for the function json.loads
. 您正在寻找
json.loads
函数。
>>> import json
>>> obj = json.loads('{"a": 12, "b": null}')
>>> obj
{'b': None, 'a': 12}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.