使用BeautifulSoup从网页上抓取javascript / json对象？

Question

I am using BeautifulSoup to get the HTML of a webpage. 我正在使用BeautifulSoup来获取网页的HTML。 That works fine so far. 到目前为止，一切正常。 But what I really want are the contents of this javascript chunk inside the HTML, which is encapsulated with <script type="text/javascript"> and then inside that tag, eventually there is a giant array thing that has a lot of {} brackets, and I believe this is a JSON array? 但是我真正想要的是HTML内的这个JavaScript块的内容，该内容用<script type="text/javascript">封装，然后在该标记内，最终有一个包含很多{}的巨型数组中括号，我相信这是一个JSON数组？

Is there a way I can try to extract that entire array from within the HTML? 有没有办法我可以尝试从HTML中提取整个数组？

Answer 1

You are looking for the function json.loads . 您正在寻找json.loads函数。

>>> import json
>>> obj = json.loads('{"a": 12, "b": null}')
>>> obj
{'b': None, 'a': 12}

使用BeautifulSoup从网页上抓取javascript / json对象？

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-06-06 00:58:13

使用BeautifulSoup从网页上抓取javascript / json对象？

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-06-06 00:58:13

解决方案1
0 已采纳 2015-06-06 00:58:13