简体   繁体   English

如何使用BeautifulSoup将HTML中的JSON文件解析为Python?

[英]How to parse a JSON file from HTML into Python using BeautifulSoup?

I have an HTML page with various information, like: 我有一个包含各种信息的HTML页面,例如:

<input class="json-data" id="init-data" type="hidden" value='{"keyboardShortcuts":[{"name":"xxxx","description":"yyyyy", > etc...

How can I obtain something like 我如何获得类似的东西

name = xxxx

That is my code so far, but every time it just prints None: 到目前为止,这就是我的代码,但是每次只打印None时:

content = page.read()
html = BeautifulSoup(content, "html.parser")
element = html.find("input", class_="json-data", value_="keyboardShortcuts")

Pull out the value attribute, read using json , then you can get the value for the name key: 拉出value属性,使用json读取,然后即可获取name键的值:

from bs4 import BeautifulSoup 从bs4导入BeautifulSoup

import json

content = '''
<input class="json-data" id="init-data" type="hidden" value='{"keyboardShortcuts":[{"name":"xxxx","description":"yyyyy"}]}
'''

html = BeautifulSoup(content, "lxml")
jsonData = json.loads(html.find('input', {'id':'init-data'})['value'])

print (jsonData['keyboardShortcuts'][0]['name'])

Output: 输出:

print (jsonData['keyboardShortcuts'][0]['name'])
xxxx

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM