[英]Load JSON data from Google GitHub repo
I am trying to load the following JSON file (from the Google Github repo) in Python as follows:我正在尝试在 Python 中加载以下 JSON 文件(来自 Google Github 存储库),如下所示:
import json
import requests
url = "https://raw.githubusercontent.com/google/vsaq/master/questionnaires/webapp.json"
r = requests.get(url)
data = r.text.splitlines(True)
#remove first n lines which is not JSON (commented license)
data = ''.join(data[14:])
When I use json.loads(data)
I get the following error:当我使用
json.loads(data)
,出现以下错误:
JSONDecodeError: Expecting ',' delimiter: line 725 column 543 (char 54975)
As this has been saved as a json file by the GitHub repo owner (Google) I'm wondering what Im doing wrong here.由于这已被 GitHub 存储库所有者 (Google) 保存为 json 文件,我想知道我在这里做错了什么。
I found the obtained text from API call is like a simple text, not a valid JSON (I checked at https://jsonformatter.curiousconcept.com/ ).我发现从 API 调用中获得的文本就像一个简单的文本,而不是一个有效的 JSON(我在https://jsonformatter.curiousconcept.com/ 上检查过)。
Here is my code that I used to filter the valid JSON part from the response.这是我用来从响应中过滤有效 JSON 部分的代码。
I have used
re
module to extract the JSON part.我使用
re
模块来提取 JSON 部分。
import json
import requests
import re
url = "https://raw.githubusercontent.com/google/vsaq/master/questionnaires/webapp.json"
r = requests.get(url)
text = r.text.strip()
m = re.search(r'\{(.|\s)*\}', text) # It is for finding a valid JSON part from obtained text
s = m.group(0).replace('false', 'False') # Python has 'False/True' not 'false/true' (Replacement)
d = eval(s)
print(d) # {...}
print(type(d)) # <class 'dict'>
References »
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.