简体   繁体   English

从 Google GitHub 存储库加载 JSON 数据

[英]Load JSON data from Google GitHub repo

I am trying to load the following JSON file (from the Google Github repo) in Python as follows:我正在尝试在 Python 中加载以下 JSON 文件(来自 Google Github 存储库),如下所示:

import json
import requests

url = "https://raw.githubusercontent.com/google/vsaq/master/questionnaires/webapp.json"
r = requests.get(url)
data = r.text.splitlines(True)
#remove first n lines which is not JSON (commented license)
data = ''.join(data[14:])

When I use json.loads(data) I get the following error:当我使用json.loads(data) ,出现以下错误:

JSONDecodeError: Expecting ',' delimiter: line 725 column 543 (char 54975)

As this has been saved as a json file by the GitHub repo owner (Google) I'm wondering what Im doing wrong here.由于这已被 GitHub 存储库所有者 (Google) 保存为 json 文件,我想知道我在这里做错了什么。

I found the obtained text from API call is like a simple text, not a valid JSON (I checked at https://jsonformatter.curiousconcept.com/ ).我发现从 API 调用中获得的文本就像一个简单的文本,而不是一个有效的 JSON(我在https://jsonformatter.curiousconcept.com/ 上检查过)。

Here is my code that I used to filter the valid JSON part from the response.这是我用来从响应中过滤有效 JSON 部分的代码。

I have used re module to extract the JSON part.我使用re模块来提取 JSON 部分。

import json
import requests
import re

url = "https://raw.githubusercontent.com/google/vsaq/master/questionnaires/webapp.json"
r = requests.get(url)
text = r.text.strip()

m = re.search(r'\{(.|\s)*\}',  text) # It is for finding a valid JSON part from obtained text
s = m.group(0).replace('false', 'False') # Python has 'False/True' not 'false/true' (Replacement)
d = eval(s)

print(d) # {...}
print(type(d)) # <class 'dict'>
References » 参考 ”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM