简体   繁体   中英

Load JSON data from Google GitHub repo

I am trying to load the following JSON file (from the Google Github repo) in Python as follows:

import json
import requests

url = "https://raw.githubusercontent.com/google/vsaq/master/questionnaires/webapp.json"
r = requests.get(url)
data = r.text.splitlines(True)
#remove first n lines which is not JSON (commented license)
data = ''.join(data[14:])

When I use json.loads(data) I get the following error:

JSONDecodeError: Expecting ',' delimiter: line 725 column 543 (char 54975)

As this has been saved as a json file by the GitHub repo owner (Google) I'm wondering what Im doing wrong here.

I found the obtained text from API call is like a simple text, not a valid JSON (I checked at https://jsonformatter.curiousconcept.com/ ).

Here is my code that I used to filter the valid JSON part from the response.

I have used re module to extract the JSON part.

import json
import requests
import re

url = "https://raw.githubusercontent.com/google/vsaq/master/questionnaires/webapp.json"
r = requests.get(url)
text = r.text.strip()

m = re.search(r'\{(.|\s)*\}',  text) # It is for finding a valid JSON part from obtained text
s = m.group(0).replace('false', 'False') # Python has 'False/True' not 'false/true' (Replacement)
d = eval(s)

print(d) # {...}
print(type(d)) # <class 'dict'>
References »

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM