I have the Twitter dataset (multiple JSON files), but let's start from one file. I have to parse JSON objects to python but json.loads()
only parse one object. A similar question is asked here but solutions are not working or good enough.
1- I can not convert JSON objects into the list as it is not efficient and I have too much data. Also proposed solutions are based on "\n" while my Twitter data objects end like }{
there is no newline
and I can not add manually. (Twitter objects are also not line by line)
2- The second solution is JSONStream
and there is not much available about it on official documentation .
3- Is there any other efficient way? One I have in consideration is using MongoDB
. but I never worked on MongoDB
. so I don't know if this is possible with this or not.
below picture shows the length of tweet object and }{
with open('sampledata.json','r',encoding='utf8') as json_file:
#for i in json_file:
while(True):
dataobj = json.load(json_file)
print(dataobj)
print("Printing each JSON Decoded Object")
Error: As there are 287 lines for one object.
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 287 column 2 (char 10528)
The while
loop used while reading the json file is not needed You can use this to read a json file:
def read_json(path):
with open(path, 'r') as file:
return json.load(file)
my_data = read_json('sampledata.json')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.