I have a json metadata file with around 26 GB of data. For obvious reasons I need to extract the first 100 lines to create a new json file to read, so that I have less alteration possible on the code that follows, which should be for testing on the 100 lines and once debug is completed apply the code on the whole file.
I have read over exporting json to csv but I wish to maintain the json structure and file type, is it possible to do so with Python?
My file is a json with some extra data, so I use a work around to read it in the first place. It looks lik this:
{"_id":{"$oid":"5b9fd47507b317551a7bfb8f"},"title":"It's Okay If You Didn't Like 'Boyhood', And Here Are Many Reasons Why","url":"https://m.huffpost.com/us/entry/6694772","article_text"
And I read it like this
with open('metadata.json', 'r') as data:
data = json.loads("[" + data.read().replace("}\n{", "},\n{") + "]")
Thanks!
You can try:
import json
with open('file.json') as ip_file:
o = json.load(ip_file)
chunkSize = 100
for i in range(0, len(o), chunkSize):
with open('new_file' + '.json', 'a') as out_file:
json.dump(o[i:i+chunkSize], out_file)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.