I have 6 json files that I would like to merge into one. I know I need to use glob, but I am having trouble understanding how to do it. I have attached the names of the files and the code I have tried. I have also created an empty json file called 'merge.json' that I would like all the jsons to be merged into. They all have the same dictionary keys, but I would like to simply merge the files, not merge all of the values into one key. I have attached what the data looks like and what I would like it to look like when merged. Thank you!
file1 = 'file1.json'
...
file6 = 'file6.json'
file1:
{time:12, 'sizes':[1,2,3], 'scores':[80,100,77]},{time:42, 'sizes':[2,3,1], 'scores':[90,50,67]},{time:88, 'sizes':[162,124,1], 'scores':[90,100,97]}
file2:
{time:52, 'sizes':[192,242,3], 'scores':[80,100,77]},{time:482, 'sizes':[2,376,1], 'scores':[9,50,27]},{time:643, 'sizes':[93,12,90], 'scores':[10,400,97]}
...
merged:
{time:12, 'sizes':[1,2,3], 'scores':[80,100,77]},{time:42, 'sizes':[2,3,1], 'scores':[90,50,67]},{time:88, 'sizes':[162,124,1], 'scores':[90,100,97]},{time:52, 'sizes':[192,242,3], 'scores':[80,100,77]},{time:482, 'sizes':[2,376,1], 'scores':[9,50,27]},{time:643, 'sizes':[93,12,90], 'scores':[10,400,97]}
I saw on another thread to use:
import json
import glob
result = []
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
with open("merged_file.json", "wb") as outfile:
json.dump(result, outfile)
But I do not understand what goes in "*.json", and where the files are being called. Thank you!
Lets turn this into a full working program using argparse so that files can be specified on the command line. Then the decision of which directory holds the desired JSON files can be decided at run time and you can use the shell's globbing to list them.
#!/usr/bin/env python
"""Read a list of JSON files holding a list of dictionaries and merge into
a single JSON file holding a list of all of the dictionaries"""
import sys
import argparse
import json
def do_merge(infiles, outfile):
merged = []
for infile in infiles:
with open(infile, 'r', encoding='utf-8') as infp:
data = json.load(infp)
assert isinstance(data, list), "invalid input"
merged.extend(data)
with open(outfile, 'w', encoding="utf-8") as outfp:
json.dump(merged, outfp)
return 0
def main(argv):
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('outfile', help="File to hold merged JSON")
parser.add_argument('infiles', nargs='+', help="List of files to merge")
args = parser.parse_args(argv)
retval = do_merge(args.infiles, args.outfile)
print(f"Merged {len(args.infiles)} files into {args.outfile}")
return retval
if __name__ == "__main__":
retval = main(sys.argv[1:])
exit(retval)
With sample JSON files setup as
mytest/file1.json
[{"time": 12, "sizes": [1, 2, 3], "scores": [80, 100, 77]},
{"time": 42, "sizes": [2, 3, 1], "scores": [90, 50, 67]},
{"time": 88, "sizes": [162, 124, 1], "scores": [90, 100, 97]}]
mytest/file2.json
[{"time": 52, "sizes": [192, 242, 3], "scores": [80, 100, 77]},
{"time": 482, "sizes": [2, 376, 1], "scores": [9, 50, 27]},
{"time": 643, "sizes": [93, 12, 90], "scores": [10, 400, 97]}]
And the test
~/tmp$ ./jsonmerge.py mergedjson.json mytest/*.json
Merged 2 files into mergedjson.json
Put all your JSON files under one directory and run this code in the same directory
import json
import glob
result = []
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
with open("merged_file.json", "wb") as outfile:
json.dump(result, outfile)
This will produce a merged_file.json
which will contain merged data from all JSON files.
for f in glob.glob("*.json")
will iterate through every json file in that directory in the order they are present in directory.
Maybe you can try like below, Check the repl.it code -
import glob
a = glob.glob('./*.json')
print (a)
merged = open("merged.json", "w+")
for i in a:
with open(i, "r") as f:
for j in f.readlines():
merged.write(j)
merged.close()
If you are intending to use the merged json as a valid json then you must structure it well. (This assumes that individual jsons are valid jsons):
Working on @tdelaney's answer:
with open("merged_file.json", "wb") as outfile:
outfile.write("[")
counter=1
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
line = None
for line in infile:
outfile.write(line)
if line is not None and not line.endswith(b"\n")
outfile.write(b"\n")
if counter < len(glob.glob("*.json")):
outfile.write(",")
else:
outfile.write("]")
counter=counter+1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.