[英]How to merge several json files into one using python
我有 6 個 json 文件,我想合並為一個。 我知道我需要使用 glob,但我無法理解如何使用它。 我附上了文件的名稱和我嘗試過的代碼。 我還創建了一個名為“merge.json”的空 json 文件,我希望將所有 json 文件合並到該文件中。 它們都有相同的字典鍵,但我想簡單地合並文件,而不是將所有值合並到一個鍵中。 我附上了數據的樣子以及我希望它在合並時的樣子。 謝謝!
file1 = 'file1.json'
...
file6 = 'file6.json'
文件1:
{time:12, 'sizes':[1,2,3], 'scores':[80,100,77]},{time:42, 'sizes':[2,3,1], 'scores':[90,50,67]},{time:88, 'sizes':[162,124,1], 'scores':[90,100,97]}
文件2:
{time:52, 'sizes':[192,242,3], 'scores':[80,100,77]},{time:482, 'sizes':[2,376,1], 'scores':[9,50,27]},{time:643, 'sizes':[93,12,90], 'scores':[10,400,97]}
...
合並:
{time:12, 'sizes':[1,2,3], 'scores':[80,100,77]},{time:42, 'sizes':[2,3,1], 'scores':[90,50,67]},{time:88, 'sizes':[162,124,1], 'scores':[90,100,97]},{time:52, 'sizes':[192,242,3], 'scores':[80,100,77]},{time:482, 'sizes':[2,376,1], 'scores':[9,50,27]},{time:643, 'sizes':[93,12,90], 'scores':[10,400,97]}
我在另一個線程上看到要使用:
import json
import glob
result = []
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
with open("merged_file.json", "wb") as outfile:
json.dump(result, outfile)
但我不明白“*.json”中的內容以及文件被調用的位置。 謝謝!
讓我們使用 argparse 將其變成一個完整的工作程序,以便可以在命令行上指定文件。 然后可以在運行時決定哪個目錄保存所需的 JSON 文件,您可以使用 shell 的 globbing 列出它們。
#!/usr/bin/env python
"""Read a list of JSON files holding a list of dictionaries and merge into
a single JSON file holding a list of all of the dictionaries"""
import sys
import argparse
import json
def do_merge(infiles, outfile):
merged = []
for infile in infiles:
with open(infile, 'r', encoding='utf-8') as infp:
data = json.load(infp)
assert isinstance(data, list), "invalid input"
merged.extend(data)
with open(outfile, 'w', encoding="utf-8") as outfp:
json.dump(merged, outfp)
return 0
def main(argv):
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('outfile', help="File to hold merged JSON")
parser.add_argument('infiles', nargs='+', help="List of files to merge")
args = parser.parse_args(argv)
retval = do_merge(args.infiles, args.outfile)
print(f"Merged {len(args.infiles)} files into {args.outfile}")
return retval
if __name__ == "__main__":
retval = main(sys.argv[1:])
exit(retval)
樣品 JSON 文件設置為
mytest/file1.json
[{"time": 12, "sizes": [1, 2, 3], "scores": [80, 100, 77]},
{"time": 42, "sizes": [2, 3, 1], "scores": [90, 50, 67]},
{"time": 88, "sizes": [162, 124, 1], "scores": [90, 100, 97]}]
mytest/file2.json
[{"time": 52, "sizes": [192, 242, 3], "scores": [80, 100, 77]},
{"time": 482, "sizes": [2, 376, 1], "scores": [9, 50, 27]},
{"time": 643, "sizes": [93, 12, 90], "scores": [10, 400, 97]}]
和測試
~/tmp$ ./jsonmerge.py mergedjson.json mytest/*.json
Merged 2 files into mergedjson.json
將所有 JSON 文件放在一個目錄下,並在同一目錄下運行此代碼
import json
import glob
result = []
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
with open("merged_file.json", "wb") as outfile:
json.dump(result, outfile)
這將生成一個merged_file.json
,其中將包含來自所有 JSON 文件的合並數據。
for f in glob.glob("*.json")
將按照目錄中存在的順序遍歷該目錄中的每個 json 文件。
也許你可以像下面這樣嘗試,檢查 repl.it代碼-
import glob
a = glob.glob('./*.json')
print (a)
merged = open("merged.json", "w+")
for i in a:
with open(i, "r") as f:
for j in f.readlines():
merged.write(j)
merged.close()
如果您打算使用合並的 json 作為有效的 json,那么您必須將其結構良好。 (這假設單個 json 是有效的 json):
處理@tdelaney 的答案:
with open("merged_file.json", "wb") as outfile:
outfile.write("[")
counter=1
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
line = None
for line in infile:
outfile.write(line)
if line is not None and not line.endswith(b"\n")
outfile.write(b"\n")
if counter < len(glob.glob("*.json")):
outfile.write(",")
else:
outfile.write("]")
counter=counter+1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.