繁体   English   中英

如何使用 python 将多个 json 文件合并为一个

[英]How to merge several json files into one using python

我有 6 个 json 文件,我想合并为一个。 我知道我需要使用 glob,但我无法理解如何使用它。 我附上了文件的名称和我尝试过的代码。 我还创建了一个名为“merge.json”的空 json 文件,我希望将所有 json 文件合并到该文件中。 它们都有相同的字典键,但我想简单地合并文件,而不是将所有值合并到一个键中。 我附上了数据的样子以及我希望它在合并时的样子。 谢谢!

file1 = 'file1.json'
...
file6 = 'file6.json'

文件1:

{time:12, 'sizes':[1,2,3], 'scores':[80,100,77]},{time:42, 'sizes':[2,3,1], 'scores':[90,50,67]},{time:88, 'sizes':[162,124,1], 'scores':[90,100,97]}

文件2:

{time:52, 'sizes':[192,242,3], 'scores':[80,100,77]},{time:482, 'sizes':[2,376,1], 'scores':[9,50,27]},{time:643, 'sizes':[93,12,90], 'scores':[10,400,97]}

...

合并:

{time:12, 'sizes':[1,2,3], 'scores':[80,100,77]},{time:42, 'sizes':[2,3,1], 'scores':[90,50,67]},{time:88, 'sizes':[162,124,1], 'scores':[90,100,97]},{time:52, 'sizes':[192,242,3], 'scores':[80,100,77]},{time:482, 'sizes':[2,376,1], 'scores':[9,50,27]},{time:643, 'sizes':[93,12,90], 'scores':[10,400,97]}

我在另一个线程上看到要使用:

import json
import glob

result = []
for f in glob.glob("*.json"):
    with open(f, "rb") as infile:
        result.append(json.load(infile))

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)

但我不明白“*.json”中的内容以及文件被调用的位置。 谢谢!

让我们使用 argparse 将其变成一个完整的工作程序,以便可以在命令行上指定文件。 然后可以在运行时决定哪个目录保存所需的 JSON 文件,您可以使用 shell 的 globbing 列出它们。

#!/usr/bin/env python

"""Read a list of JSON files holding a list of dictionaries and merge into
a single JSON file holding a list of all of the dictionaries"""

import sys
import argparse
import json

def do_merge(infiles, outfile):
    merged = []
    for infile in infiles:
        with open(infile, 'r', encoding='utf-8') as infp:
            data = json.load(infp)
            assert isinstance(data, list), "invalid input"
            merged.extend(data)
    with open(outfile, 'w', encoding="utf-8") as outfp:
        json.dump(merged, outfp)
    return 0

def main(argv):
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument('outfile', help="File to hold merged JSON")
    parser.add_argument('infiles', nargs='+', help="List of files to merge")
    args = parser.parse_args(argv)
    retval = do_merge(args.infiles, args.outfile)
    print(f"Merged {len(args.infiles)} files into {args.outfile}")
    return retval

if __name__ == "__main__":
    retval = main(sys.argv[1:])
    exit(retval)

样品 JSON 文件设置为

mytest/file1.json

[{"time": 12, "sizes": [1, 2, 3], "scores": [80, 100, 77]},
{"time": 42, "sizes": [2, 3, 1], "scores": [90, 50, 67]},
{"time": 88, "sizes": [162, 124, 1], "scores": [90, 100, 97]}]

mytest/file2.json

[{"time": 52, "sizes": [192, 242, 3], "scores": [80, 100, 77]},
{"time": 482, "sizes": [2, 376, 1], "scores": [9, 50, 27]},
{"time": 643, "sizes": [93, 12, 90], "scores": [10, 400, 97]}]

和测试

~/tmp$ ./jsonmerge.py mergedjson.json mytest/*.json
Merged 2 files into mergedjson.json

将所有 JSON 文件放在一个目录下,并在同一目录下运行此代码

import json
import glob

result = []
for f in glob.glob("*.json"):
    with open(f, "rb") as infile:
        result.append(json.load(infile))

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)

这将生成一个merged_file.json ,其中将包含来自所有 JSON 文件的合并数据。

for f in glob.glob("*.json")将按照目录中存在的顺序遍历该目录中的每个 json 文件。

也许你可以像下面这样尝试,检查 repl.it代码-

import glob

a = glob.glob('./*.json')
print (a)

merged = open("merged.json", "w+")
for i in a:
  with open(i, "r") as f:
    for j in f.readlines():
      merged.write(j)

merged.close()

如果您打算使用合并的 json 作为有效的 json,那么您必须将其结构良好。 (这假设单个 json 是有效的 json):

处理@tdelaney 的答案:

with open("merged_file.json", "wb") as outfile:
    outfile.write("[")
    counter=1
    for f in glob.glob("*.json"):
        with open(f, "rb") as infile:
            line = None
            for line in infile:
                outfile.write(line)
            if line is not None and not line.endswith(b"\n")
                outfile.write(b"\n")
            if counter < len(glob.glob("*.json")):
                outfile.write(",")
            else:
                outfile.write("]")
            counter=counter+1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM