简体   繁体   English

合并具有不均匀字段的 JSON 文件的问题

[英]Issue with merging JSON files with uneven fields

I am trying to merge multiple JSON files of some size , and uneven fields.我正在尝试合并多个大小不一的JSON 文件和不均匀的字段。

Here is what i mean:这就是我的意思:

JSON Example#1 JSON 示例#1

{"firstName":"John", "lastName":"Doe"}

JSON Example#2 JSON 示例#2

{"firstName":"John", "lastName":"Doe", "middleName":"Doe"}

Both the files are saved in the same location as the Merger.py that i am using merge these 2 JSON files.这两个文件都保存在与我使用合并这两个 JSON 文件的 Merger.py 相同的位置。

Merger.py合并文件

import json
import glob

result = []
for f in glob.glob("*.json"):
    with open(f, "rb") as infile:
        result.append(json.load(infile))

with open("merged_file.json", "wb") as outfile:
     json.dump(result, outfile)

When i try to execute it however, i keep getting this error:但是,当我尝试执行它时,我不断收到此错误:

Traceback (most recent call last): File "Merger.py", line 10, in json.dump(result, outfile) File "C:\\Users\\<...Directory Path...>\\Programs\\Python\\Python37-32\\lib\\json__init__.py", line 180, in dump fp.write(chunk) TypeError: a bytes-like object is required, not 'str'回溯(最近一次调用):文件“Merger.py”,第 10 行,在 json.dump(result, outfile) 文件“C:\\Users\\<...Directory Path...>\\Programs\\Python\\Python37 -32\\lib\\json__init__.py", line 180, in dump fp.write(chunk) TypeError: a bytes-like object is required, not 'str'

I understand this is due to uneven fields in the JSON files.我知道这是由于 JSON 文件中的字段不均匀。

My question is, is there a work around for this situation ?我的问题是,这种情况有解决方法吗?

The original files are consisting of 91+ Million records each, so manual merging is out of the question (not that i have not tried that too) .原始文件由 91+ 百万条记录组成,因此手动合并是不可能(不是我也没有尝试过)

I understand this is due to uneven fields in the JSON files.我知道这是由于 JSON 文件中的字段不均匀。

No,this is not problem.不,这不是问题。

You use with open(f, "rb") as infile and open("merged_file.json", "wb") .您使用with open(f, "rb") as infileopen("merged_file.json", "wb") That is incorrect.那是不正确的。 You should use with open(f, "r") as infile and open("merged_file.json", "w") .您应该使用with open(f, "r") as infileopen("merged_file.json", "w")

Only when we open a binary file instead of a text file,we use wb or rb .只有当我们打开二进制文件而不是文本文件时,我们才使用wbrb

TypeError: a bytes-like object is required, not 'str'类型错误:需要类似字节的对象,而不是“str”

The exception has shown your problem.异常显示了您的问题。

So you code maybe should be this:所以你的代码可能应该是这样的:

import json

result = []
for f in glob.glob("*.json"):
    with open(f, "r") as infile:
        result.append(json.load(infile))

Yourjson = {
    "result":result
}

with open("merged_file.json", "w") as outfile:
    json.dump(Yourjson, outfile,indent=4)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM