简体   繁体   English

如何使用 python 将多个 json 文件合并为 1

[英]how to merge multiple json files into 1 using python

Here's my code, really simple stuff...这是我的代码,非常简单的东西......

Here, I am trying to merge multiple json files into a single json file在这里,我试图将多个 json 文件合并到一个 json 文件中

import json
import glob

result = []
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
    result.append(json.load(infile))

with open("merged_file.json", "wb") as outfile:
  json.dump(result, outfile)

file 1 is--文件 1 是——

{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3402}
{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3389}

file 2 is--文件 2 是——

{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3402}
{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3389}

they do have redundant records and 2 records per column.他们确实有冗余记录,每列有 2 条记录。 How do I merge these two files in one single json file.如何将这两个文件合并到一个 json 文件中。

I am getting the below error--我收到以下错误 -

 JSONDecodeError: Extra data: line 2 column 1 (char 62)

with the traceback as--回溯为-

JSONDecodeError                           Traceback (most recent call last)
<ipython-input-2-d33a95f39988> in <module>
  5 for f in glob.glob("*.json"):
  6     with open(f, "rb") as infile:
----> 7         result.append(json.load(infile))
  8 
  9 with open("merged_file.json", "wb") as outfile:

~\anaconda3\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, 
parse_constant, object_pairs_hook, **kw)
294         cls=cls, object_hook=object_hook,
295         parse_float=parse_float, parse_int=parse_int,
--> 296         parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
297 
298 

~\anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, 
parse_constant, object_pairs_hook, **kw)
346             parse_int is None and parse_float is None and
347             parse_constant is None and object_pairs_hook is None and not kw):
--> 348         return _default_decoder.decode(s)
349     if cls is None:
350         cls = JSONDecoder

~\anaconda3\lib\json\decoder.py in decode(self, s, _w)
338         end = _w(s, end).end()
339         if end != len(s):
--> 340             raise JSONDecodeError("Extra data", s, end)
341         return obj

Please HELP:(请帮忙:(

The file format is incorrect.文件格式不正确。 It should either be a list of dictionary or dictionary containing unique keys它应该是字典列表或包含唯一键的字典

If you cannot modify the file, you can read the content and append it to the result.如果无法修改文件,可以读取内容并将 append 读取到结果。

Read each file and append the result读取每个文件和 append 结果

result = ''
for f in glob.glob("*.json"):
    with open(f, "r") as infile:
        result += infile.read()

Then write the final result into another file然后将最终结果写入另一个文件

with open("merged_file.json", "w") as outfile:
    outfile.writelines(result)

Output: Output:

{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3402}
{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3389}
{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3402}
{"playlist_track.PlaylistId":1,"playlist_track.TrackId":3389}

With the above solution I would definitely attempt to change the file extension to .txt or something else that is not JSON .使用上述解决方案,我肯定会尝试将文件扩展名更改为.txt或其他不是JSON的东西。

My recommandation would be to convert the file into a JSON format and save it this way.我的建议是将文件转换为 JSON 格式并以这种方式保存。

Read each line and convert it in to a dict.阅读每一行并将其转换为字典。 Result will contain a list of dict, which is JSON serializable结果将包含一个字典列表,它是 JSON 可序列化的

result = []
for f in glob.glob("*.json"):
    with open(f, "r") as infile:
        for line in infile.readlines():
            result.append(json.loads(line))

Once this is done you can now save the content as a JSON file完成后,您现在可以将内容保存为 JSON 文件

with open("merged_file.json", "w") as outfile:
  json.dump(result, outfile)

You may open the file as a JSON file now:您现在可以将文件作为 JSON 文件打开:

with open("merged_file.json", "r") as fp:
    print(pformat(json.load(fp)))

Output: Output:

[{"playlist_track.PlaylistId": 1, "playlist_track.TrackId": 3402}, {"playlist_track.PlaylistId": 1, "playlist_track.TrackId": 3389}, {"playlist_track.PlaylistId": 1, "playlist_track.TrackId": 3402}, {"playlist_track.PlaylistId": 1, "playlist_track.TrackId": 3389}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM