简体   繁体   English

合并几个 json 为 Python 中的每个值选择奇数

[英]Merge several json picking the odd one for each value in Python

I currently have N json input files that all have the same structure, but where N - 1 values are defined as "None" for each of them.我目前有 N 个 json 个输入文件,它们都具有相同的结构,但其中每个文件的 N - 1 个值都定义为“无”。 I want to combine them into a single json where, much like a git merge/patch, it always picks the set value (ie, the one different from "None").我想将它们合并到一个 json 中,就像 git 合并/补丁一样,它总是选择设置值(即与“无”不同的值)。 Here is an example (fictious):这是一个例子(虚构的):

json 1: {'a': 'aaa', ['b': 'None', 'c': 'None']}
json 2: {'a': 'None', ['b': 'bbb', 'c': 'None']}
json 3: {'a': 'None', ['b': 'None', 'c': 'ccc']}

expected result: {'a': 'aaa', ['b': 'bbb', 'c': 'ccc']}预期结果:{'a': 'aaa', ['b': 'bbb', 'c': 'ccc']}

Atm, I'm thinking of using a zip for all the input files, iterating each word and picking whatever is not 'None' for composing the output file. Atm,我正在考虑对所有输入文件使用 zip,迭代每个单词并选择任何不是“无”的内容来组成 output 文件。 However, I'm thinking there must be a cleaner way of doing it that I'm just not seeing now.. Thanks in advance!但是,我认为必须有一种我现在还没有看到的更清洁的方法。提前致谢!

The format of your json files is incorrect right now.您的 json 文件的格式现在不正确。 You should verify that and update the code accordingly.您应该验证这一点并相应地更新代码。 As of now, I have converted your json in the following format:截至目前,我已将您的 json 转换为以下格式:

json_1 = {"a": "aaa", "b": "None", "c": "None"}
json_2 = {"a": "None", "b": "bbb", "c": "None"}
json_3 = {"a": "None", "b": "None", "c": "ccc"}

If the data is in files, you can use the following function:如果数据在文件中,可以使用下面的function:

import json

f = open ('data.json', "r")
json.load(f.read())

and if the data is in string format, you can use:如果数据是字符串格式,你可以使用:

import json

json_1 = json.loads('{"a": "aaa", "b": "None", "c": "None"}')
json_2 = json.loads('{"a": "None", "b": "bbb", "c": "None"}')
json_3 = json.loads('{"a": "None", "b": "None", "c": "ccc"}')

As for the solution, iterating over the json files will be the best option.至于解决方案,遍历 json 个文件将是最好的选择。 An alternate approach to solve the problem would be to clear all the keys which contain "None" beforehand and then merging them as one.解决该问题的另一种方法是预先清除所有包含“None”的键,然后将它们合并为一个。 The sameple code for the same is:相同的相同代码是:

json_clean_1 = {k: v for k, v in json_1.items() if v != "None"}
json_clean_2 = {k: v for k, v in json_2.items() if v != "None"}
json_clean_3 = {k: v for k, v in json_3.items() if v != "None"}

output_json = dict(list(json_clean_1.items()) + list(json_clean_2.items()) + list(json_clean_3.items()))
print(output_json)

Based on the comment, you could use a solution similar to this.根据评论,您可以使用与此类似的解决方案。 It should give you an output dataframe with a column containing the combined values:它应该给你一个 output dataframe 和一个包含组合值的列:

import os
import json
import pandas as pd

# get files
json_files = [
    each_file for each_file in os.listdir(".") if each_file.endswith('.json')
]

# read files
dfs = []
for file in json_files:
    with open(file) as f:
        json_data = pd.json_normalize(json.loads(f.read()))
    dfs.append(json_data)

# combine and clean df
combined_df = pd.concat(dfs, ignore_index=True)
cleaned_df = combined_df.replace("None", pd.NA).transpose()

# df with required column
cleaned_df['combined_col'] = cleaned_df[cleaned_df.columns].apply(
    lambda x: ','.join(x.dropna().astype(str)), axis=1)
print(cleaned_df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM