简体   繁体   中英

Parsing multiple Json objects and merging into a single json object in Python

I have string containing multiple jsons. I want to convert that string to single json object.

For example,

Assuming the following input,

input = """
{
 "a" : {
        "x":"y",
        "w":"z" 
    }
} 

{
"b" : {
       "v":"w",
       "z":"l"
   }
}
"""

The expected output will be:
Output :

{
"a" : {
       "x":"y",
       "w":"z"
   }

"b" : {
       "v":"w",
       "z":"l"
    }
}

if we treat them as dictionaries and have

>>> a = {'a':{'a':1}}
>>> b = {'b':{'b':1}}

we can simply

>>> a.update(b)
>>> a

{'a': {'a': 1}, 'b': {'b': 1}}

you can take advantage of the fact that you can see when a dictionary begins by looking if a line starts with '{':

import json

input = """
{

 "a" : {

        "x":"y",

        "w":"z"

    }

}

{
"b" : {

       "v":"w",

       "z":"l"

   }

}"""

my_dicts = {}
start_dict = False
one_json = ''

for line in input.split('\n'):
    if line.startswith('{'):
        # check if we pass a json
        if start_dict:
            my_dicts.update(json.loads(one_json))
            one_json = ''
        else:
            start_dict = True

    one_json = f'{one_json}\n{line}'

# take the last json
my_dicts.update(json.loads(one_json))

print(my_dicts)

output:

{'a': {'w': 'z', 'x': 'y'}, 'b': {'v': 'w', 'z': 'l'}}

Build up a list of dictionaries parsing each character. One could also parse each line.

There is good probability of finding a user library that already does this function but here is a way to go

import json

braces = []
dicts = []
dict_chars = []

for line in inp: # input is a builtin so renamed from input to inp
  char = line.strip()
  dict_chars.append(line)
  if '{' == char:
    braces.append('}')
  elif '}' == char:
    braces.pop()
  elif len(braces) == 0 and dict_chars:
    text = ''.join(dict_chars)
    if text.strip():
      dicts.append(json.loads(text))
    dict_chars = []

Then, merge dictionaries in the list.

merged_dict = {}
for dct in dicts:
  merged_dict.update(dct)
> print(merged_dict)
{u'a': {u'x': u'y', u'w': u'z'}, u'b': {u'z': u'l', u'v': u'w'}}

Output merged dictionary as json string with indentation.

merged_output = json.dumps(merged_dict, indent=4)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM