Convert pandas dataframe of relational 1:N data to nested JSON

Question

I have this df:

   id_edo nombre_edo  id_mun nombre_mun
0       1        AGS       1         m1
1       1        AGS       2         m2
2       2         BC       3         m3
3       2         BC       4         m4
4       3        BCS       5         m5
5       3        BCS       6         m6
6       4        CAM       7         m7
7       4        CAM       8         m8

It is just a 1 to N relationship representing States and Municipalities in Mexico City. (well it is just mock data, but you get the idea)

Now I want to get the representation of the data in a Nested JSON Array of Objects like this:

[{
        "id_edo": 1,
        "nombre_edo": "AGS",
        "children": [{
            "id_mun": 1,
            "nombre_mun": "m1"
        }, {
            "id_mun": 2,
            "nombre_mun": "m2"
        }]
    },
    {
        "id_edo": 2,
        "nombre_edo": "BC",
        "children": [{
            "id_mun": 3,
            "nombre_mun": "m3"
        }, {
            "id_mun": 4,
            "nombre_mun": "m4"
        }]
    },
    {
        "id_edo": 3,
        "nombre_edo": "BCS",
        "children": [{
            "id_mun": 5,
            "nombre_mun": "m5"
        }, {
            "id_mun": 6,
            "nombre_mun": "m6"
        }]
    },
    {
        "id_edo": 4,
        "nombre_edo": "CAM",
        "children": [{
            "id_mun": 7,
            "nombre_mun": "m7"
        }, {
            "id_mun": 8,
            "nombre_mun": "m8"
        }]
    }
]

This is as far as I could get:

dff = (df.groupby(['id_edo', 'nombre_edo'])['id_mun', 'nombre_mun']
       .apply(lambda x: [dict(x.values)])
       .reset_index(name='children')
       .to_dict(orient='records'))

Output: [{'id_edo': 1, 'nombre_edo': 'AGS', 'children': [{1: 'm1', 2: 'm2'}]}, {'id_edo': 2, 'nombre_edo': 'BC', 'children': [{3: 'm3', 4: 'm4'}]}, {'id_edo': 3, 'nombre_edo': 'BCS', 'children': [{5: 'm5', 6: 'm6'}]}, {'id_edo': 4, 'nombre_edo': 'CAM', 'children': [{7: 'm7', 8: 'm8'}]}]

As you can see in the output, only the values form the nested dict ({ id_mun : nombre_mun }), which is different from what I want to achieve ({"id_mun":value, "nombre_mun": value}).

Answer 1

In the lambda function, you are using x.values . This will result in a dictionary. You need to iterate through this and split them out.

Instead of writing this:

   #.apply(lambda x: [dict(x.values)])

Call a function to convert the dictionary values into the required dictionary

   .apply(lambda x: sp(dict(x.values)))

Here's the full code:

import pandas as pd

df = pd.DataFrame({'id_edo':[1,1,2,2,3,3,4,4],
              'nombre_edo':['AGS','AGS','BC','BC','BCS','BCS','CAM','CAM'],
              'id_mun':[1,2,3,4,5,6,7,8],
              'nombre_mun':['m1','m2','m3','m4','m5','m6','m7','m8']})

def sp(a):
    mun = []
    for k,v in a.items():
        mun.append({'id_mun':k,'nombre_mun':v})
    return (mun)

dff = (df.groupby(['id_edo', 'nombre_edo'])[['id_mun', 'nombre_mun']]
       #.apply(lambda x: [dict(x.values)])
       .apply(lambda x: sp(dict(x.values)))
       .reset_index(name='children')
       .to_dict(orient='records'))

The output of this will be as follows:

[{
   'id_edo': 1, 
   'nombre_edo': 'AGS', 
   'children': [{'id_mun': 1, 'nombre_mun': 'm1'}, 
                {'id_mun': 2, 'nombre_mun': 'm2'}]
 }, 
 {
   'id_edo': 2, 
   'nombre_edo': 'BC', 
   'children': [{'id_mun': 3, 'nombre_mun': 'm3'}, 
                {'id_mun': 4, 'nombre_mun': 'm4'}]
 }, 
 {
   'id_edo': 3, 
   'nombre_edo': 'BCS', 
   'children': [{'id_mun': 5, 'nombre_mun': 'm5'}, 
                {'id_mun': 6, 'nombre_mun': 'm6'}]
 }, 
 {
   'id_edo': 4, 
   'nombre_edo': 'CAM', 
   'children': [{'id_mun': 7, 'nombre_mun': 'm7'}, 
                {'id_mun': 8, 'nombre_mun': 'm8'}]
  }
]

Convert pandas dataframe of relational 1:N data to nested JSON

Question

1 answers

solution1
1 ACCPTED 2020-09-09 02:50:10

Convert pandas dataframe of relational 1:N data to nested JSON

Question

1 answers

solution1 1 ACCPTED 2020-09-09 02:50:10

solution1
1 ACCPTED 2020-09-09 02:50:10