简体   繁体   中英

Pandas json_normalize on recursively nested json

I have a json file with a deeply nested recursive structure:

{"children": [ 
              "val" = x
              "data" = y
              "children": [{ 
                           "val" = x
                           "data" = y
                           "children": [{ 
                                         ....
              "val" = x
              "data" = y
              "children": [{ 
                           "val" = x
                           "data" = y
                           "children": [{ 
                                         ....

Using pandas json_normalize as follows:

json_normalize(data = self.data["children"], record_path="children")

gives dataframe where the first level is flattened but the deepers levels remain json strings within the dataframe.

How can i flatten my dataframe such that the entire json tree is unpacked and flattened?

Providing your json is well formatted and has the same structure at all levels you can extract all the data by passing a List of keywords to json_normalize from each level.

json = {'children': [{
          'val': 1,
          'data': 2,
          'children': [{
                       'val': 3,
                       'data' : 4,
                       'children': [{'val' : 4,
                                     'data' : 5}],
                       }],
          },{
          'val' : 6,
          'data' : 7,
          'children': [{
                       'val' : 8,
                       'data' : 9,
                       'children': [{'val' : 10,
                                     'data' : 11}],
                       }]
          }]}

for i in range(1,3):
    print( json_normalize(data = json,record_path=['children']*i) )

This gives the following output, which you can use recursively add into a single DataFrame if you wish.

                                            children  data  val
0  [{'val': 3, 'data': 4, 'children': [{'val': 4,...     2    1
1  [{'val': 8, 'data': 9, 'children': [{'val': 10...     7    6
                    children  data  val
0    [{'val': 4, 'data': 5}]     4    3
1  [{'val': 10, 'data': 11}]     9    8
   data  val
0     5    4
1    11   10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM