简体   繁体   中英

Flattening nested JSON with LIST values in it using Pandas

I am trying to flatten this nested JSON file:

{
    "1": {
        "name": "Treez #0001",
        "description": "Treez #0001",
        "image": "",
        "atributes": [
            {
                "trait_type": "Apple Count",
                "value": "3"
            },
            {
                "trait_type": "Body Type",
                "value": "demeter"
            },
            {
                "trait_type": "Background",
                "value": "RiverLand"
            },
            {
                "trait_type": "Body Texture",
                "value": "TMac"
            },
            {
                "trait_type": "Stone Type",
                "value": "TeenSpirit"
            },
            {
                "trait_type": "Leaf Texture",
                "value": "Parisian"
            },
            {
                "trait_type": "Apple Texture",
                "value": "Pastel"
            },
            {
                "trait_type": "Animal",
                "value": "Monkey"
            },
            {
                "trait_type": "Rarity Score ",
                "value": "1271"
            }
        ]
    },
    "2": {
        "name": "Treez #0002",
        "description": "Treez #0002",
        "image": "",
        "atributes": [
            {
                "trait_type": "Apple Count",
                "value": "3"
            },
            {
                "trait_type": "Body Type",
                "value": "Naked"
            },
            {
                "trait_type": "Background",
                "value": "Heaven"
            },
            {
                "trait_type": "Body Texture",
                "value": "WcWall"
            },
            {
                "trait_type": "Stone Type",
                "value": "Fairy"
            },
            {
                "trait_type": "Leaf Texture",
                "value": "Haze"
            },
            {
                "trait_type": "Apple Texture",
                "value": "SummerTime"
            },
            {
                "trait_type": "Rarity Score ",
                "value": "767"
            }
        ]
    },
    "3": {
        "name": "Treez #0003",
        "description": "Treez #0003",
        "image": "",
        "atributes": [
            {
                "trait_type": "Apple Count",
                "value": "1"
            },
            {
                "trait_type": "Body Type",
                "value": "Naked"
            },
            {
                "trait_type": "Background",
                "value": "FutureCaveMens"
            },
            {
                "trait_type": "Body Texture",
                "value": "CottonCandy"
            },
            {
                "trait_type": "Stone Type",
                "value": "TeenSpirit"
            },
            {
                "trait_type": "Leaf Texture",
                "value": "Energy"
            },
            {
                "trait_type": "Apple Texture",
                "value": "Slushy"
            },
            {
                "trait_type": "Rarity Score ",
                "value": "517"
            }
        ]
    },
    "4": {
        "name": "Treez #0004",
        "description": "Treez #0004",
        "image": "",
        "atributes": [
            {
                "trait_type": "Apple Count",
                "value": "1"
            },
            {
                "trait_type": "Body Type",
                "value": "Naked"
            },
            {
                "trait_type": "Background",
                "value": "ColorfulSkies"
            },
            {
                "trait_type": "Body Texture",
                "value": "CottonCandy"
            },
            {
                "trait_type": "Stone Type",
                "value": "Poison"
            },
            {
                "trait_type": "Leaf Texture",
                "value": "LemonHaze"
            },
            {
                "trait_type": "Apple Texture",
                "value": "Wilderness"
            },
            {
                "trait_type": "Rarity Score ",
                "value": "502"
            }
        ]
    }
}

Pandas.io.json.read_json return without parsing the list:

在此处输入图像描述

and

when I try json_normalize:

f = open("file.t", "r")
data = json.load(f)
pd.json_normalize(data)

it looks like this:

在此处输入图像描述

and I could't find any parameters to work.

My expected output is columns = ['name','Apple Count', 'Body Type', 'Background', 'Body Texture', 'Stone Type', 'Leaf Texture', 'Apple Texture', 'Animal','Rarity Score']

as @sammywemmy suggested,

I first did;

pd.json_normalize(data['1'], 'atributes', ['name'])

gave me this:

在此处输入图像描述

then pivoted my table by:

data.pivot(index= 'name', columns='trait_type', values='value')

and this is the result as expected;

在此处输入图像描述

    import pandas as pd
    tree=     {
        "1": {
            "name": "Treez #0001",
            "description": "Treez #0001",
            "image": "",
            "atributes": [
                {
                    "trait_type": "Apple Count",
                    "value": "3"
                },
                {
                    "trait_type": "Body Type",
                    "value": "demeter"
                },
                {
                    "trait_type": "Background",
                    "value": "RiverLand"
                },
                {
                    "trait_type": "Body Texture",
                    "value": "TMac"
                },
                {
                    "trait_type": "Stone Type",
                    "value": "TeenSpirit"
                },
                {
                    "trait_type": "Leaf Texture",
                    "value": "Parisian"
                },
                {
                    "trait_type": "Apple Texture",
                    "value": "Pastel"
                },
                {
                    "trait_type": "Animal",
                    "value": "Monkey"
                },
                {
                    "trait_type": "Rarity Score ",
                    "value": "1271"
                }
            ]
        },
        "2": {
            "name": "Treez #0002",
            "description": "Treez #0002",
            "image": "",
            "atributes": [
                {
                    "trait_type": "Apple Count",
                    "value": "3"
                },
                {
                    "trait_type": "Body Type",
                    "value": "Naked"
                },
                {
                    "trait_type": "Background",
                    "value": "Heaven"
                },
                {
                    "trait_type": "Body Texture",
                    "value": "WcWall"
                },
                {
                    "trait_type": "Stone Type",
                    "value": "Fairy"
                },
                {
                    "trait_type": "Leaf Texture",
                    "value": "Haze"
                },
                {
                    "trait_type": "Apple Texture",
                    "value": "SummerTime"
                },
                {
                    "trait_type": "Rarity Score ",
                    "value": "767"
                }
            ]
        },
        "3": {
            "name": "Treez #0003",
            "description": "Treez #0003",
            "image": "",
            "atributes": [
                {
                    "trait_type": "Apple Count",
                    "value": "1"
                },
                {
                    "trait_type": "Body Type",
                    "value": "Naked"
                },
                {
                    "trait_type": "Background",
                    "value": "FutureCaveMens"
                },
                {
                    "trait_type": "Body Texture",
                    "value": "CottonCandy"
                },
                {
                    "trait_type": "Stone Type",
                    "value": "TeenSpirit"
                },
                {
                    "trait_type": "Leaf Texture",
                    "value": "Energy"
                },
                {
                    "trait_type": "Apple Texture",
                    "value": "Slushy"
                },
                {
                    "trait_type": "Rarity Score ",
                    "value": "517"
                }
            ]
        },
        "4": {
            "name": "Treez #0004",
            "description": "Treez #0004",
            "image": "",
            "atributes": [
                {
                    "trait_type": "Apple Count",
                    "value": "1"
                },
                {
                    "trait_type": "Body Type",
                    "value": "Naked"
                },
                {
                    "trait_type": "Background",
                    "value": "ColorfulSkies"
                },
                {
                    "trait_type": "Body Texture",
                    "value": "CottonCandy"
                },
                {
                    "trait_type": "Stone Type",
                    "value": "Poison"
                },
                {
                    "trait_type": "Leaf Texture",
                    "value": "LemonHaze"
                },
                {
                    "trait_type": "Apple Texture",
                    "value": "Wilderness"
                },
                {
                    "trait_type": "Rarity Score ",
                    "value": "502"
                }
            ]
        }
    }
    
    
    def traverse_parser_dfs(master_tree):
      flatten_tree_node = []
      def _process_leaves(tree:dict,prefix:str = "node", tree_node:dict = dict(), update:bool = True):
          is_nested = False
          if isinstance(tree,dict):
            for k in tree.keys():
                if type(tree[k]) == str:
                    colName = prefix + "_" + k
                    tree_node[colName] = tree[k]
                elif type(tree[k]) == dict:
                    prefix += "_" + k
                    leave = tree[k]
                    _process_leaves(leave,prefix = prefix, tree_node = tree_node, update = False)
            for k in tree.keys():
                if type(tree[k]) == list:
                    is_nested = True
                    prefix += "_" + k
                    for leave in tree[k]:
                        _process_leaves(leave,prefix = prefix, tree_node = tree_node.copy())
            if not is_nested and update:
                flatten_tree_node.append(tree_node)
            
      _process_leaves(master_tree)
      df = pd.DataFrame(flatten_tree_node)
      df.columns = df.columns.str.replace("@", "_")
      df.columns = df.columns.str.replace("#", "_")
      return df
    
    
    print(traverse_parser_dfs(tree))
    
    node_1_name node_1_description node_1_image  ... node_1_2_3_4_image node_1_2_3_4_atributes_trait_type node_1_2_3_4_atributes_value
0   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
1   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
2   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
3   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
4   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
5   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
6   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
7   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
8   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
9   Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
10  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
11  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
12  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
13  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
14  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
15  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
16  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
17  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
18  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
19  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
20  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
21  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
22  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
23  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
24  Treez #0001        Treez #0001               ...                NaN                               NaN                          NaN
25  Treez #0001        Treez #0001               ...                                          Apple Count                            1
26  Treez #0001        Treez #0001               ...                                            Body Type                        Naked
27  Treez #0001        Treez #0001               ...                                           Background                ColorfulSkies
28  Treez #0001        Treez #0001               ...                                         Body Texture                  CottonCandy
29  Treez #0001        Treez #0001               ...                                           Stone Type                       Poison
30  Treez #0001        Treez #0001               ...                                         Leaf Texture                    LemonHaze
31  Treez #0001        Treez #0001               ...                                        Apple Texture                   Wilderness
32  Treez #0001        Treez #0001               ...                                        Rarity Score                           502
33  Treez #0001        Treez #0001               ...                                                  NaN                          NaN

[34 rows x 20 columns]
    [34 rows x 20 columns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM