简体   繁体   中英

how to flatten out nested json arrays with their parent in python using pandas

I have a requirement to parse and flatten out nested json in python using pandas module.

SOURCE JSON:

{
    "people": [{
            "name": "ABC",
            "age": "33",
            "mobile": "44545",
            "location": "hyderabad",
            "interests": [{
                "hobby": "dancing",
                "food": "continental",
                "city": "Paris"
            }]
        },
        {
            "name": "DEF",
            "age": "11",
            "mobile": "12121212",
            "location": "pune",
            "interests": [{
                "hobby": "reading",
                "food": "Pizza",
                "city": "France"
            }]
        }
    ]
}

From the above source json file, I need to obtain two different json files which are as follows:

OUTPUT JSON 1:

{"name": "ABC", "age": "33", "mobile": "44545", "location": "hyderabad"}
{"name": "DEF", "age": "11","mobile": "12121212", "location": "pune”}

OUTPUT JSON 2:

{"name": "ABC”, ”interests_hobby”:”dancing”, “interests_food”:”continental”, “interests_city”:”Paris”}
{“name": "DEF”, ”interests_hobby”:”reading”, “interests_food”:”Pizza”, “interests_city”:”France”}

The condition is that we must use python and pandas module.(pd.json_normalize)

First, you have to open the json file and load it. Then, use command

df = pd.json_normalize(json_obj, record_path="people")

to obtain a DataFrame with the fields you want. As you will see, the field "interests" will have not been split, but you can redo the steps.

For obtaining a json file with your desired output, I recommend you

df.to_json(orient='records')

that returns a json file withaout indexes from the DataFrame.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM