简体   繁体   中英

how to split the one data frame into multiple rows, if we have one column in data frame as nested array

i have a dataframe df1 =

name age data date of joining
1 Steve 27 nestedjson 01-08-2021

here nestedjson is

[{'inputImage': 'url1', 'outputImage': 'url2', 'timeTracks': {'main': {'actualTotal': 2.993}, 'request': {'Code': 'Internal Test1'}}}, {'inputImage': 'url3', 'outputImage': 'url4', 'timeTracks': {'main': {'actualTotal': 3.283}, 'request': {'Code': 'Internal Test2'}}}, {'inputImage': 'url5', 'outputImage': 'url6', 'timeTracks': {'main': {'actualTotal': 3.31}, 'request': {'Code': 'Internal Test3'}}}]

I need to final data frame as df2 = 在此处输入图像描述

try this instead, If it helps,

    sample_data = [{
    'name': 'Steve',
    'age': 'Age',
    'date_of_joining': '01-08-2021',
    'data': [{
        'inputImage': 'url1',
        'outputImage': 'url2',
        'timeTracks': {
            'main': {
                'actualTotal': 2.993
            },
            'request': {
                'Code': 'Internal Test1'
            }
        }
    }, {
        'inputImage': 'url3',
        'outputImage': 'url4',
        'timeTracks': {
            'main': {
                'actualTotal': 3.283
            },
            'request': {
                'Code': 'Internal Test2'
            }
        }
    }]
}]
import pandas as pd
df = pd.json_normalize(sample_data, record_path=['data'], meta=['name', 'age', 'date_of_joining'], errors='ignore')

inputImage outputImage  timeTracks.main.actualTotal timeTracks.request.Code   name  age date_of_joining
0       url1        url2                        2.993          Internal Test1  Steve  Age      01-08-2021
1       url3        url4                        3.283          Internal Test2  Steve  Age      01-08-2021

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM